Commit Graph

331 Commits

Author SHA1 Message Date
Daniel P. Berrange
10a8b1f958 Add support for forcing a private network namespace for LXC guests
If no <interface> elements are included in an LXC guest XML
description, then the LXC guest will just see the host's
network interfaces. It is desirable to be able to hide the
host interfaces, without having to define any guest interfaces.

This patch introduces a new feature flag <privnet/> to allow
forcing of a private network namespace for LXC. In the future
I also anticipate that we will add <privuser/> to force a
private user ID namespace.

* src/conf/domain_conf.c, src/conf/domain_conf.h: Add support
  for <privnet/> feature. Auto-set <privnet> if any <interface>
  devices are defined
* src/lxc/lxc_container.c: Honour request for private network
  namespace
2012-03-15 17:00:39 +00:00
Daniel P. Berrange
6e6aa000c6 Add container_uuid env variable to LXC guests
Systemd has declared that all container virtualization technologies
should set 'container_uuid' to identify themselves.

http://cgit.freedesktop.org/systemd/systemd/commit/?id=09b967eaa51a39dabb7f238927f67bd682466dbc
2012-03-15 11:20:20 +00:00
Daniel Veillard
dd39f13af0 Fix a few typo in translated strings
this was raised by our hindi localization team
chandan kumar <chandankumar.093047@gmail.com>
2012-03-12 17:41:26 +08:00
Ansis Atteka
ac8bbdbdfa Attach vm-id to Open vSwitch interfaces.
This patch will allow OpenFlow controllers to identify which interface
belongs to a particular VM by using the Domain UUID.

ovs-vsctl get Interface vnet0 external_ids
{attached-mac="52:54:00:8C:55:2C", iface-id="83ce45d6-3639-096e-ab3c-21f66a05f7fa", iface-status=active, vm-id="142a90a7-0acc-ab92-511c-586f12da8851"}

V2 changes:
Replaced vm-uuid with vm-id. There was a discussion in Open vSwitch
mailinglist that we should stick with the same DB key postfixes for the
sake of consistency (e.g iface-id, vm-id ...).
2012-03-08 14:44:15 -05:00
Eric Blake
73b9977140 xml: use long long internally, to centralize overflow checks
On 64-bit platforms, unsigned long and unsigned long long are
identical, so we don't have to worry about overflow checks.
On 32-bit platforms, anywhere we narrow unsigned long long back
to unsigned long, we have to worry about overflow; it's easier
to do this in one place by having most of the code use the same
or wider types, and only doing the narrowing at the last minute.
Therefore, the memory set commands remain unsigned long, and
the memory get command now centralizes the overflow check into
libvirt.c, so that drivers don't have to repeat the work.

This also fixes a bug where xen returned the wrong value on
failure (most APIs return -1 on failure, but getMaxMemory
must return 0 on failure).

* src/driver.h (virDrvDomainGetMaxMemory): Use long long.
* src/libvirt.c (virDomainGetMaxMemory): Raise overflow.
* src/test/test_driver.c (testGetMaxMemory): Fix driver.
* src/rpc/gendispatch.pl (name_to_ProcName): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorGetMaxMemory): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetMaxMemory): Likewise.
* src/xen/xend_internal.c (xenDaemonDomainGetMaxMemory):
Likewise.
* src/xen/xend_internal.h (xenDaemonDomainGetMaxMemory):
Likewise.
* src/xen/xm_internal.c (xenXMDomainGetMaxMemory): Likewise.
* src/xen/xm_internal.h (xenXMDomainGetMaxMemory): Likewise.
* src/xen/xs_internal.c (xenStoreDomainGetMaxMemory): Likewise.
* src/xen/xs_internal.h (xenStoreDomainGetMaxMemory): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetMaxMemory):
Likewise.
* src/esx/esx_driver.c (esxDomainGetMaxMemory): Likewise.
* src/libxl/libxl_driver.c (libxlDomainGetMaxMemory): Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetMaxMemory): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetMaxMemory): Likewise.
* src/uml/uml_driver.c (umlDomainGetMaxMemory): Likewise.
2012-03-07 18:24:43 -07:00
Martin Kletzander
6ba4b300b0 lxc: Cleaner fix for compilation without SELinux
Just a cleanup of commit 32f881c6c4.
2012-02-29 14:55:32 +01:00
Jiri Denemark
8ab785783f hooks: Add support for capturing hook output
Hooks may now be used as filters.
2012-02-29 12:27:12 +01:00
Martin Kletzander
9f748277bb Fixed URI parsing
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.

File changes:
 - src/util/viruri.h        -- declaration
 - src/util/viruri.c        -- definition
 - src/libvirt_private.syms -- symbol export
 - src/Makefile.am          -- added source and header files
 - cfg.mk                   -- added sc_prohibit_xmlURI
 - all others               -- ID name and include fixes
2012-02-24 16:49:21 -07:00
Ansis Atteka
df81004632 network: support Open vSwitch
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:

    <interface type='bridge'>
      <mac address='52:54:00:d0:3f:f2'/>
      <source bridge='ovsbr'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
      </virtualport>
      <address type='pci' domain='0x0000' bus='0x00'
                          slot='0x03' function='0x0'/>
    </interface>

or if libvirt should auto-generate the interfaceid use
following syntax:

    <interface type='bridge'>
      <mac address='52:54:00:d0:3f:f2'/>
      <source bridge='ovsbr'/>
      <virtualport type='openvswitch'>
      </virtualport>
      <address type='pci' domain='0x0000' bus='0x00'
                          slot='0x03' function='0x0'/>
    </interface>

It is also possible to pass an optional profileid. To do that
use following syntax:

   <interface type='bridge'>
     <source bridge='ovsbr'/>
     <mac address='00:55:1a:65:a2:8d'/>
     <virtualport type='openvswitch'>
       <parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
                   profileid='test-profile'/>
     </virtualport>
   </interface>

To create Open vSwitch bridge install Open vSwitch and
run the following command:

    ovs-vsctl add-br ovsbr
2012-02-15 16:04:54 -05:00
Laine Stump
9368465f75 conf: rename virDomainNetGetActualDirectVirtPortProfile
An upcoming patch will add a <virtualport> element to interfaces of
type='bridge', so it makes sense to give this function a more generic
name.
2012-02-15 16:04:53 -05:00
Daniel P. Berrange
d474dbadde Populate /dev/std{in,out,err} symlinks in LXC containers
Some applications expect /dev/std{in,out,err} to exist. Populate
them during container startup as symlinks to /proc/self/fd
2012-02-08 19:50:15 +00:00
Philipp Hahn
99d24ab2e0 virterror.c: Fix several spelling mistakes
compat{a->i}bility
erron{->e}ous
nec{c->}essary.
Either "the" or "a".

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-02-03 11:32:51 -07:00
Martin Kletzander
32f881c6c4 Fixed connection definition for non-SELinux builds
This patch fixes the access of variable "con" in two files where the
variable was declared only on SELinux builds and thus the build failed
without SELinux. It's a rather nasty fix but helps fix the build
quickly and without any major changes to the code.
2012-02-03 16:13:45 +01:00
Daniel P. Berrange
5df67cdcd3 Set a security context on /dev and /dev/pts mounts
To allow the container to access /dev and /dev/pts when under
sVirt, set an explicit mount option. Also set a max size on
the /dev mount to prevent DOS on memory usage

* src/lxc/lxc_container.c: Set /dev mount context
* src/lxc/lxc_controller.c: Set /dev/pts mount context
2012-02-02 17:45:19 -07:00
Daniel P. Berrange
0f01192e7e Add support for sVirt in the LXC driver
For the sake of backwards compat, LXC guests are *not*
confined by default. This is because it is not practical
to dynamically relabel containers using large filesystem
trees. Applications can create confined containers though,
by giving suitable XML configs

* src/Makefile.am: Link libvirt_lxc to security drivers
* src/lxc/libvirtd_lxc.aug, src/lxc/lxc_conf.h,
  src/lxc/lxc_conf.c, src/lxc/lxc.conf,
  src/lxc/test_libvirtd_lxc.aug: Config file handling for
  security driver
* src/lxc/lxc_driver.c: Wire up security driver functions
* src/lxc/lxc_controller.c: Add a '--security' flag to
  specify which security driver to activate
* src/lxc/lxc_container.c, src/lxc/lxc_container.h: Set
  the process label just before exec'ing init.
2012-02-02 17:44:39 -07:00
Eric Blake
16dc4ade7a lxc: export container=lxc-libvirt for systemd
Systemd detects containers based on whether they have
an environment variable starting with 'container=lxc';
using a longer name fits the expectations, while also
allowing detection of who created the container.

Requested by Lennart Poettering, in response to
https://bugs.freedesktop.org/show_bug.cgi?id=45175

* src/lxc/lxc_container.c (lxcContainerBuildInitCmd): Add another
env-var.
2012-01-25 08:25:37 -07:00
Daniel P. Berrange
c30a78c398 Don't bind mount onto a char device for /dev/ptmx in LXC
The current setup code for LXC is bind mounting /dev/pts/ptmx
on top of a character device /dev/ptmx. This is denied by SELinux
policy and is just wrong. The target of a bind mount should just
be a plain file

* src/lxc/lxc_container.c: Don't bind /dev/pts/ptmx onto
  a char device
2012-01-25 14:11:08 +00:00
Eric Blake
9e48c22534 util: use new virTypedParameter helpers
Reusing common code makes things smaller; it also buys us some
additional safety, such as now rejecting duplicate parameters
during a set operation.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters)
(qemuDomainSetMemoryParameters, qemuDomainSetNumaParameters)
(qemuSetSchedulerParametersFlags)
(qemuDomainSetInterfaceParameters, qemuDomainSetBlockIoTune)
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuDomainGetNumaParameters, qemuGetSchedulerParametersFlags)
(qemuDomainBlockStatsFlags, qemuDomainGetInterfaceParameters)
(qemuDomainGetBlockIoTune): Use new helpers.
* src/esx/esx_driver.c (esxDomainSetSchedulerParametersFlags)
(esxDomainSetMemoryParameters)
(esxDomainGetSchedulerParametersFlags)
(esxDomainGetMemoryParameters): Likewise.
* src/libxl/libxl_driver.c
(libxlDomainSetSchedulerParametersFlags)
(libxlDomainGetSchedulerParametersFlags): Likewise.
* src/lxc/lxc_driver.c (lxcDomainSetMemoryParameters)
(lxcSetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetMemoryParameters, lxcGetSchedulerParametersFlags)
(lxcDomainGetBlkioParameters): Likewise.
* src/test/test_driver.c (testDomainSetSchedulerParamsFlags)
(testDomainGetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorSetSchedulerParameters)
(xenHypervisorGetSchedulerParameters): Likewise.
2012-01-19 13:20:30 -07:00
Eric Blake
9c3775765e lxc: use live/config helper
Based on qemu changes made in commits ae523427 and 659ded58.

* src/lxc/lxc_driver.c (lxcSetSchedulerParametersFlags)
(lxcGetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetBlkioParameters): Use helpers.
(lxcDomainSetBlkioParameters): Allow setting live and config at
once.
2012-01-19 13:14:10 -07:00
Daniel P. Berrange
c53ba61b21 Fix startup of LXC containers with filesystems containing symlinks
Given an LXC guest with a root filesystem path of

  /export/lxc/roots/helloworld/root

During startup, we will pivot the root filesystem to end up
at

  /.oldroot/export/lxc/roots/helloworld/root

We then try to open

  /.oldroot/export/lxc/roots/helloworld/root/dev/pts

Now consider if '/export/lxc' is an absolute symlink pointing
to '/media/lxc'. The kernel will try to open

  /media/lxc/roots/helloworld/root/dev/pts

whereas it should be trying to open

  /.oldroot//media/lxc/roots/helloworld/root/dev/pts

To deal with the fact that the root filesystem can be moved,
we need to resolve symlinks in *any* part of the filesystem
source path.

* src/libvirt_private.syms, src/util/util.c,
  src/util/util.h: Add virFileResolveAllLinks to resolve
  all symlinks in a path
* src/lxc/lxc_container.c: Resolve all symlinks in filesystem
  paths during startup
2012-01-18 13:34:42 +00:00
Daniel P. Berrange
9130396214 Re-write LXC controller end-of-file I/O handling yet again
Currently the LXC controller attempts to deal with EOF on a
tty by spawning a thread to do an edge triggered epoll_wait().
This avoids the normal event loop spinning on POLLHUP. There
is a subtle mistake though - even after seeing POLLHUP on a
master PTY, it is still perfectly possible & valid to write
data to the PTY. There is a buffer that can be filled with
data, even when no client is present.

The second mistake is that the epoll_wait() thread was not
looking for the EPOLLOUT condition, so when a new client
connects to the LXC console, it had to explicitly send a
character before any queued output would appear.

Finally, there was in fact no need to spawn a new thread to
deal with epoll_wait(). The epoll file descriptor itself
can be poll()'d on normally.

This patch attempts to deal with all these problems.

 - The blocking epoll_wait() thread is replaced by a poll
   on the epoll file descriptor which then does a non-blocking
   epoll_wait() to handle events
 - Even if POLLHUP is seen, we continue trying to write
   any pending output until getting EAGAIN from write.
 - Once write returns EAGAIN, we modify the epoll event
   mask to also look for EPOLLOUT

* src/lxc/lxc_controller.c: Avoid stalled I/O upon
  connected to an LXC console
2012-01-12 20:42:52 +00:00
Daniel P. Berrange
707781fe12 Only add the timer when a callback is registered
The lifetime of the virDomainEventState object is tied to
the lifetime of the driver, which in stateless drivers is
tied to the lifetime of the virConnectPtr.

If we add & remove a timer when allocating/freeing the
virDomainEventState object, we can get a situation where
the timer still triggers once after virDomainEventState
has been freed. The timeout callback can't keep a ref
on the event state though, since that would be a circular
reference.

The trick is to only register the timer when a callback
is registered with the event state & remove the timer
when the callback is unregistered.

The demo for the bug is to run

  while true ; do date ; ../tools/virsh -q -c test:///default 'shutdown test; undefine test; dominfo test' ; done

prior to this fix, it will frequently hang and / or
crash, or corrupt memory
2011-12-19 11:08:25 +00:00
Daniel P. Berrange
34ad13536e Hide use of timers for domain event dispatch
Currently all drivers using domain events need to provide a callback
for handling a timer to dispatch events in a clean stack. There is
no technical reason for dispatch to go via driver specific code. It
could trivially be dispatched directly from the domain event code,
thus removing tedious boilerplate code from all drivers

Also fix the libxl & xen drivers to pass 'true' when creating the
virDomainEventState, since they run inside the daemon & thus always
expect events to be present.

* src/conf/domain_event.c, src/conf/domain_event.h: Internalize
  dispatch of events from timer callback
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_domain.c, src/qemu/qemu_driver.c,
  src/remote/remote_driver.c, src/test/test_driver.c,
  src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
  src/xen/xen_driver.c: Remove all timer dispatch functions
2011-12-19 11:08:24 +00:00
Daniel P. Berrange
7b87a30f15 Convert drivers to thread safe APIs for adding callbacks
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_driver.c, src/remote/remote_driver.c,
  src/test/test_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Convert
  to threadsafe APIs
2011-12-19 11:08:10 +00:00
Daniel P. Berrange
d09f6ba5fe Return count of callbacks when registering callbacks
When registering a callback for a particular event some callers
need to know how many callbacks already exist for that event.
While it is possible to ask for a count, this is not free from
race conditions when threaded. Thus the API for registering
callbacks should return the count of callbacks. Also rename
virDomainEventStateDeregisterAny to virDomainEventStateDeregisterID

* src/conf/domain_event.c, src/conf/domain_event.h,
  src/libvirt_private.syms: Return count of callbacks when
  registering callbacks
* src/libxl/libxl_driver.c, src/libxl/libxl_driver.c,
  src/qemu/qemu_driver.c, src/remote/remote_driver.c,
  src/remote/remote_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Update
  for change in APIs
2011-12-19 11:08:10 +00:00
Stefan Berger
33eb3567dd Pass the VM's UUID into the nwfilter subsystem
A preparatory patch for DHCP snooping where we want to be able to
differentiate between a VM's interface using the tuple of
<VM UUID, Interface MAC address>. We assume that MAC addresses could
possibly be re-used between different networks (VLANs) thus do not only
want to rely on the MAC address to identify an interface.

At the current 'final destination' in virNWFilterInstantiate I am leaving
the vmuuid parameter as ATTRIBUTE_UNUSED until the DHCP snooping patches arrive.
(we may not post the DHCP snooping patches for 0.9.9, though)

Mostly this is a pretty trivial patch. On the lowest layers, in lxc_driver
and uml_conf, I am passing the virDomainDefPtr around until I am passing
only the VM's uuid into the NWFilter calls.
2011-12-08 21:35:20 -05:00
Daniel P. Berrange
4d82fa688e When checking nttyFDs to see if it is != 1, be sure to use '1' and not '-1'
* src/lxc/lxc_controller.c: Fix check for tty count
2011-12-08 15:48:49 +00:00
Daniel P. Berrange
a8bb75a3e6 Remove time APIs from src/util/util.h
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.

* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
  src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
  src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
  virtime APIs
2011-11-30 11:43:50 +00:00
Daniel P. Berrange
9ae0b8349c Add suspend info to Xen, LXC and UML hypervisor capabilities
* src/lxc/lxc_conf.c, src/uml/uml_conf.c,
  src/xen/xen_hypervisor.c: Initialize suspend capabilities
* tests/xencapsdata/*xml: Add empty powermgmt capabilities
2011-11-30 10:12:30 +00:00
Eric Blake
51727c1dc0 qemu, lxc: drop redundant checks
After the previous patch, there are now some redundant checks.

* src/qemu/qemu_driver.c (qemudDomainGetVcpuPinInfo)
(qemuGetSchedulerParametersFlags): Drop checks now guaranteed by
libvirt.c.
* src/lxc/lxc_driver.c (lxcGetSchedulerParametersFlags):
Likewise.
2011-11-29 10:54:29 -07:00
Srivatsa S. Bhat
4ddb37c395 Implement the core API to suspend/resume the host
Add the core functions that implement the functionality of the API.
Suspend is done by using an asynchronous mechanism so that we can return
the status to the caller before the host gets suspended. This asynchronous
operation is achieved by suspending the host in a separate thread of
execution. However, returning the status to the caller is only best-effort,
but not guaranteed.

To resume the host, an RTC alarm is set up (based on how long we want to
suspend) before suspending the host. When this alarm fires, the host
gets woken up.

Suspend-to-RAM operation on a host running Linux can take upto more than 20
seconds, depending on the load of the system. (Freezing of tasks, an operation
preceding any suspend operation, is given up after a 20 second timeout).
And Suspend-to-Disk can take even more time, considering the time required
for compaction, creating the memory image and writing it to disk etc.
So, we do not allow the user to specify a suspend duration of less than 60
seconds, to be on the safer side, since we don't want to prematurely declare
failure when we only had to wait for some more time.
2011-11-29 17:29:17 +08:00
Daniel P. Berrange
508aef9b0e Refactor initial LXC mem tune / device ACL code
To make lxcSetContainerResources smaller, pull the mem tune
and device ACL setup code out into separate methods

* src/lxc/lxc_controller.c: Introduce lxcSetContainerMemTune
  and lxcSetContainerDeviceACL
2011-11-28 12:06:51 +00:00
Daniel P. Berrange
a04699fc12 Add support for blkio tuning of LXC containers
* src/lxc/lxc_controller.c: Refactor setting of initial blkio
  tuning parameters
* src/lxc/lxc_driver.c: Enable live change of blkio tuning
2011-11-28 12:06:51 +00:00
Daniel P. Berrange
d9724a81b3 Add support for CPU quota/period to LXC driver
* src/lxc/lxc_driver.c: Support changing quota/period for LXC
  containers
* src/lxc/lxc_controller.c: Set initial quota/period at startup
2011-11-28 12:06:29 +00:00
Daniel P. Berrange
9175347828 Support CPU placement in LXC driver
While LXC does not have the concept of VCPUS, so we can't do
per-VCPU pCPU placement, we can support the VM level CPU
placement. Todo this simply set the CPU affinity of the LXC
controller at startup. All child processes will inherit this
affinity.

* src/lxc/lxc_controller.c: Set process affinity
2011-11-28 12:06:27 +00:00
Daniel P. Berrange
3e1b6d7575 Support NUMA memory placement for LXC containers
Use numactl to set NUMA memory placement for LXC containers

* src/lxc/lxc_controller.c: Support NUMA memory placement
2011-11-28 12:05:33 +00:00
Jiri Denemark
2c4cdb736c Fix version numbers for isAlive and setKeepAlive driver APIs 2011-11-24 14:44:59 +01:00
Jiri Denemark
e401b0cd02 Implement virConnectIsAlive in all drivers 2011-11-24 12:00:10 +01:00
Daniel P. Berrange
bfe952c9b2 Add support for interfaces with type=direct to LXC
Support creation of macvlan devices for LXC containers. Do not
allow setting of bandwidth controls or vport profiles due to the
complication that there is no host side visible device to work
with.

* src/lxc/lxc_driver.c: Support type=direct interfaces
2011-11-18 16:12:34 +00:00
Daniel P. Berrange
f3b1b9b184 Refactor LXC network setup to allow future enhancements
The current lxcSetupInterfaces() method directly performs setup
of the bridge devices. Since it will shortly need to also create
macvlan devices, move the bridge related code into a separate
method

* src/lxc/lxc_driver.c: Split lxcSetupInterfaces() to create a
  new lxcSetupInterfaceBridge()
2011-11-18 16:10:37 +00:00
Daniel P. Berrange
428cffb1e7 Move LXC veth.c code into shared utility APIs
Move the virNetDevSetName and virNetDevSetNamespace APIs out
of LXC's veth.c and into virnetdev.c.

Move the remaining content of the file to src/util/virnetdevveth.c

* src/lxc/veth.c: Rename to src/util/virnetdevveth.c
* src/lxc/veth.h: Rename to src/util/virnetdevveth.h
* src/util/virnetdev.c, src/util/virnetdev.h: Add
  virNetDevSetName and virNetDevSetNamespace
* src/lxc/lxc_container.c, src/lxc/lxc_controller.c,
  src/lxc/lxc_driver.c: Update include paths
2011-11-15 10:28:02 +00:00
Daniel P. Berrange
29b242ad80 Rename the LXC veth management APIs and delete duplicated APIs
The src/lxc/veth.c file contains APIs for managing veth devices,
but some of the APIs duplicate stuff from src/util/virnetdev.h.
Delete thed duplicate APIs and rename the remaining ones to
follow virNetDevVethXXXX

* src/lxc/veth.c, src/lxc/veth.h: Rename APIs & delete duplicates
* src/lxc/lxc_container.c, src/lxc/lxc_controller.c,
  src/lxc/lxc_driver.c: Update for API renaming
2011-11-15 10:28:02 +00:00
Eric Blake
e55ec69de6 build: drop useless dirent.h includes
* .gnulib: Update to latest, for improved syntax-check.
* src/lxc/lxc_container.c (includes): Drop unused include.
* src/network/bridge_driver.c: Likewise.
* src/node_device/node_device_linux_sysfs.c: Likewise.
* src/openvz/openvz_driver.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/storage/storage_backend_iscsi.c: Likewise.
* src/storage/storage_backend_mpath.c: Likewise.
* src/uml/uml_conf.c: Likewise.
* src/uml/uml_driver.c: Likewise.
2011-11-11 14:12:37 -07:00
Daniel P. Berrange
0eee075dc7 Adjust naming of network device bandwidth management APIs
Rename virBandwidth to virNetDevBandwidth, and virRate to
virNetDevBandwidthRate.

* src/util/network.c, src/util/network.h: Rename bandwidth
  structs and APIs
* src/conf/domain_conf.c, src/conf/domain_conf.h,
  src/conf/network_conf.c, src/conf/network_conf.h,
  src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/util/macvtap.c,
  src/util/macvtap.h, tools/virsh.c: Update for API changes.
2011-11-09 17:10:28 +00:00
Daniel P. Berrange
e49c9bf25c Split bridge.h into three separate files
Following the renaming of the bridge management APIs, we can now
split the source file into 3 corresponding pieces

 * src/util/virnetdev.c: APIs for any type of network interface
 * src/util/virnetdevbridge.c: APIs for bridge interfaces
 * src/util/virnetdevtap.c: APIs for TAP interfaces

* src/util/virnetdev.c, src/util/virnetdev.h,
  src/util/virnetdevbridge.c, src/util/virnetdevbridge.h,
  src/util/virnetdevtap.c, src/util/virnetdevtap.h: Copied
  from bridge.{c,h}
* src/util/bridge.c, src/util/bridge.h: Split into 3 pieces
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/openvz/openvz_driver.c, src/qemu/qemu_command.c,
  src/qemu/qemu_conf.h, src/uml/uml_conf.c, src/uml/uml_conf.h,
  src/uml/uml_driver.c: Update #include directives
2011-11-09 16:34:25 +00:00
Daniel P. Berrange
dced27c89e Rename all brXXXX APIs to follow new convention
The existing brXXX APIs in src/util/bridge.h are renamed to
follow one of three different conventions

 - virNetDevXXX       - operations for any type of interface
 - virNetDevBridgeXXX - operations for bridge interfaces
 - virNetDevTapXXX    - operations for tap interfaces

* src/util/bridge.h, src/util/bridge.c: Rename all APIs
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/uml/uml_conf.c,
  src/uml/uml_driver.c: Update for API renaming
2011-11-09 16:33:28 +00:00
Daniel P. Berrange
4f4fd8f7ad Make all brXXX APIs raise errors, instead of returning errnos
Currently every caller of the brXXX APIs has to store the returned
errno value and then raise an error message. This results in
inconsistent error messages across drivers, additional burden on
the callers and makes the error reporting inaccurate since it is
hard to distinguish different scenarios from 1 errno value.

* src/util/bridge.c: Raise errors instead of returning errnos
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/uml/uml_conf.c,
  src/uml/uml_driver.c: Remove error reporting code
2011-11-09 16:33:19 +00:00
Daniel P. Berrange
6cfeb9a766 Remove 'brControl' object
The bridge management APIs in src/util/bridge.c require a brControl
object to be passed around. This holds the file descriptor for the
control socket. This extra object complicates use of the API for
only a minor efficiency gain, which is in turn entirely offset by
the need to fork/exec the brctl command for STP configuration.

This patch removes the 'brControl' object entirely, instead opening
the control socket & closing it again within the scope of each method.

The parameter names for the APIs are also made to consistently use
'brname' for bridge device name, and 'ifname' for an interface
device name. Finally annotations are added for non-NULL parameters
and return check validation

* src/util/bridge.c, src/util/bridge.h: Remove brControl object
  and update API parameter names & annotations.
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/uml/uml_conf.h, src/uml/uml_conf.c, src/uml/uml_driver.c,
  src/qemu/qemu_command.c, src/qemu/qemu_conf.h,
  src/qemu/qemu_driver.c: Remove reference to 'brControl' object
2011-11-09 16:33:14 +00:00
Alex Jia
0dbc10a89e lxc: free error object to avoid memory leak
Detected by Coverity. Leak introduced in commit 9d201a5.

* src/lxc/lxc_driver.c: Clean up on failure.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-11-09 10:35:17 +01:00
Alex Jia
b9338ac828 lxc: free 'ttyFDs' array on return from lxcVmStart
Detected by Coverity. Leak introduced in commit 0f31f7b.

* src/lxc/lxc_driver.c: Clean up on failure.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-11-09 10:28:50 +01:00
Eric Blake
04d2a7f253 lxc: avoid use-after-free
I got this weird failure:

error: Failed to start domain simple
error: internal error cannot mix caller fds with blocking execution

and tracked it down to a use-after-free - virCommandSetOutputFD
was storing the address of a stack-local variable, which then
went out of scope before the virCommandRun that dereferenced it.

Bug introduced in commit 451cfd05 (0.9.2).

* src/lxc/lxc_driver.c (lxcBuildControllerCmd): Move log fd
registration...
(lxcVmStart): ...to caller.
2011-11-04 08:08:42 -06:00
Eric Blake
8aee48bdaa lxc: use common code for process cleanup
Based on a Coverity report - the return value of waitpid() should
always be checked, to avoid problems with leaking resources.

* src/lxc/lxc_controller.c (lxcControllerRun): Use simpler virPidAbort.
2011-11-03 08:44:19 -06:00
Daniel P. Berrange
209c2880b9 Fix default console type setting
The default console type may vary based on the OS type. ie a Xen
paravirt guests wants a 'xen' console, while a fullvirt guests
wants a 'serial' console.

A plain integer default console type in the capabilities does
not suffice. Instead introduce a callback that is passed the
OS type.

* src/conf/capabilities.h: Use a callback for default console
  type
* src/conf/domain_conf.c, src/conf/domain_conf.h: Use callback
  for default console type. Add missing LXC/OpenVZ console types.
* src/esx/esx_driver.c, src/libxl/libxl_conf.c,
  src/lxc/lxc_conf.c, src/openvz/openvz_conf.c,
  src/phyp/phyp_driver.c, src/qemu/qemu_capabilities.c,
  src/uml/uml_conf.c, src/vbox/vbox_tmpl.c,
  src/vmware/vmware_conf.c, src/xen/xen_hypervisor.c,
  src/xenapi/xenapi_driver.c: Set default console type callback
2011-11-03 12:01:48 +00:00
Daniel P. Berrange
8866eed097 Set aliases for LXC/UML console devices
To allow virDomainOpenConsole to access non-primary consoles,
device aliases are required to be set. Until now only the QEMU
driver has done this. Update LXC & UML to set aliases for any
console devices

* src/lxc/lxc_driver.c, src/uml/uml_driver.c: Set aliases
  for console devices
2011-11-03 12:01:43 +00:00
Daniel P. Berrange
0f31f7b794 Add support for multiple consoles in LXC
Currently the LXC controller only supports setup of a single
text console. This is wired up to the container init's stdio,
as well as /dev/console and /dev/tty1. Extending support for
multiple consoles, means wiring up additional PTYs to /dev/tty2,
/dev/tty3, etc, etc. The LXC controller is passed multiple open
file handles, one for each console requested.

* src/lxc/lxc_container.c, src/lxc/lxc_container.h: Wire up
  all the /dev/ttyN links required to symlink to /dev/pts/NN
* src/lxc/lxc_container.h: Open more container side /dev/pts/NN
  devices, and adapt event loop to handle I/O from all consoles
* src/lxc/lxc_driver.c: Setup multiple host side PTYs
2011-11-03 12:01:13 +00:00
Daniel P. Berrange
86b53e59d8 Rewrite LXC I/O forwarding to use main event loop
The current I/O code for LXC uses a hand crafted event loop
to forward I/O between the container & host app, based on
epoll to handle EOF on PTYs. This event loop is not easily
extensible to add more consoles, or monitor other types of
file descriptors.

Remove the custom event loop and replace it with a normal
libvirt event loop. When detecting EOF on a PTY, disable
the event watch on that FD, and fork off a background thread
that does a edge-triggered epoll() on the FD. When the FD
finally shows new incoming data, the thread re-enables the
watch on the FD and exits.

When getting EOF from a read() on the PTY, the existing code
would do waitpid(WNOHANG) to see if the container had exited.
Unfortunately there is a race condition, because even though
the process has closed its stdio handles, it might still
exist.

To deal with this the new event loop uses a SIG_CHILD handler
to perform the waitpid only when the container is known to
have actually exited.

* src/lxc/lxc_controller.c: Rewrite the event loop to use
  the standard APIs.
2011-11-03 12:01:12 +00:00
Daniel P. Berrange
0873b688c6 Allow multiple consoles per virtual guest
While Xen only has a single paravirt console, UML, and
QEMU both support multiple paravirt consoles. The LXC
driver can also be trivially made to support multiple
consoles. This patch extends the XML to allow multiple
<console> elements in the XML. It also makes the UML
and QEMU drivers support this config.

* src/conf/domain_conf.c, src/conf/domain_conf.h: Allow
  multiple <console> devices
* src/lxc/lxc_driver.c, src/xen/xen_driver.c,
  src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c: Update for
  internal API changes
* src/security/security_selinux.c, src/security/virt-aa-helper.c:
  Only label consoles that aren't a copy of the serial device
* src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
  src/qemu/qemu_process.c, src/uml/uml_conf.c,
  src/uml/uml_driver.c: Support multiple console devices
* tests/qemuxml2xmltest.c, tests/qemuxml2argvtest.c: Extra
  tests for multiple virtio consoles. Set QEMU_CAPS_CHARDEV
  for all console /channel tests
* tests/qemuxml2argvdata/qemuxml2argv-channel-virtio-auto.args,
  tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args
  tests/qemuxml2argvdata/qemuxml2argv-console-virtio.args: Update
  for correct chardev syntax
* tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.args,
  tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.xml: New
  test file
2011-11-03 12:01:05 +00:00
Eric Blake
f4e584decf lxc: allow getting < max typed parameters
Allow the user to call with nparams too small, per API documentation.
Also, libvirt.c filters out nparams of 0 for scheduler parameters.

* src/lxc/lxc_driver.c (lxcDomainGetMemoryParameters): Allow fewer
than max.
(lxcGetSchedulerParametersFlags): Drop redundant check.
2011-11-02 14:00:13 -06:00
Eric Blake
319992d4b6 API: document scheduler parameter names
Document the parameter names that will be used by
virDomain{Get,Set}SchedulerParameters{,Flags}, rather than
hard-coding those names in each driver, to match what is
done with memory, blkio, and blockstats parameters.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_SCHEDULER_CPU_SHARES)
(VIR_DOMAIN_SCHEDULER_VCPU_PERIOD)
(VIR_DOMAIN_SCHEDULER_VCPU_QUOTA, VIR_DOMAIN_SCHEDULER_WEIGHT)
(VIR_DOMAIN_SCHEDULER_CAP, VIR_DOMAIN_SCHEDULER_RESERVATION)
(VIR_DOMAIN_SCHEDULER_LIMIT, VIR_DOMAIN_SCHEDULER_SHARES): New
field name macros.
* src/qemu/qemu_driver.c (qemuSetSchedulerParametersFlags)
(qemuGetSchedulerParametersFlags): Use new defines.
* src/test/test_driver.c (testDomainGetSchedulerParamsFlags)
(testDomainSetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorGetSchedulerParameters)
(xenHypervisorSetSchedulerParameters): Likewise.
* src/xen/xend_internal.c (xenDaemonGetSchedulerParameters)
(xenDaemonSetSchedulerParameters): Likewise.
* src/lxc/lxc_driver.c (lxcSetSchedulerParametersFlags)
(lxcGetSchedulerParametersFlags): Likewise.
* src/esx/esx_driver.c (esxDomainGetSchedulerParametersFlags)
(esxDomainSetSchedulerParametersFlags): Likewise.
* src/libxl/libxl_driver.c (libxlDomainGetSchedulerParametersFlags)
(libxlDomainSetSchedulerParametersFlags): Likewise.
2011-11-02 13:52:56 -06:00
Daniel P. Berrange
9d201a5c22 Don't overwrite error message during VM cleanup
If an LXC VM fails to start, quite a few cleanup paths will
result in the original error message being overwritten. Some
other cleanup paths also forgot to actually terminate the VM.

* src/lxc/lxc_driver.c: Ensure VM is terminated on startup
  failure and preserve original error
2011-11-01 18:40:37 +00:00
Daniel P. Berrange
26798492e3 Add support for probing filesystem with libblkid
The LXC code for mounting container filesystems from block devices
tries all filesystems in /etc/filesystems and possibly those in
/proc/filesystems. The regular mount binary, however, first tries
using libblkid to detect the format. Add support for doing the same
in libvirt, since Fedora's /etc/filesystems is missing many formats,
most notably ext4 which is the default filesystem Fedora uses!

* src/Makefile.am: Link libvirt_lxc to libblkid
* src/lxc/lxc_container.c: Probe filesystem format with libblkid
2011-11-01 18:40:37 +00:00
Daniel P. Berrange
6828535669 Fix error message when failing to detect filesystem
If we looped through /etc/filesystems trying to mount with each
type and failed all options, we forget to actually raise an
error message.

* src/lxc/lxc_container.c: Raise error if unable to detect
  the filesystems. Also fix existing error message
2011-11-01 18:40:37 +00:00
Daniel P. Berrange
878cc33a6a Workaround for broken kernel autofs mounts
The kernel automounter is mostly broken wrt to containers. Most
notably if you start a new filesystem namespace and then attempt
to unmount any autofs filesystem, it will typically fail with a
weird error message like

  Failed to unmount '/.oldroot/sys/kernel/security':Too many levels of symbolic links

Attempting to detach the autofs mount using umount2(MNT_DETACH)
will also fail with the same error. Therefore if we get any error on
unmount()ing a filesystem from the old root FS when starting a
container, we must immediately break out and detach the entire
old root filesystem (ignoring any mounts below it).

This has the effect of making the old root filesystem inaccessible
to anything inside the container, but at the cost that the mounts
live on in the kernel until the container exits. Given that SystemD
uses autofs by default, we need LXC to be robust this scenario and
thus this tradeoff is worthwhile.

* src/lxc/lxc_container.c: Detach root filesystem if any umount
  operation fails.
2011-11-01 18:40:37 +00:00
Daniel P. Berrange
a02f57faa9 Correctly handle '*' in /etc/filesystems
The /etc/filesystems file can contain a '*' on the last line to
indicate that /proc/filessystems should be tried next. We have
a check that this '*' only occurs on the last line. Unfortunately
when we then start reading /proc/filesystems, we mistakenly think
we've seen '*' in /proc/filesystems and fail

* src/lxc/lxc_container.c: Skip '*' validation when we're reading
  /proc/filesystems
2011-11-01 18:40:37 +00:00
Daniel P. Berrange
065ecf5162 Ensure errno is valid when returning from lxcContainerWaitForContinue
Only some of the return paths of lxcContainerWaitForContinue will
have set errno. In other paths we need to set it manually to avoid
the caller getting a random stale errno value

* src/lxc/lxc_container.c: Set errno in lxcContainerWaitForContinue
2011-11-01 18:40:37 +00:00
Peter Krempa
95d3b4de71 lxc: Revert zeroing count of allocated items if VIR_REALLOC_N fails
Previous commit clears number of items alocated in lxcSetupLoopDevices
if VIR_REALLOC_N fails. In that case, the pointer is not NULL, and
causes leaking FDs that have been allocated.

 *  src/lxc/lxc_controller.c: revert zeroing array size
2011-10-27 10:32:21 +02:00
Alex Jia
3fd2b1e9d0 lxc: avoid null deref on lxcSetupLoopDevices failure
If the function lxcSetupLoopDevices(def, &nloopDevs, &loopDevs) failed,
the variable loopDevs will keep a initial NULL value, however, the
function VIR_FORCE_CLOSE(loopDevs[i]) will directly deref it.

This patch also fixes returning a bogous number of devices from
lxcSetupLoopDevices on an error path.

* rc/lxc/lxc_controller.c: fixed a null pointer dereference.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-10-27 10:03:10 +02:00
Alex Jia
d2dff42598 lxc: avoid missing '{' in the function
Cppcheck detected a syntaxError on lxcDomainInterfaceStats.

* src/lxc/lxc_driver.c: fixed missing '{' in the function lxcDomainInterfaceStats.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-10-27 09:33:26 +02:00
Eric Blake
69d044c034 waitpid: improve safety
Based on a report by Coverity.  waitpid() can leak resources if it
fails with EINTR, so it should never be used without checking return
status.  But we already have a helper function that does that, so
use it in more places.

* src/lxc/lxc_container.c (lxcContainerAvailable): Use safer
virWaitPid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput, virtTestMain):
Likewise.
* src/libvirt.c (virConnectAuthGainPolkit): Simplify with virCommand.
2011-10-24 15:42:52 -06:00
Serge E. Hallyn
80710c69fe lxc: use hand-rolled code in place of unlockpt and grantpt
The glibc ones (intentionally) cannot handle ptys opened in a
devpts not mounted at /dev/pts.

Drop the (un-exported, unused) virFileOpenTtyAt.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2011-10-19 14:47:16 -06:00
Daniel P. Berrange
02e92dc470 Add support for autodestroy of guests to the LXC and UML drivers
We recently added support for VIR_DOMAIN_START_AUTODESTROY and
an impl to the QEMU driver. It is very desirable to support in
other drivers, so this adds it to LXC and UML

* src/lxc/lxc_conf.h, src/lxc/lxc_driver.c,
  src/uml/uml_conf.h, src/uml/uml_driver.c: Wire up autodestroy
  functions
2011-10-19 09:14:27 +01:00
Serge E. Hallyn
d60299c3ec Fix typo in lxc_controller
s/Mouting/Mounting.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
2011-10-13 09:44:17 -06:00
Eric Blake
dbbe16c26e maint: typo fixes
I noticed a couple typos in recent commits, and fixed the remaining
instances of them.

* docs/internals/command.html.in: Fix spelling errors.
* include/libvirt/libvirt.h.in (virConnectDomainEventCallback):
Likewise.
* python/libvirt-override.py (virEventAddHandle): Likewise.
* src/lxc/lxc_container.c (lxcContainerChild): Likewise.
* src/util/hash.c (virHashCreateFull): Likewise.
* src/storage/storage_backend_logical.c
(virStorageBackendLogicalMakeVol): Likewise.
* src/esx/esx_driver.c (esxFormatVMXFileName): Likewise.
* src/vbox/vbox_tmpl.c (vboxIIDIsEqual_v3_x): Likewise.
2011-10-10 14:02:06 -06:00
Eric Blake
2e593ba518 lxc: fix logic bug
Detected by Coverity.  We want to increment the size_t counter,
not the pointer to the counter.  Bug present since 5f5c6fde (0.9.5).

* src/lxc/lxc_controller.c (lxcSetupLoopDevices): Use correct
precedence.
2011-10-07 20:49:12 -06:00
Daniel P. Berrange
b59bb93129 Make LXC work with new network configuration types
If using one of the new non-NAT/routed virtual network
configurations, the LXC driver would not know how to
setup the VETH devices. Adding in calls to setup the
"actual" network configuration at VM startup and cleanup
when shutting down fixes this.

* src/lxc/lxc_driver.c: Setup/cleanup actual net devs
2011-10-06 10:20:01 +01:00
Daniel P. Berrange
652f887144 Allow passing of command line args to LXC container
When booting a virtual machine with a kernel/initrd it is possible
to pass command line arguments using the <cmdline>...args...</cmdline>
element in the guest XML. These appear to the kernel / init process
in /proc/cmdline.

When booting a container we do not have a custom /proc/cmdline,
but we can easily set an environment variable for it. Ideally
we could pass individual arguments to the init process as a
regular set of 'char *argv[]' parameters, but that would involve
libvirt parsing the <cmdline> XML text. This can easily be added
later, even if we add the env variable now

* docs/drvlxc.html.in: Document env variables passed to LXC
* src/conf/domain_conf.c: Add <cmdline> to be parsed for
  guests of type='exe'
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_CMDLINE env var
2011-10-04 14:15:09 +01:00
Daniel P. Berrange
6cc9ee9b18 Add support for bandwidth filtering on LXC guests
Call virBandwidthEnable after creating the LXC veth, so that any
bandwidth controls get applied

* src/lxc/lxc_driver.c: Enable bandwidth limiting
2011-10-04 14:15:09 +01:00
Michal Privoznik
45ad3d6962 debug: Annotate some variables as unused
as they are not used with debugging turned off.
2011-09-27 10:16:46 +02:00
Peter Krempa
79cf07af7c Avoid using "devname" as an identifier.
/usr/lib/stdlib.h in Mac OS X and probably also in BSD's
exports this symbol :(
2011-09-16 20:49:04 +08:00
Scott Moser
f0fe28cb8d lxc: do not require 'ifconfig' or 'ipconfig' in container
Currently, the lxc implementation invokes 'ip' and 'ifconfig' commands
inside a container using 'virRun'.  That has the side effect of requiring
those commands to be present and to function in a manner consistent with
the usage.  Some small roots (such as ttylinux) may not have 'ip' or
'ifconfig'.

This patch replaces the use of these commands with usage of
netdevice.  The result is that lxc containers do not have to implement
those commands, and lxc in libvirt is only dependent on the netdevice
interface.

I've tested this patch locally against the ubuntu libvirt version enough
to verify its generally sane.  I attempted to build upstream today, but
failed with:
  /usr/bin/ld:
    ../src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_domain.o):
   undefined reference to symbol 'xmlXPathRegisterNs@@LIBXML2_2.4.30

Thats probably a local issue only, but I wanted to get this patch up and
see what others thought of it.  This is ubuntu bug
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/828211 .
2011-09-01 20:11:50 -06:00
Serge Hallyn
c1665ba872 Create ptmx as a device
Hi,

I'm seeing an issue with udev and libvirt-lxc.  Libvirt-lxc creates
/dev/ptmx as a symlink to /dev/pts/ptmx.  When udev starts up, it
checks the device type, sees ptmx is 'not right', and replaces it
with a 'proper' ptmx.

In lxc, /dev/ptmx is bind-mounted from /dev/pts/ptmx instead of being
symlinked, so udev sees the right device type and leaves it alone.

A patch like the following seems to work for me.  Would there be
any objections to this?

>From 4c5035de52de7e06a0de9c5d0bab8c87a806cba7 Mon Sep 17 00:00:00 2001
From: Ubuntu <ubuntu@domU-12-31-39-14-F0-B3.compute-1.internal>
Date: Wed, 31 Aug 2011 18:15:54 +0000
Subject: [PATCH 1/1] make ptmx a bind mount rather than symlink

udev on some systems checks the device type of /dev/ptmx, and replaces it if
not as expected.  The symlink created by libvirt-lxc therefore gets replaced.
By creating it as a bind mount, the device type is correct and udev leaves it
alone.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
2011-09-01 20:11:50 -06:00
Osier Yang
6af0c3e82b lxc: Fix incorrect changes on error codes.
Fix incorrect changes introduced by commit 6ac47762bb.
2011-09-01 17:34:31 +08:00
Eric Blake
3a52b864dd maint: fix comment typos
* src/qemu/qemu_driver.c (qemuDomainSaveInternal): Fix typo.
* src/conf/domain_event.c (virDomainEventDispatchMatchCallback):
Likewise.
* daemon/libvirtd.c (daemonRunStateInit): Likewise.
* src/lxc/lxc_container.c (lxcContainerChildMountSort): Likewise.
* src/util/virterror.c (virCopyError, virRaiseErrorFull): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxprSound): Likewise.
2011-08-23 11:31:28 -06:00
Osier Yang
6ac47762bb lxc: Cleanup improper VIR_ERR_NO_SUPPORT use
s/VIR_ERR_NO_SUPPORT/VIR_ERR_OPERATION_INVALID/

Special case is changes on lxcDomainInterfaceStats, if it's not
implemented on the platform, prints error like:

    lxcError(VIR_ERR_OPERATION_INVALID, "%s",
             _("interface stats not implemented on this platform"));

As the function is supported by driver actually, error like
VIR_ERR_NO_SUPPORT is confused.
2011-08-23 16:17:10 +08:00
Osier Yang
b375fc01e2 lxc: Allow to undefine a running domain 2011-08-19 21:47:33 +08:00
Daniel P. Berrange
f80a4ed77a Move pidfile functions into util/virpidfile.{c,h}
The functions for manipulating pidfiles are in util/util.{c,h}.
We will shortly be adding some further pidfile related functions.
To avoid further growing util.c, this moves the pidfile related
functions into a dedicated virpidfile.{c,h}. The functions are
also all renamed to have 'virPidFile' as their name prefix

* util/util.h, util/util.c: Remove all pidfile code
* util/virpidfile.c, util/virpidfile.h: Add new APIs for pidfile
  handling.
* lxc/lxc_controller.c, lxc/lxc_driver.c, network/bridge_driver.c,
  qemu/qemu_process.c: Add virpidfile.h include and adapt for API
  renames
2011-08-12 20:37:00 +01:00
Daniel P. Berrange
5f5c6fde00 Allow use of file images for LXC container filesystems
A previous commit gave the LXC driver the ability to mount
block devices for the container filesystem. Through use of
the loopback device functionality, we can build on this to
support use of plain file images for LXC filesytems.

By setting the LO_FLAGS_AUTOCLEAR flag we can ensure that
the loop device automatically disappears when the container
dies / shuts down

* src/lxc/lxc_container.c: Raise error if we see a file
  based filesystem, since it should have been turned into
  a loopback device already
* src/lxc/lxc_controller.c: Rewrite any filesystems of
  type=file, into type=block, by binding the file image
  to a free loop device
2011-08-08 11:38:09 +01:00
Daniel P. Berrange
8c7477c481 Fix typo in LXC cgroups setup error message
* src/lxc/lxc_controller.c: s/PYT/PTY/
2011-08-08 11:38:09 +01:00
Daniel P. Berrange
77791dc0e1 Allow use of block devices for guest filesystem
Currently the LXC driver can only populate filesystems from
host filesystems, using bind mounts. This patch allows host
block devices to be mounted. It autodetects the filesystem
format at mount time, and adds the block device to the cgroups
ACL. Example usage is

    <filesystem type='block' accessmode='passthrough'>
      <source dev='/dev/sda1'/>
      <target dir='/home'/>
    </filesystem>

* src/lxc/lxc_container.c: Mount block device filesystems
* src/lxc/lxc_controller.c: Add block device filesystems
  to cgroups ACL
2011-08-08 11:38:05 +01:00
Daniel P. Berrange
b6bd2d3466 Don't mount /dev for application containers
An application container shouldn't get a private /dev. Fix
the regression from 6d37888e6a

* src/lxc/lxc_container.c: Don't mount /dev for app containers
2011-08-08 11:24:35 +01:00
Eric Blake
00ef048f62 fdstream: drop delete argument
Revert 6a1f5f568f.  Now that libvirt_iohelper takes fds by
inheritance rather than by open() (commit 1eb66479), there is
no longer a race where the parent can unlink() a file prior to
the iohelper open()ing the same file.  From there, it makes
more sense to have the callers both create and unlink, rather
than the caller create and the stream unlink, since the latter
was only needed when iohelper had to do the unlink.

* src/fdstream.h (virFDStreamOpenFile, virFDStreamCreateFile):
Callers are responsible for deletion.
* src/fdstream.c (virFDStreamOpenFileInternal): Don't leak created
file on failure.
(virFDStreamOpenFile, virFDStreamCreateFile): Drop parameter.
* src/lxc/lxc_driver.c (lxcDomainOpenConsole): Update callers.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemuDomainOpenConsole): Likewise.
* src/storage/storage_driver.c (storageVolumeDownload)
(storageVolumeUpload): Likewise.
* src/uml/uml_driver.c (umlDomainOpenConsole): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise.
2011-08-02 14:53:43 -06:00
Laine Stump
d6354c1696 util: change virFile*Pid functions to return < 0 on failure
Although most functions in libvirt return 0 on success and < 0 on
failure, there are a few functions lingering around that return errno
(a positive value) on failure, and sometimes code calling those
functions incorrectly assumes the <0 standard. I noticed one of these
the other day when auditing networkStartDhcpDaemon after Guido Gunther
found a place where success was improperly returned on failure (that
patch has been acked and is pending a push). The problem was that it
expected the return value from virFileReadPid to be < 0 on failure,
but it was actually positive (it was also neglected to set the return
code in this case, similar to the bug found by Guido).

This all led to the fact that *all* of the virFile*Pid functions in
util.c are returning errno on failure. This patch remedies that
problem by changing them all to return -errno on failure, and makes
any necessary changes to callers of the functions. (In the meantime, I
also properly set the return code on failure of virFileReadPid in
networkStartDhcpDaemon).
2011-07-25 16:56:26 -04:00
Daniel P. Berrange
b3ad9b9b80 Honour filesystem readonly flag & make special FS readonly
A container should not be allowed to modify stuff in /sys
or /proc/sys so make them readonly. Make /selinux readonly
so that containers think that selinux is disabled.

Honour the readonly flag when mounting container filesystems
from the guest XML config

* src/lxc/lxc_container.c: Support readonly mounts
2011-07-22 15:31:11 +01:00
Daniel P. Berrange
6d37888e6a Refactor mounting of special filesystems
Even in non-virtual root filesystem mode we should be mounting
more than just a new /proc. Refactor lxcContainerMountBasicFS
so that it does everything except for /dev and /dev/pts moving
that into lxcContainerMountDevFS. Pass in a source prefix
to lxcContainerMountBasicFS() so it can be used in both shared
root and private root modes.

* src/lxc/lxc_container.c: Unify mounting code for special
  filesystems
2011-07-22 15:31:11 +01:00
Daniel P. Berrange
66a00e61a4 Pull code for doing a bind mount into separate method
The bind mount setup is about to get more complicated.
To avoid having to deal with several copies, pull it
out into a separate lxcContainerMountFSBind method.

Also pull out the iteration over container filesystems,
so that it will be easier to drop in support for non-bind
mount filesystems

* src/lxc/lxc_container.c: Pull bind mount code out into
  lxcContainerMountFSBind
2011-07-22 15:31:07 +01:00
Michal Privoznik
2dd3f025a0 destroy: Implement internal API for lxc driver 2011-07-21 20:41:27 +02:00
Eric Blake
8e22e08935 build: rename files.h to virfile.h
In preparation for a future patch adding new virFile APIs.

* src/util/files.h, src/util/files.c: Move...
* src/util/virfile.h, src/util/virfile.c: ...here, and rename
functions to virFile prefix.  Macro names are intentionally
left alone.
* *.c: All '#include "files.h"' uses changed.
* src/Makefile.am (UTIL_SOURCES): Reflect rename.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_close): Likewise.
* src/libvirt_private.syms: Likewise.
* docs/hacking.html.in: Likewise.
* HACKING: Regenerate.
2011-07-21 10:34:51 -06:00
Osier Yang
39babffb73 undefine: Implement undefineFlags for all other drivers 2011-07-20 11:08:21 +08:00
Daniel P. Berrange
80cafba310 Fix now dead cleanup of VMs on libvirtd restart
When libvirtd restarts it will attempt to reconnect to existing
LXC containers. If it loads a XML state file for the container
the container will appear running. If we fail to read the PID
file, or fail to connect to the LXC monitor, we should be killing
off the guest, but if the VMs cgroup does not exist any more,
cleanup will get skipped. Reading the PID file is also pointless
since the PID is in the XML statefile

In lxcReconnectVM we do not need to read the PID file. If part
of the reconnect process fails we need to run the VM terminate
code as a safety net.

In lxcVMTerminate, if we can't obtain the VM cgroup, we know
the process has died, but we must still run lxcVMCleanup to
clear out the virDomainObjPtr live state

* src/lxc/lxc_driver.c: Fix cleanup of dead VMs on restart
2011-07-18 16:13:56 +01:00
Eric Blake
461e0f1a2d flags: use common dumpxml flags check
The previous patches only cleaned up ATTRIBUTE_UNUSED flags cases;
auditing the drivers found other places where flags was being used
but not validated.  In particular, domainGetXMLDesc had issues with
clients accepting a different set of flags than the common
virDomainDefFormat helper function.

* src/conf/domain_conf.c (virDomainDefFormat): Add common flag check.
* src/uml/uml_driver.c (umlDomainAttachDeviceFlags)
(umlDomainDetachDeviceFlags): Reject unknown
flags.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc)
(vboxDomainAttachDeviceFlags)
(vboxDomainDetachDeviceFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainMemoryPeek): Likewise.
(qemuDomainGetXMLDesc): Document common flag handling.
* src/libxl/libxl_driver.c (libxlDomainGetXMLDesc): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetXMLDesc): Likewise.
* src/openvz/openvz_driver.c (openvzDomainGetXMLDesc): Likewise.
* src/phyp/phyp_driver.c (phypDomainGetXMLDesc): Likewise.
* src/test/test_driver.c (testDomainGetXMLDesc): Likewise.
* src/vmware/vmware_driver.c (vmwareDomainGetXMLDesc): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetXMLDesc): Likewise.
2011-07-15 12:22:20 -06:00