We recently added support for VIR_DOMAIN_START_AUTODESTROY and
an impl to the QEMU driver. It is very desirable to support in
other drivers, so this adds it to LXC and UML
* src/lxc/lxc_conf.h, src/lxc/lxc_driver.c,
src/uml/uml_conf.h, src/uml/uml_driver.c: Wire up autodestroy
functions
Detected by Coverity. We want to increment the size_t counter,
not the pointer to the counter. Bug present since 5f5c6fde (0.9.5).
* src/lxc/lxc_controller.c (lxcSetupLoopDevices): Use correct
precedence.
If using one of the new non-NAT/routed virtual network
configurations, the LXC driver would not know how to
setup the VETH devices. Adding in calls to setup the
"actual" network configuration at VM startup and cleanup
when shutting down fixes this.
* src/lxc/lxc_driver.c: Setup/cleanup actual net devs
When booting a virtual machine with a kernel/initrd it is possible
to pass command line arguments using the <cmdline>...args...</cmdline>
element in the guest XML. These appear to the kernel / init process
in /proc/cmdline.
When booting a container we do not have a custom /proc/cmdline,
but we can easily set an environment variable for it. Ideally
we could pass individual arguments to the init process as a
regular set of 'char *argv[]' parameters, but that would involve
libvirt parsing the <cmdline> XML text. This can easily be added
later, even if we add the env variable now
* docs/drvlxc.html.in: Document env variables passed to LXC
* src/conf/domain_conf.c: Add <cmdline> to be parsed for
guests of type='exe'
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_CMDLINE env var
Currently, the lxc implementation invokes 'ip' and 'ifconfig' commands
inside a container using 'virRun'. That has the side effect of requiring
those commands to be present and to function in a manner consistent with
the usage. Some small roots (such as ttylinux) may not have 'ip' or
'ifconfig'.
This patch replaces the use of these commands with usage of
netdevice. The result is that lxc containers do not have to implement
those commands, and lxc in libvirt is only dependent on the netdevice
interface.
I've tested this patch locally against the ubuntu libvirt version enough
to verify its generally sane. I attempted to build upstream today, but
failed with:
/usr/bin/ld:
../src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_domain.o):
undefined reference to symbol 'xmlXPathRegisterNs@@LIBXML2_2.4.30
Thats probably a local issue only, but I wanted to get this patch up and
see what others thought of it. This is ubuntu bug
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/828211 .
Hi,
I'm seeing an issue with udev and libvirt-lxc. Libvirt-lxc creates
/dev/ptmx as a symlink to /dev/pts/ptmx. When udev starts up, it
checks the device type, sees ptmx is 'not right', and replaces it
with a 'proper' ptmx.
In lxc, /dev/ptmx is bind-mounted from /dev/pts/ptmx instead of being
symlinked, so udev sees the right device type and leaves it alone.
A patch like the following seems to work for me. Would there be
any objections to this?
>From 4c5035de52de7e06a0de9c5d0bab8c87a806cba7 Mon Sep 17 00:00:00 2001
From: Ubuntu <ubuntu@domU-12-31-39-14-F0-B3.compute-1.internal>
Date: Wed, 31 Aug 2011 18:15:54 +0000
Subject: [PATCH 1/1] make ptmx a bind mount rather than symlink
udev on some systems checks the device type of /dev/ptmx, and replaces it if
not as expected. The symlink created by libvirt-lxc therefore gets replaced.
By creating it as a bind mount, the device type is correct and udev leaves it
alone.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
s/VIR_ERR_NO_SUPPORT/VIR_ERR_OPERATION_INVALID/
Special case is changes on lxcDomainInterfaceStats, if it's not
implemented on the platform, prints error like:
lxcError(VIR_ERR_OPERATION_INVALID, "%s",
_("interface stats not implemented on this platform"));
As the function is supported by driver actually, error like
VIR_ERR_NO_SUPPORT is confused.
The functions for manipulating pidfiles are in util/util.{c,h}.
We will shortly be adding some further pidfile related functions.
To avoid further growing util.c, this moves the pidfile related
functions into a dedicated virpidfile.{c,h}. The functions are
also all renamed to have 'virPidFile' as their name prefix
* util/util.h, util/util.c: Remove all pidfile code
* util/virpidfile.c, util/virpidfile.h: Add new APIs for pidfile
handling.
* lxc/lxc_controller.c, lxc/lxc_driver.c, network/bridge_driver.c,
qemu/qemu_process.c: Add virpidfile.h include and adapt for API
renames
A previous commit gave the LXC driver the ability to mount
block devices for the container filesystem. Through use of
the loopback device functionality, we can build on this to
support use of plain file images for LXC filesytems.
By setting the LO_FLAGS_AUTOCLEAR flag we can ensure that
the loop device automatically disappears when the container
dies / shuts down
* src/lxc/lxc_container.c: Raise error if we see a file
based filesystem, since it should have been turned into
a loopback device already
* src/lxc/lxc_controller.c: Rewrite any filesystems of
type=file, into type=block, by binding the file image
to a free loop device
Currently the LXC driver can only populate filesystems from
host filesystems, using bind mounts. This patch allows host
block devices to be mounted. It autodetects the filesystem
format at mount time, and adds the block device to the cgroups
ACL. Example usage is
<filesystem type='block' accessmode='passthrough'>
<source dev='/dev/sda1'/>
<target dir='/home'/>
</filesystem>
* src/lxc/lxc_container.c: Mount block device filesystems
* src/lxc/lxc_controller.c: Add block device filesystems
to cgroups ACL
An application container shouldn't get a private /dev. Fix
the regression from 6d37888e6a
* src/lxc/lxc_container.c: Don't mount /dev for app containers
Revert 6a1f5f568f. Now that libvirt_iohelper takes fds by
inheritance rather than by open() (commit 1eb66479), there is
no longer a race where the parent can unlink() a file prior to
the iohelper open()ing the same file. From there, it makes
more sense to have the callers both create and unlink, rather
than the caller create and the stream unlink, since the latter
was only needed when iohelper had to do the unlink.
* src/fdstream.h (virFDStreamOpenFile, virFDStreamCreateFile):
Callers are responsible for deletion.
* src/fdstream.c (virFDStreamOpenFileInternal): Don't leak created
file on failure.
(virFDStreamOpenFile, virFDStreamCreateFile): Drop parameter.
* src/lxc/lxc_driver.c (lxcDomainOpenConsole): Update callers.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemuDomainOpenConsole): Likewise.
* src/storage/storage_driver.c (storageVolumeDownload)
(storageVolumeUpload): Likewise.
* src/uml/uml_driver.c (umlDomainOpenConsole): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise.
Although most functions in libvirt return 0 on success and < 0 on
failure, there are a few functions lingering around that return errno
(a positive value) on failure, and sometimes code calling those
functions incorrectly assumes the <0 standard. I noticed one of these
the other day when auditing networkStartDhcpDaemon after Guido Gunther
found a place where success was improperly returned on failure (that
patch has been acked and is pending a push). The problem was that it
expected the return value from virFileReadPid to be < 0 on failure,
but it was actually positive (it was also neglected to set the return
code in this case, similar to the bug found by Guido).
This all led to the fact that *all* of the virFile*Pid functions in
util.c are returning errno on failure. This patch remedies that
problem by changing them all to return -errno on failure, and makes
any necessary changes to callers of the functions. (In the meantime, I
also properly set the return code on failure of virFileReadPid in
networkStartDhcpDaemon).
A container should not be allowed to modify stuff in /sys
or /proc/sys so make them readonly. Make /selinux readonly
so that containers think that selinux is disabled.
Honour the readonly flag when mounting container filesystems
from the guest XML config
* src/lxc/lxc_container.c: Support readonly mounts
Even in non-virtual root filesystem mode we should be mounting
more than just a new /proc. Refactor lxcContainerMountBasicFS
so that it does everything except for /dev and /dev/pts moving
that into lxcContainerMountDevFS. Pass in a source prefix
to lxcContainerMountBasicFS() so it can be used in both shared
root and private root modes.
* src/lxc/lxc_container.c: Unify mounting code for special
filesystems
The bind mount setup is about to get more complicated.
To avoid having to deal with several copies, pull it
out into a separate lxcContainerMountFSBind method.
Also pull out the iteration over container filesystems,
so that it will be easier to drop in support for non-bind
mount filesystems
* src/lxc/lxc_container.c: Pull bind mount code out into
lxcContainerMountFSBind
When libvirtd restarts it will attempt to reconnect to existing
LXC containers. If it loads a XML state file for the container
the container will appear running. If we fail to read the PID
file, or fail to connect to the LXC monitor, we should be killing
off the guest, but if the VMs cgroup does not exist any more,
cleanup will get skipped. Reading the PID file is also pointless
since the PID is in the XML statefile
In lxcReconnectVM we do not need to read the PID file. If part
of the reconnect process fails we need to run the VM terminate
code as a safety net.
In lxcVMTerminate, if we can't obtain the VM cgroup, we know
the process has died, but we must still run lxcVMCleanup to
clear out the virDomainObjPtr live state
* src/lxc/lxc_driver.c: Fix cleanup of dead VMs on restart
The previous patches only cleaned up ATTRIBUTE_UNUSED flags cases;
auditing the drivers found other places where flags was being used
but not validated. In particular, domainGetXMLDesc had issues with
clients accepting a different set of flags than the common
virDomainDefFormat helper function.
* src/conf/domain_conf.c (virDomainDefFormat): Add common flag check.
* src/uml/uml_driver.c (umlDomainAttachDeviceFlags)
(umlDomainDetachDeviceFlags): Reject unknown
flags.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc)
(vboxDomainAttachDeviceFlags)
(vboxDomainDetachDeviceFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainMemoryPeek): Likewise.
(qemuDomainGetXMLDesc): Document common flag handling.
* src/libxl/libxl_driver.c (libxlDomainGetXMLDesc): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetXMLDesc): Likewise.
* src/openvz/openvz_driver.c (openvzDomainGetXMLDesc): Likewise.
* src/phyp/phyp_driver.c (phypDomainGetXMLDesc): Likewise.
* src/test/test_driver.c (testDomainGetXMLDesc): Likewise.
* src/vmware/vmware_driver.c (vmwareDomainGetXMLDesc): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetXMLDesc): Likewise.
* src/lxc/lxc_driver.c (lxcOpen, lxcDomainSetMemoryParameters)
(lxcDomainGetMemoryParameters): Reject unknown flags.
* src/lxc/lxc_container.c (lxcContainerStart): Rename flags to
cflags to reflect that it is not tied to libvirt.
The drivers were accepting domain configs without checking if those
were actually meant for them. For example the LXC driver happily
accepts configs with type QEMU.
Add a check for the expected domain types to the virDomainDefParse*
functions.
Some callers expected virFileMakePath to set errno, some expected
it to return an errno value. Unify this to return 0 on success and
-1 on error. Set errno to report detailed error information.
Also optimize virFileMakePath if stat fails with an errno different
from ENOENT.
To avoid regressions, we let callers specify whether to require a
minor and micro version. Callers that were parsing uname() output
benefit from defaulting to 0, whereas callers that were parsing
version strings from other sources should not change in behavior.
* src/util/util.c (virParseVersionString): Allow caller to choose
whether to fail if minor or micro is missing.
* src/util/util.h (virParseVersionString): Update signature.
* src/esx/esx_driver.c (esxGetVersion): Update callers.
* src/lxc/lxc_driver.c (lxcVersion): Likewise.
* src/openvz/openvz_conf.c (openvzExtractVersionInfo): Likewise.
* src/uml/uml_driver.c (umlGetVersion): Likewise.
* src/vbox/vbox_MSCOMGlue.c (vboxLookupVersionInRegistry):
Likewise.
* src/vbox/vbox_tmpl.c (vboxExtractVersion): Likewise.
* src/vmware/vmware_conf.c (vmwareExtractVersion): Likewise.
* src/xenapi/xenapi_driver.c (xenapiGetVersion): Likewise.
Reported by Matthias Bolte.
Since we virEventRegisterDefaultImpl is now a public API, callers need
a way to invoke the default registered Handle and Timeout functions. We
already have general functions for these internally, so promote
them to the public API.
v2:
Actually add APIs to libvirt.h
The LXC driver networking uses veth device pairs. These can
be easily hooked into the network filtering code.
* src/lxc/lxc_driver.c: Add calls to setup/teardown nwfilter
The algorithm for autoassigning vethXXX devices, was always
skipping over the starting dev index when finding a free
name for the guest device. This should only be done if the host
device was autoallocated.
* src/lxc/veth.c: Don't skip over veth indexes
Add a simple handshake with the lxc_controller process so we can detect
process startup failures. We do this by adding a new --handshake cli arg
to lxc_controller for passing a file descriptor. If the process fails to
launch, we scrape all output from the logfile and report it to the user.
Seems reasonable to have all command wrappers in the same place
v2:
Dont move SetInherit
v3:
Comment spelling fix
Adjust WARN0 comment
Remove spurious #include movement
Don't include sys/types.h
Combine virExec enums
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This patch seperate the domain config loading just as qemu driver
does, first loading config of running or trasient domains, then
of persistent inactive domains. And only try to reconnect the
monitor of running domains, so that it won't always throws errors
saying can't connect to domain monitor.
And as "virDomainLoadConfig->virDomainAssignDef->virDomainObjAssignDef",
already do things like "vm->newDef = def", removed the codes
in "lxcReconnectVM" that does the same work.