Like the comment suggested, we just open the file and pass the file
descriptor to uml. The input "stream" is set to "null", since I couldn't
find any useful way to actually use a file for input for a chardev and
this also mimics what e.g. QEmu does internally.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
Instead of using one big traversal spec for lookup use a set of
more fine grained traversal specs that are selected based on the
actual needs of the lookup.
This gives up to 20% speedup for certain operations like domain
listing due to less HTTP(S) traffic.
* src/xenapi/xenapi_driver.c (xenapiDomainGetInfo): Avoid using
XEN_VM_POWER_STATE_UNKNOWN, which disappeared in newer xenapi.
* src/xenapi/xenapi_utils.c (mapPowerState): Likewise.
Previously QEMU enabled KQEMU by default and had -no-kqemu.
0.11.x switched to requiring -enable-kqemu. 0.12.x dropped
kqemu entirely. This patch adds support for -enable-kqemu
so 0.11.x works. It replaces a huge set of if() with a
switch() to make the code a bit more readable.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Support
-enable-kqemu
With the previous storage pool UUID source not all storage pools
had a proper UUID, especially GSX storage pools. The mount path
is unique per host and cannot change during the lifetime of the
datastore. Therefore, it's MD5 sum can be used as UUID.
Use gnulib's crypto/md5 module to generate the MD5 sum.
Xen supports on_crash actions coredump-{destroy,restart}. libvirt
cannot parse config returned by xend that contains either of these
actions
xen52 # xm li -l test | grep on_crash
(on_crash coredump-restart)
xen52 # virsh dumpxml test
error: internal error unknown lifecycle type coredump-restart
This patch adds a new virDomainLifecycleCrash enum and appends
the new options to existing destroy, restart, preserve, and
rename-restart options.
We already filled the PCI address structure when we checked whether it's
free or not, so let's just use the structure here instead of filling it
again.
I wrote a patch to add support for listing the Vendor and Model of a
storage pool in the storage pool XML. This would allow vendor
extensions of specific devices. The patch includes a test for the new
attributes as well.
Patrick Dignan
* src/uml/uml_driver.c (umlMonitorCommand): Validate that enough
bytes were read to dereference both res.length, and that many
bytes from res.data.
Reported by Soren Hansen.
node_device/node_device_driver.c: In function 'nodeDeviceVportCreateDelete':
node_device/node_device_driver.c:423: error: implicit declaration of function 'stat' [-Wimplicit-function-declaration]
* src/node_device/node_device_driver.c (includes): Add <sys/stat.h>.
Doing `virsh schedinfo rhel5u3 --cap 65535' the hypervisor does the
call, but does not change the value nor raise an error. Best is just to
consider it's not in the allowed values. The problem is that the error
won't be output since the xend driver will then be called and raise an
error
error: this function is not supported by the hypervisor: unsupported
in xendConfigVersion < 4
which will override the useful information from
xenUnifiedDomainSetSchedulerParameters(). So best is to also invert the
order in which the xen sub-drivers are called.
* src/xen/xen_hypervisor.c: mark 65535 cap value as out of bound
* src/xen/xen_hypervisor.c: reverse the order of the calls to the xen
sub drivers to get the error message if needed
Some kernels, such as the one used in RHEL-5, have vport_create and
vport_delete operation files in /sys/class/scsi_host/hostN directory
instead of /sys/class/fc_host/hostN. Let's check both paths for
compatibility reasons.
This also removes unnecessary '/' characters from sysfs paths containing
LINUX_SYSFS_FC_HOST_PREFIX.
NodeDeviceCreateXML and NodeDeviceDestroy methods added for NPIV were
using the wrong privateData field for the remote driver. This doesn't
impact KVM, since the remote driver handles everything, thus
privateData == devMonPrivateData. It does impact Xen though, because
the remote driver only handles a subset of methods and thus
privateData != devMonPrivateData.
In case an optional object cannot be found the lookup function is
left early and the cleanup code is not executed.
This pattern occurs in some other functions too.
The current version of the qemu managed save implementation
is subject to a race where the domain shuts down between
the time that we start the command and the time that we
actually try to do the save. Close this race by making
qemuDomainSaveFlags() expect both the driver and the passed-in
vm object to be locked before executing.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When reconnecting to existing VMs, we re-reserved only those PCI
addresses which were explicitly mentioned in domain XML. Since some
addresses are always reserved (e.g., 0:0:0 and 0:0:1), we need to handle
those too.
Also all this should only be done if device flag is supported by qemu.
In this patch I am extending and fixing the nwfilter module's reload support to stop all ongoing threads (for learning IP addresses of interfaces) and rebuild the filtering rules of all interfaces of all VMs when libvirt is started. Now libvirtd rebuilds the filters upon the SIGHUP signal and libvirtd restart.
About the patch: The nwfilter functions require a virConnectPtr. Therefore I am opening a connection in qemudStartup, which later on needs to be closed outside where the driver lock is held since otherwise it ends up in a deadlock due to virConnectClose() trying to lock the driver as well.
I have tested this now for a while with several machines running and needing the IP address learner thread(s). The rebuilding of the firewall rules seems to work fine following libvirtd restart or a SIGHUP. Also the termination of libvirtd worked fine.
floppy0.present defaults to true. Therefore, it needs to be
explicitly set to false when the XML config doesn't specify the
corresponding floppy device.
Also update tests accordingly.
I changed virStorage[Open|Close] to virVIOSDriver[Open|Close] so
the network driver can use it - since the network driver deals
with Open/Close in the same way.
This patch does two things:
* It makes umlConnectTapDevice ask brAddTap for a persistent tap by
passing it a NULL tapfd argument.
* Stops umlConnectTapDevice from immediately dismantling the bridge
it just set up.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
When passing a NULL tapfd argument to brAddTap, we need to close the fd
of the tap device. If we don't, libvirt will keep the fd open
indefinitely and renders the the guest unable to configure its side of
the tap device.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
If umlBuildCommandLineChr fails (e.g. due to an unsupported chardev
type), it returns NULL. umlBuildCommandLine does not check for this and
sets this as an argument on the comand line, effectively ending the
argument list. This patch checks for this case and sets the chardev to
"none".
Signed-off-by: Soren Hansen <soren@linux2go.dk>
When sniffing the network traffic, discard class D and E IP addresses when sniffing traffic. This was a reason why filters were not correctly rebuilt on VMs on the local 192.* network when libvirt was restarted and those VMs did not use a DHCP request to get its IP address.
While testing the SIGHUP handling and reloading of the nwfilter driver, I found that when the filters are rebuilt and mutlipe threads handled the individual interfaces, concurrently running multiple external bash scripts causes strange failures even though the executed ebtables commands are working on different tables for different interfaces. I cannot say for sure where the concurrency problems are caused, but introducing this lock definitely helps.
Since the qemu process is running as qemu:qemu, it can't actually
look at the unix socket in /var/run/libvirt/qemu which is owned by
root and has permission 700. Move the unix socket to
/var/lib/libvirt/qemu, which is already owned by qemu:qemu.
Thanks to Justin Clift for test this out for me.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The problem is that on the source of the migration, libvirtd
is responsible for creating the unix socket over which the data
will flow. Since libvirtd is running as root, this file will
be created as root. When the qemu process running as qemu:qemu
goes to access the unix file to write data to it, it will get
permission denied and fail. Make sure to change the owner
of the unix file to qemu:qemu.
Thanks to Justin Clift for testing this patch out for me.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
This patch fixes a couple of complaints from valgrind when tickling libvirtd with SIGHUP.
The first two files contain fixes for memory leaks. The 3rd one initializes an uninitialized variable. The 4th one is another memory leak.
Basically a followup of the previous patch about balloon desactivation
if desactivated, to not ask for balloon information to qemu as we will
just get an error back.
This can make a huge difference in the time needed for domain
information or list when a machine is loaded, and balloon has been
desactivated in the guests.
* src/qemu/qemu_driver.c: do not get the balloon info if the balloon
suppor is disabled
--dhcp-no-override description from dnsmasq man page:
Disable re-use of the DHCP servername and filename fields as
extra option space. If it can, dnsmasq moves the boot server and
filename information (from dhcp-boot) out of their dedicated
fields into DHCP options. This make extra space available in the
DHCP packet for options but can, rarely, confuse old or broken
clients. This flag forces "simple and safe" behaviour to avoid
problems in such a case.
It seems some virtual network card ROMs are this old/buggy so let's add
--dhcp-no-override as a workaround for them. We don't use extra DHCP
options so this should be safe. The option was added in dnsmasq-2.41,
which becomes the minimum required version.
For parsing try to match by datastore mount path first, if that
fails fallback to /vmfs/volumes/<datastore>/<path> parsing. This
also fixes problems with GSX on Windows. Because GSX on Windows
doesn't use /vmfs/volumes/ style file names.
For formatting use the datastore mount path too, instead of using
/vmfs/volumes/<datastore>/<path> as fixed format.
We add --dhcp-lease-max=xxx argument when network->def->nranges > 0 but
we only allocate space for in the opposite case :-) I guess we are lucky
enough to miscount somewhere else so that we actually allocate more
space than we need since no-one has hit this bug so far.
Introduce esxVMX_Context containing functions pointers to
glue both parts together in a generic way.
Move the ESX specific part to esx_driver.c.
This is a step towards making the VMX code reusable in a
potential VMware Workstation and VMware Player driver.
The balloon device is automatically added to qemu guests if supported,
but it may be useful to desactivate it. The simplest to not change the
existing behaviour is to allow
<memballoon type="none"/>
as an extra option to desactivate it (it is automatically added if the
memballoon construct is missing for the domain).
The following simple patch just adds the extra option and does not
change the default behaviour but avoid creating a balloon device if
type="none" is used.
* docs/schemas/domain.rng: add the extra type attribute value
* src/conf/domain_conf.c src/conf/domain_conf.h: add the extra enum
value
* src/qemu/qemu_conf.c: if enum is NONE, don't activate the device,
i.e. don't pass the args to qemu/kvm
Added a more detailed error message when adding a tap devices fails and
the kernel is missing tun support.
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
Fix the error checking to use the return value from brAddTap() instead
of checking the current errno value which might have been changed by
clean up calls inside of brAddTap().
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
https://bugzilla.redhat.com/622515 - When hot-unplugging CPUs,
libvirt failed to start a guest that had been pinned to CPUs that
were still online.
Tested on a dual-core laptop, where I also discovered that, per
http://www.cyberciti.biz/files/linux-kernel/Documentation/cpu-hotplug.txt,
/sys/devices/system/cpu/cpu0/online does not exist on systems where it
cannot be hot-unplugged.
* src/nodeinfo.c (linuxNodeInfoCPUPopulate): Ignore CPUs that are
currently offline. Detect readdir failure.
(parse_socket): Move guts...
(get_cpu_value): ...to new function, shared with...
(cpu_online): New function.
device_del command is not synchronous for PCI devices, it merely asks
the guest to release the device and returns. If the host wants to use
that device before the guest actually releases it, we are in big
trouble. To avoid this, we already added a loop which waits up to 10
seconds until the device is actually released before we do anything else
with that device. But we only added this loop for managed PCI devices
before we try reattach them back to the host.
However, we need to wait even for non-managed devices. We don't reattach
them automatically, but we still want to prevent the host from using it.
This was revealed thanks to sVirt: when we relabel sysfs files
corresponding to the PCI device before the guest finished releasing the
device, qemu is no longer allowed to access those files and if it wants
(as a result of guest's request) to write anything to them, it just
exits, which kills the guest.
This is not a proper fix and needs some further work both on libvirt and
qemu side in the future.
virDiskNameToIndex has a list of disk name prefixes that it uses in the
process of finding the disk's index. This list is missing "ubd" which
is the disk prefix used for UML domains.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
That way it can be used to verify a numeric address without storing
the details
* src/util/network.c: change virSocketParseAddr to allow a null @addr
parameter
According to <xen-3.4.3/tools/python/xen/xm/create.py:158>
gopts.var('bootargs', val='NAME',
fn=set_value, default=None,
use="Arguments to pass to boot loader")
the "bootloader_args" parameter needs to be translated into "bootargs"
when using "virsh domxml-to-native xen-xm".
The reverse direction (domxml-from-native) is already okay.
This patch fixes domxml-to-native and adds two test files to catch this
problem.
Signed-off-by: Philipp Hahn <hahn@univention.de>
src/phyp/phyp_driver.c:phypListDomainsGeneric was crashing due to a buffer
overflow if any line returned from virRun wasn't <=10 characters.
Since virStrToLong_i recognizes any non-numeric as a terminator (not
just NULL), there actually is no need to copy the number into a
separate string anyway, so this patch eliminates that copy, the fixed
length buffer, and therefore the potential to overflow.
This change also provided the oppurtunity to eliminate the character
counting loop, instead using the return from virStrToLong_i to point
past the end of the number, then simply skip the \n to get to the
next.
Fix the error checking to use the return value from brAddTap() instead
of checking the current errno value which might have been changed by
clean up calls inside of brAddTap().
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
Added a more detailed error message when adding a tap devices fails and
the kernel is missing tun support.
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
the followup on the boot=on problem, basically it's not needed to
specify it when booting out of IDE devices when using KVM
* src/qemu/qemu_conf.c: do not use boot=on for IDE devices
* tests/qemuxml2argvdata/qemuxml2argv*.args: this changes the output
for 5 of the tests
Patch version revamped by Eric Blake <eblake@redhat.com> of Jiri
Denemark <jdenemar@redhat.com> original patch
When attaching a PCI device which doesn't explicitly set its PCI
address, libvirt allocates the address automatically. The problem is
that when checking which PCI address is unused, we only check for those
with slot number higher than the highest slot number ever used.
Thus attaching/detaching such device several times in a row (31 is the
theoretical limit, less then 30 tries are enough in practise) makes any
further device attachment fail. Furthermore, attaching a device with
predefined PCI address to 0:0:31 immediately forbids attachment of any
PCI device without explicit address.
This patch changes the logic so that we always check all PCI addresses
before we say there is no PCI address available.
Modifications from v1: revert back to remembering the last slot
reserved, but allow wraparound to not be limited by the end.
In this way, slots are still assigned in the same order as
before the patch, rather than filling in the gaps closest to
0 and risking making windows guests mad.
* src/qemu/qemu_conf.c: fix pci reservation code to do a round-robbin
check of all available PCI splot availability before failing.
Don't rely on summary.url anymore, because its value is different
between an esx:// and vpx:// connection. Use host.mountInfo.path
instead.
Don't fallback to lookup by UUID (actually lookup by absolute path)
in esxVI_LookupDatastoreByName when lookup by name fails. Add a
seperate function for this: esxVI_LookupDatastoreByAbsolutePath
Now a vpx:// connection has an explicitly specified host. This
allows to enabled several functions for a vpx:// connection
again, like host UUID, hostname, general node info, max vCPU
count, free memory, migration and defining new domains.
Lookup datacenter, compute resource, resource pool and host
system once and cache them. This simplifies the rest of the
code and reduces overall HTTP(S) traffic a bit.
esx:// and vpx:// can be mixed freely for a migration.
Ensure that migration source and destination refer to the
same vCenter. Also directly encode the resource pool and
host system object IDs into the migration URI in the prepare
function. Then directly build managed object references in
the perform function instead of re-looking up already known
information.
This patch attempts to take advantage of a newly added netfilter
module to correct for a problem with some guest DHCP client
implementations when used in conjunction with a DHCP server run on the
host systems with packet checksum offloading enabled.
The problem is that, when the guest uses a RAW socket to read the DHCP
response packets, the checksum hasn't yet been fixed by the IP stack,
so it is incorrect.
The fix implemented here is to add a rule to the POSTROUTING chain of
the mangle table in iptables that fixes up the checksum for packets on
the virtual network's bridge that are destined for the bootpc port (ie
"dhcpc", ie port 68) port on the guest.
Only very new versions of iptables will have this support (it will be
in the next upstream release), so a failure to add this rule only
results in a warning message. The iptables patch is here:
http://patchwork.ozlabs.org/patch/58525/
A corresponding kernel module patch is also required (the backend of
the iptables patch) and that will be in the next release of the
kernel.
When trying to assign a PCI device to a guest, we have
to check that all bridges upstream of that device support
ACS. That means that we have to find the parent bridge of
the current device, check for ACS, then find the parent bridge
of that device, check for ACS, etc. As it currently stands,
the code to do this iterates through all PCI devices on the
system, looking for a device that has a range of busses that
included the current device's bus.
That check is not restrictive enough, though. Depending on
how we iterated through the list of PCI devices, we could first
find the *topmost* bridge in the system; since it necessarily had
a range of busses including the current device's bus, we
would only ever check the topmost bridge, and not check
any of the intermediate bridges.
Note that this also caused a fairly serious bug in the
secondary bus reset code, where we could erroneously
find and reset the topmost bus instead of the inner bus.
This patch changes pciGetParentDevice() so that it first
checks if a bridge device's secondary bus exactly matches
the bus of the device we are looking for. If it does, we've
found the correct parent bridge and we are done. If it does not,
then we check to see if this bridge device's busses *include* the
bus of the device we care about. If so, we mark this bridge device
as best, and go on. If we later find another bridge device whose
busses include this device, but is more restrictive, then we
free up the previous best and mark the new one as best. This
algorithm ensures that in the normal case we find the direct
parent, but in the case that the parent bridge secondary bus
is not exactly the same as the device, we still find the
correct bridge.
This patch was tested by me on a 4-port NIC with a
bridge without ACS (where assignment failed), a 4-port
NIC with a bridge with ACS (where assignment succeeded),
and a 2-port NIC with no bridges (where assignment
succeeded).
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When parsing hostdev, the following message would be emitted:
10:17:19.052: error : virDomainHostdevDefParseXML:3748 : internal error unknown node alias
However, alias is appropriately parsed in
virDomainDeviceInfoParseXML anyway. Disable the error message
in the initial XML parsing loop.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When an openvz domain is defined with virDomainDefineXML,
domain id is set to -1. A call to virDomainGetInfo after
starting the domain would then fail because this invalid
id is passed to openvzGetProcessInfo.
Found by clang. Clang complained that virStorageBackendProbeTarget
could dereference NULL if backingStoreFormat was NULL, but since all
callers passed a valid pointer, I added attributes instead of null
checks.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Kill dead store.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise. Skip null checks, by adding attributes.
valgrind was complaining that virUUIDParse was depending on
an uninitialized value. Indeed it was; virSetHostUUIDStr()
didn't initialize the dmiuuid buffer to 0's, meaning that
anything after the string read from /sys was uninitialized.
Clear out the dmiuuid buffer before use, and make sure to
always leave a \0 at the end.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Basically the 'boot=on' boot selection device is something present in
KVM but not in upstream QEmu, as a result if we boot a QEmu domain
without KVM acceleration we must disable boot=on ... even if the front
end kvm binary expose that capability in the help page.
* src/qemu/qemu_conf.c: in qemudBuildCommandLine if -no-kvm
is passed, then deactivate QEMUD_CMD_FLAG_DRIVE_BOOT
ADD_ARG_LIT should only be used for literal arguments,
since it duplicates the memory. Since virBufferContentAndReset
is already allocating memory, we should only use ADD_ARG.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
esxVI_WaitForTaskCompletion can take a UUID to lookup the
corresponding domain and check if the current task for it
is blocked by a question. It calls another function to do
this: esxVI_LookupAndHandleVirtualMachineQuestion looks up
the VirtualMachine and checks for a question. If there is
a question it calls esxVI_HandleVirtualMachineQuestion to
handle it.
If there was no question or it has been answered the call
to esxVI_LookupAndHandleVirtualMachineQuestion returns 0.
If any error occurred during the lookup and answering
process -1 is returned. The problem with this is, that -1
is also returned when there was no error but the question
could not be answered. So esxVI_WaitForTaskCompletion cannot
distinguish between this two situations and reports that a
question is blocking the task even when there was actually
another problem.
This inherent problem didn't surface until vSphere 4.1 when
you try to define a new domain. The driver tries to lookup
the domain that is just in the process of being registered.
There seems to be some kind of race condition and the driver
manages to issue a lookup command before the ESX server was
able to register the domain. This used to work before.
Due to the return value problem described above the driver
reported a false error message in that case.
To solve this esxVI_WaitForTaskCompletion now takes an
additional occurrence parameter that describes whether or
not to expect the domain to be existent. Also add a new
parameter to esxVI_LookupAndHandleVirtualMachineQuestion
that allows to distinguish if the call returned -1 because
of an actual error or because the question could not be
answered.
'./autobuild.sh' with lcov installed discovered that our
coverage support has been bit-rotting for a while. This
restores it back to a successful state, although I have
not yet spent any time looking through the resulting files to
look for low-hanging fruit in the unit test coverage front.
* configure.ac: Clear COMPILER_FLAGS at right place.
* Makefile.am (cov): Newer genhtml no longer likes plain -s.
* m4/compiler-flags.m4 (gl_COMPILER_FLAGS): Don't AC_SUBST
COMPILER_FLAGS; it is a shell variable for use in configure only.
* src/Makefile.am (AM_CFLAGS, AM_LDFLAGS): New variables, to make
it easier to provide global flag additions. Use throughout, to
uniformly apply coverage flags.
* .gitignore: Globally ignore gcov output.
* daemon/.gitignore: Simplify.
* src/.gitignore: Likewise.
* tests/.gitignore: Likewise.
The recent switch to enable -Wlogical-op paid off again.
gcc 4.5.0 (rawhide) is smarter than 4.4.4 (Fedora 13).
* src/xen/xend_internal.c (xenDaemonAttachDeviceFlags)
(xenDaemonUpdateDeviceFlags, xenDaemonDetachDeviceFlags): Use
correct operator.
Signed-off-by: Eric Blake <eblake@redhat.com>
src/lxc/veth.c:150: VIR_DEBUG(_("Failed to delete '%s' (%d)"),
src/lxc/veth.c:188: VIR_DEBUG(_("Failed to disable '%s' (%d)"),
maint.mk: do not mark these strings for translation
* src/lxc/veth.c (vethDelete, vethInterfaceUpOrDown): Don't
translate VIR_DEBUG.
Previously, the functions in src/lxc/veth.c could sometimes return
positive values on failure rather than -1. This made accurate error
reporting difficult, and led to one failure to catch an error in a
calling function.
This patch makes all the functions in veth.c consistently return 0 on
success, and -1 on failure. It also fixes up the callers to the veth.c
functions where necessary.
Note that this patch may be related to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=607496.
It will not fix the bug, but should unveil what happens.
* po/POTFILES.in - add veth.c, which previously had no translatable strings
* src/lxc/lxc_controller.c
* src/lxc/lxc_container.c
* src/lxc/lxc_driver.c - fixup callers to veth.c, and remove error logs,
as they are now done in veth.c
* src/lxc/veth.c - make all functions consistently return -1 on error.
* src/lxc/veth.h - use ATTRIBUTE_NONNULL to protect against NULL args.
This fixes a leak described in
https://bugzilla.redhat.com/show_bug.cgi?id=590073
xenUnifiedDomainInfoList has a pointer to a list of pointers to
xenUnifiedDomain. We were freeing up all the domains, but neglecting
to free the list.
This was found by Paolo Bonzini <pbonzini@redhat.com>.
If detecting the FLR flag of a pci device fails, then we
could run into the situation of trying to close a file
descriptor twice, once in pciInitDevice() and once in pciFreeDevice().
Fix that by removing the pciCloseConfig() in pciInitDevice() and
just letting pciFreeDevice() handle it.
Thanks to Chris Wright for pointing out this problem.
While we are at it, fix an error check. While it would actually
work as-is (since success returns 0), it's still more clear to
check for < 0 (as the rest of the code does).
Signed-off-by: Chris Lalancette <clalance@redhat.com>
All <console> devices now export a <target> type attribute. QEMU defaults
to 'serial', UML defaults to 'uml, xen can be either 'serial' or 'xen'
depending on fullvirt. Understandably there is lots of test fallout.
This will be used to differentiate between a serial vs. virtio console for
QEMU.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
targetType only tracks the actual <target> format we are parsing. Currently
we only fill abide this value for channel devices.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
There is actually a difference between the character device type (serial,
parallel, channel, ...) and the target type (virtio, guestfwd). Currently
they are awkwardly conflated.
Start to pull them apart by renaming targetType -> deviceType. This is
an entirely mechanical change.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
During function test of the 802.1Qbg implementation in lldpad we came
across a small problem in the handling of the netlink message
corresponding to PORT_PROFILE_RESPONSE_INPROGRESS. This should not
result in returning the default rc=1.
- src/util/macvtap.c: fix getPortProfileStatus() to return 0 in that
case and also fix an indentation problem
QEMU has had two different syntax for disk cache options
Old: on|off
New: writeback|writethrough|none
QEMU recently added another 'unsafe' option which broke the
libvirt check. We can avoid this & future breakage, if we
do a negative check for the old syntax, instead of a positive
check for the new syntax
* src/qemu/qemu_conf.c: Invert cache option check
Add a new element to the <os> block:
<bootmenu enable="yes|no"/>
Which maps to -boot,menu=on|off on the QEMU command line.
I decided to use an explicit 'enable' attribute rather than just make the
bootmenu element boolean. This allows us to treat lack of a bootmenu element
as 'use hypervisor default'.
Some buggy PCI devices actually support FLR, but
forget to advertise that fact in their PCI config space.
However, Virtual Functions on SR-IOV devices are
*required* to support FLR by the spec, so force has_flr
on if this is a virtual function.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When doing a PCI secondary bus reset, we must be sure that there are no
active devices on the same bus segment. The active device tracking is
designed to only track host devices that are active in use by guests.
This ignores host devices that are actively in use by the host. So the
current logic will reset host devices.
Switch this logic around and allow sbus reset when we are assigning all
devices behind a bridge to the same guest at guest startup or as a result
of a single attach-device command.
* src/util/pci.h: change signature of pciResetDevice to add an
inactive devices list
* src/qemu/qemu_driver.c src/xen/xen_driver.c: use (or not) the new
functionality of pciResetDevice() depending on the place of use
* src/util/pci.c: implement the interface and logic changes
- src/qemu/qemu_driver.c: Eliminate code duplication by using the new
helpers qemuPrepareHostdevPCIDevices and qemuDomainReAttachHostdevDevices.
This reduces the number of open coded calls to pciResetDevice.
- src/qemu/qemu_driver.c: These new helpers take hostdev list and count
directly rather than getting them indirectly from domain definition.
This will allow reuse for the attach-device case.
- src/qemu/qemu_driver.c: Update qemuGetPciHostDeviceList to take a
hostdev list and count directly, rather than getting this indirectly
from domain definition. This will allow reuse for the attach-device case.
Add a pointer to the primary context of a connection and use it in all
driver functions that don't dependent on the context type. This includes
almost all functions that deal with a virDomianPtr. Therefore, using
a vpx:// connection allows you to perform all the usual domain related
actions like start, destroy, suspend, resume, dumpxml etc.
Some functions that require an explicitly specified ESX server don't work
yet. This includes the host UUID, the hostname, the general node info, the
max vCPU count and the free memory. Also not working yet are migration and
defining new domains.
Since 070f61002f the vcenter query
parameter has been ignored, because the refactoring to use
esxUtil_ParseQuery was incomplete. This effectively broke migration,
because the vcenter query parameter is essential for a migration.
Commit 68719c4bdd added the
p option to control disk format probing, but it wasn't added
to the getopt_long optstring parameter.
Add the p option to the getopt_long optstring parameter.
Commit a885334499 added this
function and wrapped vah_add_file in it. vah_add_file may
return -1, 0, 1. It returns 1 in case the call to valid_path
detects a restricted file. The original code treated a return
value != 0 as error. The refactored code treats a return
value < 0 as error. This triggers segfault in virt-aa-helper
and breaks virt-aa-helper-test for the restricted file tests.
Make sure that add_file_path returns -1 on error.
virt-aa-helper used to ignore errors when opening files.
Commit a885334499 refactored
the related code and changed this behavior. virt-aa-helper
didn't ignore open errors anymore and virt-aa-helper-test
fails.
Make sure that virt-aa-helper ignores open errors again.
Thanks to DV for knocking together the Relax-NG changes
quickly for me.
Changes since v1:
- Change the domain.rng to correspond to the new schema
- Don't allocate caps->ns in testQemuCapsInit since it is a static table
Changes since v2:
- Change domain.rng to add restrictions on allowed environment names
Changes since v3:
- Remove a bogus comment in the tests
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Since we are adding a new "per-hypervisor" protocol, we
make it so that the qemu remote protocol uses a new
PROTOCOL and PROGRAM number. This allows us to easily
distinguish it from the normal REMOTE protocol.
This necessitates changing the proc in remote_message_header
from a "remote_procedure" to an "unsigned", which should
be the same size (and thus preserve the on-wire protocol).
Changes since v1:
- Fixed up a couple of script problems in remote_generate_stubs.pl
- Switch an int flag to a bool in dispatch.c
Changes since v2:
- None
Changes since v3:
- Change unsigned proc to signed proc, to conform to spec
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Implement the qemu driver's virDomainQemuMonitorCommand
and hook it into the API entry point.
Changes since v1:
- Rename the (external) qemuMonitorCommand to qemuDomainMonitorCommand
- Add virCheckFlags to qemuDomainMonitorCommand
Changes since v2:
- Drop ATTRIBUTE_UNUSED from the flags
Changes since v3:
- Add a flag to priv so we only print out monitor command warning once. Note
that this has not been plumbed into qemuDomainObjPrivateXMLFormat or
qemuDomainObjPrivateXMLParse, which means that if you run a monitor command,
restart libvirtd, and then run another monitor command, you may get an
an erroneous VIR_INFO. It's a pretty minor matter, and I didn't think it
warranted the additional code.
- Add BeginJob/EndJob calls around EnterMonitor/ExitMonitor
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Add the library entry point for the new virDomainQemuMonitorCommand()
entry point. Because this is not part of the "normal" libvirt API,
it gets its own header file, library file, and will eventually
get its own over-the-wire protocol later in the series.
Changes since v1:
- Go back to using the virDriver table for qemuDomainMonitorCommand, due to
linking issues
- Added versioning information to the libvirt-qemu.so
Changes since v2:
- None
Changes since v3:
- Add LGPL header to libvirt-qemu.c
- Make virLibConnError and virLibDomainError macros instead of function calls
Changes since v4:
- Move exported symbols to libvirt_qemu.syms
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Now that we have the ability to specify arbitrary qemu
command-line parameters in the XML, use it to handle unknown
command-line parameters when doing a native-to-xml conversion.
Changes since v1:
- Rename num_extra to num_args
- Fix up a memory leak on an error path
Changes since v2:
- Add a VIR_WARN when adding the argument via qemu:arg
Changes since v3:
- None
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Implement the qemu hooks for XML namespace data. This
allows us to specify a qemu XML namespace, and then
specify:
<qemu:commandline>
<qemu:arg value='arg'/>
<qemu:env name='name' value='value'/>
</qemu:commandline>
In the domain XML.
Changes since v1:
- Change the <qemu:arg>arg</qemu:arg> XML to <qemu:arg value='arg'/> XML
- Fix up some memory leaks in qemuDomainDefNamespaceParse
- Rename num_extra and extra to num_args and args, respectively
- Fixed up some error messages
- Make sure to escape user-provided data in qemuDomainDefNamespaceFormatXML
Changes since v2:
- Add checking to ensure environment variable names are valid
- Invert the logic in qemuDomainDefNamespaceFormatXML to return early
Changes since v3:
- Change strspn() to c_isalpha() check of first letter of environment variable
Signed-off-by: Chris Lalancette <clalance@redhat.com>
This patch adds namespace XML parsers to be hooked into
the main domain parser. This allows for individual hypervisor
drivers to add per-namespace XML into the main domain XML.
Changes since v1:
- Use a statically declared table for caps->ns, removing the need to
allocate/free it.
Changes since v2:
- None
Changes since v3:
- None
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The first conditional is always true which means the iterator will
never find another device on the same bus.
if (dev->domain != check->domain ||
dev->bus != check->bus ||
----> (check->slot == check->slot &&
check->function == check->function)) <-----
The goal of that check is to verify that the device is either:
in a different pci domain
on a different bus
is the same identical device
This means libvirt may issue a secondary bus reset when there are
devices
on that bus that actively in use by the host or another guest.
* src/util/pci.c: fix a bogus test in pciSharesBusWithActive()
The remote driver is using the wrong privateData field in
a couple of functions. THis is harmless for stateful
drivers like QEMU/UML/LXC, but will crash with Xen
* src/remote/remote_driver.c: Fix use of privateData field
A Linux software bridge will assume the MAC address of the enslaved
interface with the numerically lowest MAC addr. When the bridge
changes MAC address there is a period of network blackout, so a
change should be avoided. The kernel gives TAP devices a completely
random MAC address. Occassionally the random TAP device MAC is lower
than that of the physical interface (eth0, eth1etc) that is enslaved,
causing the bridge to change its MAC.
This change sets an explicit MAC address for all TAP devices created
using the configured MAC from the XML, but with the high byte set
to 0xFE. This should ensure TAP device MACs are higher than any
physical interface MAC.
* src/qemu/qemu_conf.c, src/uml/uml_conf.c: Pass in a MAC addr
for the TAP device with high byte set to 0xFE
* src/util/bridge.c, src/util/bridge.h: Set a MAC when creating
the TAP device to override random MAC
The PCI slot 1 must be reserved at all times, since PIIX3 is
always present, even if no IDE device is in use for guest disks
* src/qemu/qemu_conf.c: Always reserve slot 1 for PIIX3
Init process may remain after sending SIGTERM for some reason.
For example, if original init program is used, it is definitely
not killed by SIGTERM.
* src/lxc/lxc_controller.c: kill with SIGKILL if SIGTERM wasn't
sufficient
One error exit in virStorageBackendCreateBlockFrom was setting the
return value to errno. The convention for volume build functions is to
return 0 on success or -1 on failure. Not only was it not necessary to
set the return value (it defaults to -1, and is set to 0 when
everything has been successfully completed), in the case that some
caller were checking for < 0 rather than != 0, they would incorrectly
believe that it completed successfully.
virDirCreate also previously returned 0 on success and errno on
failure. This makes it fit the recommended convention of returning 0
on success, -errno (ie a negative number) on failure.
Previously virStorageBackendCopyToFD would simply return -1 on
error. This made the error return from one of its callers inconsistent
(createRawFileOpHook is supposed to return -errno, but if
virStorageBackendCopyToFD failed, createRawFileOpHook would just
return -1). Since there is a useful errno in every case of error
return from virStorageBackendCopyToFD, and since the other uses of
that function ignore the return code (beyond simply checking to see if
it is < 0), this is a safe change.
virFileOperation previously returned 0 on success, or the value of
errno on failure. Although there are other functions in libvirt that
use this convention, the preferred (and more common) convention is to
return 0 on success and -errno (or simply -1 in some cases) on
failure. This way the check for failure is always (ret < 0).
* src/util/util.c - change virFileOperation and virFileOperationNoFork to
return -errno on failure.
* src/storage/storage_backend.c, src/qemu/qemu_driver.c
- change the hook functions passed to virFileOperation to return
-errno on failure.
To try and ensure that people upgrading from old QEMU get guests
with the same PCI device ordering, change the way we assign addrs
to match QEMU's default order. This should make Windows less
annoyed.
* src/qemu/qemu_conf.c: Follow QEMU's default PCI ordering
logic when assigning addresses
* tests/*.args: Update for changed PCI addresses
To allow compatibility with older QEMU PCI device slot assignment
it is necessary to explicitly track the balloon device in the
XML. This introduces a new device
<memballoon model='virtio|xen'/>
It can also have a PCI address, auto-assigned if necessary.
The memballoon will be automatically added to all Xen and QEMU
guests by default.
* docs/schemas/domain.rng: Add <memballoon> element
* src/conf/domain_conf.c, src/conf/domain_conf.h: parsing
and formatting for memballoon device. Always add a memory
balloon device to Xen/QEMU if none exists in XML
* src/libvirt_private.syms: Export memballoon model APIs
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Honour the
PCI device address in memory balloon device
* tests/*: Update to test new functionality
The first VGA and IDE devices need to have fixed PCI address
reservations. Currently this is handled inline with the other
non-primary VGA/IDE devices. The fixed virtio balloon device
at slot 3, ensures auto-assignment skips the slots 1/2. The
virtio address will shortly become configurable though. This
means the reservation of fixed slots needs to be done upfront
to ensure that they don't get re-used for other devices.
This is more or less reverting the previous changeset:
commit 83acdeaf17
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Wed Feb 3 16:11:29 2010 +0000
Fix restore of QEMU guests with PCI device reservation
The difference is that this time, instead of unconditionally
reserving the address, we only reserve the address if it was
initially type=none. Addresses of type=pci were handled
earlier in process by qemuDomainPCIAddressSetCreate(). This
ensures restore step doesn't have problems
* src/qemu/qemu_conf.c: Reserve first VGA + IDE address
upfront
The VIR_ERR_NO_SUPPORT refers to an API which is not implemented.
There is a separate VIR_ERR_CONFIG_UNSUPPORTED for XML config
options that are not available with the current hypervisor.
* src/qemu/qemu_conf.c, src/qemu/qemu_driver.c: Remove
many VIR_ERR_NO_SUPPORT replace with VIR_ERR_CONFIG_UNSUPPORTED
If you try to execute two concurrent migrations p2p
from A->B and B->A, the two libvirtd's will deadlock
trying to perform the migrations. The reason for this is
that in p2p migration, the libvirtd's are responsible for
making the RPC Prepare, Migrate, and Finish calls. However,
they are currently holding the driver lock while doing so,
which basically guarantees deadlock in this scenario.
This patch fixes the situation by adding
qemuDomainObjEnterRemoteWithDriver and
qemuDomainObjExitRemoteWithDriver helper methods. The Enter
take an additional object reference, then drops both the
domain object lock and the driver lock. The Exit takes
both the driver and domain object lock, then drops the
reference. Adding calls to these Enter and Exit helpers
around remote calls in the various migration methods
seems to fix the problem for me in testing.
This should make the situation safe. The additional domain
object reference ensures that the domain object won't disappear
while this operation is happening. The BeginJob that is called
inside of qemudDomainMigratePerform ensures that we can't execute a
second migrate (or shutdown, or save, etc) job while the
migration is active. Finally, the additional check on the state
of the vm after we reacquire the locks ensures that we can't
be surprised by an external event (domain crash, etc).
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Originally the storage volume files were opened with O_DSYNC to make
sure they were flushed to disk immediately. It turned out that this
was extremely slow in some cases, so the O_DSYNC was removed in favor
of just calling fsync() after all the data had been written. However,
this call to fsync was inside the block that is executed to zero-fill
the end of the volume file. In cases where the new volume is copied
from an old volume, and they are the same length, this fsync would
never take place.
Now the fsync is *always* done, unless there is an error (in which
case it isn't important, and is most likely inappropriate.
A missing set of braces around an error condition caused us to skip
zero'ing out the remainder of a new volume file if the new volume was
longer than the original (the goto was supposed to be taken only in
the case of error, but was always being taken).
The storage volume lookup code was probing for the backing store
format, instead of using the format extracted from the file
itself. This meant it could report in accurate information. If
a format is included in the file, then use that in preference,
with probing as a fallback.
* src/storage/storage_backend_fs.c: Use extracted backing store
format
When creating qcow2 files with a backing store, it is important
to set an explicit format to prevent QEMU probing. The storage
backend was only doing this if it found a 'kvm-img' binary. This
is wrong because plenty of kvm-img binaries don't support an
explicit format, and plenty of 'qemu-img' binaries do support
a format. The result was that most qcow2 files were not getting
a backing store format.
This patch runs 'qemu-img -h' to check for the two support
argument formats
'-o backing_format=raw'
'-F raw'
and use whichever option it finds
* src/storage/storage_backend.c: Query binary to determine
how to set the backing store format
Record a default driver name/type in capabilities struct. Use this
when parsing disks if value is not set in XML config.
* src/conf/capabilities.h: Record default driver name/type for disks
* src/conf/domain_conf.c: Fallback to default driver name/type
when parsing disks
* src/qemu/qemu_driver.c: Set default driver name/type to raw
Disk format probing is now disabled by default. A new config
option in /etc/qemu/qemu.conf will re-enable it for existing
deployments where this causes trouble
The implementation of security driver callbacks often needs
to access the security driver object. Currently only a handful
of callbacks include the driver object as a parameter. Later
patches require this is many more places.
* src/qemu/qemu_driver.c: Pass in the security driver object
to all callbacks
* src/qemu/qemu_security_dac.c, src/qemu/qemu_security_stacked.c,
src/security/security_apparmor.c, src/security/security_driver.h,
src/security/security_selinux.c: Add a virSecurityDriverPtr
param to all security callbacks
Update the QEMU cgroups code, QEMU DAC security driver, SELinux
and AppArmour security drivers over to use the shared helper API
virDomainDiskDefForeachPath().
* src/qemu/qemu_driver.c, src/qemu/qemu_security_dac.c,
src/security/security_selinux.c, src/security/virt-aa-helper.c:
Convert over to use virDomainDiskDefForeachPath()
There is duplicated code which iterates over disk backing stores
performing some action. Provide a convenient helper for doing
this to eliminate duplication & risk of mistakes with disk format
probing
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDiskDefForeachPath()
Require the disk image to be passed into virStorageFileGetMetadata.
If this is set to VIR_STORAGE_FILE_AUTO, then the format will be
resolved using probing. This makes it easier to control when
probing will be used
* src/qemu/qemu_driver.c, src/qemu/qemu_security_dac.c,
src/security/security_selinux.c, src/security/virt-aa-helper.c:
Set VIR_STORAGE_FILE_AUTO when calling virStorageFileGetMetadata.
* src/storage/storage_backend_fs.c: Probe for disk format before
calling virStorageFileGetMetadata.
* src/util/storage_file.h, src/util/storage_file.c: Remove format
from virStorageFileMeta struct & require it to be passed into
method.
The virStorageFileGetMetadataFromFD did two jobs in one. First
it probed for storage type, then it extracted metadata for the
type. It is desirable to be able to separate these jobs, allowing
probing without querying metadata, and querying metadata without
probing.
To prepare for this, split out probing code into a new pair of
methods
virStorageFileProbeFormatFromFD
virStorageFileProbeFormat
* src/util/storage_file.c, src/util/storage_file.h,
src/libvirt_private.syms: Introduce virStorageFileProbeFormat
and virStorageFileProbeFormatFromFD
Instead of including a field in FileTypeInfo struct for the
disk format, rely on the array index matching the format.
Use verify() to assert the correct number of elements in the
array.
* src/util/storage_file.c: remove type field from FileTypeInfo
When QEMU opens a backing store for a QCow2 file, it will
normally auto-probe for the format of the backing store,
rather than assuming it has the same format as the referencing
file. There is a QCow2 extension that allows an explicit format
for the backing store to be embedded in the referencing file.
This closes the auto-probing security hole in QEMU.
This backing store format can be useful for libvirt users
of virStorageFileGetMetadata, so extract this data and report
it.
QEMU does not require disk image backing store files to be in
the same format the file linkee. It will auto-probe the disk
format for the backing store when opening it. If the backing
store was intended to be a raw file this could be a security
hole, because a guest may have written data into its disk that
then makes the backing store look like a qcow2 file. If it can
trick QEMU into thinking the raw file is a qcow2 file, it can
access arbitrary files on the host by adding further backing
store links.
To address this, callers of virStorageFileGetMeta need to be
told of the backing store format. If no format is declared,
they can make a decision whether to allow format probing or
not.
IPtables will seek to preserve the source port unchanged when
doing masquerading, if possible. NFS has a pseudo-security
option where it checks for the source port <= 1023 before
allowing a mount request. If an admin has used this to make the
host OS trusted for mounts, the default iptables behaviour will
potentially allow NAT'd guests access too. This needs to be
stopped.
With this change, the iptables -t nat -L -n -v rules for the
default network will be
Chain POSTROUTING (policy ACCEPT 95 packets, 9163 bytes)
pkts bytes target prot opt in out source destination
14 840 MASQUERADE tcp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
75 5752 MASQUERADE udp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
0 0 MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24
* src/network/bridge_driver.c: Add masquerade rules for TCP
and UDP protocols
* src/util/iptables.c, src/util/iptables.c: Add source port
mappings for TCP & UDP protocols when masquerading.
There are many naming conventions for partitions associated with a
block device. Some of the major ones are:
/dev/foo -> /dev/foo1
/dev/foo1 -> /dev/foo1p1
/dev/mapper/foo -> /dev/mapper/foop1
/dev/disk/by-path/foo -> /dev/disk/by-path/foo-part1
The universe of possible conventions isn't clear. Rather than trying
to understand all possible conventions, this patch divides devices
into two groups, device mapper devices and everything else. Device
mapper devices seem always to follow the convention of device ->
devicep1; everything else is canonicalized.
* src/uml/uml_driver.c (umlMonitorCommand): Correct flaw that would
cause unconditional "incomplete reply ..." failure, since "nbytes"
was always 0 or 1.
* src/qemu/qemu_driver.c (qemuConnectMonitor): Correct erroneous
parenthesization in two expressions. Without this fix, failure
to set or clear SELinux security context in the monitor would go
undiagnosed. Also correct a diagnostic and split some long lines.
When comparing a CPU without <model> element, such as
<cpu>
<topology sockets='1' cores='1' threads='1'/>
</cpu>
libvirt would happily crash without warning.
When autodetecting whether XML describes guest or host CPU, the presence
of <arch> element is checked. If it's present, we treat the XML as host
CPU definition. Which is right, since guest CPU definitions do not
contain <arch> element. However, if at the same time the root <cpu>
element contains `match' attribute, we would silently ignore it and
still treat the XML as host CPU. We should rather refuse such invalid
XML.
When a CPU to be compared with host CPU describes a host CPU instead of
a guest CPU, the result is incorrect. This is because instead of
treating additional features in host CPU description as required, they
were treated as if they were mentioned with all possible policies at the
same time.
In case qemu supports -nodefconfig, libvirt adds uses it when launching
new guests. Since this option may affect CPU models supported by qemu,
we need to use it when probing for available models.
An indentation mistake meant that a check for return status
was not properly performed in all cases. This could result
in a crash on NULL pointer in a following line.
* src/qemu/qemu_monitor_json.c: Fix check for return status
when processing JSON for blockstats
By specifying <vendor> element in CPU requirements a guest can be
restricted to run only on CPUs by a given vendor. Host CPU vendor is
also specified in capabilities XML.
The vendor is checked when migrating a guest but it's not forced, i.e.,
guests configured without <vendor> element can be freely migrated.
All features in the baseline CPU definition were always created with
policy='require' even though an arch driver returned them with different
policy settings.
This allows the user to give an explicit path to configure
./configure --with-vbox=/path/to/virtualbox
instead of having the VirtualBox driver probe a set of possible
paths at runtime. If no explicit path is specified then configure
probes the set of "known" paths.
https://bugzilla.redhat.com/show_bug.cgi?id=609185
We only use libpciaccess for resolving device product/vendor. If
initializing the library fails (say if using qemu:///session), don't
warn so loudly, and carry on as usual.
Any error message raised after the process has forked needs
to be followed by virDispatchError, otherwise we have no chance of
ever seeing it. This was selectively done for hook functions in the past,
but really applies to all post-fork errors.
As pointed out by Eric Blake, using dirent->d_type breaks
compilation on MinGW. This patch addresses this by using
'#if defined' as same as doing for virCgroupForDriver.
Some, but not all, codepaths in the qemuMonitorOpen() method
would trigger the destroy callback. The caller does not expect
this to be invoked if construction fails, only during normal
release of the monitor. This resulted in a possible double-unref
of the virDomainObjPtr, because the caller explicitly unrefs
the virDomainObjPtr if qemuMonitorOpen() fails
* src/qemu/qemu_monitor.c: Don't invoke destroy callback from
qemuMonitorOpen() failure paths
ENOENT happens normally when a subsystem is enabled with any other
subsystems and the directory of the target group has already removed
in a prior loop. In that case, the function should just return without
leaving an error message.
NB this is the same behavior as before introducing virCgroupRemoveRecursively.
Make sure to *not* call qemuDomainPCIAddressReleaseAddr if
QEMUD_CMD_FLAG_DEVICE is *not* set (for older qemu). This
prevents a crash when trying to do device detachment from
a qemu guest.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
In the current libvirt PCI code, there is no checking whether
a PCI device is in use by a guest when doing node device
detach or reattach. This causes problems when a device is
assigned to a guest, and the administrator starts issuing
nodedevice commands. Make it so that we check the list
of active devices when trying to detach/reattach, and only
allow the operation if the device is not assigned to a guest.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Fix regression introduced in commit a4a287242 - basically, the
phyp storage driver should only accept the same URIs that the
main phyp driver is willing to accept. Blindly accepting all
URIs meant that the phyp storage driver was being consulted for
'virsh -c qemu:///session pool-list --all', rather than the
qemu storage driver, then since the URI was not for phyp, attempts
to then use the phyp driver crashed because it was not initialized.
* src/phyp/phyp_driver.c (phypStorageOpen): Only accept connections
already open to a phyp driver.
This code was just recently added (by me) and didn't account for the
fact that stdin_path is sometimes NULL. If it's NULL, and
SetSecurityAllLabel fails, a segfault would result.
Because tty path is unexpectedly not saved in the live configuration
file of a domain, libvirtd cannot get the console of the domain back
after restarting.
The reason why the tty path isn't saved is that, to save the tty path,
the save function, virDomainSaveConfig, requires that the target domain
is running (pid != -1), however, lxc driver calls the function before
starting the domain to pass the configuration to controller.
To ensure to save the tty path, the patch lets lxc driver call the save
function again after starting the domain.
The function is expected to return negative value on failure,
however, it returns positive value when either setInterfaceName
or vethInterfaceUpOrDown fails. Because the function returns
the return value of either as is, however, the two functions
may return positive value on failure.
The patch fixes the defects and add error messages.
When the saved domain image is on an NFS share, at least some part of
domainSetSecurityAllLabel will fail (for example, selinux labels can't
be modified). To allow domain restore to still work in this case, just
ignore the errors.
virStorageFileIsSharedFS would previously only work if the entire path
in question was stat'able by the uid of the libvirtd process. This
patch changes it to crawl backwards up the path retrying the statfs
call until it gets to a partial path that *can* be stat'ed.
This is necessary to use the function to learn the fstype for files
stored as a different user (and readable only by that user) on a
root-squashed remote filesystem.
Also restore the label to its original value after qemu is finished
with the file.
Prior to this patch, qemu domain restore did not function properly if
selinux was set to enforce.
If an active migration operation fails, or is cancelled by the
admin, the QEMU on the destination is shutdown and the one on
the source continues running. It is important in shutting down
the QEMU on the destination, the security drivers don't reset
the file labelling/permissions.
* src/qemu/qemu_driver.c: Don't reset labelling/permissions
on migration abort
Minor speedups by using the full power of sed.
* src/phyp/phyp_driver.c (phypGetVIOSFreeSCSIAdapter)
(phypDiskType, phypListDefinedDomains): Use fewer processes, by
folding other work into sed.
(phypGetVIOSPartitionID): Likewise. Also avoid non-portable use
of 'sed -s'.
Add the storage management driver to the Power Hypervisor driver.
This is a big but simple patch, it's just a new set of functions.
This patch includes:
* Storage driver: The set of pool-* and vol-* functions.
* attach-disk function.
* Support for IVM on the new functions.
Signed-off-by: Eric Blake <eblake@redhat.com>
* src/phyp/phyp_driver.c (phypStorageDriver): New driver.
(phypStorageOpen, phypStorageClose): New functions.
(phypRegister): Register it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Several phyp functions are not namespace clean, and had no reason
to be exported since no one outside the phyp driver needed to use
them. Rather than do lots of forward declarations, I was able
to topologically sort the file. So, this patch looks huge, but
is really just a matter of marking things static and dealing with
the compiler fallout.
* src/phyp/phyp_driver.h (PHYP_DRIVER_H): Add include guard.
(phypCheckSPFreeSapce): Delete unused declaration.
(phypGetSystemType, phypGetVIOSPartitionID, phypCapsInit)
(phypBuildLpar, phypUUIDTable_WriteFile, phypUUIDTable_ReadFile)
(phypUUIDTable_AddLpar, phypUUIDTable_RemLpar, phypUUIDTable_Pull)
(phypUUIDTable_Push, phypUUIDTable_Init, phypUUIDTable_Free)
(escape_specialcharacters, waitsocket, phypGetLparUUID)
(phypGetLparMem, phypGetLparCPU, phypGetLparCPUGeneric)
(phypGetRemoteSlot, phypGetBackingDevice, phypDiskType)
(openSSHSession): Move declarations to phyp_driver.c and make static.
* src/phyp/phyp_driver.c: Rearrange file contents to provide
topological sorting of newly-static funtions (no semantic changes
other than reduced scope).
(phypGetBackingDevice, phypDiskType): Mark unused, for now.
The patches for shared storage migration were not correctly written
for json mode. Thus the 'blk' and 'inc' parameters were never being
set. In addition they didn't set the QEMU_MONITOR_MIGRATE_BACKGROUND
so migration was synchronous. Due to multiple bugs in QEMU's JSON
impl this wasn't noticed because it treated the sync migration requst
as asynchronous anyway. Finally 'background' parameter was converted
to take arbitrary flags but not renamed, and not all uses were changed
to unsigned int.
* src/qemu/qemu_driver.c: Set QEMU_MONITOR_MIGRATE_BACKGROUND in
doNativeMigrate
* src/qemu/qemu_monitor_json.c: Process QEMU_MONITOR_MIGRATE_NON_SHARED_DISK
and QEMU_MONITOR_MIGRATE_NON_SHARED_INC flags
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.h, src/qemu/qemu_monitor_text.c,
src/qemu/qemu_monitor_text.h: change 'int background' to
'unsigned int flags' in migration APIs. Add logging of flags
parameter
During incoming migration the QEMU monitor is not able to be
used. The incoming migration code did not keep hold of the
job lock because migration is split across multiple API calls.
This meant that further monitor commands on the guest would
hang until migration finished with no timeout.
In this change the qemuDomainMigratePrepare method sets the
job flag just before it returns. The qemuDomainMigrateFinish
method checks for this job flag & clears it once done. This
prevents any use of the monitor between prepare+finish steps.
The qemuDomainGetJobInfo method is also updated to refresh
the job elapsed time. This means that virsh domjobinfo can
return time data during incoming migration
* src/qemu/qemu_driver.c: Keep a job active during incoming
migration. Refresh job elapsed time when returning job info
When configuring serial, parallel, console or channel devices
with a file, dev or pipe backend type, it is necessary to label
the file path in the security drivers. For char devices of type
file, it is neccessary to pre-create (touch) the file if it does
not already exist since QEMU won't be allowed todo so itself.
dev/pipe configs already require the admin to pre-create before
starting the guest.
* src/qemu/qemu_security_dac.c: set file ownership for character
devices
* src/security/security_selinux.c: Set file labeling for character
devices
* src/qemu/qemu_driver.c: Add character devices to cgroup ACL
The parallel, serial, console and channel devices are all just
character devices. A lot of code needs todo the same thing to
all these devices. This provides an convenient API for iterating
over all of them.
* src/conf/domain_conf.c, src/conf/domain_conf.c,
src/libvirt_private.syms: Add virDomainChrDefForeach
We previously assumed that if the -device option existed in qemu, that
-nodefconfig would also exist. It turns out that isn't the case, as
demonstrated by qemu-kvm-0.12.3 in Fedora 13.
*/src/qemu/qemu_conf.[hc] - add a new QEMUD_CMD_FLAG, set it via the
help output, and check it before adding
-nodefconfig to the qemu commandline.
Also don't abuse the disk driver name to specify the SCSI controller
model anymore:
<driver name='buslogic'/>
Use the newly added model attribute of the controller element for this:
<controller type='scsi' index='0' model='buslogic'/>
The disk driver name approach is deprecated now, but still works for
backward compatibility reasons.
Update the documentation and tests accordingly.
Fix usage of the words controller and id in the VMX handling code. Use
controller, bus and unit properly.
The domain XML parsing code autogenerates disk address and
controller elements when they are not explicitly specified.
The code assumes a narrow SCSI bus (7 units per bus). ESX
uses a wide SCSI bus (16 units per bus).
This is a step towards controller support for the ESX driver.
Move libnl to libvirt_util.la, because macvtap.c requires it.
Add GnuTLS to libvirt_driver.la, because libvirt.c calls gcrypt functions.
When built without loadable driver modules, then the remote driver pulls
in GnuTLS.
Move libgnu.la from libvirt_parthelper_CFLAGS to libvirt_parthelper_LDADD.
Through conversation with Kumar L Srikanth-B22348, I found
that the function of getting memory usage (e.g., virsh dominfo)
doesn't work for lxc with ns subsystem of cgroup enabled.
This is because of features of ns and memory subsystems.
Ns creates child cgroup on every process fork and as a result
processes in a container are not assigned in a cgroup for
domain (e.g., libvirt/lxc/test1/). For example, libvirt_lxc
and init (or somewhat specified in XML) are assigned into
libvirt/lxc/test1/8839/ and libvirt/lxc/test1/8839/8849/,
respectively. On the other hand, memory subsystem accounts
memory usage within a group of processes by default, i.e.,
it does not take any child (and descendant) groups into
account. With the two features, virsh dominfo which just
checks memory usage of a cgroup for domain always returns
zero because the cgroup has no process.
Setting memory.use_hierarchy of a group allows to account
(and limit) memory usage of every descendant groups of the group.
By setting it of a cgroup for domain, we can get proper memory
usage of lxc with ns subsystem enabled. (To be exact, the
setting is required only when memory and ns subsystems are
enabled at the same time, e.g., mount -t cgroup none /cgroup.)
As same as normal directories, a cgroup cannot be removed if it
contains sub groups. This patch changes virCgroupRemove to remove
all descendant groups (subdirectories) of a target group before
removing the target group.
The handling is required when we run lxc with ns subsystem of cgroup.
Ns subsystem automatically creates child cgroups on every process
forks, but unfortunately the groups are not removed on process exits,
so we have to remove them by ourselves.
With this patch, such child (and descendant) groups are surely removed
at lxc shutdown, i.e., lxcVmCleanup which calls virCgroupRemove.
add iptables rules to allow TFTP from the virtual network if <tftp>
element is defined in the network definition.
Fedora bz#580215
* src/network/bridge_driver.c: open UDP port 69 for TFTP traffic if
tftproot is defined
We already use the '-nodefaults' command line arg with QEMU to stop
it adding any default devices to guests. Unfortunately, QEMU will
load global config files from /etc/qemu that may also add default
devices. These aren't blocked by '-nodefaults', so we need to also
add the '-nodefconfig' arg to prevent that.
Unfortunately these global config files are also used to define
custom CPU models. So in blocking global hardware device addition
we also block definitions of new CPU models. Libvirt doesn't know
about these custom CPU models though, so it would never make use
of them anyway. Thus blocking them via -nodefconfig isn't a show
stopping problem. We would need to expand libvirt's own CPU model
XML database to support these instead.
* src/qemu/qemu_conf.c: Add '-nodefconfig' if available
* tests/qemuxml2argvdata/: Add '-nodefconfig' to all data files which
have '-nodefaults' present
The current code pattern requires that callers of qemuMonitorClose
check for the return value == 0, and if so, set priv->mon = NULL
and release the reference held on the associated virDomainObjPtr
The change d84bb6d6a3 violated that
requirement, meaning that priv->mon never gets set to NULL, and
a reference count is leaked on virDomainObjPtr.
This design was a bad one, so remove the need to check the return
valueof qemuMonitorClose(). Instead allow registration of a
callback that's invoked just when the last reference on qemuMonitorPtr
is released.
Finally there was a potential reference leak in qemuConnectMonitor
in the failure path.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a destroy
callback invoked from qemuMonitorFree
* src/qemu/qemu_driver.c: Use the destroy callback to release the
reference on virDomainObjPtr when the monitor is freed. Fix other
potential reference count leak in connecting to monitor
Before issuing monitor commands it is neccessary to check whether
the guest is still running. Most places use virDomainIsActive()
correctly, but a few relied on 'priv->mon != NULL'. In theory
these should be equivalent, but the release of the last reference
count on priv->mon can be delayed a small amount of time until
the event handler is finally deregistered. A further ref counting
bug also means that priv->mon might be never released. In such a
case, code could mistakenly issue a monitor command and wait for
a response that will never arrive, effectively leaving the QEMU
driver waiting on virCondWait() forever..
To protect against these possibilities, make sure all code uses
virDomainIsActive(), not 'priv->mon != NULL'
* src/qemu/qemu_driver.c: Replace 'priv->mon != NULL' with
calls to 'priv->mon != NULL'()
If there is no driver for a URI we report
"no hypervisor driver available"
This is bad because not all virt drivers are hypervisors (ie container
based virt).
If there is no driver support for an API we report
"this function is not supported by the hypervisor"
This is bad for the same reason, and additionally because it is
also used for the network, interface & storage drivers.
* src/util/virterror.c: Improve error messages
Following Daniel Berrange's multiple helpful suggestions for improving
this patch and introducing another driver interface, I now wrote the
below patch where the nwfilter driver registers the functions to
instantiate and teardown the nwfilters with a function in
conf/domain_nwfilter.c called virDomainConfNWFilterRegister. Previous
helper functions that were called from qemu_driver.c and qemu_conf.c
were move into conf/domain_nwfilter.h with slight renaming done for
consistency. Those functions now call the function expored by
domain_nwfilter.c, which in turn call the functions of the new driver
interface, if available.
- Fix documentation for virGetStorageVol: it has 'key' argument instead
of 'uuid'.
- Remove TODO comment from virReleaseStorageVol: we use volume key as an
identifier instead of UUID.
- Print human-readable UUID string in debug message in virReleaseSecret.
Per-connection hashes for domains, networks, storage pools and network
filter pools were indexed by names which was not the best choice. UUIDs
are better identifiers, so lets use them.
If VM startup fails early enough (can't find a referenced USB device),
libvirtd will crash trying to clear the VNC port bit, since port = 0,
which overflows us out of the bitmap bounds.
Fix this by being more defensive in the bitmap operations, and only
clearing a previously set VNC port.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Followup to https://bugzilla.redhat.com/show_bug.cgi?id=599091,
commit 20206a4b, to reduce disk waste in padding.
* src/qemu/qemu_monitor.h (QEMU_MONITOR_MIGRATE_TO_FILE_BS): Drop
back to 4k.
(QEMU_MONITOR_MIGRATE_TO_FILE_TRANSFER_SIZE): New macro.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update comment.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextMigrateToFile): Use
two invocations of dd to output non-aligned large blocks.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMigrateToFile):
Likewise.
This patch adds an optional XML attribute to a nwfilter rule to give the user control over whether the rule is supposed to be using the iptables state match or not. A rule may now look like shown in the XML below with the statematch attribute either having value '0' or 'false' (case-insensitive).
[...]
<rule action='accept' direction='in' statematch='false'>
<tcp srcmacaddr='1:2:3:4:5:6'
srcipaddr='10.1.2.3' srcipmask='32'
dscp='33'
srcportstart='20' srcportend='21'
dstportstart='100' dstportend='1111'/>
</rule>
[...]
I am also extending the nwfilter schema and add this attribute to a test case.
Use virBuffer* API to conditionally keep the portion of the command
line specific to HMC, so that IVM can work.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch works around a recent extension of the netlink driver I had made use of when building the netlink messages. Unfortunately older kernels don't accept IFLA_IFNAME + name of interface as a replacement for the interface's index, so this patch now gets the interface index ifindex if it's not provided (ifindex <= 0).
Match earlier change for qemu pause support with virDomainCreateXML.
* src/qemu/qemu_driver.c (qemudDomainObjStart): Add parameter; all
callers changed.
(qemudDomainStartWithFlags): Implement flag support.
Define the wire format for the new virDomainCreateWithFlags
API, and implement client and server side of marshaling code.
* daemon/remote.c (remoteDispatchDomainCreateWithFlags): Add
server side dispatch for virDomainCreateWithFlags.
* src/remote/remote_driver.c (remoteDomainCreateWithFlags)
(remote_driver): Client side serialization.
* src/remote/remote_protocol.x
(remote_domain_create_with_flags_args)
(remote_domain_create_with_flags_ret)
(REMOTE_PROC_DOMAIN_CREATE_WITH_FLAGS): Define wire format.
* daemon/remote_dispatch_args.h: Regenerate.
* daemon/remote_dispatch_prototypes.h: Likewise.
* daemon/remote_dispatch_table.h: Likewise.
* src/remote/remote_protocol.c: Likewise.
* src/remote/remote_protocol.h: Likewise.
* src/remote_protocol-structs: Likewise.
Daniel's patch works with gcc and CFLAGS containing -O (the
autoconf default), but fails with non-gcc or with other
CFLAGS (such as -g), since c-ctype.h declares c_isdigit as
a macro only for certain compilation settings.
* src/Makefile.am (libvirt_parthelper_LDFLAGS): Add gnulib
library, for when c_isdigit is not a macro.
* src/storage/parthelper.c (main): Avoid out-of-bounds
dereference, noticed by Jim Meyering.
Disks with a trailing digit in their path (eg /dev/loop0 or
/dev/dm0) have an extra 'p' appended before the partition
number (eg, to form /dev/loop0p1 not /dev/loop01). Fix the
partition lookup to append this extra 'p' when required
* src/storage/parthelper.c: Add a 'p' before partition
number if required
Otherwise, a malicious packet could cause a DoS via spurious
out-of-memory failure.
* src/uml/uml_driver.c (umlMonitorCommand): Validate that incoming
data is reliable before using it to allocate/dereference memory.
Don't report bogus errno on short read.
Reported by Jim Meyering.
* src/util/threads.c (includes) [WIN32]: On mingw, favor native
threading over pthreads-win32 library.
* src/util/thread.h [WIN32] Likewise.
Suggested by Daniel P. Berrange.
When a disk is on a root squashed NFS server, it may not be
possible to stat() the disk file in virCgroupAllowDevice.
The virStorageFileGetMeta method may also fail to extract
the parent backing store. Both of these errors have to be
ignored to avoid breaking NFS deployments
* src/qemu/qemu_driver.c: Ignore errors in cgroup setup to
keep root squash NFS happy
https://bugzilla.redhat.com/show_bug.cgi?id=589465
Some guests (eg with badly configured grub, or Windows' installation cd)
require quick response from the console user. That's why we have a
"launchPaused" option in vdsm.
To implement it via libvirt, we need to ask libvirt not to call
qemuMonitorStartCPUs() after starting qemu. Calling virDomainStop
immediately after the domain is up is inherently raceful.
* src/qemu/qemu_driver.c (qemudStartVMDaemon): Add new parameter;
all callers adjusted.
(qemudDomainCreate): Implement support for new flag.
* This patch is a modification of a patch submitted by Nigel Jones.
It fixes several memory leaks on device addition/removal:
1. Free the virNodeDeviceDefPtr in udevAddOneDevice if the return
value is non-zero
2. Always release the node device reference after the device has been
processed.
* Refactored for better readability per the suggestion of clalance
A look at the QEMU source revealed the missing bits of info about
the VPC file format, so we can enable this now
* src/util/storage_file.c: Enable VPC format, providing version
and disk size offset fields
When an attempt to hotplug a PCI device to a guest fails,
the device was left attached to pci-stub. It is neccessary
to reset the device and then attach it to the host driver
again.
* src/qemu/qemu_driver.c: Reattach PCI device to host if
hotadd fails
The restore code is done in places where errors cannot be
raised, since they will overwrite over pre-existing errors.
* src/security/security_selinux.c: Only warn about failures
in label restore, don't report errors
Any output at all from device_add indicates an error in the
command execution. Thus it needs to check for reply != ""
* src/qemu/qemu_monitor_text.c: Fix reply check for errors
to treat any output as an error
HAL is deprecated and UDEV is the future. Thus if both
options are compiled, we should prefer use of UDEV over
HAL
* src/node_device/node_device_driver.c: Switch init
order to try UDEV first, then HAL
When SELinux is running in MLS mode, libvirtd will have a
different security level to the VMs. For libvirtd to be
able to connect to the monitor console, the client end of
the UNIX domain socket needs a different label. This adds
infrastructure to set the socket label via the security
driver framework
* src/qemu/qemu_driver.c: Call out to socket label APIs in
security driver
* src/qemu/qemu_security_stacked.c: Wire up socket label
drivers
* src/security/security_driver.h: Define security driver
entry points for socket labelling
* src/security/security_selinux.c: Set socket label based on
VM label
The network driver is not doing correct checking for
duplicate UUID/name values. This introduces a new method
virNetworkObjIsDuplicate, based on the previously
written virDomainObjIsDuplicate.
* src/conf/network_conf.c, src/conf/network_conf.c,
src/libvirt_private.syms: Add virNetworkObjIsDuplicate,
* src/network/bridge_driver.c: Call virNetworkObjIsDuplicate
for checking uniqueness of uuid/names
The storage pool driver is mistakenly using the error code
VIR_ERR_INVALID_STORAGE_POOL which is for diagnosing invalid
pointers. This patch switches it to use VIR_ERR_NO_STORAGE_POOL
which is the correct code for cases where the storage pool does
not exist
* src/storage/storage_driver.c: Replace VIR_ERR_INVALID_STORAGE_POOL
with VIR_ERR_NO_STORAGE_POOL
The storage pool driver is not doing correct checking for
duplicate UUID/name values. This introduces a new method
virStoragePoolObjIsDuplicate, based on the previously
written virDomainObjIsDuplicate.
* src/conf/storage_conf.c, src/conf/storage_conf.c,
src/libvirt_private.syms: Add virStoragePoolObjIsDuplicate,
* src/storage/storage_driver.c: Call virStoragePoolObjIsDuplicate
for checking uniqueness of uuid/names
The domain parsing code would auto-add a virtio serial controller
if it saw any virtio serial channel defined. Unfortunately it
always added a controller with index=0, even if the channel address
specified an index != 0. It only added one controller, even if
multiple controllers were referenced by channels. Finally, it let
the ports+vectors parameters initialize to zero instead of -1, which
prevented the controllers accepting any ports.
* src/conf/domain_conf.c: Initialize ports+vectors when adding
virtio serial controllers. Add all neccessary virtio serial
controllers, instead of hardcoding controller 0
* qemuxml2argvdata/qemuxml2argv-channel-virtio.args,
qemuxml2argvdata/qemuxml2argv-channel-virtio.xml: Expand to
test controller auto-add behaviour
To ensure that the device addressing scheme is stable across
hotplug/unplug, all virtio serial channels needs to have an
associated port number in their address. This is then specified
to QEMU using the nr=NNN parameter
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parsing
for port number in vioserial address types.
* src/qemu/qemu_conf.c: Set 'nr=NNN' parameter with virtio
serial port number
* tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args,
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.xml: Expand
data set to ensure coverage of port addressing
QEMU upstream decided against adding a 'reason' field to
the block IO event in QMP. Disable this code to remove a
annoying warning message. It will be renabled when the
error string reason is re-introduced in QEMU
Adjust args to qemudStartVMDaemon() to also specify path to stdin_fd,
so this can be passed to the AppArmor driver via SetSecurityAllLabel().
This updates all calls to qemudStartVMDaemon() as well as setting up
the non-AppArmor security driver *SetSecurityAllLabel() declarations
for the above. This is required for the following
"apparmor-fix-save-restore" patch since AppArmor resolves the passed
file descriptor to the pathname given to open().
See https://bugzilla.redhat.com/show_bug.cgi?id=599091
Saving a paused 512MB domain took 3m47s with the old block size of 512
bytes. Changing the block size to 1024*1024 decreased the time to 56
seconds. (Doubling again to 2048*1024 yielded 0 improvement; lowering
to 512k increased the save time to 1m10s, about 20%)
The pointer to the xml describing the domain is saved into an object
prior to calling VIR_REALLOC_N() to make the size of the memory it
points to a multiple of QEMU_MONITOR_MIGRATE_TO_FILE_BS. If that
operation needs to allocate new memory, the pointer that was saved is
no longer valid.
To avoid this situation, adjust the size *before* saving the pointer.
(This showed up when experimenting with very large values of
QEMU_MONITOR_MIGRATE_TO_FILE_BS).
Fixes for issues in commit 211dd1e9 noted by by Jim Meyering.
1. Allocate content buffer of size content_length + 1 to ensure
NUL-termination.
2. Limit content buffer size to 64k
3. Fix whitespace issue
V2:
- Add comment to clarify allocation of content buffer
- Add ATTRIBUTE_NONNULL where appropriate
- User NULLSTR macro
There are cases when a response from xend can exceed 4096 bytes, in
which case anything beyond 4096 is ignored. This patch changes the
current fixed-size, stack-allocated buffer to a dynamically allocated
buffer based on Content-Length in HTTP header.
* It appears that the udev event for HBA creation arrives before the
associated sysfs data is fully populated, resulting in bogus data
for the nodedev entry until the entry is refreshed. This problem is
particularly troublesome when creating NPIV vHBAs because it results
in libvirt failing to find the newly created adapter and waiting for
the full timeout period before erroneously failing the create
operation. This patch forces an update before any attempt to use
any scsi_host nodedev entry.
This patch that adds support for configuring 802.1Qbg and 802.1Qbh
switches. The 802.1Qbh part has been successfully tested with real
hardware. The 802.1Qbg part has only been tested with a (dummy)
server that 'behaves' similarly to how we expect lldpad to 'behave'.
The following changes were made during the development of this patch:
- Merging Scott's v13-pre1 patch
- Fixing endptr related bug while using virStrToLong_ui() pointed out
by Jim Meyering
- Addressing Jim Meyering's comments to v11
- requiring mac address to the vpDisassociateProfileId() function to
pass it further to the 802.1Qbg disassociate part (802.1Qbh untouched)
- determining pid of lldpad daemon by reading it from /var/run/libvirt.pid
(hardcode as is hardcode alson in lldpad sources)
- merging netlink send code for kernel target and user space target
(lldpad) using one function nlComm() to send the messages
- adding a select() after the sending and before the reading of the
netlink response in case lldpad doesn't respond and so we don't hang
- when reading the port status, in case of 802.1Qbg, no status may be
received while things are 'in progress' and only at the end a status
will be there.
- when reading the port status, use the given instanceId and vf to pick
the right IFLA_VF_PORT among those nested under IFLA_VF_PORTS.
- never sending nor parsing IFLA_PORT_SELF type of messages in the
802.1Qbg case
- iterating over the elements in a IFLA_VF_PORTS to pick the right
IFLA_VF_PORT by either IFLA_PORT_PROFILE and given profileId
(802.1Qbh) or IFLA_PORT_INSTANCE_UUID and given instanceId (802.1Qbg)
and reading the current status in IFLA_PORT_RESPONSE.
- recycling a previous patch that adds functionality to interface.c to
- get the vlan identifier on an interface
- get the flags of an interface and some convenience function to
check whether an interface is 'up' or not (not currently used here)
- adding function to determine the root physical interface of an
interface. For example if a macvtap is linked to eth0.100, it will
find eth0. Also adding a function that finds the vlan on the 'way to
the root physical interface'
- conveying the root physical interface name and index in case of 802.1Qbg
- conveying mac address of macvlan device and vlan identifier in
IFLA_VFINFO_LIST[ IFLA_VF_INFO[ IFLA_VF_MAC(mac), IFLA_VF_VLAN(vlan) ] ]
to (future) lldpad via netlink
- To enable build with --without-macvtap rename the
[dis|]associatePortProfileId functions, prepend 'vp' before their
name and make them non-static functions.
- Renaming variable multicast to nltarget_kernel and inverting
the logic
- Addressing Jim Meyering's comments; this also touches existing
code for example for correcting indentation of break statements or
simplification of switch statements.
- Renamed occurrencvirVirtualPortProfileDef to virVirtualPortProfileParamses
- 802.1Qbg part prepared for sending a RTM_SETLINK and getting
processing status back plus a subsequent RTM_GETLINK to
get IFLA_PORT_RESPONSE.
Note: This interface for 802.1Qbg may still change
- [David Allan] move getPhysfn inside IFLA_VF_PORT_MAX to avoid
compiler
warning when latest if_link.h isn't available
- move from Stefan's 802.1Qb{g|h} XML v8 to v9
- move hostuuid and vf index calcs to inside doPortProfileOp8021Qbh
- remove debug fprintfs
- use virGetHostUUID (thanks Stefan!)
- fix compile issue when latest if_link.h isn't available
- change poll timeout to 10s, at 1/8 intervals
- if polling times out, log msg and return -ETIMEDOUT
- Add Stefan's code for getPortProfileStatus
- Poll for up to 2 secs for port-profile status, at 1/8 sec intervals:
- if status indicates error, abort openMacvtapTap
- if status indicates success, exit polling
- if status is "in-progress" after 2 secs of polling, exit
polling loop silently, without error
My patch finishes out the 802.1Qbh parts, which Stefan had mostly complete.
I've tested using the recent kernel updates for VF_PORT netlink msgs and
enic for Cisco's 10G Ethernet NIC. I tested many VMs, each with several
direct interfaces, each configured with a port-profile per the XML. VM-to-VM,
and VM-to-external work as expected. VM-to-VM on same host (using same NIC)
works same as VM-to-VM where VMs are on diff hosts. I'm able to change
settings on the port-profile while the VM is running to change the virtual
port behaviour. For example, adjusting a QoS setting like rate limit. All
VMs with interfaces using that port-profile immediatly see the effect of the
change to the port-profile.
I don't have a SR-IOV device to test so source dev is a non-SR-IOV device,
but most of the code paths include support for specifing the source dev and
VF index. We'll need to complete this by discovering the PF given the VF
linkdev. Once we have the PF, we'll also have the VF index. All this info-
mation is available from sysfs.
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=598272
Some files under /sys/bus/usb/devices/ have the format 'usbX', where
X is the USB bus number. Use STRPREFIX to correctly parse the bus numbers.
If a directory pool contains pipes or sockets, a pool start can fail or hang:
https://bugzilla.redhat.com/show_bug.cgi?id=589577
We already try to avoid these special files, but only attempt after
opening the path, which is where the problems lie. Unify volume opening
into helper functions, which use the proper open() flags to avoid error,
followed by fstat to validate storage mode.
Previously, virStorageBackendUpdateVolTargetInfoFD attempted to enforce the
storage mode check, but allowed callers to detect this case and silently
continue. In practice, only the FS backend was using this feature, the rest
were treating unknown mode as an error condition. Unfortunately the InfoFD
function wasn't raising an error message here, so error reporting was
busted.
This patch adds 2 functions: virStorageBackendVolOpen, and
virStorageBackendVolOpenModeSkip. The latter retains the original opt out
semantics, the former now throws an explicit error.
This patch maintains the previous volume mode checks: allowing specific
modes for specific pool types requires a bit of surgery, since VolOpen
is called through several different helper functions.
v2: Use ATTRIBUTE_NONNULL. Drop stat check, just open with
O_NONBLOCK|O_NOCTTY.
v3: Move mode check logic back to VolOpen. Use 2 VolOpen functions with
different error semantics.
v4: Make second VolOpen function more extensible. Didn't opt to change
FS backend defaults, this can just be to fix the original bug.
v5: Prefix default flags with VIR_, use ATTRIBUTE_RETURN_CHECK
Since the macvtap device needs active tear-down and the teardown logic
is based on the interface name, it can happen that if for example 1 out
of 3 interfaces was successfully created, that during the failure path
the macvtap's target device name is used to tear down an interface that
is doesn't own (owned by another VM).
So, in this patch, the target interface name is reset so that there is
no target interface name and the interface name is always cleared after
a tear down.
* If a nodedev has a parent that we don't want to display, we should
continue walking up the udev device tree to see if any of its
earlier ancestors are devices that we display. It makes the tree
much nicer looking than having a whole lot of devices hanging off
the root node.
These files are borrowed from upstream release versions, and should
not need further edits in the context of libvirt (instead, a new
upstream vbox release would entail adding a new header file). We do
not re-generate these files as part of libvirt, nor do we want to lose
our minor edits (such as cppi cleanups).
* src/vbox/vbox_CAPI_v2_2.h: Clarify file origins.
* src/vbox/vbox_CAPI_v3_0.h: Likewise.
* src/vbox/vbox_CAPI_v3_1.h: Likewise.
* src/vbox/vbox_CAPI_v3_2.h: Likewise. Reindent with cppi.
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961
If using the default virtual network, an easy way to lose guest network
connectivity is to install libvirt inside the VM. The autostarted
default network inside the guest collides with host virtual network
routing. This is a long standing issue that has caused users quite a
bit of pain and confusion.
On network startup, parse /proc/net/route and compare the requested
IP+netmask against host routing destinations: if any matches are found,
refuse to start the network.
v2: Drop sscanf, fix a comment typo, comment that function could use
libnl instead of /proc
v3: Consider route netmask. Compare binary data rather than convert to
string.
v4: Return to using sscanf, drop inet functions in favor of virSocket,
parsing safety checks. Don't make parse failures fatal, in case
expected format changes.
v5: Try and continue if we receive unexpected. Delimit parsed lines to
prevent scanning past newline
'listen' isn't a valid qemu-dm option, as reported a long time ago here:
https://bugzilla.redhat.com/show_bug.cgi?id=492958
Matches the near identical logic in qemu_conf.c
v2: When parsing sexpr, only match on ",server", rather than
full ',server,nowait'.
* Incorporated Jim's feedback (v1 & v2)
* Moved case of DEVTYPE == "wlan" up as it's definitive that we have a network interface.
* Made comment more detailed about the wired case to explain better
how it differentiates between wired network interfaces and USB
devices.
Eliminate almost all backward jumps by replacing this common pattern:
int
some_random_function(void)
{
int result = 0;
...
cleanup:
<unconditional cleanup code>
return result;
failure:
<cleanup code in case of an error>
result = -1;
goto cleanup
}
with this simpler pattern:
int
some_random_function(void)
{
int result = -1;
...
result = 0;
cleanup:
if (result < 0) {
<cleanup code in case of an error>
}
<unconditional cleanup code>
return result;
}
Add a bool success variable in functions that don't have a int result
that can be used for the new pattern.
Also remove some unnecessary memsets in error paths.
The hotplug methods still had the qemuCmdFlags variable declared
as an int, instead of unsigned long long. This caused flag checks
to be incorrect for flags > 31
* src/qemu/qemu_driver.c: Fix integer overflow in hotplug
This allows libvirt to open the PCI device sysfs config file prior
to dropping privileges so qemu can access the full config space.
Without this, a de-privileged qemu can only access the first 64
bytes of config space.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Detect support
for pci-assign.configfd option. Use this option when formatting
PCI device string if possible
* src/qemu/qemu_driver.c: Pre-open PCI sysfs config file and pass
to QEMU
We've been running into a lot of situations where
virGetHostname() is returning "localhost", where a plain
gethostname() would have returned the correct thing. This
is because virGetHostname() is *always* trying to canonicalize
the name returned from gethostname(), even when it doesn't
have to.
This patch changes virGetHostname so that if the value returned
from gethostname() is already FQDN or localhost, it returns
that string directly. If the value returned from gethostname()
is a shortened hostname, then we try to canonicalize it. If
that succeeds, we returned the canonicalized hostname. If
that fails, and/or returns "localhost", then we just return
the original string we got from gethostname() and hope for
the best.
Note that after this patch it is up to clients to check whether
"localhost" is an allowed return value. The only place
where it's currently not is in qemu migration.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The virVirtualPortProfileFormat just went below the
virVirtualPortProfileParamsParseXML function and got inside the
The attached patch moves virVirtualPortProfileFormat below the #ifndef
PROXY block.
Allows listing existing pools and requesting information about them.
Alter the esxVI_ProductVersion enum in a way that allows to check for
product type by masking.
This patch parses the following two XML descriptions, one for
802.1Qbg and one for 802.1Qbh, and stores the data internally.
The actual triggering of the switch setup protocol has not been
implemented here but the relevant code to do that should go into
the functions associatePortProfileId() and disassociatePortProfileId().
<interface type='direct'>
<source dev='eth0.100' mode='vepa'/>
<model type='virtio'/>
<virtualport type='802.1Qbg'>
<parameters managerid='12' typeid='0x123456' typeidversion='1'
instanceid='fa9b7fff-b0a0-4893-8e0e-beef4ff18f8f'/>
</virtualport>
<filterref filter='clean-traffic'/>
</interface>
<interface type='direct'>
<source dev='eth0.100' mode='vepa'/>
<model type='virtio'/>
<virtualport type='802.1Qbh'>
<parameters profileid='my_profile'/>
</virtualport>
</interface>
I'd suggest to use this patch as a base for triggering the setup
protocol with the 802.1Qb{g|h} switch.
Several rounds of changes were made to this patch. The
following is a list of these changes.
- Renamed structure virVirtualPortProfileDef to virVirtualPortProfileParams
as per Daniel Berrange's request
- Addressing Daniel Berrange's comments:
- removing macvtap.h's dependency on domain_conf.h by
moving the virVirtualPortProfileDef structure into macvtap.h
and not passing virtDomainNetDefPtr to any functions in
macvtap.c
- Addressed most of Chris Wright's comments:
- indicating error in case virtualport XML node cannot be parsed
properly
- parsing hex and decimal numbers using virStrToLong_ui() with
parameter '0' for base
- tgifname (target interface name) variable wasn't necessary
to pass to openMacvtapTap function anymore
- assigning the virtual port data structure to the virDomainNetDef
only if it was previously parsed
- make sure that the error code returned by openMacvtapTap() is a negative n
in case the associatePortProfileId() function failed.
- renaming vsi in the XML to virtualport
- replace all occurrences of vsi in the source as well
- removing mode and MAC address parameters from the functions that
will communicate with the hareware diretctly or indirectly
- moving the associate and disassociate functions to the end of the
file for subsequent patches to easier make them generally available
for export
- passing the macvtap interface name rather than the link device since
this otherwise gives funny side effects when using netlink messages
where IFLA_IFNAME and IFLA_ADDRESS are specified and the link dev
all of a sudden gets the MAC address of the macvtap interface.
- Removing rc = -1 error indications in the case of 802.1Qbg|h setup in case
we wanted to use hook scripts for the setup and so the setup doesn't fail
here.
- if instance ID UUID is not supplied it will automatically be generated
- adapted schema to make instance ID UUID optional
- added test case
- parser and XML generator have been separated into their own
functions so they can be re-used elsewhere (passthrough case
for example)
- Adapted XML parser and generator support the above shown type
(802.1Qbg, 802.1Qbh).
- Adapted schema to above XML
- Adapted test XML to above XML
- Passing through the VM's UUID which seems to be necessary for
802.1Qbh -- sorry no host UUID
- adding virtual function ID to association function, in case it's
necessary to use (for SR-IOV)
This patch introduces a dependency on libnl, which subsequent patches
will then use.
Changes from V1 to V2:
- added diffstats
- following changes in tree
Spurious / in a pool target path makes life difficult for apps using the
GetVolByPath, and doing other path based comparisons with pools. This
has caused a few issues for virt-manager users:
https://bugzilla.redhat.com/show_bug.cgi?id=494005https://bugzilla.redhat.com/show_bug.cgi?id=593565
Add a new util API which removes spurious /, virFileSanitizePath. Sanitize
target paths when parsing pool XML, and for paths passed to GetVolByPath.
v2: Leading // must be preserved, properly sanitize path=/, sanitize
away /./ -> /
v3: Properly handle starting ./ and ending /.
v4: Drop all '.' handling, just sanitize / for now.
Allow for a host UUID in the capabilities XML. Local drivers
will initialize this from the SMBIOS data. If a sanity check
shows SMBIOS uuid is invalid, allow an override from the
libvirtd.conf configuration file
* daemon/libvirtd.c, daemon/libvirtd.conf: Support a host_uuid
configuration option
* docs/schemas/capability.rng: Add optional host uuid field
* src/conf/capabilities.c, src/conf/capabilities.h: Include
host UUID in XML
* src/libvirt_private.syms: Export new uuid.h functions
* src/lxc/lxc_conf.c, src/qemu/qemu_driver.c,
src/uml/uml_conf.c: Set host UUID in capabilities
* src/util/uuid.c, src/util/uuid.h: Support for host UUIDs
* src/node_device/node_device_udev.c: Use the host UUID functions
* tests/confdata/libvirtd.conf, tests/confdata/libvirtd.out: Add
new host_uuid config option to test
The cgroups ACL code was only allowing the primary disk image.
It is possible to chain images together, so we need to search
for backing stores and add them to the ACL too. Since the ACL
only handles block devices, we ignore the EINVAL we get from
plain files. In addition it was missing code to teardown the
cgroup when hot-unplugging a disk
* src/qemu/qemu_driver.c: Allow backing stores in cgroup ACLs
and add missing teardown code in unplug path
Basic live migration was broken by the commit that added
non-shared block support in two ways:
1) It added a virCheckFlags() to doNativeMigrate(). Besides
the fact that typical usage of virCheckFlags() is in driver
entry points, and doNativeMigrate() is not an entry point,
it was missing important flags like VIR_MIGRATE_LIVE. Move
the virCheckFlags to the top-level qemuDomainMigratePrepare2
and friends.
2) It also added a memory leak in qemuMonitorTextMigrate()
by not freeing the memory used by virBufferContentAndReset().
This is fixed by storing the pointer in a temporary variable
and freeing it at the end.
With this patch in place, normal live migration works again.
v3: Instead of the churn for virCheckFlagsUI and UL, instead
always promote flags to an unsigned long and always use %lx
for the fprintf.
v2: Add back flags check, which required adding virCheckFlagsUI
and virCheckFlagsUL
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Currently all host audio backends are disabled if a VM is using VNC, in
favor of the QEMU VNC audio extension. Unfortunately no released VNC
client supports this extension, so users have no way of getting audio
to work if using VNC.
Add a new config option in qemu.conf which allows changing libvirt's
behavior, but keep the default intact.
v2: Fix doc typos, change name to vnc_allow_host_audio
The device path doesn't make use of guestAddr, so the memcpy corrupts
the guest info struct.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The virDomainGetBlockInfo API allows query physical block
extent and allocated block extent. These are normally the
same value unless storing a special format like qcow2
inside a block device. In this scenario we can query QEMU
to get the actual allocated extent.
Since last time:
- Return fatal error in text monitor
- Only invoke monitor command for block devices
- Fix error handling JSON code
* src/qemu/qemu_driver.c: Fill in block aloction extent when VM
is running
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API to query the highest block extent via info blockstats
* src/lxc/lxc_driver.c (lxcSetSchedulerParameters): Ensure that
"->field" is "cpu_shares" before possibly giving a diagnostic about
a type for a "cpu_shares" value.
Also, virCgroupSetCpuShares could fail without evoking a diagnostic.
Add one.
Volume detection in the scsi backend was duplicating code already
present in storage_backend.c. Let's drop the duplicate code.
Also, change the shared function name to be less generic, and remove
some error squashing in the other call site.
We were squashing error messages in a few cases. Recode to follow common
ret = -1 convention.
v2: Handle more error squashing issues further up in MakeNewVol and
CreateVols. Use ret = -1 convention in MakeVols.
The qemu driver contains a subtle race in the logic to find next
available vnc port. Currently it iterates through all available ports
and returns the first for which bind(2) succeeds. However it is possible
that a previously issued port has not yet been bound by qemu, resulting
in the same port used for a subsequent domain.
This patch addresses the race by using a simple bitmap to "reserve" the
ports allocated by libvirt.
V2:
- Put port bitmap in struct qemud_driver
- Initialize bitmap in qemudStartup
V3:
- Check for failure of virBitmapGetBit
- Additional check for port != -1 before calling virbitmapClearBit
V4:
- Check for failure of virBitmap{Set,Clear}Bit
V2:
- Move bitmap impl to src/util/bitmap.[ch]
- Use CHAR_BIT instead of explicit '8'
- Use size_t instead of unsigned int
- Fix calculation of bitmap size in virBitmapAlloc
- Ensure bit is within range of map in the set, clear, and get
operations
- Use bool in virBitmapGetBit
- Add virBitmapFree to free-like funcs in cfg.mk
V3:
- Check for overflow in virBitmapAlloc
- Fix copy and paste bug in virBitmapAlloc
- Use size_t in prototypes
- Add ATTRIBUTE_NONNULL in prototypes where appropriate
and remove NULL check from impl
V4:
- Add ATTRIBUTE_RETURN_CHECK in prototypes where appropriate.
We shouldn't be checking validity in domain_conf, since
it can be used by multiple different hosts and hypervisors.
Remove the check completely.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We need a common internal function for starting managed domains to be
used during autostart. This patch factors out relevant code from
qemudDomainStart into qemudDomainObjStart and makes it use the
refactored code for domain restore instead of calling qemudDomainRestore
API directly.
We need to be able to assign new def to an existing virDomainObj which
is already locked. This patch factors out the relevant code from
virDomainAssignDef into virDomainObjAssignDef.
We need to be able to restore a domain which we already locked and
started a job for it without undoing these steps. This patch factors
out internals of qemudDomainRestore into separate functions which work
for locked objects.
* src/qemu/qemu_conf.c (qemudParseHelpStr): Fix errors that made
it impossible to diagnose invalid minor and micro version number
components.
Signed-off-by: Chris Wright <chrisw@redhat.com>
The current cleanup: in StartVMDaemon path is a poor duplication.
qemuShutdownVMDaemon can handle teardown for inactive VMs, so let's use it.
v2: Remove old abort: label, only use cleanup:
* src/qemu/qemu_conf.c (QEMU_VERSION_STR_1, QEMU_VERSION_STR_2):
Define these instead of...
(QEMU_VERSION_STR): ... this. Remove definition.
(qemudParseHelpStr): Check first for the new, shorter prefix,
"QEMU emulator version", and then for the old one,
"QEMU PC emulator version" when trying to parse the version number.
Based on a patch by Chris Wright.
* src/lxc/lxc_controller.c (ignorable_epoll_accept_errno): New function.
(lxcControllerMain): Handle a failed accept carefully:
most errno values indicate legitimate failure and must be fatal.
However, ignore a special case: that in which an incoming client quits
between the poll() indicating its presence, and our accept() which
is trying to process it.
There doesn't seem to be anything specific to tap devices for this
array of file descriptors which need to stay open of the guest to use.
Rename then for others to make use of.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Chris Lalancette <clalance@redhat.com>
These files may be useful for anyone making modifications to
source files in a tarball distribution.
* src/Makefile.am (EXTRA_DIST): Add THREADS.txt.
* daemon/Makefile.am (EXTRA_DIST): Add THREADING.txt.
This test was failing on systems using pdwtags from dwarves-1.3.
Reported by Matthias Bolte.
Two-pronged fix:
- use --verbose to work also with dwarves-1.3; adapt regular
expressions to handle now-varying separators
- require a minimum number of post-split clauses, in order to
skip upon any future format change.
Currently there are 318; if there are 300 or fewer,
give a warning similar to when pdwtags is missing.
* src/Makefile.am (remote_protocol-structs): Use pdwtags' --verbose
option to make 1.3 emit member sizes and offsets.
Consistently output WARNING messages to stderr.
Do not require each caller of virStorageFileGetMetadata and
virStorageFileGetMetadataFromFD to first clear the storage of the
"meta" buffer. Instead, initialize that storage in
virStorageFileGetMetadataFromFD.
* src/util/storage_file.c (virStorageFileGetMetadataFromFD): Clear
"meta" here, not before each of the following callers.
* src/qemu/qemu_driver.c (qemuSetupDiskCgroup): Don't clear "meta" here.
(qemuTeardownDiskCgroup): Likewise.
* src/qemu/qemu_security_dac.c (qemuSecurityDACSetSecurityImageLabel):
Likewise.
* src/security/security_selinux.c (SELinuxSetSecurityImageLabel):
Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
Approximately 60 messages were marked. Since these diagnostics are
intended solely for developers and maintainers, encouraging translation
is deemed to be counterproductive:
http://thread.gmane.org/gmane.comp.emulators.libvirt/25050/focus=25052
Run this command:
git grep -l VIR_WARN|xargs perl -pi -e \
's/(VIR_WARN0?)\s*\(_\((".*?")\)/$1($2/'
There were three very similar uses of qemuMonitorAddDrive.
This change makes the three 17-line sequences identical.
* src/qemu/qemu_driver.c (qemudDomainAttachPciDiskDevice): Detect
failure. Add VIR_WARN and braces.
(qemudDomainAttachSCSIDisk): Add VIR_WARN and braces.
(qemudDomainAttachUsbMassstorageDevice): Likewise.
History has shown that there are frequent bugs in the QEMU driver
code leading to the monitor being invoked with a NULL pointer.
Although the QEMU driver code should always report an error in
this case before invoking the monitor, as a safety net put in a
generic check in the monitor code entry points.
* src/qemu/qemu_monitor.c: Safety net to check for NULL monitor
object
Any method which intends to invoke a monitor command must have
a check for virDomainObjIsActive() before using the monitor to
ensure that priv->mon != NULL.
There is one subtle edge case in this though. If a method invokes
multiple monitor commands, and calls qemuDomainObjExitMonitor()
in between two of these commands then there is no guarentee that
priv->mon != NULL anymore. This is because the QEMU process may
exit or die at any time, and because qemuDomainObjEnterMonitor()
releases the lock on virDomainObj, it is possible for the background
thread to close the monitor handle and thus qemuDomainObjExitMonitor
will release the last reference allowing priv->mon to become NULL.
This affects several methods, most notably migration but also some
hotplug methods. This patch takes a variety of approaches to solve
the problem, depending on the particular usage scenario. Generally
though it suffices to add an extra virDomainObjIsActive() check
if qemuDomainObjExitMonitor() was called during the method.
* src/qemu/qemu_driver.c: Fix multiple potential NULL pointer flaws
in usage of the monitor
* src/qemu/qemu_driver.c (qemudDomainSetVcpus): Upon look-up failure,
i.e., vm==NULL, goto cleanup, rather than to "endjob", superficially
since the latter would dereference vm, but more fundamentally because
we certainly don't want to call qemuDomainObjEndJob before we've
even attempted qemuDomainObjBeginJob.
(gdb) p/x QEMUD_CMD_FLAG_VNET_HOST
$7 = 0xffffffff80000000
Oops - that meant we were incorrectly setting QEMU_CMD_FLAG_RTC_TD_HACK
for qemu-kvm-0.12.3 (and probably botching a few other settings as well).
Fixes Red Hat BZ#592070
* src/qemu/qemu_conf.h (QEMUD_CMD_FLAG_VNET_HOST): Avoid sign
extension.
* tests/qemuhelpdata/qemu-kvm-0.12.3: New file.
* tests/qemuhelptest.c (mymain): Add another case.
A fedora translator filed:
https://bugzilla.redhat.com/show_bug.cgi?id=580816
Pointing out these two error messages as unclear: "write save" sounds
like a typo without context, and lack of a colon made the second message
difficult to parse.
virFileResolveLink was returning a positive value on error,
thus confusing callers that assumed failure was < 0. The
confusion is further evidenced by callers that would have
ended up calling virReportSystemError with a negative value
instead of a valid errno.
Fixes Red Hat BZ #591363.
* src/util/util.c (virFileResolveLink): Live up to documentation.
* src/qemu/qemu_security_dac.c
(qemuSecurityDACRestoreSecurityFileLabel): Adjust callers.
* src/security/security_selinux.c
(SELinuxRestoreSecurityFileLabel): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskDeleteVol): Likewise.
Fix the cygwin regression introduced in commit 48445ccff, but
without repeating the fresh build regression of commit
2d550542e.
* src/Makefile.am (libvirt_test_la_LIBADD): Split out subset of
locally-built libraries...
(libvirt_test_la_BUILT_LIBADD): ...into new variable.
(libvirt_test_la_DEPENDENCIES): Depend only on the subset that
automake would have given us for free if we didn't have to add our
own extra file.
Matthias noted that the line:
virt_aa_helper_LDFLAGS = $(WARN_CFLAGS)
looks inconsistent, so I did an audit.
Currently, the set of compiler warning flags passed to gcc as $CC are
equally permitted as the set of linker flags passed to gcc as $LD, so
there was no problem with that usage. But if we ever get in a
situation where $CC and $LD treat particular flags differently, using
the right variable form will make it easier.
In the process, I spotted a couple of typos that were omitting useful
flags, as well as specifying a -l under the wrong variable.
* acinclude.m4 (LIBVIRT_COMPILE_WARNINGS): Define WARN_LDFLAGS as
an alias for WARN_CFLAGS.
* tools/Makefile.am (virsh_LDFLAGS): Use more canonical spelling.
* proxy/Makefile.am (libvirt_proxy_LDFLAGS): Likewise. Move
library...
(libvirt_proxy_LDADD): ...here.
* src/Makefile.am (virt_aa_helper_LDFLAGS): Use more canonical
spelling of WARN_LDFLAGS.
(libvirt_parthelper_LDFLAGS, libvirt_lxc_LDFLAGS): Likewise. Use
correct spelling of COVERAGE_LDFLAGS.
Reported by Matthias Bolte.
* configure.ac: Check for <linux/magic.h>.
* src/util/storage_file.c: Include <linux/magic.h> only if present.
Linux kernels prior to 2.6.19 lacked it.
[__linux__] (NFS_SUPER_MAGIC): Define if not already defined.
qemuReadLogOutput early VM death detection is racy and won't always work.
Startup then errors when connecting to the VM monitor. This won't report
the emulator cmdline output which is typically the most useful diagnostic.
Check if the VM has died at the very end of the monitor connection step,
and if so, report the cmdline output.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=581381
* src/qemu/qemu_driver.c (qemudDomainSetVcpus): Avoid NULL-deref
upon unknown UUID. Call qemuDomainObjBeginJob(vm) only after
ensuring that vm != NULL, not before. This potential NULL-deref
was introduced by commit 2c555d87b0.
This reverts commit 2d550542ee.
The patch worked for incremental builds, but broke fresh
builds, because it interfered with automake's automatic
dependency generation. Until I figure out how to make
automake do what we want, I'd rather leave cygwin broken
but fresh Linux builds working.
make[3]: *** No rule to make target `-lxml2', needed by `libvirt.la'. Stop.
Due to treating the wrong string as a dependency.
* src/Makefile.am (libvirt_la_DEPENDENCIES): Depend only on
locally-built file, not on strings that might resolve as '-lxml2'.
The code specifies driver->cacheDir as the format string,
but it usually doesn't contain '%s', so the subsequent
argument, "/qemu.mem.XXXXXX", is always ignored.
The patch fixes the misuse.
Setting dynamic_ownership=0 in /etc/libvirt/qemu.conf prevents
libvirt's DAC security driver from setting uid/gid on disk
files when starting/stopping QEMU, allowing the admin to manage
this manually. As a side effect it also stopped setting of
uid/gid when saving guests to a file, which completely breaks
save when QEMU is running non-root. Thus saved state labelling
code must ignore the dynamic_ownership parameter
* src/qemu/qemu_security_dac.c: Ignore dynamic_ownership=0 when
doing save/restore image labelling
When QEMU runs with its disk on NFS, and as a non-root user, the
disk is chownd to that non-root user. When migration completes
the last step is shutting down the QEMU on the source host. THis
normally resets user/group/security label. This is bad when the
VM was just migrated because the file is still in use on the dest
host. It is thus neccessary to skip the reset step for any files
found to be on a shared filesystem
* src/libvirt_private.syms: Export virStorageFileIsSharedFS
* src/util/storage_file.c, src/util/storage_file.h: Add a new
method virStorageFileIsSharedFS() to determine if a file is
on a shared filesystem (NFS, GFS, OCFS2, etc)
* src/qemu/qemu_driver.c: Tell security driver not to reset
disk labels on migration completion
* src/qemu/qemu_security_dac.c, src/qemu/qemu_security_stacked.c,
src/security/security_selinux.c, src/security/security_driver.h,
src/security/security_apparmor.c: Add ability to skip disk
restore step for files on shared filesystems.
The cgroups ACL code was only allowing the primary disk image.
It is possible to chain images together, so we need to search
for backing stores and add them to the ACL too. Since the ACL
only handles block devices, we ignore the EINVAL we get from
plain files. In addition it was missing code to teardown the
cgroup when hot-unplugging a disk
* src/qemu/qemu_driver.c: Allow backing stores in cgroup ACLs
and add missing teardown code in unplug path
If the IO error event does not include a reason, then there
is a possible crash dispatching the event
* src/conf/domain_event.c: Missing check for a NULL reason before
strduping allows for a crash
QEMU is gaining a new monitor command netdev_add for hotplugging
NICs using the netdev backend code. We already support this on
the command this, though it is disabled. This adds support for
hotplug too, also to remain disabled until 0.13 QEMU is released
* src/qemu/qemu_driver.c: Support netdev hotplug for NICs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for netdev_add and netdev_remove commands
When closing a monitor using qemuMonitorClose(), we are aware of
the possibility the monitor is still being used somewhere:
/* NB: ordinarily one might immediately set mon->watch to -1
* and mon->fd to -1, but there may be a callback active
* that is still relying on these fields being valid. So
* we merely close them, but not clear their values and
* use this explicit 'closed' flag to track this state */
but since we call virEventAddHandle() on that monitor without increasing
its ref counter, the monitor is still freed which makes possible users
of it quite unhappy. The unhappiness can lead to a hang if qemuMonitorIO
tries to lock mutex which no longer exists.
First calling REMOTE_PROC_CLOSE and then removing watches might lead to
a hang as HANGUP event can be triggered before the watches are actually
removed but after virConnectPtr is already freed. As a result of that
remoteDomainEventFired() would try to lock uninitialized mutex, which
would hang for ever.
Allow debugging of GNUTLS interactions by setting
LIBVIRT_GNUTLS_DEBUG=10 LIBVIRT_DEBUG=1 virsh
* src/remote/remote_driver.c: Use LIBVIRT_GNUTLS_DEBUG to
enable gnutls debugging
Some shells warn about missing programs before redirection;
the idiomatic way to silence them is to run the program check
inside a subshell, with the redirections outside the subshell.
But a subshell is only needed in places where it is reasonable
to expect the use of such a noisy shell in the first place.
* src/Makefile.am (remote_protocol-structs): Use subshell, for
FreeBSD 8.0 /bin/sh.
* cfg.mk (sc_preprocessor_indentation): Avoid subshell, since the
only users running cfg.mk can be assumed to have decent tools.
For printf("%*s",foo,bar), clang complains if foo is not int:
warning: field width should have type 'int', but argument has
type 'unsigned int' [-Wformat]
* src/conf/storage_encryption_conf.c
(virStorageEncryptionSecretFormat, virStorageEncryptionFormat):
Use correct type.
* src/conf/storage_encryption_conf.h (virStorageEncryptionFormat):
Likewise.
Now, if you update remote_protocol.x without also updating
remote_protocol-structs to match, then "make check" will fail.
* src/Makefile.am (remote_protocol-structs): Extract list of
structs and member names from remote_protocol.o.
(check-local): Depend on it.
* src/remote_protocol-structs: New file.
This reverts commit b5b8a6db69.
That commit was not necessary. The problem is fixed by commit
0e9b3a269b, but I didn't rebuild
it properly after pulling in the commit and didn't notice it.
Per automake, LDFLAGS is used early in the line, and LIBADD
(libraries) or LDADD (programs) is used late. On platforms like
cygwin, without lazy linking, this order matters. Therefore, libtool
commands, -L, and similar should be in LDFLAGS, but -l should be in
L*ADD.
* src/Makefile.am (*_LDFLAGS): Move libraries...
(*_LIBADD): ...to their LIBADD counterpart.
Change 965466c1 added a new field to struct remote_error, which broke
the RPC protocol. Fortunately the new field is unused, so this change
simply removes it again.
* src/remote/remote_protocol.(c|h|x): Remove remote_nwfilter from struct
remote_error
With the introduction of the generic qemu device model, unplugging
SCSI disks works like a charm, so support it in libvirt.
* src/qemu/qemu_driver.c: Add qemudDomainDetachSCSIDiskDevice() to do the
unplugging, extend qemudDomainDetachDeviceAdd().
Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Gnulib can guarantee that pthread.h exists, but for now, it is a dummy
header with no support for most pthread_* functions. Modify our
use of pthread to use function checks, rather than header checks,
to determine how much pthread support is present.
* bootstrap.conf (gnulib_modules): Add pthread.
* configure.ac: Drop all pthread.h checks. Optimize function
checks. Add check for pthread functions.
* src/Makefile.am (libvirt_lxc_LDADD): Ensure proper link.
* src/remote/remote_driver.c (remoteIOEventLoop): Depend on
pthread_sigmask, now that gnulib guarantees pthread.h.
* src/util/util.c (virFork): Likewise.
* src/util/threads.c (threads-pthread.c): Depend on
pthread_mutexattr_init, as a witness of full pthread support.
* src/util/threads.h (threads-pthread.h): Likewise.
Detected by clang. POSIX requires that the second argument to
va_start be the name of the last variable; and in some implementations,
passing *path instead of path would dereference bogus memory instead
of pulling arguments off the stack.
* src/util/util.c (virBuildPathInternal): Use correct argument to
va_start.
Support for live migration between hosts that do not share storage was
added to qemu-kvm release 0.12.1.
It supports two flags:
-b migration without shared storage with full disk copy
-i migration without shared storage with incremental copy (same base image
shared between source and destination).
I tested the live migration without shared storage (both flags) for native
and p2p with and without tunnelling. I also verified that the fix doesn't
affect normal migration with shared storage.
Add an empty body for virCondWaitUntil and move virPipeReadUntilEOF
out of the '#ifndef WIN32' block, because it compiles fine with MinGW
in combination with gnulib.
Necessary on cygwin, where uid_t and gid_t are 4-byte long rather
than int, causing gcc -Wformat warnings.
* src/util/util.c (virFileOperationNoFork, virDirCreateNoFork)
(virFileOperation, virDirCreate, virGetUserEnt): Cast uid_t and
gid_t before passing to printf.
* .gitignore: Ignore Windows executables.
When a filter is updated, only those interfaces must have their old
rules cleared that either reference the filter directly or indirectly
through another filter. Remember between the different steps of the
instantiation of the filters which interfaces must be skipped. I am
using a hash map to remember the names of the interfaces and store a
bogus pointer to ~0 into it that need not be freed.
For the decision on whether to instantiate the rules, the check for a
pending IP address learn request is not sufficient since then only the
thread could instantiate the rules. So, a boolean needs to be passed
when the thread instantiates the filter rules late and the IP address
learn request is still pending in order to override the check for the
pending learn request. If the rules are to be updated while the thread
is active, this will not be done immediately but the thread will do that
later on.
WIN32 is always defined when __MINGW32__ is defined, but the
converse is not true. WIN32 is more generic, if someone were
to ever attempt porting to a microsoft compiler. This does
not affect Cygwin, which intentionally does not define WIN32.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Use more
generic flag macro.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD)
(virStorageBackendRunProgRegex): Likewise.
* tools/console.h (vshRunConsole): Likewise.
[Error message]
error: Failed to start domain lxc_test1
error: internal error Failed to create veth device pair: 512
The reason of the failure is that lxc driver unexpectedly re-uses
an auto-assigned veth name and tries to create the created veth
again. The failure will happen when a domain has multiple network
interfaces and the names of those are not specified in XML.
The patch fixes the problem by resetting buffers of veth names
in every iteration of creating veth.
* src/lxc/lxc_driver.c: prevent re-using auto-assigned veth name
Reported by Kumar L Srikanth-B22348.
<hostdev> address parsing previously attempted to detect the number
base: currently it is hardcoded to base 16, which can break PCI
assignment via virt-manager. Revert to the previous behavior.
* src/conf/domain_conf.c: virDomainDevicePCIAddressParseXML, switch to
virStrToLong_ui(bus, NULL, 0, ...) to autodetect base
This introduces a new event type
VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON
This event is the same as the previous VIR_DOMAIN_ID_IO_ERROR
event, but also includes a string describing the cause of
the event.
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventIOErrorReasonCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *srcPath,
const char *devAlias,
int action,
const char *reason,
void *opaque);
This is currently wired up to the QEMU block IO error events
* daemon/remote.c: Dispatch IO error events to client
* examples/domain-events/events-c/event-test.c: Watch for
IO error events
* include/libvirt/libvirt.h.in: Define new IO error event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle IO error events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block IO errors and emit a libvirt IO error event
* src/remote/remote_driver.c: Receive and dispatch IO error
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
IO error events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_IO_ERROR event
from QEMU monitor
Introduce a function to notify the IP address learning
thread to terminate and thus release the lock on the interface.
Notify the thread before grabbing the lock on the interface
and tearing down the rules. This prevents a 'virsh destroy' to
tear down the rules that the IP address learning thread has
applied.
The functions invoked by the IP address learning thread
that apply some basic filtering rules did not clean up
any previous filtering rules that may still be there
(due to a libvirt restart for example). With the
patch below all the rules are cleaned up first.
Also, I am introducing a function to drop all traffic
in case the IP address learning thread could not apply
the rules.
The local DHCP server on virtbr0 sends DHCP ACK messages when a VM is
started and requests an IP address while the initial DHCP lease on the
VM's MAC address hasn't expired. So, also pick the IP address of the VM
if that type of message is seen.
Thanks to Gerhard Stenzel for providing a test case for this.
Changes from V1 to V2:
- cleanup: replacing DHCP option numbers through constants
When using -device syntax, the IO event will have a different
prefix, 'drive-' that needs to be skipped over before matching
against the libvirt disk alias
* src/qemu/qemu_driver.c: Skip QEMU_DRIVE_HOST_PREFIX in IO event
* daemon/remote.c: Server side dispatcher
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h,
daemon/remote_dispatch_ret.h, daemon/remote_dispatch_table.h: Update
with new API
* src/remote/remote_driver.c: Client side dispatcher
* src/remote/remote_protocol.c, src/remote/remote_protocol.h: Update
* src/remote/remote_protocol.x: Define new wire protocol
This defines the internal driver API and stubs out each driver
* src/driver.h: Define virDrvDomainGetBlockInfo signature
* src/libvirt.c, src/libvirt_public.syms: Glue public API to drivers
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Stub out driver
qemuDomainPCIAddressSetFree was freeing up the hash
table for the pci addresses, but not freeing up the addr
structure. Looking over the callers of this function, it
seems like they expect it to also free up the structure,
so do that here.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We were over-writing a pointer without freeing it in
case of a disk device, leading to a memory leak.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When building on Ubuntu with make -j3 (or more), it would always
fail when trying to build virt-aa-helper. I'm not an expert in
automake by any means, but I think the entry for virt-aa-helper
is mis-using LDADD; it shouldn't be putting direct paths to
libvirt_conf.la and libvirt_util.la, but instead referencing those
names. With this patch in place, I'm able to successfully build
on Ubuntu 9.04 with make -j3.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The previous commit changes a goto from 'endjob' to 'cleanup',
leaving the endjob label unused. Remove it to avoid compile
warning.
* src/qemu/qemu_driver.c: Remove 'endjob' label
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): When setting
"vm" to NULL, jump over vm-dereferencing code to "cleanup".
(qemuDomainRevertToSnapshot): Likewise.
use /var/lib/libvirt/dnsmasq since /var/lib/libvirt/network is
unreadable by the dnsmasq binary
* src/network/bridge_driver.c: update DNSMASQ_STATE_DIR
* src/Makefile.am: create it on make install
* libvirt.spec.in: take the new directory into account
In cases where the security driver failed to restore a label after a
guest has saved, we mistakenly jumped to the error cleanup paths.
This is not good, because the operation has in fact completed and
cannot be rolled back completely. Label restore is non-critical, so
just log the problem instead. Also add a missing restore call in
the error cleanup path
* src/qemu/qemu_driver.c: Fix handling of security driver
restore failures in QEMU domain save
When cgroups is enabled, access to block devices is likely to be
restricted to a whitelist. Prior to saving a guest to a block device,
it is necessary to add the block device to the whitelist. This is
not required upon restore, since QEMU reads from stdin
* src/qemu/qemu_driver.c: Add block device to cgroups whitelist
if neccessary during domain save.
The save process was relying on use of the shell >> append
operator to ensure the save data was placed after the libvirt
header + XML. This doesn't work for block devices though.
Replace this code with use of 'dd' and its 'seek' parameter.
This means that we need to pad the header + XML out to a
multiple of dd block size (in this case we choose 512).
The qemuMonitorMigateToCommand() monitor API is used for both
save/coredump, and migration via UNIX socket. We can't simply
switch this to use 'dd' since this causes problems with the
migration usage. Thus, create a dedicated qemuMonitorMigateToFile
which can accept an filename + offset, and remove the filename
from the current qemuMonitorMigateToCommand() API
* src/qemu/qemu_driver.c: Switch to qemuMonitorMigateToFile
for save and core dump
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Create
a new qemuMonitorMigateToFile, separate from the existing
qemuMonitorMigateToCommand to allow handling file offsets
It is possible to use block devices with domain save/restore. Upon
failure QEMU unlinks the path being saved to. This isn't good when
it is a block device !
* src/qemu/qemu_driver.c: Don't unlink block devices if save fails
If a transient QEMU crashes during save attempt, then the virDomainPtr
object may be freed. If a persistent QEMU crashes during save, then
the 'priv->mon' field is no longer valid since it will be inactive.
* src/qemu/qemu_driver.c: Fix two crashes when QEMU exits
during a save attempt
In particular I was forgetting to take the qemuMonitorPrivatePtr
lock (via qemuDomainObjBeginJob), which would cause problems
if two users tried to access the same domain at the same time.
This patch also fixes a problem where I was forgetting to remove
a transient domain from the list of domains.
Thanks to Stephen Shaw for pointing out the problem and testing
out the initial patch.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
This patch adds support for the RARP protocol. This may be needed due to
qemu sending out a RARP packet (at least that's what it seems to want to
do even though the protocol id is wrong) when migration finishes and
we'd need a rule to let the packets pass.
Unfortunately my installation of ebtables does not understand -p RARP
and also seems to otherwise depend on strings in /etc/ethertype
translated to protocol identifiers. Therefore I need to pass -p 0x8035
for RARP. To generally get rid of the dependency of that file I switch
all so far supported protocols to use their protocol identifier in the
-p parameter rather than the string.
I am also extending the schema and added a test case.
changes from v1 to v2:
- added test case into patch
With JSON qemu monitor, we get a STOP event from qemu whenever qemu
stops guests CPUs. The downside of it is that vm->state is changed to
PAUSED and a new generic paused event is send to applications. However,
when we ask qemu to stop the CPUs we are not really interested in qemu
event and we usually want to issue a more specific event.
By setting vm->status to PAUSED before actually sending the request to
qemu (and resetting it back if the request fails) we can ignore the
event since the event handler does nothing when the guest is already
paused. This solution is quite hacky but unfortunately it's the best
solution which I was able to come up with and it doesn't introduce a
race condition.
* virStorageEncryptionFormat is called from both
virDomainDiskDefFormat and virStorageVolTargetDefFormat. The proper
indentation in the generated XML depends on the caller. My earlier
patch to fix the incorrect indentation for the domain XML broke the
indentation for the storage XML. This patch adopts Laine's
suggestion of requring the caller of virStorageEncryptionFormat to
provide an unsigned int with the number of spaces the output should
be indented. The patch modifies both callers to provide the
additional argument.
* Add a regression test for the domain XML
* src/conf/domain_conf.c src/conf/storage_conf.c
src/conf/storage_encryption_conf.c src/conf/storage_encryption_conf.h:
change the indentation code
* tests/qemuxml2xmltest.c
tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args
tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml: add a regression test
Cygwin's XDR implementation defines xdr_u_int64_t instead of
xdr_uint64_t and lacks IXDR_PUT_INT32/IXDR_GET_INT32.
Alter the IXDR_GET_LONG regex in rpcgen_fix.pl so it doesn't destroy
the #define IXDR_GET_INT32 IXDR_GET_LONG in remote_protocol.x.
Also fix the remote_protocol.h regex in rpcgen_fix.pl.
With this patch I want to enable hex number inputs in the filter XML. A
number that was entered as hex is also printed as hex unless a string
representing the meaning can be found.
I am also extending the schema and adding a test case. A problem with
the DSCP value is fixed on the way as well.
Changes from V1 to V2:
- using asHex boolean in all printf type of functions to select the
output format in hex or decimal format
This patch makes libvirtd start the dnsmasq daemon with a
--dhcp-hostsfile option instead of --dhcp-host options for each
'//ip/dhcp/host' entries defined in network xml file.
the dnsmasq host file is stored into /var/lib/libvirt/network
* src/network/bridge_driver.c: define the directory for the hostfiles
and save/delete them to be used by dnsmasq
* po/POTFILES.in: the new module contains translatable strings
* src/Makefile.am: include the files in the utils set
* src/libvirt_private.syms: exports the symbols internally
It implements an idea to save dhcp hosts' macaddr vs. ipaddr mappings to
static file and make dnsmasq loading it with "--dhcp-hostsfile" option,
originally suggested by Dan, and can address the problem that too
many "--dhcp-host" args hitting ARG_MAX limit
* src/util/dnsmasq.h src/util/dnsmasq.c: adds the 2 new files
While doing some testing of the snapshot code I noticed that
if qemuDomainSnapshotLoad failed, it would print a NULL as
part of the error. That's not desirable, so leave the
full_path variable around until after we are done printing
errors.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We were freeing the virDomainSnapshotDefPtr, but not
the virDomainSnapshotObjPtr in virDomainSnapshotObjFree.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
It's not needed and is currently ignored, but this is a bug.
It will get fixed soon and QMP will return an error for keys
it doesn't know about, this will break libvirt.
* src/qemu/qemu_monitor_json.c: remove qemuMonitorJSONCommandAddTimestamp()
and the place where it's invoked in qemuMonitorJSONMakeCommand()
The nwfilterDriverActive() could de-reference a NULL pointer
if it hadn't be started at the point it was called. It was
also not thread safe, since it lacked locking around data
accesses.
* src/nwfilter/nwfilter_driver.c: Fix locking & NULL checks
in nwfilterDriverActive()
The user probably doesn't care what the gai error numbers are, as
much as what the failed conversion IP address was.
* src/remote/remote_driver.c (addrToString): Mention which address
could not be converted.
* daemon/remote.c (addrToString): Likewise.
* The error messages coming from qemu's DAC support contain strings
from the original SELinux security driver code. This just removes
references to "security context" and other SELinux-isms from the DAC
code.
Signed-off-by: Spencer Shimko <sshimko@tresys.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
- using INT_BUFSIZE_BOUND() to determine the length of the buffersize
for printing and integer into
- not explicitly initializing static var threadsTerminate to false
anymore, since that's done automatically
Changes after V2:
- removed while looks in case of OOM error
- removed on ifaceDown() call
- preceding one ifaceDown() call with an ifaceCheck() call
Since the name of an interface can be the same between stops and starts
of different VMs I have to switch the IP address learning thread to use
the index of the interface to determine whether an interface is still
available or not - in the case of macvtap the thread needs to listen for
traffic on the physical interface, thus having to time out periodically
to check whether the VM's macvtap device is still there as an indication
that the VM is still alive. Previously the following sequence of 2 VMs
with macvtap device
virsh start testvm1; virsh destroy testvm1 ; virsh start testvm2
would not terminate the thread upon testvm1's destroy since the name of
the interface on the host could be the same (i.e, macvtap0) on testvm1
and testvm2, thus it was easily race-able. The thread would then
determine the IP address parameter for testvm2 but apply the rule set
for testvm1. :-(
I am also introducing a lock for the interface (by name) that the thread
must hold while it listens for the traffic and releases when it
terminates upon VM termination or 0.5 second thereafter. Thus, the new
thread for a newly started VM with the same interface name will not
start while the old one still holds the lock. The only other code that I
see that also needs to grab the lock to serialize operation is the one
that tears down the firewall that were established on behalf of an
interface.
I am moving the code applying the 'basic' firewall rules during the IP
address learning phase inside the thread but won't start the thread
unless it is ensured that the firewall driver has the ability to apply
the 'basic' firewall rules.
The hang fix in d376b7d63e was incomplete
since it left quite a few {Enter,Exit}Monitor calls which require driver
to be unlocked. Since the driver is locked throughout the whole
function, {Enter,Exit}MonitorWithDriver need to be used instead to
ensure driver is not locked when issuing monitor commands.
The comment in qemuDomainWaitForMigrationComplete says we are polling
every 50ms but the code sleeps only for 50us. This was already discussed
during review but apparently forgotten when the series was pushed.
The text monitor code was checking for a '\n' prefix on several
places. Previously this would work, but since the monitor code
re-write the '\n' is already stripped off, so mustn't be checked
for.
* src/qemu/qemu_monitor_text.c: Fix monitor error checking
Probably as a result of a merge error, the CPU hotplug command
names were completely wrong.
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Fix
the CPU hotplug command names
Adds ability to provide a preferred CPU model for CPUID data decoding.
Such model would be considered as the best possible model (if it's
supported by hypervisor) regardless on number of features which have to
be added or removed for describing required CPU.
So far, when CPUID data were converted into CPU model and features, the
features can only be added to the model. As a result, when a guest asked
for something like "qemu64,-svm" it would get a qemu32 plus a bunch of
additional features instead.
This patch adds support for removing feature from the base model.
Selection algorithm remains the same: the best CPU model is the model
which requires lowest number of features to be added/removed from it.
Qemu committed a patch which list some CPU names in [] when asked for
supported CPUs (qemu -cpu ?). Yet, it needs such CPUs to be passed
without those square braces. When probing for supported CPU models, we
can just strip the square braces and pretend we have never seen them.
First, inital VCPU pinning is set correctly but then it is reset by
assigning qemu process to a new cgroup (which contains all CPUs). It's
easily fixed by swapping these two actions.
An empty root snapshot list was considered as error condition. Creating a
new snapshot would fail if the domain didn't have snapshots yet, because
the snapshot-create function tries to lookup the list of existing snapshots
in order to verify that the snapshot name is unique. This fails if the
domain doesn't have snapshots yet.
Removing the NULL check from esxVI_LookupRootSnapshotTreeList fixes this.
I am moving some of the eb/iptables related functions into the interface
of the firewall driver and am making them only accessible via the driver's
interface. Otherwise exsiting code is adapted where needed. I am adding one
new function to the interface that checks whether the 'basic' rules can be
applied, which will then be used by a subsequent patch.
According to GCC, ATTRIBUTE_UNUSED means that an attribute _might_
be unused, not _must_ be unused. Therefore, it is easier to
blindly mark a variable, than to try and do preprocessor limiting
of when we know it is unused.
* src/remote/remote_driver.c (remoteAuthenticate): Mark attribute
as potentially unused.
Reported by Gustovo Morozowski.
The initial boot of VMs uses -device for NICs where available. The
corresponding monitor command is device_add, but the network hotplug
code was still using device_del by mistake.
* src/qemu/qemu_driver.c: Use device_add for NIC hotplug where
available
If either of the getfd or host_net_add monitor commands return
any text, this indicates an error condition. Don't ignore this!
* src/qemu/qemu_monitor_text.c: Report errors for getfd and
host_net_add
The 'device_del' command expects a parameter called 'id' but we
were passing 'config'.
* src/qemu/qemu_monitor_json.c: Fix device_del command parameter
The idea is that every API implementation in driver which has flags
parameter should first call virCheckFlags() macro to check the function
was called with supported flags:
virCheckFlags(VIR_SUPPORTED_FLAG_1 |
VIR_SUPPORTED_FLAG_2 |
VIR_ANOTHER_SUPPORTED_FLAG, -1);
The error massage which is printed when unsupported flags are passed
looks like:
invalid argument in virFooBar: unsupported flags (0x2)
Where the unsupported flags part only prints those flags which were
passed but are not supported rather than all flags passed.
Based on a warning from coverity. The safe* functions
guarantee complete transactions on success, but don't guarantee
freedom from failure.
* src/util/util.h (saferead, safewrite, safezero): Add
ATTRIBUTE_RETURN_CHECK.
* src/remote/remote_driver.c (remoteIO, remoteIOEventLoop): Ignore
some failures.
(remoteIOReadBuffer): Adjust error messages on read failure.
* daemon/event.c (virEventHandleWakeup): Ignore read failure.
Disk devices in QEMU have two parts, the guest device and the host
backend driver. Historically these two parts have had the same
"unique" name. With the switch to using -device though, they now
have separate names. Thus when changing CDROM media, for guests
using -device syntax, we need to prepend the QEMU_DRIVE_HOST_PREFIX
constant
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add helper function
qemuDeviceDriveHostAlias() for building a host backend alias
* src/qemu/qemu_driver.c: Use qemuDeviceDriveHostAlias() to determine
the host backend alias for performing eject/change commands in the
monitor
The device_add command was added in JSON mode in a way I didn't
expect. Instead of passing the normal device string to the JSON
command:
{ "execute": "device_add", "arguments": { "device": "ne2k_pci,id=nic.1,netdev=net.1" } }
We need to split up the device string into a full JSON object
{ "execute": "device_add", "arguments": { "driver": "ne2k_pci", "id": "nic.1", "netdev": "net.1" } }
* src/qemu/qemu_conf.h, src/qemu/qemu_conf.c: Rename the
qemuCommandLineParseKeywords method to qemuParseKeywords
and export it to monitor
* src/qemu/qemu_monitor_json.c: Split up device string into
a JSON object for device_add command
The parameter for the qemuMonitorDeviceDel() is a device alias,
not a device config string. Rename the parameter reflect this
and avoid confusion to readers.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Rename devicestr to devalias in qemuMonitorDeviceDel()
The QEMU developers have stated that they will not be porting
the commands 'pci_add', 'pci_del', 'usb_add', 'usb_del' to the
JSON mode monitor, since they're obsoleted by 'device_add'
and 'device_del'. libvirt has (untested) code that would have
supported those commands in theory, but since we already use
device_add/del where available, there's no need to keep the
legacy stuff anymore.
The text mode monitor keeps support for all commands for sake
of historical compatability.
* src/qemu/qemu_monitor_json.c: Remove 'pci_add', 'pci_del',
'usb_add', 'usb_del' commands
The QEMU driver is mistakenly calling directly into the text
mode monitor for the domain memory stats query.
* src/qemu/qemu_driver.c: Replace qemuMonitorTextGetMemoryStats with
qemuMonitorGetMemoryStats
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add the new
wrapper for qemuMonitorGetMemoryStats
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
qemuMonitorJSONGetMemoryStats implementation
Instead of reporting VIR_ERR_INTERNAL_ERROR use the more specific
VIR_ERR_CONFIG_UNSUPPORTED
* src/qemu/qemu_conf.c: Report VIR_ERR_CONFIG_UNSUPPORTED for
unsupported video adapters
To avoid race-conditions, the tear down of a filter has to happen before
the tap interface disappears and another tap interface with the same
name can re-appear. This patch tries to fix this. In one place, where
communication with the qemu monitor may fail, I am only tearing the
filters down after knowing that the function did not fail.
I am also moving the tear down functions into an include file for other
drivers to reuse.
* This patch implements a memory allocator to obtain memory for
structures whose last member is a variable length array. C99 refers
to these variable length objects as structs containing flexible
array members.
* Fixed macro parentheses per Eric Blake
* src/xen/xend_internal.c (xend_parse_sexp_desc_char): Add three
uses of sa_assert, each preceding a strchr(value,... to assure
clang that "value" is non-NULL.
* src/qemu/qemu_driver.c (qemudDomainAttachSCSIDisk):
Initialize "cont" to NULL, so clang knows it's set.
Add an sa_assert so it knows it's non-NULL when dereferenced.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesApplyNewRules):
Don't dereference a NULL or uninitialized pointer when given
an empty list of rules. Add an sa_assert(inst) in each loop to
tell clang that the uses of "inst[i]" are valid.
Among some here, there is a strong aversion to the use of "assert", yet
some others think it is essential (when applied judiciously) even --
perhaps "especially" -- at the heart of libraries and core hypervisor-
related code.
Here is a compromise that lets us make assertions about the code (e.g.,
to tell static analyzers about invariants) without even a hint of risk
of an abort.
* src/internal.h [STATIC_ANALYSIS]: Include <assert.h>.
(sa_assert): Define. A no-op most of the time, but equivalent
to classical assert when STATIC_ANALYSIS is nonzero.
* src/storage/storage_backend_fs.c (virStorageBackendFileSystemMount):
Clang was not smart enough, and mistakenly reported that "options"
could be used uninitialized. Initialize it.
Somehow the backend of this function was never implemented in
libvirt's netcf driver, and nobody noticed until now. (The required
netcf function was already in place, so nothing needs to change
there.)
* src/interface/netcf_driver.c: add in the backend function, and point
to it from the table of driver functions.
* src/openvz/openvz_driver.c (openvzGetProcessInfo): Reorganize
so that unexpected /proc/vz/vestat content cannot make us use
uninitialized variables. Without this change, an input line with
a matching "readvps", but fewer than 4 numbers would result in our
using at least "systime" uninitialized.
I am getting rid of determining the path to necessary CLI tools at
compile time. Instead, now the firewall driver has an initialization
function that uses virFindFileInPath() to determine the path to
necessary CLI tools and a shutdown function to free allocated memory.
The rest of the patch mostly deals with availability of the CLI tools
and to not call certain code blocks if a tool is not available and that
strings now have to be built slightly differently.
Generate almost all SOAP method mapping code.
Update the driver code to use the complete paramater list of some methods
that had parameters skipped before.
Improve the ESX_VI__METHOD marco to do automatic output deserialization
based on output occurrence. Also incorporate automatic _this binding and
output pointer check.
* src/esx/esx_vmx.c (esxVMX_GatherSCSIControllers): Do not dereference
a NULL disk->driverName. We already detect this condition in another
case. Check for it here, too.
When building libvirt on RHEL-5, I saw this error:
cc1: warnings being treated as errors
openvz/openvz_conf.c: In function 'openvzGetVPSUUID':
openvz/openvz_conf.c:835: warning: 'saveptr' may be used uninitialized in this function
make[3]: *** [libvirt_driver_openvz_la-openvz_conf.lo] Error 1
gcc in RHEL-5 gets upset about this usage of strtok_r (even though
it is perfectly valid). Just set *saveptr to NULL at the
start to quiet it down.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Changes from v1 to v2:
- changed function name prefixes to 'iface' from previous 'Iface'
- Further to make make syntax-check pass:
- indentation fix in interface.h
- added entry to POTFILES.in
I am consolidating network interface related functions used in nwfilter
and macvtap code in utils/interface.c. All function names are prefixed
with 'Iface'. The following functions are now available through
interface.h:
int ifaceCtrl(const char *name, bool up);
int ifaceUp(const char *name);
int ifaceDown(const char *name);
int ifaceCheck(bool reportError, const char *ifname,
const unsigned char *macaddr, int ifindex);
int ifaceGetIndex(bool reportError, const char *ifname, int *ifindex);
I added 'int ifindex' as parameter to ifaceCheck to the original
function and modified the code accordingly.
* configure.ac docs/news.html.in libvirt.spec.in src/libvirt_public.syms:
updates for release of 0.8.0
* po/*.po po/libvirt.pot: updated a lar set of localizations, and merge
the messages
I mistakenly took the op field in the DHCP message as the DHCP_OFFER
type. Rather than basing the decision to read the VM's IP address on
that field, process the appended DHCP options where option 53 indicates
the actual type of the packet. I am also reading the broadcast address
of the VM, but don't use it so far.
In a couple of cases typos meant we were firing the wrong type
of event. In the python code my previous commit accidentally
missed some chunks of the code.
* python/libvirt-override-virConnect.py: Add missing python glue
accidentally left out of previous commit
* src/conf/domain_event.c, src/qemu/qemu_monitor_json.c: Fix typos
in event name / method name to invoke
Currently when we attempt to change the cdrom in a qemu VM the monitor
doesn't generate an error if the target filename doesn't exist. I've
submitted a patch[1] for this. This patch is the libvirt qemu-driver
side which catches the error message from the monitor and reportes the
error to libvirt. This means that virsh attach-disk cdrom commands
won't appear to succeed when qemu change command actually failed.
* src/qemu/qemu_monitor_text.c: in qemuMonitorTextChangeMedia() look
for failure to access the new data
Fix invalid code generating in esx_vi_generator.py regarding deep copy
types that contain enum properties.
Add strptime and timegm to bootstrap.conf. Both are used to convert a
xsd:dateTime to calendar time.
Add a testcase of the xsd:dateTime conversion.
The network filter / snapshot / hooks code introduced some
non-portable pices that broke the win32 build
* configure.ac: Check for net/ethernet.h required by nwfile config
parsing code
* src/conf/nwfilter_conf.c: Define ethernet protocol constants
if net/ethernet.h is missing
* src/util/hooks.c: Disable hooks build on Win32 since it lacks
fork/exec/pipe
* src/util/threads-win32.c: Fix unchecked return value
* tools/virsh.c: Disable SIGPIPE on Win32 since it doesn't exist.
Fix non-portable strftime() formats
Changes from V1 to V2 of this patch
- I had reversed the logic thinking that icmp type 0 is a echo
request,but it's reply -- needed to reverse the logic
- Found that ebtables takes the --ip-tos argument only as a hex number
This patch enables the skipping of some of the ICMP traffic rules on the
iptables level under certain circumstances so that the following filter
properly enables unidirectional pings:
<filter name='testcase'>
<uuid>d6b1a2af-def6-2898-9f8d-4a74e3c39558</uuid>
<!-- allow incoming ICMP Echo Request -->
<rule action='accept' direction='in' priority='500'>
<icmp type='8'/>
</rule>
<!-- allow outgoing ICMP Echo Reply -->
<rule action='accept' direction='out' priority='500'>
<icmp type='0'/>
</rule>
<!-- drop all other ICMP traffic -->
<rule action='drop' direction='inout' priority='600'>
<icmp/>
</rule>
</filter>
Extend tests to cover all SCSI controller types and document the
new type.
The lsisas1068 SCSI controller type was added in ESX 4.0. The VMX
parser reports an error when this controller type is present. This
makes virsh dumpxml fail for every domain that uses this controller
type.
This patch fixes this and adds lsisas1068 to the list of accepted
SCSI controller types.
Reported by Jonathan Kelley.
This patch implements support for learning a VM's IP address. It uses
the pcap library to listen on the VM's backend network interface (tap)
or the physical ethernet device (macvtap) and tries to capture packets
with source or destination MAC address of the VM and learn from DHCP
Offers, ARP traffic, or first-sent IPv4 packet what the IP address of
the VM's interface is. This then allows to instantiate the network
traffic filtering rules without the user having to provide the IP
parameter somewhere in the filter description or in the interface
description as a parameter. This only supports to detect the parameter
IP, which is for the assumed single IPv4 address of a VM. There is not
support for interfaces that may have multiple IP addresses (IP
aliasing) or IPv6 that may then require more than one valid IP address
to be detected. A VM can have multiple independent interfaces that each
uses a different IP address and in that case it will be attempted to
detect each one of the address independently.
So, when for example an interface description in the domain XML has
looked like this up to now:
<interface type='bridge'>
<source bridge='mybridge'/>
<model type='virtio'/>
<filterref filter='clean-traffic'>
<parameter name='IP' value='10.2.3.4'/>
</filterref>
</interface>
you may omit the IP parameter:
<interface type='bridge'>
<source bridge='mybridge'/>
<model type='virtio'/>
<filterref filter='clean-traffic'/>
</interface>
Internally I am walking the 'tree' of a VM's referenced network filters
and determine with the given variables which variables are missing. Now,
the above IP parameter may be missing and this causes a libvirt-internal
thread to be started that uses the pcap library's API to listen to the
backend interface (in case of macvtap to the physical interface) in an
attempt to determine the missing IP parameter. If the backend interface
disappears the thread terminates assuming the VM was brought down. In
case of a macvtap device a timeout is being used to wait for packets
from the given VM (filtering by VM's interface MAC address). If the VM's
macvtap device disappeared the thread also terminates. In all other
cases it tries to determine the IP address of the VM and will then apply
the rules late on the given interface, which would have happened
immediately if the IP parameter had been explicitly given. In case an
error happens while the firewall rules are applied, the VM's backend
interface is 'down'ed preventing it to communicate. Reasons for failure
for applying the network firewall rules may that an ebtables/iptables
command failes or OOM errors. Essentially the same failure reasons may
occur as when the firewall rules are applied immediately on VM start,
except that due to the late application of the filtering rules the VM
now is already running and cannot be hindered anymore from starting.
Bringing down the whole VM would probably be considered too drastic.
While a VM's IP address is attempted to be determined only limited
updates to network filters are allowed. In particular it is prevented
that filters are modified in such a way that they would introduce new
variables.
A caveat: The algorithm does not know which one is the appropriate IP
address of a VM. If the VM spoofs an IP address in its first ARP traffic
or IPv4 packets its filtering rules will be instantiated for this IP
address, thus 'locking' it to the found IP address. So, it's still
'safer' to explicitly provide the IP address of a VM's interface in the
filter description if it is known beforehand.
* configure.ac: detect libpcap
* libvirt.spec.in: require libpcap[-devel] if qemu is built
* src/internal.h: add the new ATTRIBUTE_PACKED define
* src/Makefile.am src/libvirt_private.syms: add the new modules and symbols
* src/nwfilter/nwfilter_learnipaddr.[ch]: new module being added
* src/nwfilter/nwfilter_driver.c src/conf/nwfilter_conf.[ch]
src/nwfilter/nwfilter_ebiptables_driver.[ch]
src/nwfilter/nwfilter_gentech_driver.[ch]: plu the new functionality in
* tests/nwfilterxml2xmltest: extend testing
When comparing a CPU to host CPU, the result would be
VIR_CPU_COMPARE_SUPERSET (or even VIR_CPU_COMPARE_INCOMPATIBLE if strict
match was required) even though the two CPUs were identical.
When qemu libvirt driver doesn't support guest CPU selection with given
qemu binary, guests requiring specific CPU should fail to start instead
of being silently supplied with a default CPU.
git grep found 12 of the former but 100 of the latter in src/.
* src/remote/remote_driver.c (initialise_gnutls): Rename...
(initialize_gnutls): ...to this.
(doRemoteOpen): Adjust caller.
* src/xen/xen_driver.c (xenUnifiedOpen): Adjust output string.
* src/util/network.c: Adjust comments.
Suggested by Matthias Bolte.
* src/conf/domain_event.c (virDomainEventGraphicsNewFromDom):
Return NULL when handling out-of-memory error, rather than
falling through with ev=NULL and then assigning to ev->member.
(virDomainEventGraphicsNewFromObj): Likewise.
The attached patch fixes a problem due to the mac match in iptables only
supporting --mac-source and no --mac-destination, thus it not being
symmetric. Therefore a rule like this one
<rule action='drop' direction='out'>
<all match='no' srcmacaddr='$MAC'/>
</rule>
should only have the MAC match on traffic leaving the VM and not test
for the same source MAC address on traffic that the VM receives.
* src/qemu/qemu_driver.c (qemudStartVMDaemon): Initialize "logfile"
to ensure that we don't use it uninitialized -- thus closing an
arbitrary file descriptor -- in the cleanup block.
* src/security/virt-aa-helper.c: adjust virt-aa-helper to handle pci
devices. Update valid_path() to have an override array to check against,
and add "/sys/devices/pci" to it. Then rename file_iterate_cb() to
file_iterate_hostdev_cb() and create file_iterate_pci_cb() based on it
To avoid an error when hitting the <seclabel...> definition
* src/security/virt-aa-helper.c: add VIR_DOMAIN_XML_INACTIVE flag
to virDomainDefParseString
Don't exit with error if the user unloaded the profile outside of
libvirt
* src/security/virt-aa-helper.c: check the exit error from apparmor_parser
before exiting with a failure
The calls to virExec() in security_apparmor.c when
invoking virt-aa-helper use VIR_EXEC_CLEAR_CAPS. When compiled without
libcap-ng, this is not a problem (it's effectively a no-op) but with
libcap-ng this causes MAC_ADMIN to be cleared. MAC_ADMIN is needed by
virt-aa-helper to manipulate apparmor profiles and without it VMs will
not start[1]. This patch calls virExec with the default VIR_EXEC_NONE
instead.
* src/security/security_apparmor.c: fallback to VIR_EXEC_NONE flags for
virExec of virt_aa_helper
Also define ESX_ERROR and ESX_VI_ERROR in a central place, instead of
defining them in each source file.
Add ESX_ERROR and ESX_VI_ERROR to the msg_gen_function list in cfg.mk.
Update po/POTFILES.in accordingly.
With Eric Blake's suggestions applied.
The following rule for direction 'in'
<rule direction='in' action='drop'>
<mac srcmacaddr='1:2:3:4:5:6'/>
</rule>
drops all traffic from the given mac address.
The following rule for direction 'out'
<rule direction='out' action='drop'>
<mac dstmacaddr='1:2:3:4:5:6'/>
</rule>
drops all traffic to the given mac address.
The following rule in direction 'inout'
<rule direction='inout' action='drop'>
<mac srcmacaddr='1:2:3:4:5:6'/>
</rule>
now drops all traffic from and to the given MAC address.
So far it would have dropped traffic from the given MAC address
and outgoing traffic with the given source MAC address, which is not useful
since the packets will always have the VM's MAC address as source
MAC address. The attached patch fixes this.
This is the last bug I currently know of and want to fix.
When starting up qemu VNC autoport guests, we were
only looking through ports 5900 to 6000, meaning we
were limited to 100 total clients. Increase that
limit to 65535 (the last available port), so we can
have up to 59635 VNC autoport guests.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
While playing around with def/newDef with the qemu code,
I noticed that newDef was *always* getting set to a value,
even when I didn't redefine the domain. I think the problem
is the virDomainLoadConfig is always doing virDomainAssignDef
regardless of whether the domain already exists in the hashtable.
In turn, virDomainAssignDef is assigning the definition (which
is actually a duplicate) to newDef. Fix this so that newDef stays
NULL until we actually have a new def.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
values. Rather use the strspn() function. Along with this cleanup the
initialization function for the code that used the regular expression
can also be removed.
The images are saved in /var/lib/libvirt/qemu/save/
and named $domainname.save . The directory is created appropriately
at daemon startup. When a domain is started while a saved image is
available, libvirt will try to load this saved image, and start the
domain as usual in case of failure. In any case the saved image is
discarded once the domain is created.
* src/qemu/qemu_conf.h: adds an extra save path to the driver config
* src/qemu/qemu_driver.c: implement the 3 new operations and handling
of the image directory
* src/remote/remote_protocol.x src/remote/remote_protocol.h
src/remote/remote_protocol.c src/remote/remote_driver.c: add the entry
points in the remote driver
* daemon/remote.c daemon/remote_dispatch_args.h
daemon/remote_dispatch_prototypes.h daemon/remote_dispatch_table.h:
and implement the daemon counterpart
virDomainManagedSave() is to be run on a running domain. Once the call
complete, as in virDomainSave() the domain is stopped upon completion,
but there is no restore counterpart as any order to start the domain
from the API would load the state from the managed file, similary if
the domain is autostarted when libvirtd starts.
Once a domain has restarted his managed save image is destroyed,
basically managed save image can only exist for a stopped domain,
for a running domain that would be by definition outdated data.
* include/libvirt/libvirt.h.in src/libvirt.c src/libvirt_public.syms:
adds the new entry points virDomainManagedSave(),
virDomainHasManagedSaveImage() and virDomainManagedSaveRemove()
* src/driver.h src/esx/esx_driver.c src/lxc/lxc_driver.c
src/opennebula/one_driver.c src/openvz/openvz_driver.c
src/phyp/phyp_driver.c src/qemu/qemu_driver.c src/vbox/vbox_tmpl.c
src/remote/remote_driver.c src/test/test_driver.c src/uml/uml_driver.c
src/xen/xen_driver.c: add corresponding new internal drivers entry
points
- ebtables requires that some of the command line parameters are passed as hex numbers; so have those attributes call a function that prints 16 and 8 bit integers as hex nunbers.
- ip6tables requires '--icmpv6-type' rather than '--icmp-type'
- ebtables complains about protocol identifiers lower than 0x600, so already discard anything lower than 0x600 in the parser
- make the protocol entry types more readable using a #define for its entries
- continue parsing a filtering rule even if a faulty entry is encountered; return an error value at the end and let the caller decide what to do with the rule's object
- fix an error message
The clock timer XML is being updated in the following ways (based on
further off-list discussion that was missed during the initial
implementation):
1) 'wallclock' is changed to 'track', and the possible values are 'boot'
(corresponds to old 'host'), 'guest', and 'wall'.
2) 'mode' has an additional value 'smpsafe'
3) when tickpolicy='catchup', there can be an optional sub-element of
timer called 'catchup':
<catchup threshold=123 slew=120 limit=10000/>
Those three values are all longs, always optional, and if they are present,
they are positive. Internally, 0 indicates "unspecified".
* docs/schemas/domain.rng: updated RNG definition to account for changes
* src/conf/domain_conf.h: change the C struct and enums to match changes.
* src/conf/domain_conf.c: timer parse and format functions changed to
handle the new selections and new element.
* src/libvirt_private.syms: *TimerWallclock* changes to *TimerTrack*
* src/qemu/qemu_conf.c: again, account for Wallclock --> Track change.
(suggested by Daniel Berrange, tested by Dan Kenigsberg)
virStorageFileGetMetadata will fail for disk images that are stored on
a root-squash NFS share that isn't world-readable.
SELinuxSetSecurityImageLabel is called during the startup of every
domain (as long as security_driver != "none"), and it will propogate
the error from virStorageFileGetMetadata, causing the domain startup
to fail. This is, however, a common scenario when qemu is run as a
non-root user and the disk image is stored on NFS.
Ignoring this failure (which doesn't matter in this case, since the
next thing done by SELinuxSetSecurityImageLabel - setting the file
context - will also fail (and that function already ignores failures
due to root-squash NFS) will allow us to continue bringing up the
domain. The result is that we don't need to disable the entire
security driver just because a domain's disk image is stored on
root-squashed NFS.
virFileReadLimFD is a poor fit for reading the header
of the restore file. The problem is that virFileReadLimFD
returns an error when there is more data after the amount
you ask to read, but that is *expected* in this case.
This patch is essentially a revert of
1a4d5c9543, but I don't think
that commit does what it says anyway. It purports to prevent
an unwarranted OOM error, but since virFileReadLimFD will
allocate memory up to the maximum anyway, the upper limit
on the total amount of memory allocated is the same for either
the old version or the new version. Since the old saferead
actually works and virFileReadLimFD does not, revert to
using saferead.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Received report of user crashing libvirtd with
virsh capabilities > capabilities.xml
virsh cpu-compare capabilities.xml
While user has been informed about proper usage of cpu-compare,
segfaulting libvirt should be avoided.
Do not parse CPU definition in virCPUDefParseXML() if XML is not
a 'cpu' node.
When a watchdog/IO error occurs, one of the possible actions that
QEMU might take is to pause the guest. In this scenario libvirt
needs to update its internal state for the VM, and emit a
lifecycle event:
VIR_DOMAIN_EVENT_SUSPENDED
with a detail being one of:
VIR_DOMAIN_EVENT_SUSPENDED_IOERROR
VIR_DOMAIN_EVENT_SUSPENDED_WATCHDOG
To future proof against possible QEMU support for multiple monitor
consoles, this patch also hooks into the 'STOPPED' event in QEMU
and emits a generic VIR_DOMAIN_EVENT_SUSPENDED_PAUSED event
* include/libvirt/libvirt.h.in: Add VIR_DOMAIN_EVENT_SUSPENDED_IOERROR
* src/qemu/qemu_driver.c: Update VM state to paused when IO error
or watchdog events occurrs
* src/qemu/qemu_monitor_json.c: Fix typo in disk IO event name
This also fixes a problem with MinGW's GCC on Windows. GCC complains
about the L modifier being unknown.
Parsing in pciIterDevices is stricter now and doesn't accept trailing
characters after the actual <domain>:<bus>:<slot>.<function> sequence
anymore.
Parsing in pciWaitForDeviceCleanup is also stricter now and expects
the <start>-<end> : <domain>:<bus>:<slot>.<function> sequence to be
terminated by \n.
Change domain from unsigned long long to unsigned int in
pciWaitForDeviceCleanup, because everywhere else domain is handled as
unsigned int too.
Parsing is stricter now and doesn't accept trailing characters
after the actual value or non-number strings anymore. atoi just
returns 0 in case it cannot parse a number from the given string.
Now an error is reported for such a string.
The switch from %lli to %lld in virCgroupGetValueI64 is intended,
as virCgroupGetValueU64 uses base 10 too, and virCgroupSetValueI64
uses %lld to format the number to string.
Parsing is stricter now and doesn't accept trailing characters
after the actual value anymore.
virParseVersionString uses virStrToLong_ui instead of sscanf.
This also fixes a bug in the UML driver, that always returned 0
as version number.
Introduce STRSKIP to check if a string has a certain prefix and
to skip this prefix.
found some cases where the output ended up not looking as expected. So
the following changes are in the patch below:
- if the protocol ID in the MAC header is an integer, just write it into
the datastructure without trying to find a corresponding string for it
and if none is found failing
- when writing the protocol ID as string, simply write it as integer if
no corresponding string can be found
- same changes for arpOpcode parsing and printing
- same changes for protocol ID in an IP packet
- DSCP value needs to be written into the data structure
- IP protocol version number is redundant at this level, so remove it
- parse the protocol ID found inside an IP packet not only as string but
also as uint8
- arrange the display of the src and destination masks to be shown after
the src and destination ip address respectively in the XML
- the existing libvirt IP address parser accepts for example '25' as an
IP address. I want this to be parsed as a CIDR type netmask. So try to
parse it as an integer first (CIDR netmask) and if that doesn't work as
a dotted IP address style netmask.
- instantiation of rules with MAC masks didn't work because they weren't
printed into a buffer, yet.
domain_conf.c:494: undefined reference to 'virNWFilterHashTableFree'
domain_conf.c:5107: undefined reference to 'virNWFilterFormatParamAttributes'
Add missing source to the proxy and disable XML parsing code in
nwfilter_params.c for a proxy build.
virStrToLong* guarantees (via strtol) that the end pointer will be set
to the point at which parsing stopped (even on failure, this point is
the start of the input string).
* src/esx/esx_driver.c (esxGetVersion): Remove pointless
conditional.
* src/qemu/qemu_conf.c (qemuParseCommandLinePCI)
(qemuParseCommandLineUSB, qemuParseCommandLineSmp): Likewise.
* src/qemu/qemu_monitor_text.c
(qemuMonitorTextGetMigrationStatus): Likewise.
Check that interface names only contain valid characters. Blank them out
otherwise.
Valid characters in this code are currently a-z,A-Z,0-9, '-' and '_'.
The Python script generates the mappings based on the type descriptions
in the esx_vi_generator.input file.
This also improves the inheritance handling and allows to get rid of the
ugly, inflexible, and error prone _base/_super approach. Now every struct
that represents a SOAP type contains a _type member, that allows to
recreate C++-like dynamic dispatch for "method" calls in C.
* src/Makefile.am: adds a few missing header files in the associated
file variables, it's needed otherwise the missing headers breaks
compilation from a distribution tarball
This patch changes the network filtering code to use libvirt's existing
IPv4 and IPv6 address parsers/printers rather than my self-written ones.
I am introducing a new function in network.c that counts the number of
bits in a netmask and ensures that the given address is indeed a netmask,
return -1 on error or values of 0-32 for IPv4 addresses and 0-128 for
IPv6 addresses. I then based the function checking for valid netmask
on invoking this function.
This patch adds IPv6 filtering support for the following protocols:
- tcp-ipv6
- udp-ipv6
- udplite-ipv6
- esp-ipv6
- ah-ipv6
- sctp-ipv6
- all-ipv6
- icmpv6
Many of the IPv4 data structure could be re-used for IPv6 support.
Since ip6tables also supports pretty much the same command line parameters
as iptables does, also much of the code could be re-used and now
command lines are invoked with the ip(6)tables tool parameter passed
through the functions as a parameter.
This patch removes the driver dependency from nwfilter_conf.c and moves
a callback function calling into the driver into
nwfilter_gentech_driver.c and passes a pointer to that callback function
upon initialization of nwfilter_conf.c.
Since the timers are defined to cover all possible config cases for
several different hypervisors, many of these possibilities generate an
error on qemu. Here is what is currently supported:
RTC: If the -rtc commandline option is available, allow setting
"clock=host"
or "clock=vm" based on the rtc timer clock='host|guest' value. Also
add "driftfix=slew" if the tickpolicy is 'catchup', or add nothing
if
tickpolicy is 'delay'. (Other tickpolicies will raise an error).
If -rtc isn't available, but -rtc-td-hack is, add that option
if the tickpolicy is 'catchup', add -rtc-td-hack, if it is 'delay'
add nothing, and if it's anything else, raise an error.
PIT: If -no-kvm-pit-reinjection is available, and tickpolicy is
'delay', add that option. if tickpolicy is 'catchup', do
nothing. Anything else --> raise an error.
If -no-kvm-pit-reinjection *isn't* available, but -tdf is, when
tickpolicy is 'catchup' add -tdf. If it's 'delay', do
nothing. Anything else --> raise an error.
If neither of those commandline options is available, and
tickpolicy is anything other than 'delay' (or unspecified), raise
an error.
HPET: If -no-hpet flag is available and present='no', add -no-hpet.
If -no-hpet is not available, and present='yes', raise an error.
If present is unspecified, the default is to do whatever this
particular qemu does by default, so don't raise an error.
All other timer types are unsupported by QEMU, so they will raise an
error.
* src/qemu/qemu_conf.c: extend qemuBuildClockArgStr() to generate the
command line arguments for the new options
* src/qemu/qemu_conf.h: define 4 new flags
* src/qemu/qemu_conf.c: check the help text of qemu for presence of
features indicated by each flag.
* tests/qemuhelptest.c: add appropriate flags into the masks for each test
This extension is described in
http://www.redhat.com/archives/libvir-list/2010-March/msg00304.html
Currently all attributes are optional, except name.
* src/conf/domain_conf.h: add data definition for virDomainTimerDef
and add a list of them to virDomainClockDef
* src/conf/domain_conf.c: XML parser and formatter for a timer inside a clock
* src/libvirt_private.syms: add new Timer enum helper functions to symbols
The QEMU cpu affinity is used in NUMA scenarios to ensure that
guest memory is allocated from a specific node. Normally memory
is allocate on demand in vCPU threads, but when using hugepages
the initial thread leader allocates memory upfront. libvirt was
not setting affinity of the thread leader, or I/O threads. This
patch changes the code to set the process affinity in between
the fork()/exec() of QEMU. This ensures that every single QEMU
thread gets the affinity
* src/qemu/qemu_driver.c: Set affinity on entire QEMU process
at startup
This patch fixes the 'make check' runs for me which, under certain
circumstances and login configurations, did invoke popups requesting
authentication. I removed the parameter conn from being passed into the
error reporting function.
* src/conf/nwfilter_conf.h src/conf/nwfilter_conf.c: remove conn from
error reporting parameters.
Right now this implements only 2 basic hooks:
- before the lxc control process is being launched
- after the lxc control process is terminated
the XML description of the domain is passed to the hook script stdin
/etc/libvirt/hook/lxc
* src/lxc/lxc_driver.c: implement synchronous script hooks for LXC
at domain startup and end
Right now this implements only 2 basic hooks:
- before the qemu process is being launched
- after the qemu process is terminated
the XML description of the domain is passed to the hook script stdin
/etc/libvirt/hook/qemu
* src/qemu/qemu_driver.c: implement synchronous script hooks for QEmu
at domain startup and end
This exports 3 basic routines:
- virHookInitialize() initializing the hook support by looking for
scripts availability
- virHookPresent() used to test if there is a hook for a given driver
- virHookCall() which actually calls a synchronous script hook with
the needed parameters
Note that this doesn't expose any public API except for the locations
and arguments passed to the scripts
* src/Makefile.am: add the 2 new files
* src/util/hooks.h src/util/hooks.c: implements the 3 functions
* src/libvirt_private.syms: export the 3 symbols internally
* po/POTFILES.in: add src/util/hooks.c to translatables modules
used to read the data from virExec stdout/err file descriptors
* src/util/util.c src/util/util.h: not static anymore and export it
* src/libvirt_private.syms: allow access internally
This flag is used in migration prepare step to send updated XML
definition of a guest.
Also ``virsh dumpxml --update-cpu [--inactive] guest'' command can be
used to see the updated CPU requirements.
Useful mainly for migration. cpuUpdate changes guest CPU requirements in
the following way:
- match == "strict" || match == "exact"
- optional features which are supported by host CPU are changed into
required features
- optional features which are not supported by host CPU are disabled
- all other features remain untouched
- match == "minimum"
- match is changed into "exact"
- optional features and all features not mentioned in guest CPU
specification which are supported by host CPU become required
features
- other optional features are disabled
- all other features remain untouched
This ensures that no feature will suddenly disappear from the guest
after migration.
When a domain is defined on host1, migrated to host2 and then migrated
back to host1, its current configuration would overwrite the libvirtd's
in-memory copy of persistent configuration of that domain. This is not
desired as we want to preserve the persistent configuration untouched.
This patch introduces new 'live' parameter to virDomainAssignDef.
Passing 'true' for 'live' means the configuration passed to
virDomainAssignDef describes a configuration of live instance of the
domain. This applies for saved domains which are being restored or for
incoming domains during migration.
All callers have been changed to pass the appropriate value.
* Fixes per feedback from Dan and Daniel
* Added test datafiles
* Re-disabled JSON flags
* Added code to print the error policy attribute when generating XML
* Re-add empty tag
This patch adds support for L3/L4 filtering using iptables. This adds
support for 'tcp', 'udp', 'icmp', 'igmp', 'sctp' etc. filtering.
As mentioned in the introduction, a .c file provided by this patch
is #include'd into a .c file. This will need work, but should be alright
for review.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
This patch adds IPv6 support for the ebtables layer. Since the parser
etc. are all parameterized, it was fairly easy to add this...
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
Add support for Qemu to have firewall rules applied and removed on VM
startup and shutdown respectively. This patch also provides support for
the updating of a filter that causes all VMs that reference the filter
to have their ebtables/iptables rules updated.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
This patch implements the core driver and provides
- management functionality for managing the filter XMLs
- compiling the internal filter representation into ebtables rules
- applying ebtables rules on a network (tap,macvtap) interface
- tearing down ebtables rules that were applied on behalf of an
interface
- updating of filters while VMs are running and causing the firewalls to
be rebuilt
- other bits and pieces
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
This patch adds XML processing for the network filter schema
and extends the domain XML processing to parse the top level
referenced filter along with potentially provided parameters
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
Signed-off-by: Gerhard Stenzel <gerhard.stenzel@de.ibm.com>
This patch adds the definition of the wire format for RPC calls
and implementation of the RPC client & server code
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
This patch adds the implementation of the public API for the network
filtering (ACL) extensions to libvirt.c .
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
This patch adds recursive locks necessary due to the processing of
network filter XML that can reference other network filters, including
references that cause looks. Loops in the XML are prevented but their
detection requires recursive locks.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
To find out where the net type 'direct' needs to be handled I introduced
the 'enum virDomainNetType' in the virDomainNetDef structure and let the
compiler tell me where the case statement is missing. Then I added the
unhandled device statement to the UML driver.
* src/conf/domain_conf.h: change _virDomainNetDef type from int to
virDomainNetType enum
* src/conf/domain_conf.c src/lxc/lxc_driver.c src/qemu/qemu_conf.c
src/uml/uml_conf.c: make sure all enum cases are properly handled
in switches
Use the new virDomainUpdateDeviceFlags API to allow the VNC password
to be changed on the fly
* src/internal.h: Define STREQ_NULLABLE() which is like STREQ()
but does not crash if either argument is NULL, and treats two
NULLs as equal.
* src/libvirt_private.syms: Export virDomainGraphicsTypeToString
* src/qemu/qemu_driver.c: Support VNC password change on a live
machine
* src/qemu/qemu_monitor.c: Disable crazy debugging info. Treat a
NULL password as "" (empty string), allowing passwords to be
disabled in the monitor
Expand the parser for the standalone <device> XML format to
allow inclusion of the <graphics> device type
* src/conf/domain_conf.h: Add virDomainGraphicsDef to
the virDomainDeviceDef struct
* src/conf/domain_conf.c: Wire up parser for virDomainGraphicsDef
to virDomainDeviceDefParse method
To allow the new virDomainUpdateDeviceFlags() API to be universally
used with all drivers, this patch adds an impl to all the current
drivers which support CDROM or Floppy disk media change via the
current virDomainAttachDeviceFlags API
* src/qemu/qemu_driver.c, src/vbox/vbox_tmpl.c,
src/xen/proxy_internal.c, src/xen/xen_driver.c,
src/xen/xend_internal.c: Implement media change via the
virDomainUpdateDeviceFlags API
* src/xen/xen_driver.h, src/xen/xen_hypervisor.c,
src/xen/xen_inotify.c, src/xen/xm_internal.c,
src/xen/xs_internal.c: Stubs for Xen driver entry points
This defines the wire format for the new virDomainUpdateDeviceFlags()
API, and implements the server & client side of the marshalling code.
* daemon/remote.c: Server side dispatch for virDomainUpdateDeviceFlags
* src/remote/remote_driver.c: Client side serialization for
virDomainUpdateDeviceFlags
* src/remote/remote_protocol.x: Define wire format for
virDomainUpdateDeviceFlags
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h,
daemon/remote_dispatch_table.h, src/remote/remote_protocol.c,
src/remote/remote_protocol.h: Re-generate code
The current virDomainAttachDevice API can be (ab)used to change
the media of an existing CDROM/Floppy device. Going forward there
will be more devices that can be configured on the fly and overloading
virDomainAttachDevice for this is not too pleasant. This patch adds
a new virDomainUpdateDeviceFlags() explicitly just for modifying
existing devices.
* include/libvirt/libvirt.h.in: Add virDomainUpdateDeviceFlags
* src/driver.h: Internal API for virDomainUpdateDeviceFlags
* src/libvirt.c, src/libvirt_public.syms: Glue public API to
driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c, src/uml/uml_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Add
stubs for new driver entry point
This introduces a new event type
VIR_DOMAIN_EVENT_ID_GRAPHICS
The same event can be emitted in 3 scenarios
typedef enum {
VIR_DOMAIN_EVENT_GRAPHICS_CONNECT = 0,
VIR_DOMAIN_EVENT_GRAPHICS_INITIALIZE,
VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT,
} virDomainEventGraphicsPhase;
Connect/disconnect are triggered at socket accept/close.
The initialize phase is immediately after the protocol
setup and authentication has completed. ie when the
client is authorized and about to start interacting with
the graphical desktop
This event comes with *a lot* of potential information
- IP address, port & address family of client
- IP address, port & address family of server
- Authentication scheme (arbitrary string)
- Authenticated subject identity. A subject may have
multiple identities with some authentication schemes.
For example, vencrypt+sasl results in a x509dname
and saslUsername identities.
This results in a very complicated callback :-(
typedef enum {
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4,
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV6,
} virDomainEventGraphicsAddressType;
struct _virDomainEventGraphicsAddress {
int family;
const char *node;
const char *service;
};
typedef struct _virDomainEventGraphicsAddress virDomainEventGraphicsAddress;
typedef virDomainEventGraphicsAddress *virDomainEventGraphicsAddressPtr;
struct _virDomainEventGraphicsSubject {
int nidentity;
struct {
const char *type;
const char *name;
} *identities;
};
typedef struct _virDomainEventGraphicsSubject virDomainEventGraphicsSubject;
typedef virDomainEventGraphicsSubject *virDomainEventGraphicsSubjectPtr;
typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn,
virDomainPtr dom,
int phase,
virDomainEventGraphicsAddressPtr local,
virDomainEventGraphicsAddressPtr remote,
const char *authScheme,
virDomainEventGraphicsSubjectPtr subject,
void *opaque);
The wire protocol is similarly complex
struct remote_domain_event_graphics_address {
int family;
remote_nonnull_string node;
remote_nonnull_string service;
};
const REMOTE_DOMAIN_EVENT_GRAPHICS_IDENTITY_MAX = 20;
struct remote_domain_event_graphics_identity {
remote_nonnull_string type;
remote_nonnull_string name;
};
struct remote_domain_event_graphics_msg {
remote_nonnull_domain dom;
int phase;
remote_domain_event_graphics_address local;
remote_domain_event_graphics_address remote;
remote_nonnull_string authScheme;
remote_domain_event_graphics_identity subject<REMOTE_DOMAIN_EVENT_GRAPHICS_IDENTITY_MAX>;
};
This is currently implemented in QEMU for the VNC graphics
protocol, but designed to be usable with SPICE graphics in
the future too.
* daemon/remote.c: Dispatch graphics events to client
* examples/domain-events/events-c/event-test.c: Watch for
graphics events
* include/libvirt/libvirt.h.in: Define new graphics event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle graphics events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for VNC events and emit a libvirt graphics event
* src/remote/remote_driver.c: Receive and dispatch graphics
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
graphics events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for VNC_CONNECTED,
VNC_INITIALIZED & VNC_DISCONNETED events from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_IO_ERROR
This event includes the action that is about to be taken
as a result of the watchdog triggering
typedef enum {
VIR_DOMAIN_EVENT_IO_ERROR_NONE = 0,
VIR_DOMAIN_EVENT_IO_ERROR_PAUSE,
VIR_DOMAIN_EVENT_IO_ERROR_REPORT,
} virDomainEventIOErrorAction;
In addition it has the source path of the disk that had the
error and its unique device alias. It does not include the
target device name (/dev/sda), since this would preclude
triggering IO errors from other file backed devices (eg
serial ports connected to a file)
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *srcPath,
const char *devAlias,
int action,
void *opaque);
This is currently wired up to the QEMU block IO error events
* daemon/remote.c: Dispatch IO error events to client
* examples/domain-events/events-c/event-test.c: Watch for
IO error events
* include/libvirt/libvirt.h.in: Define new IO error event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle IO error events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block IO errors and emit a libvirt IO error event
* src/remote/remote_driver.c: Receive and dispatch IO error
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
IO error events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_IO_ERROR event
from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_WATCHDOG
This event includes the action that is about to be taken
as a result of the watchdog triggering
typedef enum {
VIR_DOMAIN_EVENT_WATCHDOG_NONE = 0,
VIR_DOMAIN_EVENT_WATCHDOG_PAUSE,
VIR_DOMAIN_EVENT_WATCHDOG_RESET,
VIR_DOMAIN_EVENT_WATCHDOG_POWEROFF,
VIR_DOMAIN_EVENT_WATCHDOG_SHUTDOWN,
VIR_DOMAIN_EVENT_WATCHDOG_DEBUG,
} virDomainEventWatchdogAction;
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventWatchdogCallback)(virConnectPtr conn,
virDomainPtr dom,
int action,
void *opaque);
* daemon/remote.c: Dispatch watchdog events to client
* examples/domain-events/events-c/event-test.c: Watch for
watchdog events
* include/libvirt/libvirt.h.in: Define new watchdg event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle watchdog events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for watchdogs and emit a libvirt watchdog event
* src/remote/remote_driver.c: Receive and dispatch watchdog
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
watchdog events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for WATCHDOG event
from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_RTC_CHANGE
This event includes the new UTC offset measured in seconds.
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventRTCChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
long long utcoffset,
void *opaque);
If the guest XML configuration for the <clock> is set to
offset='variable', then the XML will automatically be
updated with the new UTC offset value. This ensures that
during migration/save/restore the new offset is preserved.
* daemon/remote.c: Dispatch RTC change events to client
* examples/domain-events/events-c/event-test.c: Watch for
RTC change events
* include/libvirt/libvirt.h.in: Define new RTC change event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle RTC change events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for RTC changes and emit a libvirt RTC change event
* src/remote/remote_driver.c: Receive and dispatch RTC change
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
RTC change events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for RTC_CHANGE event
from QEMU monitor
The reboot event is not a normal lifecycle event, since the
virtual machine on the host does not change state. Rather the
guest OS is resetting the virtual CPUs. ie, the QEMU process
does not restart. Thus, this does not belong in the current
lifecycle events callback.
This introduces a new event type
VIR_DOMAIN_EVENT_ID_REBOOT
It takes no parameters, besides the virDomainPtr, so it can
use the generic callback signature.
* daemon/remote.c: Dispatch reboot events to client
* examples/domain-events/events-c/event-test.c: Watch for
reboot events
* include/libvirt/libvirt.h.in: Define new reboot event ID
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle reboot events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for reboots and emit a libvirt reboot event
* src/remote/remote_driver.c: Receive and dispatch reboot
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
reboot events
To avoid confusion, rename the current REMOTE_PROC_DOMAIN_EVENT
message to REMOTE_PROC_DOMAIN_EVENT_LIFECYCLE. This does not
cause ABI problems, since the names are only relevant at the source
code level. On the wire they encoding is a plain integer whose
value does not change
* src/remote/remote_protocol.x: Rename REMOTE_PROC_DOMAIN_EVENT
to REMOTE_PROC_DOMAIN_EVENT_LIFECYCLE.
* daemon/remote.c, src/remote/remote_driver.c: Update code for
renamed event
This wires up the remote driver to handle the new events APIs.
The public API allows an application to request a callback filters
events to a specific domain object, and register multiple callbacks
for the same event type. On the wire there are two strategies for
this
- Register multiple callbacks with the remote daemon, each
with filtering as needed
- Register only one callback per event type, with no filtering
Both approaches have potential inefficiency. In the first scheme,
the same event gets sent over the wire many times if multiple
callbacks are registered. With the second scheme, unneccessary
events get sent over the wire if a per-domain filter is set on
the client. The second scheme is far easier to implement though,
so this patch takes that approach.
* daemon/dispatch.h: Don't export remoteRelayDomainEvent since it
is no longer needed for unregistering callbacks, instead the
unique callback ID is used
* daemon/libvirtd.c, daemon/libvirtd.h: Track and unregister
callbacks based on callback ID, instead of function pointer
* daemon/remote.c: Switch over to using virConnectDomainEventRegisterAny
instead of legacy virConnectDomainEventRegister function. Refactor
remoteDispatchDomainEventSend() to cope with arbitrary event types
* src/driver.h, src/driver.c: Move verify() call into source file
instead of header, to avoid polluting the global namespace with
the verify function name
* src/remote/remote_driver.c: Implement new APIs for event
registration. Refactor processCallDispatchMessage() to cope
with arbitrary incoming event types. Merge remoteDomainQueueEvent()
into processCallDispatchMessage() to avoid duplication of code.
Rename remoteDomainReadEvent() to remoteDomainReadEventLifecycle()
* src/remote/remote_protocol.x: Define wire format for the new
virConnectDomainEventRegisterAny and virConnectDomainEventDeregisterAny
functions
The libvirtd daemon impl will need to switch over to using the
new event APIs. To make this simpler, ensure all drivers currently
providing events support both the new APIs and old APIs.
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c, src/test/test_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Implement the new
virConnectDomainEvent(Dereg|Reg)isterAny driver entry points
The current internal domain events API tracks callbacks based on
the function pointer, and only supports lifecycle events. This
adds new internal APIs for registering callbacks for other event
types. These new APIs are postfixed with the word 'ID' to indicate
that they operated based on event ID, instead of hardcoded to
lifecycle events
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Add new APIs for handling callbacks
for non-lifecycle events
The internal domain events APIs are designed to handle the lifecycle
events. This needs to be refactored to allow arbitrary new event
types to be handled.
* The signature of virDomainEventDispatchFunc changes to use
virConnectDomainEventGenericCallback instead of the lifecycle
event specific virConnectDomainEventCallback
* Every registered callback gains a unique ID to allow its
removal based on ID, instead of function pointer
* Every registered callback gains an 'eventID' to allow callbacks
for different types of events to be distinguished
* virDomainEventDispatch is adapted to filter out callbacks
whose eventID does not match the eventID of the event being
dispatched
* virDomainEventDispatch is adapted to filter based on the
domain name and uuid, if this filter is set for a callback.
* virDomainEvent type/detail fields are moved into a union to
allow different data fields for other types of events to be
added later
* src/conf/domain_event.h, src/conf/domain_event.c: Refactor
to allow handling of different types of events
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/xen/xen_driver.c: Change dispatch function signature
to use virConnectDomainEventGenericCallback
The virtual box driver was directly accesing the domain events
structs instead of using the APIs provided. To prevent this kind
of abuse, make the struct definitions private, forcing use of the
internal APIs. This requires adding one extra internal API.
* src/conf/domain_event.h, src/conf/domain_event.c: Move
virDomainEventCallback and virDomainEvent structs into
the source file instead of header
* src/vbox/vbox_tmpl.c: Use official APIs for dispatching domain
events instead of accessing structs directly.
The current API for domain events has a number of problems
- Only allows for domain lifecycle change events
- Does not allow the same callback to be registered multiple times
- Does not allow filtering of events to a specific domain
This introduces a new more general purpose domain events API
typedef enum {
VIR_DOMAIN_EVENT_ID_LIFECYCLE = 0, /* virConnectDomainEventCallback */
...more events later..
}
int virConnectDomainEventRegisterAny(virConnectPtr conn,
virDomainPtr dom, /* Optional, to filter */
int eventID,
virConnectDomainEventGenericCallback cb,
void *opaque,
virFreeCallback freecb);
int virConnectDomainEventDeregisterAny(virConnectPtr conn,
int callbackID);
Since different event types can received different data in the callback,
the API is defined with a generic callback. Specific events will each
have a custom signature for their callback. Thus when registering an
event it is neccessary to cast the callback to the generic signature
eg
int myDomainEventCallback(virConnectPtr conn,
virDomainPtr dom,
int event,
int detail,
void *opaque)
{
...
}
virConnectDomainEventRegisterAny(conn, NULL,
VIR_DOMAIN_EVENT_ID_LIFECYCLE,
VIR_DOMAIN_EVENT_CALLBACK(myDomainEventCallback)
NULL, NULL);
The VIR_DOMAIN_EVENT_CALLBACK() macro simply does a "bad" cast
to the generic signature
* include/libvirt/libvirt.h.in: Define new APIs for registering
domain events
* src/driver.h: Internal driver entry points for new events APIs
* src/libvirt.c: Wire up public API to driver API for events APIs
* src/libvirt_public.syms: Export new APIs
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c,
src/xenapi/xenapi_driver.c: Stub out new API entries
The keys of entries in a VMX file are case insensitive. Both scsi0:1.fileName
and scsi0:1.filename are valid. Therefore, make the conf parser compare names
case insensitive in VMX mode to accept every capitalization variation.
Also add test cases for this.
<source file=''/> results in def->disks[i]->src == NULL. But
vboxDomainDefineXML and vboxDomainAttachDevice didn't check
def->disks[i]->src for NULL and expected it to be a valid string.
Add checks for def->disks[i]->src != NULL to fix the segfault.
* src/Makefile.am (augeas-check): New target, just to give the existing
rule a name. At the same time, prefix the commands with $(AM_V_GEN),
to avoid unexpected build output with V=0 which is the default.
Before, this function would blindly accept an invalid def->dst
and then abuse the idx=-1 it would get from virDiskNameToIndex,
when passing it invalid strings like "xvda:disk" and "sda1".
Now, this function returns -1 upon failure.
* src/conf/domain_conf.c (virDomainDiskDefAssignAddress): as above.
Update callers.
* src/conf/domain_conf.h: Update prototype.
* src/qemu/qemu_conf.c: Update callers.
Even if gnulib can provide stubs, it won't help that much. So just
replace affected util functions (virFileOperation and virDirCreate)
with stubs on Windows. Both functions aren't used on libvirt's
client side, so this is fine for MinGW builds.
This symbol is conditional, it would need to be exported conditional to
work properly with MinGW. So just remove it, as no other driver register
function is listed in the symbols files.
If esxVI_String_DeepCopyValue or esxVI_SelectionSpec_AppendToList fail
then selectionSpec would leak. Add a free call in the failure path to
fix the leak.
This is actually a consequence of the reworked required parameter
checking: Unify the required parameter check into a Validate function
instead of doing it separately im the (de)serialization part.
The required parameter checking for the mapped methods parameter was
done in the (de)serialize functions before. Now it's explicitly done
in the mapped method itself.
Invoking virDomainSetMemory() on lxc driver results in libvirtd
segfault when cgroups has not been configured on the host.
Ensure driver->cgroup is non-null before invoking
virCgroupForDomain(). To prevent similar segfaults in the future,
ensure driver parameter to virCgroupForDomain() is non-null before
dereferencing.
"virsh dominfo <vm>" crashes if there's no primary security driver set
since we only intialize the secmodel.model and secmodel.doi if we have
one. Attached patch checks for securityPrimaryDriver instead of
securityDriver since the later is always set in qemudSecurityInit().
Closes: http://bugs.debian.org/574359
Attempt to turn on vhost-net mode for devices of type NETWORK, BRIDGE,
and DIRECT (macvtap).
* src/qemu/qemu_conf.h: add vhostfd to qemuBuildHostNetStr prototype
add qemudOpenVhostNet prototype new flag to set when :,vhost=" found in
qemu help
* src/qemu/qemu_conf.c: * set QEMUD_CMD_FLAG_VNET_HOST is ",vhost=" found
in qemu help
- qemudOpenVhostNet - opens /dev/vhost-net to pass to qemu if everything
is in place to use it.
- qemuBuildHostNetStr - add vhostfd to commandline if it's not empty
(higher levels decide whether or not to fill it in)
- qemudBuildCommandLine - if /dev/vhost-net is successfully opened, add
its fd to tapfds array so it isn't closed on qemu exec, and populate
vhostfd_name to be passed in to commandline builder.
* src/qemu/qemu_driver.c: add filler 0 for new arg to qemuBuildHostNetStr,
along with a note that this must be implemented in order for hot-plug of
vhost-net virtio devices to work properly (once qemu "netdev_add" monitor
command is implemented).
POSIX states that creation of a mutex with default attributes
is unspecified whether the mutex is recursive or non-recursive.
We specifically want non-recursive (deadlock is desirable in
flushing out coding bugs that used our mutex incorrectly).
* src/util/threads-pthread.c (virMutexInit): Specifically request
non-recursive mutex, rather than relying on unspecified default.
Currently no command can be sent to a qemu process while another job is
active. This patch adds support for signaling long-running jobs (such as
migration) so that other threads may request predefined operations to be
done during such jobs. Two signals are defined so far:
- QEMU_JOB_SIGNAL_CANCEL
- QEMU_JOB_SIGNAL_SUSPEND
The first one is used by qemuDomainAbortJob.
The second one is used by qemudDomainSuspend for suspending a domain
during migration, which allows for changing live migration into offline
migration. However, there is a small issue in the way qemudDomainSuspend
is currently implemented for migrating domains. The API calls returns
immediately after signaling migration job which means it is asynchronous
in this specific case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
* src/qemu/qemu_driver.c (qemudDomainAttachSCSIDisk): The ".controller"
member is an index, and *may* be 0. As such, the commit that we're
reverting broke SCSI disk hot-plug on controller 0.
Reported by Wolfgang Mauerer.
We need to call PrepareHostdevs to determine the USB device path before
any security calls. PrepareHostUSBDevices was also incorrectly skipping
all USB devices.
* src/qemu/qemu_conf.c: add the ",readonly=on" for read-only disks
and also parse it back in qemuParseCommandLineDisk()
* tests/qemuxml2argvtest.c
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-readonly-disk.args
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-readonly-disk.xml:
add a specific regression test
The nodeGetInfo code was always assuming that machine had a
single NUMA node, which is not correct. The good news is that
libnuma gives us this information pretty easily, so let's
properly report it.
NOTE: With recent hardware starting to support CPU hot-add
and hot-remove, both this code and the nodeCapsInitNUMA()
code are quickly going to become obsolete. We'll have to
think of a more dynamic solution for dealing with NUMA
nodes and CPUs that can come and go at will.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Currently if you dump the core of a qemu guest with
qemudDomainCoreDump, subsequent commands will hang
up libvirtd. This is because qemudDomainCoreDump
uses qemuDomainWaitForMigrationComplete, which expects
the qemuDriverLock to be held when it's called. This
patch does the simple thing and moves the qemuDriveUnlock
to the end of the qemudDomainCoreDump so that the driver
lock is held for the entirety of the call (as it is done
in qemudDomainSave). We will probably want to make the
lock more fine-grained than that in the future, but
we can fix both qemudDomainCoreDump and qemudDomainSave
at the same time.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The code to add job support into libvirtd caused a problem
in qemudDomainSetVcpus. In particular, a qemuDomainObjEndJob()
call was added at the end of the function, but a
corresponding qemuDomainObjBeginJob() was not. Additionally,
a call to qemuDomainObj{Enter,Exit}Monitor() was also missed
in qemudDomainHotplugVcpus(). These missing calls conspired to
cause a hang in the libvirtd process after the command was
finished. Fix this by adding the missing calls.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
As previously discussed[1], this patch removes the
qemudDomainSetMaxMemory() function, since it doesn't
work. This means that instead of getting somewhat
cryptic errors, you will now get:
error: Unable to change MaxMemorySize
error: this function is not supported by the hypervisor: virDomainSetMaxMemory
Which describes the situation perfectly.
[1] https://www.redhat.com/archives/libvir-list/2010-February/msg00928.html
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When using the JSON monitor, qemuMonitorJSONExtractCPUInfo
was returning 0 on success. Unfortunately, higher levels of
the cpuinfo code expect that it returns the number of CPUs
it found on success. This one-line patch fixes it so that
it returns the correct number. This makes "virsh vcpuinfo <domain>"
work when using the JSON monitor.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
As pointed out by eblake, I made a real hash of the
nodeinfo code with commit
aa2f6f96dd. This patch
cleans it up:
1) Do more work at compile time instead of runtime (minor)
2) Properly handle the hex digits that come from
/sys/devices/system/cpu/cpu*/topology/thread_siblings
3) Fix up some error paths that could cause SEGV
4) Used unsigned's for the cpu numbers (cpu -1 doesn't
make any sense)
Along with the recent patch from jdenemar to zero out
the nodeinfo structure, I've re-tested this on the
machines having the problems, and it seems to be good.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
It is a bad idea to call gettext on an already-translated
string. In cases where a string must be translated separately
from where it is exposed to xgettext, the gettext manual
recommends the idiom of N_() wrapping gettext_noop for
marking the string.
* src/internal.h (N_): Fix definition to match gettext manual.
* tools/virsh.c: (cmdHelp, cmdList, cmdDomstate, cmdDominfo)
(cmdVcpuinfo, vshUsage): Replace incorrect use of N_ with _.
(vshCmddefHelp): Likewise. Mark C format strings appropriately.
The nodeinfo structure wasn't initialized in qemu driver and with the
recent change in CPU topology parsing, old value of nodeinfo->sockets
could be used and incremented giving totally bogus results.
Let's just wipe the structure completely.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
A few more non-literal format strings in error log messages have crept
in. Fix them in the standard way - turn the format string into "%s"
with the original string as the arg.
If a special cache strategy for a disk has been specified in a domain
definition, but no driverName has been set, virDomainGetXMLDesc would not
include the <driver> tag at all.
* src/conf/domain_conf.c: make sure any <driver> tag setting is
serialized if set.
The current code for "nodeinfo" is pretty naive
about socket and thread information. To determine the
sockets, it just takes the number of cpus and divides
by the number of cores. For the thread count, it always
sets it to 1. With more recent Intel machines, however,
hyperthreading is again an option, meaning that these
heuristics no longer work and give bogus numbers. This
patch goes through /sys to get the additional
information so we properly report it.
Note that I had to edit the tests not to report on
socket and thread counts, since these are determined
dynamically now.
v2: As pointed out by Eric Blake, gnulib provides
count-one-bits (which is LGPLv2+). Use it instead
of a hand-coded popcnt.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When adding domainMemoryStats API support for the qemu driver, I didn't
follow the locking rules exactly. The job condition must be held when
executing monitor commands. This corrects the segfaults I was seeing
when calling domainMemoryStats in a multi-threaded environment.
* src/qemu/qemu_driver.c: in qemudDomainMemoryStats() add missing
calls to qemuDomainObjBeginJob/qemuDomainObjEndJob
doTunnelSendAll function (used by QEMU migration) uses a 64k buffer on
the stack, which could be problematic. This patch replaces that with a
buffer from the heap.
While in the neighborhood, this patch also improves error reporting in
the case that saferead fails - previously, virStreamAbort() was called
(resetting errno) before reporting the error. It's been changed to
report the error first.
* src/qemu/qemu_driver.c: fix doTunnelSendAll() to use a malloc'ed
buffer
* src/qemu/qemu_driver.c (qemudDomainAttachSCSIDisk): Handle
the (theoretical) case of an empty controller list, so that
clang does not think the subsequent dereference of "cont"
would dereference an undefined variable (due to preceding
loop not iterating even once).
* src/qemu/qemu_driver.c (qemudDomainRestore): A corrupt save file
(in particular, a too-large header.xml_len value) would cause an
unwarranted out-of-memory error. Do not trust the just-read
header.xml_len. Instead, merely use that as a hint, and
read/allocate up to that number of bytes from the file.
Also verify that header.xml_len is positive; if it were negative,
passing it to virFileReadLimFD could cause trouble.
to saferead_lim, which interprets it as a size_t.
* src/util/util.c (virFileReadLimFD): Do not malfunction when
maxlen < -1. Return -1,EINVAL in that case. Handle maxlen==0
in the same manner.
* src/xen/proxy_internal.c (xenProxyDomainDumpXML): An invalid packet
could include a too-large "ans.len" value, which would make us allocate
too much memory and then copy data from beyond the end of "ans",
possibly evoking a segfault. Ensure that the value we use is no
larger than the remaining portion of "ans".
Also, change unnecessary memmove to memcpy (src and dest obviously
do not overlap, so no need to use memmove).
(xenProxyDomainGetOSType): Likewise.
(xenProxyGetCapabilities): Likewise.
The code erroneously searched the entire "reply" for a comma, when
its intent was to search only that portion after "balloon: actual="
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetMemoryStats):
Search for "," only starting *after* the BALLOON_PREFIX string.
Otherwise, we'd be more prone to false positives.
Changeset
commit 5073aa994a
Author: Cole Robinson <crobinso@redhat.com>
Date: Mon Jan 11 11:40:46 2010 -0500
Added support for product/vendor based passthrough, but it only
worked at the security driver layer. The main guest XML config
was not updated with the resolved bus/device ID. When the QEMU
argv refactoring removed use of product/vendor, this then broke
launching guests.
THe solution is to move the product/vendor resolution up a layer
into the QEMU driver. So the first thing QEMU does is resolve
the product/vendor to a bus/device and updates the XML config
with this info. The rest of the code, including security drivers
and QEMU argv generated can now rely on bus/device always being
set.
* src/util/hostusb.c, src/util/hostusb.h: Split vendor/product
resolution code out of usbGetDevice and into usbFindDevice.
Add accessors for bus/device ID
* src/security/virt-aa-helper.c, src/security/security_selinux.c,
src/qemu/qemu_security_dac.c: Remove vendor/product from the
usbGetDevice() calls
* src/qemu/qemu_driver.c: Use usbFindDevice to resolve vendor/product
into a bus/device ID
The pci_del command is not being ported to QMP. Convert all the
QEMU hotplug code over to use device_del whenever it is available
to avoid the pci_del problem
* src/qemu/qemu_driver.c: Convert unplug code to device_del
Previously hot-unplug could not be supported for USB devices
in QEMU, since usb_del required the guest visible address
which libvirt never knows. With 'device_del' command we can
now unplug based on device alias, so support that.
* src/qemu/qemu_driver.c: Use device_del to remove USB devices
Upstart crashes & burns in a heap if $TERM environment variable
is missing. Presumably the kernel always sets this when booting
init on a real machine, so libvirt should set it for containers
too.
To make a typical inittab / mingetty setup happier, we need to
symlink the primary console /dev/pts/0 to /dev/tty1.
Improve logging in certain scenarios to make troubleshooting
easier
* src/lxc/lxc_container.c: Create /dev/tty1 and set $TERM
When using the 'ns' cgroup controller, the moment a process calls
'unshare(CLONE_NEWNS)', it will be given a private cgroup tree
under its current location. This really messages up the LXC
controller process, because it ends up creating the containers'
cgroup in the wrong place. The fix is fairly easy, just move
the cgroup setup before the code which calls unshare(). The
'ns' controller will still create extra undesired cgroups, but
they at least won't break libvirt's setup now.
The patch also adds a missing cgroups allow rule for /dev/tty
device node
When getting the driver/domain cgroup it is possible to specify
whether it should be auto created. If auto-creation was turned
off, libvirt still mistakenly created its own top level cgroup
* src/util/cgroup.c: Honour autocreate flag for top level cgroup
This allows the config to have a setting that means "leave it alone",
eg when building a pool where the directory already exists the user
may want the current uid/gid of the directory left intact. This
actually gets us back to older behavior - before recent changes to the
pool building code, we weren't as insistent about honoring the uid/gid
settings in the XML, and virt-manager was taking advantage of this
behavior.
As a side benefit, removing calls to getuid/getgid from the XML
parsing functions also seems like a good idea. And having a default
that is different from a common/useful value (0 == root) is a good
thing in general, as it removes ambiguity from decisions (at least one
place in the code was checking for (perms.uid == 0) to see if a
special uid was requested).
Note that this will only affect newly created pools and volumes. Due
to the way that the XML is parsed, then formatted for newly created
volumes, all existing pools/volumes already have an explicit uid and
gid set.
src/conf/storage_conf.c: Remove calls to setuid/setgid for default values
of uid/gid, and set them to -1 instead
src/storage/storage_backend.c:
src/storage/storage_backend_fs.c:
Make account for the new default values of perms.uid
and perms.gid.
With the recent changes to the linking defaults in Fedora 13 (namely
enabling --no-add-needed behaviour by default), we have to pass the
dlopen()-providing libraries directly at the link of the module; use the
same AC_SEARCH_LIBS function as used before to look for it and add it to
the Makefile.
QEMU has a monitor command 'set_cpu' which allows a specific
CPU to be toggled between online& offline state. libvirt CPU
hotplug does not work in terms of individual indexes CPUs.
Thus to support this, we iteratively toggle the online state
when the total number of vCPUs is adjusted via libvirt
NB, currently untested since QEMU segvs when running this!
* src/qemu/qemu_driver.c: Toggle online state for CPUs when
doing hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
monitor API for toggling a CPU's online status via 'set_cpu
The storage backend implementations all presume that the XML parser
is validating correctness of the source specification. The check for
a source device was lost at some point. This allowed for a potential
crash in the disk backend. Re-introduce the sanity check
* src/conf/storage_conf.c: Re-add check for source device
When trying to delete a pool the error message claimed the volume
could not be deleted.
* src/storage/storage_driver.c: Error message referred to
volumes instead of pools
The QMP code was running query-migration instead of query-migrate.
This doesn't work so well
* src/qemu/qemu_monitor_json.c: s/query-migration/query-migrate/
The code to remove the cgroup after QEMU failed to startup could
be obscuring a real error from earlier on. It is not neccessary
to raise an error in this case, so tell cgroups to keep quiet
* src/qemu/qemu_driver.c: Don't raise cgroups error in QEMU start
cleanup code.
The QEMU hotunplug code for PCI devices was looking at host
devices in the guest config without first filtering non
PCI devices. This means it was reading garbage
* src/qemu/qemu_driver.c: Filter out non-PCI devices
Commit 3c12a67b76 added
a dependency on the NFS_SUPER_MAGIC macro, which is
defined in linux/magic.h. Unfortunately linux/magic.h
is not available in RHEL-5, and causes a compile error.
Just define it locally, since this is something that
can't change.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Move *all* file operations related to creation and writing of libvirt
header to the domain save file into a hook function that is called by
virFileOperation. First try to call virFileOperation as root. If that
fails with EACCESS, and (in the case of Linux) statfs says that we're
trying to save the file on an NFS share, rerun virFileOperation,
telling it to fork a child process and setuid to the qemu user. This
is the only way we can successfully create a file on a root-squashed
NFS server.
This patch (along with setting dynamic_ownership=0 in qemu.conf)
makes qemudDomainSave work on root-squashed NFS.
* src/qemu/qemu_driver.c: provide new qemudDomainSaveFileOpHook()
utility, use it in qemudDomainSave() if normal creation of the
file as root failed, and after checking the filesystem type for
the storage is NFS. In that case we also bypass the security
driver, as this would fail on NFS.
If qemudDomainRestore fails to open the domain save file, create a
pipe, then fork a process that does setuid(qemu_user) and opens the
file, then reads this file and stuffs it into the pipe. the parent
libvirtd process will use the other end of the pipe as its fd, then
reap the child process after it's done reading.
This makes domain restore work on a root-squash NFS share that is only
visible to the qemu user.
* src/qemu/qemu_driver.c: add new qemudOpenAsUID() helper function,
and use it in qemudDomainRestore() if reading as root directly failed.
The USB/PCI device hotplug code for the QEMU driver was forgetting
to allocate a unique device alias.
* src/qemu/qemu_driver.c: Fill in device alias for USB/PCI devices
The code assumed that 'device_add' returned an empty string upon
success. This is not true, it sometimes prints random debug info.
THus we need to check for an explicit fail string
* src/qemu/qemu_monitor_text.c: Fix error checking of the device_add
monitor command
Another warning caught by coverity. Continue to perform best-effort
closing and resource release, but warn the caller about the failure.
* src/esx/esx_driver.c (esxClose): Return an error on failure to close.
Various safezero() implementations used either -1, errno or -errno
return values. This patch fixes them all to return -1 and set errno
appropriately.
There was also a bug in size parameter passed to safewrite() which could
result in an attempt to write gigabytes out of a megabyte buffer.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When a VM save attempt failed, the VM would be left in a paused
state. It is neccessary to resume CPU execution upon failure
if it was running originally
* src/qemu/qemu_driver.c: Resume CPUs upon save failure
This supports cancellation of jobs for the QEMU driver against
the virDomainMigrate, virDomainSave and virDomainCoreDump APIs.
It is not yet supported for the virDomainRestore API, although
it is desirable.
* src/qemu/qemu_driver.c: Issue 'migrate_cancel' command if
virDomainAbortJob is issued during a migration operation
* tools/virsh.c: Add a domjobabort command
This defines the wire protocol for the new API
* src/remote/remote_protocol.x: Wire protocol definition
* src/remote/remote_driver.c,daemon/remote.c: Client and server
side implementation
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h,
daemon/remote_dispatch_table.h, src/remote/remote_protocol.c,
src/remote/remote_protocol.h: Re-generate from remote_protocol.x
This provides the internal glue for the driver API
* src/driver.h: Internal API contract
* src/libvirt.c, src/libvirt_public.syms: Connect public API
to driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub out entry points
Introduce support for virDomainGetJobInfo in the QEMU driver. This
allows for monitoring of any API that uses the 'info migrate' monitor
command. ie virDomainMigrate, virDomainSave and virDomainCoreDump
Unfortunately QEMU does not provide a way to monitor incoming migration
so we can't wire up virDomainRestore yet.
The virsh tool gets a new command 'domjobinfo' to query status
* src/qemu/qemu_driver.c: Record virDomainJobInfo and start time
in qemuDomainObjPrivatePtr objects. Add generic shared handler
for calling 'info migrate' with all migration based APIs.
* src/qemu/qemu_monitor_text.c: Fix parsing of 'info migration' reply
* tools/virsh.c: add new 'domjobinfo' command to query progress
* src/remote/remote_protocol.x: Define wire protocol format
for virDomainGetJobInfo API
* src/remote/remote_driver.c, daemon/remote.c: Implement client
and server marshalling code for virDomainGetJobInfo()
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h, daemon/remote_dispatch_table.h,
src/remote/remote_protocol.c, src/remote/remote_protocol.h: Rebuild
files from src/remote/remote_protocol.x
The internal glue layer for the new pubic API
* src/driver.h: Define internal driver API contract
* src/libvirt.c, src/libvirt_public.syms: Wire up public
API to internal driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub new entry point
All other uses of get_str_prop in this file that ignored
failure explicitly cast to void.
* src/node_device/node_device_hal.c (dev_create): Silence coverity
warning.
A number of the error messages raised when parsing USB devices
refered to PCI devices by mistake
* src/qemu/qemu_conf.c: s/PCI/USB/ in qemuParseCommandLineUSB()
when the underlying qemu supports the drive/device model and the
controller has been added this way.
* src/qemu/qemu_driver.c: use qemuMonitorDelDevice() when detaching
PCI controller and if supported
* src/qemu/qemu_monitor.[ch]: add new qemuMonitorDelDevice() function
* src/qemu/qemu_monitor_json.[ch]: JSON backend for DelDevice command
* src/qemu/qemu_monitor_text.[ch]: Text backend for DelDevice command
* src/qemu/qemu_driver.c: in qemudDomainDetachPciControllerDevice()
when a controller is not present in the system anymore, the PCI
address must be deleted from libvirt's hashtable because it can
be re-used for other purposes.
* src/qemu/qemu_driver.c: in qemudDomainAttachDevice(), one must not
delete the data part when the operation succeeds because it is
required later on. The correct pattern to handlethe parsed
representation of the device information on success
is dev->data.controller = NULL; virDomainDeviceDefFree(dev);,
which leaves the structure pointed at by data in memory.
* src/cpu/cpu_x86.c (x86Decode): Don't dereference NULL when passed
a NULL "models" pointer, or when passed a nonzero "nmodels" value
and a corresponding NULL models[i].
* src/phyp/phyp_driver.c (phypUUIDTable_Push): Move incr/decr
of ptr/nread into the loop where those variables are used.
Also, remove "exit" label and just-preceding "goto".
Allow an arbitrary timezone with QEMU by setting the $TZ environment
variable when launching QEMU
* src/qemu/qemu_conf.c: Set TZ environment variable if a timezone
is requested
* tests/qemuxml2argvtest.c: Add test case for timezones
* tests/qemuxml2argvdata/qemuxml2argv-clock-france.xml,
tests/qemuxml2argvdata/qemuxml2argv-clock-france.args: Data
for timezone tests
This extends the XML to allow for
<clock offset='timezone' timezone='Europe/Paris'/>
This is useful if the admin has not configured any timezone on the
host OS, but still wants to synchronize a guest to a specific one.
* src/conf/domain_conf.h, src/conf/domain_conf.c: Support extra
'timezone' attribute on clock configuration
* docs/schemas/domain.rng: Add 'timezone' attribute
* src/xen/xend_internal.c, src/xen/xm_internal.c: Reject configs
with a configurable timezone
This allows QEMU guests to be started with an arbitrary clock
offset
The test case can't actually be enabled, since QEMU argv expects
an absolute timestring, and this will obviously change every
time the test runs :-( Hopefully QEMU will allow a relative
time offset in the future.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Use the -rtc arg
if available to support variable clock offset mode
* tests/qemuhelptest.c: Add QEMUD_CMD_FLAG_RTC for qemu 0.12.1
* qemuxml2argvdata/qemuxml2argv-clock-variable.args,
qemuxml2argvdata/qemuxml2argv-clock-variable.xml,
qemuxml2argvtest.c: Test case, except we can't actually enable
it yet.
This introduces a third option for clock offset synchronization,
that allows an arbitrary / variable adjustment to be set. In
essence the XML contains the time delta in seconds, relative to
UTC.
<clock offset='variable' adjustment='123465'/>
The difference from 'utc' mode, is that management apps should
track adjustments and preserve them at next reboot.
* docs/schemas/domain.rng: Schema for new clock mode
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parse
new clock time delta
* src/libvirt_private.syms, src/util/xml.c, src/util/xml.h: Add
virXPathLongLong() method
The XML will soon be extended to allow more than just a simple
localtime/utc boolean flag. This change replaces the plain
'int localtime' with a separate struct to prepare for future
extension
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add a new
virDomainClockDef structure
* src/libvirt_private.syms: Export virDomainClockOffsetTypeToString
and virDomainClockOffsetTypeFromString
* src/qemu/qemu_conf.c, src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
src/xen/xm_internal.c: Updated to use new structure for localtime
While building under RHEL-5, I got a compile warning because
virDomainObjFormat was defined but not used. That came about
because in RHEL-5 we build with "#define PROXY", and
virDomainObjFormat is only used with !PROXY. Move the
define.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/openvz/openvz_conf.c (openvzGetVEID): Always call fclose.
Diagnose parse failure also when vzlist output is empty.
If somehow we read a -1, diagnose that (albeit as a parse failure).
Kind of minor, but it annoys me that the default auth callback
doesn't put a space between the prompt and the input, like a typical
terminal, ssh, etc. This patch changes the current prompt:
Please enter your authentication name:myuser
to
Please enter your authentication name: myuser
In bug 567931 we found that virt-top would exit occasionally
when the terminal window was resized. Tracking this down it
turned out that SIGWINCH was being delivered to the process at
exactly the point where the libvirt remote driver was calling
poll(2) waiting for a reply from libvirtd.
This caused the poll(2) call to be interrupted (returning errno
EINTR). However handling EINTR the same way as EAGAIN was not
the solution to this problem since we found previously that this
would break Ctrl-C handling (commit 47fec8eac2).
The correct solution is to mask out SIGWINCH for the duration
of the poll(2) system call. The per-thread mask is changed and
restored immediately after the call. Since we are using
pthread_sigmask, this should not affect other threads, and
since we restore the signal mask immediately afterwards it should
not affect the current thread visibly either. Other possibly
problematic signals are SIGCHLD and SIGPIPE and these are
masked too.
Note use of ignore_value: It's not fatal if we cannot mask out
SIGWINCH, and in any case pthread_sigmask never fails on Linux
as long as you supply the correct arguments.
I tested this patch and it cures the original problem with
virt-top.
Create the filesystem on the partition used by the pool
* configure.ac: check for mkfs availability
* libvirt.spec.in: add extra require on util-linux for mkfs
* src/storage/storage_backend_fs.c: run mkfs with the expected
fs type when creating a filesystem pool
We were using 'Y' to mean exabyte, when the correct abbreviation would be
'E' ('Y' is yettabyte, which is exabyte * 1024 * 1024). While it isn't
strictly backwards compatible, I highly doubt anyone was actually using
this broken behavior, so I don't see any harm in in dropping 'Y' handling.
Recently we introduced O_DSYNC flag when creating raw storage files to
avoid filling all disk cache with dirty pages. However, the patch got
lost when virStorageBackendCreateRaw was reworked using
virFileOperation. Let's use O_DSYNC again.
* src/xen/xend_internal.c (xenDaemonDomainSetAutostart): Avoid a NULL
dereference upon non-SEXPR_VALUE'd on_xend_start. This bug was
introduced by commit 37ce5600c0.
There were a few operations on the storage volume file that were still
being done as root, which will fail if the file is on a root-squashed
NFS share. The result was that attempts to create a storage volume of
type "raw" on a root-squashed NFS share would fail.
This patch uses the newly introduced "hook" function in
virFileOperation to execute all those file operations in the child
process that's run under the uid that owns the file (and, presumably,
has permission to write to the NFS share)
* src/storage/storage_backend.c: use virFileOperation() in
virStorageBackendCreateRaw, turning virStorageBackendCreateRaw()
into a new createRawFileOpHook() hook
It turns out it is also useful to be able to perform other operations
on a file created while running as a different uid (eg, write things
to that file), and possibly to do this to a file that already
exists. This patch adds an optional hook function to the renamed (for
more accuracy of purpose) virFileOperation; the hook will be called
after the file has been opened (possibly created) and gid/mode
checked/set, before closing it.
As with the other operations on the file, if the VIR_FILE_OP_AS_UID
flag is set, this hook function will be called in the context of a
child process forked from the process that called virFileOperation.
The implication here is that, while all data in memory is available to
this hook function, any modification to that data will not be seen by
the caller - the only indication in memory of what happened in the
hook will be the return value (which the hook should set to 0 on
success, or one of the standard errno values on failure).
Another piece of making the function more flexible was to add an
"openflags" argument. This arg should contain exactly the flags to be
passed to open(2), eg O_RDWR | O_EXCL, etc.
In the process of adding the hook to virFileOperation, I also realized
that the bits to fix up file owner/group/mode settings after creation
were being done in the parent process, which could fail, so I moved
them to the child process where they should be.
* src/util/util.[ch]: rename and rework virFileCreate-->virFileOperation,
and redo flags in virDirCreate
* storage/storage_backend.c, storage/storage_backend_fs.c: update the
calls to virFileOperation/virDirCreate to reflect changes in the API,
but don't yet take advantage of the hook.
Fix multiple veth problem.
NETIF setting was overwritten after first CT because any CT could not be
found by name.
* src/openvz/openvz_conf.c src/openvz/openvz_conf.h: add the
openvzGetVEID lookup function
* src/openvz/openvz_driver.c: use it in openvzDomainSetNetwork()
If the hostname as returned by "gethostname" resolves
to "localhost" (as it does with the broken Fedora-12
installer), then live migration will fail because the
source will try to migrate to itself. Detect this
situation up-front and abort the live migration before
we do any real work.
* src/util/util.h src/util/util.c: add a new virGetHostnameLocalhost
with an optional localhost check, and rewire virGetHostname() to use
it
* src/libvirt_private.syms: expose the new function
* src/qemu/qemu_driver.c: use it in qemudDomainMigratePrepare2()
* src/xen/xend_internal.c (xenDaemonDomainSetAutostart): Rewrite to
avoid dereferencing the result of sexpr_lookup. While in this
particular case, it was guaranteed never to be NULL, due to the
preceding "if sexpr_node(...)" guard, it's cleaner to skip the
sexpr_node call altogether, and also saves a lookup.
This patch sets or unsets the IFF_VNET_HDR flag depending on what device
is used in the VM. The manipulation of the flag is done in the open
function and is only fatal if the IFF_VNET_HDR flag could not be cleared
although it has to be (or if an ioctl generally fails). In that case the
macvtap tap is closed again and the macvtap interface torn.
* src/qemu/qemu_conf.c src/qemu/qemu_conf.h: pass qemuCmdFlags to
qemudPhysIfaceConnect()
* src/util/macvtap.c src/util/macvtap.h: add vnet_hdr boolean to
openMacvtapTap(), and private function configMacvtapTap()
* src/qemu/qemu_driver.c: add extra qemuCmdFlags when calling
qemudPhysIfaceConnect()
For __virExec() this is a semantic NOP except for when fork()
fails. __virExec() would previously forget to restore the signal mask
in this case; virFork() corrects this behavior.
virFileCreate() and virDirCreate() gain the code to reset the logging
and properly deal with the signal handling race condition.
This also removes a log message that had a typo ("cannot fork o create
file '%s'") - this error is now logged in a more generic manner in
virFork() (more generic, but really just as informative, since the
fact that it's forking to create a file is immaterial to the fact that
it simply can't fork)
* src/util/util.c: use the generic virFork() in the 3 functions
virFork() contains bookkeeping that must be done any time a process
forks. Currently this includes:
1) Call virLogLock() prior to fork() and virLogUnlock() just after,
to avoid a deadlock if some other thread happens to hold that lock
during the fork.
2) Reset the logging hooks and send all child process log messages to
stderr.
3) Block all signals prior to fork(), then either a) reset the signal
mask for the parent process, or b) clear the signal mask for the
child process.
Note that the signal mask handling in __virExec erroneously fails to
restore the signal mask when fork() fails. virFork() fixes this
problem.
Other than this, it attempts to behave as closely to fork() as
possible (including preserving errno for the caller), with a couple
exceptions:
1) The return value is 0 (success) or -1 (failure), while the pid is
returned via the pid_t* argument. Like fork(), if pid < 0 there is
no child process, otherwise both the child and the parent will
return to the caller, and both should look at the return value,
which will indicate if some of the extra processing outlined above
encountered an error.
2) If virFork() returns with pid < 0 or with a return value < 0
indicating an error condition, the error has already been
reported. You can log an additional message if you like, but it
isn't necessary, and may be awkwardly extraneous.
Note that virFork()'s child process will *never* call _exit() - if a
child process is created, it will return to the caller.
* util.c util.h: add virFork() function, based on what is currently
done in __virExec().
Support virtio-serial controller and virtio channel in QEMU backend.
Will output
the following for virtio-serial controller:
-device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4,max_ports=16,vectors=4
and the following for a virtio channel:
-chardev pty,id=channel0 \
-device
virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0
* src/qemu/qemu_conf.c: Add argument output for virtio
* tests/qemuxml2argvtest.c
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args: Add test for
QEMU command line generation
Add support for virtio-serial by defining a new 'virtio' channel target type
and a virtio-serial controller. Allows the following to be specified in a
domain:
<controller type='virtio-serial' index='0' ports='16' vectors='4'/>
<channel type='pty'>
<target type='virtio' name='org.linux-kvm.port.0'/>
<address type='virtio-serial' controller='0' bus='0'/>
</channel>
* docs/schemas/domain.rng: Add virtio-serial controller and virtio
channel type.
* src/conf/domain_conf.[ch]: Domain parsing/serialization for
virtio-serial controller and virtio channel.
* tests/qemuxml2xmltest.c
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.xml: add domain xml
parsing test
* src/libvirt_private.syms src/qemu/qemu_conf.c:
virDomainDefAddDiskControllers() renamed to
virDomainDefAddImplicitControllers()
Remove virDomainDevicePCIAddressEqual and virDomainDeviceDriveAddressEqual,
which are defined but not used anywhere.
* src/conf/domain_conf.[ch] src/libvirt_private.syms: Remove
virDomainDevicePCIAddressEqual and virDomainDeviceDriveAddressEqual.
Currently we just error with ex. 'virbr0: No such device'.
Since we are using public API calls here, we need to ensure that any
raised error is properly saved and restored, since API entry points
always reset messages.
virGetLastError returns NULL if no error has been set, not on
allocation error like virSetError assumed. Use virLastErrorObject
instead. This fixes virSetError when no error is currently stored.
Rework and simplification of teardown of the macvtap device.
Basically all devices with the same MAC address and link device are kept
alive and not attempted to be torn down. If a macvtap device linked to a
physical interface with a certain MAC address 'M' is to be created it
will automatically fail if the interface is 'up'ed and another macvtap
with the same properties (MAC addr 'M', link dev) happens to be 'up'.
This will prevent the VM from starting or the device from being attached
to a running VM. Stale interfaces are assumed to be there for some
reason and not stem from libvirt.
In the VM shutdown path, it's assuming that an interface name is always
available so that if the device type is DIRECT it can be torn down
using its name.
* src/util/macvtap.h src/libvirt_macvtap.syms: change of deleting routine
* src/util/macvtap.c: cleanups and change of deleting routine
* src/qemu/qemu_driver.c: change cleanup on shutdown
* src/qemu/qemu_conf.c: don't delete Macvtap in qemudPhysIfaceConnect()
The QEMU JSON monitor changed balloon commands to return/accept
bytes instead of kilobytes. Update libvirt to cope with this
* src/qemu/qemu_monitor_json.c: Expect/use bytes for ballooning
* src/uml/uml_driver.c (umlMonitorCommand): This function would
sometimes return -1, yet fail to free the "reply" it had allocated.
Hence, no caller would know to free the corresponding argument.
When returning -1, be sure to free all allocated resources.
* src/storage/storage_backend_mpath.c (virStorageBackendIsMultipath):
The result of dm_get_next_target was never used (and isn't needed),
so don't store it.
Similar to the Set*Mem commands, this implementation was bogus and
misleading. Make it clear this is a hotplug only operation, and that the
hotplug piece isn't even implemented.
Also drop the overkill maxvcpus validation: we don't perform this check
at XML define time so clearly no one is missing it, and there is
always the risk that our info will be out of date, possibly preventing
legitimate CPU values.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
SetMem and SetMaxMem are hotplug only APIs, any persistent config
changes are supposed to go via XML definition. The original implementation
of these calls were incorrect and had the nasty side effect of making
a psuedo persistent change that would be lost after libvirtd restart
(I didn't know any better).
Fix these APIs to rightly reject non running domains.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
The plain QEMU tree does not include 'thread_id' in the JSON
output. Thus we need to treat it as non-fatal if missing.
* src/qemu/qemu_monitor_json.c: Treat missing thread_id as non-fatal
A typo in the check for the primary IDE controller could cause
a crash on restore depending on the exact guest config.
* src/qemu/qemu_conf.c: Fix s/video/controller/ typo & slot
number typo
Current error reporting for JSON mode returns the full JSON
command string and full JSON error string. This is not very
user friendly, so this change makes the error report only
contain the basic command name, and friendly error message
description string. The full JSON data is logged instead.
* src/qemu/qemu_monitor_json.c: Always return the 'desc' field from
the JSON error message to users.
When in JSON mode, QEMU requires that 'qmp_capabilities' is run as
the first command in the monitor. This is a no-op when run in the
text mode monitor
* src/qemu/qemu_driver.c: Run capabilities negotiation when
connecting to the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
support for the 'qmp_capabilities' command, no-op in text mode.
This part adds support for qemu making a macvtap tap device available
via file descriptor passed to qemu command line. This also attempts to
tear down the macvtap device when a VM terminates. This includes support
for attachment and detachment to/from running VM.
* src/qemu/qemu_conf.[ch] src/qemu/qemu_driver.c: add support in the
QEmu driver
This part adds the helper code to setup and tear down macvtap devices
using direct communication with the device driver via netlink sockets.
The rather short messages received from the netlink layer are now
written into a dynamically allocated buffer
* src/util/macvtap.h src/util/macvtap.c: provides the new module
* po/POTFILES.in: the module contains translated strings
This part adds support to domain_conf.{c|h} for parsing the new
interface XML of type 'direct'. The parsed mode is now stored as
an int.
* src/conf/domain_conf.c src/conf/domain_conf.h: extend parsing code
* src/util/macvtap.h: empty header to not break compilation
This patch adds build support for libvirt checking for certain contents
of /usr/include/linux/if_link.h to see whether macvtap support is
compilable on that system. One can disable macvtap support in libvirt
via --without-macvtap passed to configure.
* configure.ac src/Makefile.am: new build support
* src/libvirt_macvtap.syms: list of exported symbols
* src/util/macvtap.c: empty module to not break compilation
The virRaiseError macro inside of virSecurityReportError expands to
virRaiseErrorFull and includes the __FILE__, __FUNCTION__ and __LINE__
information. But this three values are always the same for every call
to virSecurityReportError and do not reflect the actual error context.
Converting virSecurityReportError into a macro results in getting the
correct __FILE__, __FUNCTION__ and __LINE__ information.
Current PCI addresses are allocated at time of VM startup.
To make them truely persistent, it is neccessary to do this
at time of virDomainDefine/virDomainCreate. The code in
qemuStartVMDaemon still remains in order to cope with upgrades
from older libvirt releases
* src/qemu/qemu_driver.c: Rename existing qemuAssignPCIAddresses
to qemuDetectPCIAddresses. Add new qemuAssignPCIAddresses which
does auto-allocation upfront. Call qemuAssignPCIAddresses from
qemuDomainDefine and qemuDomainCreate to assign PCI addresses that
can then be persisted. Don't clear PCI addresses at shutdown if
they are intended to be persistent
The old text mode monitor prompts for a password when disks are
encrypted. This interactive approach doesn't work for JSON mode
monitor. Thus there is a new 'block_passwd' command that can be
used.
* src/qemu/qemu_driver.c: Split out code for looking up a disk
secret from findVolumeQcowPassphrase, into a new method
getVolumeQcowPassphrase. Enhance qemuInitPasswords() to also
set the disk encryption password via the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for the 'block_passwd' monitor command.
Since c26cb9234f, the dname
parameter has been ignored by these two functions. Use it.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepareTunnel): Honor dname
parameter once again.
(qemudDomainMigratePrepare2): Likewise.
All other libvirt functions use array first and then number of elements
in that array. Let's make cpuDecode follow this rule.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
With QEMU >= 0.12 the host and guest side of disks no longer have
the same naming convention. Specifically the host side will now
get a 'drive-' prefix added to its name. The 'info blockstats'
monitor command returns the host side name, so it is neccessary
to strip this off when looking up stats since libvirt stores the
guest side name !
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Move 'drive-' prefix
string to a defined constant
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Strip
off 'drive-' prefix (if found) when looking up disk stats
Currently the timeout for reading startup output is 3 seconds. If the
host is under any sort of load, we can easily trigger this. Lets bump
it to 30 seconds.
Since the polling loop checks to see if the process has died, we shouldn't
erroneously hit this timeout if qemu bombs (only if it is stuck in some
infinite loop).
Use the ATTRIBUTE_NONNULL annotation to mark some virConnectPtr
args as mandatory non-null so the compiler can warn of mistakes
* src/conf/domain_event.h: All virConnectPtr args must be non-null
* src/qemu/qemu_conf.h: qemudBuildCommandLine and
qemudNetworkIfaceConnect() must be given non-null connection
* tests/qemuxml2argvtest.c: Provide a non-null (dummy) connection to
qemudBuildCommandLine()
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in secret_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in interface_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in cpu_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in storage_conf.{h,c} and storage_encryption_conf.{h,c}
and update all callers to match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in node_device_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in network_conf.{h,c} and update all callers to
match
All callers now pass a NULL virConnectPtr into the USB/PCi device
iterator functions. Therefore the virConnectPtr arg can now be
removed from these functions
* src/util/hostusb.h, src/util/hostusb.c: Remove virConnectPtr
from usbDeviceFileIterate
* src/util/pci.c, src/util/pci.h: Remove virConnectPtr arg from
pciDeviceFileIterate
* src/qemu/qemu_security_dac.c, src/security/security_selinux.c: Update
to drop redundant virConnectPtr arg
The QEMU flags are commonly stored as a signed or unsigned int,
allowing only 31 flags. This limit is rather close, so to aid
future patches, change it to a 64-bit int
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h, src/qemu/qemu_driver.c,
tests/qemuargv2xmltest.c, tests/qemuhelptest.c, tests/qemuxml2argvtest.c:
Use 'unsigned long long' for QEMU flags
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in security_driver.{h,c} and update all callers to
match
The security driver was mistakenly initialized before the QEMU
config file was loaded. This prevents it being turned off again.
The capabilities XML was also getting the wrong security driver
name, due to the stacked driver arrangement.
* src/qemu/qemu_driver.c: Fix initialization order and capabilities
model name
* src/util/util.h (virAsprintf): Remove ATTRIBUTE_RETURN_CHECK, since
it is perfectly fine to ignore the return value, now that the pointer
is guaranteed to be set to NULL upon failure.
* src/util/storage_file.c (absolutePathFromBaseFile): Remove now-
unnecessary use of ignore_value.
* src/util/storage_file.c (absolutePathFromBaseFile): While this use
of virAsprintf is slightly cleaner than using stpncpy(stpcpy(...,
it does impose an artificial limitation on the length of the base_file
name. Rather than asserting that it does not exceed INT_MAX, return
NULL when it does.
When creating preallocated large raw files opening them with O_DSYNC
prevents long delays in reading because cache pages can be immediately
reused without writing them on a disk first.
virDomain{Attach,Detach}Device is now only permitted on active
domains. Explicitly state this restriction in the API
documentation.
V2: Only change doc, dropping the hunk that forced the restriction
in libvirt frontend.
When configured with --enable-gcc-warnings, it didn't even compile.
* src/util/storage_file.c: Include <assert.h>.
(absolutePathFromBaseFile): Assert that converting size_t to int is valid.
Reverse length/string args to match "%.*s".
Explicitly ignore the return value of virAsprintf.
* src/util/storage_file.c: Include "dirname.h".
(absolutePathFromBaseFile): Rewrite not to leak, and to require
fewer allocations.
* bootstrap (modules): Add dirname-lgpl.
* .gnulib: Update submodule to the latest.
When restoring from a saved guest image, the XML would already
contain the PCI slot ID of the IDE controller & video card.
The attempt to explicitly reserve this upfront would thus fail
everytime.
* src/qemu/qemu_conf.c: Reserve IDE controller / video card
slot at time of need, rather than upfront
Similar fix as previous one but for fork() usage when creating
a file or directory
* src/util/util.c: virLogLock() and virLogUnlock() around fork()
in virFileCreate() and virDirCreateSimple()
Ad pointed out by Dan Berrange:
So if some thread in libvirtd is currently executing a logging call,
while another thread calls virExec(), that other thread no longer
exists in the child, but its lock is never released. So when the
child then does virLogReset() it deadlocks.
The only way I see to address this, is for the parent process to call
virLogLock(), immediately before fork(), and then virLogUnlock()
afterwards in both parent & child. This will ensure that no other
thread
can be holding the lock across fork().
* src/util/logging.[ch] src/libvirt_private.syms: export virLogLock() and
virLogUnlock()
* src/util/util.c: lock just before forking and unlock just after - in
both parent and child.
The original udev node device backend neglected to lock the driverState
struct containing the device list when adding and removing devices
* src/node_device/node_device_udev.c: add necessary locks in
udevRemoveOneDevice() and udevAddOneDevice()
* src/xen/xs_internal.c (xenStoreDomainIntroduced): Don't use -1
as an allocation size upon xenStoreNumOfDomains failure.
(xenStoreDomainReleased): Likewise.
If the primary security driver (SELinux/AppArmour) was disabled
then the secondary QEMU DAC security driver was also disabled.
This is mistaken, because the latter must be active at all times
* src/qemu/qemu_driver.c: Ensure DAC driver is always active
* src/xen/xen_hypervisor.c: Remove all "domain == NULL" tests.
* src/xen/xen_hypervisor.h: Instead, use ATTRIBUTE_NONNULL to
mark each "domain" parameter as "known always to be non-NULL".
When attaching a USB host device based on vendor/product, libvirt
will resolve the vendor/product into a device/bus pair. This means
that when printing XML we should allow device/bus info to be printed
at any time if present
* src/conf/domain_conf.c, docs/schemas/domain.rng: Allow USB device
bus info alongside vendor/product
To allow devices to be hot(un-)plugged it is neccessary to ensure
they all have a unique device aliases. This fixes the hotplug
methods to assign device aliases before invoking the monitor
commands which need them
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Expose methods
for assigning device aliases for disks, host devices and
controllers
* src/qemu/qemu_driver.c: Assign device aliases when hotplugging
all types of device
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-pci-address-device.args,
tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-address-device.args:
Update for changed hostdev naming scheme
This patch re-arranges the QEMU device alias assignment code to
make it easier to call into the same codeblock when performing
device hotplug. The new code has the ability to skip over already
assigned names to facilitate hotplug
* src/qemu/qemu_driver.c: Call qemuAssignDeviceNetAlias()
instead of qemuAssignNetNames
* src/qemu/qemu_conf.h: Export qemuAssignDeviceNetAlias()
instead of qemuAssignNetNames
* src/qemu/qemu_driver.c: Merge the legacy disk/network alias
assignment code into the main methods
The current way of assigning names to the host network backend and
NIC device in QEMU was over complicated, by varying naming scheme
based on the NIC model and backend type. This simplifies the naming
to simply be 'net0' and 'hostnet0', allowing code to easily determine
the host network name and vlan based off the primary device alias
name 'net0'. This in turn allows removal of alot of QEMU specific
code from the XML parser, and makes it easier to assign new unique
names for NICs that are hotplugged
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove hostnet_name
and vlan fields from virNetworkDefPtr
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h, src/qemu/qemu_driver.c:
Use a single network alias naming scheme regardless of NIC type
or backend type. Determine VLANs from the alias name.
* tests/qemuxml2argvdata/qemuxml2argv-net-eth-names.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-device.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.args: Update
for new simpler naming scheme
The QEMU 0.12.x tree has the -netdev command line argument, but not
corresponding monitor command. We can't enable the former, without
the latter since it will break hotplug/unplug.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Disable -netdev usage
until 0.13 at earliest
* tests/qemuxml2argvtest.c: Add test for -netdev syntax
* tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.xml: Test
data files for -netdev syntax
PCI disk, disk controllers, net devices and host devices need to
have PCI addresses assigned before they are hot-plugged
* src/qemu/qemu_conf.c: Add APIs for ensuring a device has an
address and releasing unused addresses
* src/qemu/qemu_driver.c: Ensure all devices have addresses
when hotplugging.
The current QEMU code allocates PCI addresses incrementally starting
at 4. This is not satisfactory because the user may have given some
addresses in their XML config, which need to be skipped over when
allocating addresses to remaining devices.
It is thus neccessary to maintain a list of already allocated PCI
addresses and then only allocate ones that remain unused. This is
also required for domain device hotplug to work properly later.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add APIs for creating
list of existing PCI addresses, and allocating new addresses.
Refactor address assignment to use this code
* src/qemu/qemu_driver.c: Pull PCI address assignment up into the
qemuStartVMDaemon() method, as a prelude to moving it into the
'define' method. Update list of allocated addresses when connecting
to a running VM at daemon startup.
* tests/qemuxml2argvtest.c, tests/qemuargv2xmltest.c,
tests/qemuxml2xmltest.c: Remove USB product test since all
passthrough is done based on address
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-product.args,
tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-product.xml: Kil
unused data files
The virDomainDeviceInfoIterate() function will provide a
convenient way to iterate over all devices in a domain.
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDeviceInfoIterate()
function.
Since QEMU startup uses the new -device argument, the hotplug
code needs todo the same. This converts disk, network and
host device hotplug to use the device_add command
* src/qemu/qemu_driver.c: Use new device_add monitor APIs
whereever possible
The way QEMU is started has been changed to use '-device' and
the new style '-drive' syntax. This needs to be mirrored in
the hotplug code, requiring addition of two new APIs.
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Define APIs
qemuMonitorAddDevice() and qemuMonitorAddDrive()
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Implement the new monitor APIs
To allow for better code reuse from hotplug methods, the code for
generating PCI/USB hostdev arg values is split out into separate
methods
* qemu/qemu_conf.h, qemu/qemu_conf.c: Introduce new APis for
qemuBuildPCIHostdevPCIDevStr, qemuBuildUSBHostdevUsbDevStr
and qemuBuildUSBHostdevDevStr
All the helper functions for building command line arguments
now return a 'char *', instead of acepting a 'char **' or
virBufferPtr argument
* qemu/qemu_conf.c: Standardize syntax for building args
* qemu/qemu_conf.h: Export all functions for building args
* qemu/qemu_driver.c: Update for changed syntax for building
NIC/hostnet args
udevGetUintProperty was called with base set to 0 for busnum and devnum.
With base 0 strtoul parses the number as octal if it start with a 0. But
busnum and devnum are decimal and udev returns them padded with leading
zeros. So strtoul parses them as octal. This works for certain decimal
values like 001-007, but fails for values like 008.
Change udevProcessUSBDevice to use base 10 for busnum and devnum.
* src/util/util.c (virGetUserID, virGetGroupID): In the unlikely event
that sysconf(_SC_GETPW_R_SIZE_MAX) fails, don't use -1 as the size in
the subsequent allocation.
Similar to the race fixed by
be34c3c7ef, make sure
to wait around for KVM to release the resources from
a hot-detached PCI device before attempting to
rebind that device to the host driver.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The QEMU driver contained code to generate a -device string for piix4-ide, but
wasn't using it. This change removes this string generation. It also adds a
comment explaining why IDE and FDC controllers don't generate -device strings.
The change also generates an error if a sata controller is specified for a QEMU
domain, as this isn't supported.
* src/qemu/qemu_conf.c: Remove VIR_DOMAIN_CONTROLLER_TYPE_IDE handler in
qemuBuildControllerDevStr(). Ignore IDE and FDC controllers. Error if
SATA controller is discovered. Add comments.
On RHEL-5 the qemu-kvm binary is located in /usr/libexec.
To reduce confusion for people trying to run upstream libvirt
on RHEL-5 machines, make the qemu driver look in /usr/libexec
for the qemu-kvm binary.
To make this work, I modified virFindFileInPath to handle an
absolute path correctly. I also ran into an issue where
NULL was sometimes being passed for the file parameter
to virFindFileInPath; it didn't crash prior to this patch
since it was building paths like /usr/bin/(null). This
is non-standard behavior, though, so I added a NULL
check at the beginning.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/util/util.c (virGetUserEnt): In the unlikely event that
sysconf(_SC_GETPW_R_SIZE_MAX) fails, don't use -1 as the size in
the subsequent allocation.
xen-unstable c/s 20762 bumped XEN_SYSCTL_INTERFACE_VERSION to 7. The
interface change does not affect libvirt, other than xenHypervisorInit()
failing since version 7 is not tried.
The attached patch accommodates the upcoming Xen 4.0 release by checking
for XEN_SYSCTL_INTERFACE_VERSION 7. If found, it sets
XEN_DOMCTL_INTERFACE_VERSION to 6, which is also new to Xen 4.0.
it causes a NULL-dereference on some systems like Solaris 10.
* src/node_device/node_device_linux_sysfs.c. Include <stdlib.h>.
(get_sriov_function): Use canonicalize_file_name, not realpath.
* bootstrap (modules): Add canonicalize-lgpl.
This fixes a segfault when the event handler is called after shutdown
when the global driver state is NULL again.
Also fix a locking issue in an error path.
virFileMakePath is a recursive function that was creates a buffer
PATH_MAX bytes long for each recursion (one recursion for each element
in the path). This changes it to have no buffers on the stack, and to
allocate just one buffer total, no matter how many elements are in the
path. Because the modified algorithm requires a char* to be passed in
rather than const char *, it is now 2 functions - a toplevel API
function that remains identical in function, and a 2nd helper function
called for the recursions, which 1) doesn't allocate anything, and 2)
takes a char* arg, so it can modify the contents.
* src/util/util.c: rewrite virFileMakePath
This reverts commit cdc42d0a48.
As DanB pointed out, this patch is actually wrong. The real
bug that was causing me to see this problem is a bug
introduced in a RHEL-5 libvirt snapshot, and I'm going to
fix the real bug there.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
If you shutdown libvirtd while a domain with PCI
devices is running, then try to restart libvirtd,
libvirtd will crash.
This happens because qemuUpdateActivePciHostdevs() is calling
pciDeviceListSteal() with a dev of 0x0 (NULL), and then trying
to dereference it. This patch fixes it up so that
qemuUpdateActivePciHostdevs() steals the devices after first
Get()'ting them, avoiding the crash.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/qemu/qemu_monitor_text.c (qemuMonitorTextAttachDrive): Most other
failures in this function would "goto cleanup", but one mistakenly
returned directly, skipping the cleanup and resulting in a leak.
In addition, iterating the "try_command" loop would clobber, and
thus leak, the "cmd" allocated on the first iteration,
so be careful to free it in addition to "reply" beforehand.
The KVM build of QEMU includs the thread ID of each vCPU in the
'query-cpus' output. This is required for pinning guests to
particular host CPUs
* src/qemu/qemu_monitor_json.c: Extract 'thread_id' from CPU info
* src/util/json.c, src/util/json.h: Declare returned strings
to be const
* src/qemu/qemu_monitor.c: Wire up JSON mode for qemuMonitorGetPtyPaths
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Fix
const correctness. Add missing error message in the function
qemuMonitorJSONGetAllPCIAddresses. Add implementation of the
qemuMonitorGetPtyPaths function calling 'query-chardev'.
Two files were using functions from <sys/stat.h> but not including
in. Most of the time they got this automatically via another header,
but certain build flag combinations can reveal the problem
* src/lxc/lxc_container.c, src/node_device/node_device_linux_sysfs.c:
Add <sys/stat.h>
The <console> tag is supposed to result in addition of a single
<serial> device for HVM guests. The 'targetType' attribute was
missing though causing the compatibility code to add a second
<console> device
* src/conf/domain_conf.c: Set targetType for serial device
When libvirtd shuts down, it places a <state/> tag in the XML
state file it writes out for guests with PCI passthrough
devices. For devices that are attached at bootup time, the
state tag is empty. However, at libvirtd startup time, it
ignores anything with a <state/> tag in the XML, effectively
hiding the guest.
This patch remove the check for VIR_DOMAIN_XML_INTERNAL_STATUS
when parsing the XML.
* src/conf/domain_conf.c: remove VIR_DOMAIN_XML_INTERNAL_STATUS
flag check in virDomainHostdevSubsysPciDefParseXML()
Certain hypervisors (like qemu/kvm) map the PCI bar(s) on
the host when doing device passthrough. This can lead to a race
condition where the hypervisor is still cleaning up the device while
libvirt is trying to re-attach it to the host device driver. To avoid
this situation, we look through /proc/iomem, and if the hypervisor is
still holding onto the bar (denoted by the string in the matcher variable),
then we can wait around a bit for that to clear up.
v2: Thanks to review by DV, make sure we wait the full timeout per-device
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The patches to add ACS checking to PCI device passthrough
introduced a bug. With the current code, if you try to
passthrough a device on the root bus (i.e. bus 0), then
it denies the passthrough. This is because the code in
pciDeviceIsBehindSwitchLackingACS() to check for a parent
device doesn't take into account the possibility of the
root bus. If we are on the root bus, it means we
legitimately can't find a parent, and it also means that
we don't have to worry about whether ACS is enabled.
Therefore return 0 (indicating we don't lack ACS) from
pciDeviceIsBehindSwitchLackingACS().
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Fix a small problem with the qemu memory stats parsing algorithm. If qemu
reports a stat that libvirt does not recognize, skip past it so parsing can
continue. This corrects a potential infinite loop in the parsing code that can
only be triggered if new statistics are added to qemu.
* src/qemu/qemu_monitor_text.c: qemuMonitorParseExtraBalloonInfo add a
skip for extra ','
The loop looking for the controller associated with a SCI drive had
an off by one, causing it to miss the last controller.
* src/qemu/qemu_driver.c: Fix off-by-1 in searching for SCSI
drive hotplug
The hotplug code in QEMU was leaking memory because although the
inner device object was being moved into the main virDomainDefPtr
config object, the outer container virDomainDeviceDefPtr was not.
* src/qemu/qemu_driver.c: Clarify code to show that the inner
device object is owned by the main domain config upon
successfull attach.
Add the ability to turn off dynamic management of file permissions
for libvirt guests.
* qemu/libvirtd_qemu.aug: Support 'dynamic_ownership' flag
* qemu/qemu.conf: Document 'dynamic_ownership' flag.
* qemu/qemu_conf.c: Load 'dynamic_ownership' flag
* qemu/test_libvirtd_qemu.aug: Test 'dynamic_ownership' flag
The hotplug code was not correctly invoking the security driver
in error paths. If a hotplug attempt failed, the device would
be left with VM permissions applied, rather than restored to the
original permissions. Also, a CDROM media that is ejected was
not restored to original permissions. Finally there was a bogus
call to set hostdev permissions in the hostdev unplug code
* qemu/qemu_driver.c: Fix security driver usage in hotplug/unplug
If there is a problem with VM startup, PCI devices may be left
assigned to pci-stub / pci-back. Adding a call to reattach
host devices in the cleanup path is required.
* qemu/qemu_driver.c: qemuDomainReAttachHostDevices() when
VM startup fails
Remove all the QEMU driver calls for setting file ownership and
process uid/gid. Instead wire in the QEMU DAC security driver,
stacking it ontop of the primary SELinux/AppArmour driver.
* qemu/qemu_driver.c: Switch over to new DAC security driver
This new security driver is responsible for managing UID/GID changes
to the QEMU process, and any files/disks/devices assigned to it.
* qemu/qemu_conf.h: Add flag for disabling automatic file permission
changes
* qemu/qemu_security_dac.h, qemu/qemu_security_dac.c: New DAC driver
for QEMU guests
* Makefile.am: Add new files
Pulling the disk labelling code out of the exec hook, and into
libvirtd will allow it to access shared state in the daemon. It
will also make debugging & error reporting easier / more reliable.
* qemu/qemu_driver.c: Move initial disk labelling calls up into
libvirtd. Add cleanup of disk labels upon failure
If a VM fails to start, we can't simply free the security label
strings, we must call the domainReleaseSecurityLabel() method
otherwise the reserved 'mcs' level will be leaked in SElinux
* src/qemu/qemu_driver.c: Invoke domainReleaseSecurityLabel()
when domain fails to start
The current security driver architecture has the following
split of logic
* domainGenSecurityLabel
Allocate the unique label for the domain about to be started
* domainGetSecurityLabel
Retrieve the current live security label for a process
* domainSetSecurityLabel
Apply the previously allocated label to the current process
Setup all disk image / device labelling
* domainRestoreSecurityLabel
Restore the original disk image / device labelling.
Release the unique label for the domain
The 'domainSetSecurityLabel' method is special because it runs
in the context of the child process between the fork + exec.
This is require in order to set the process label. It is not
required in order to label disks/devices though. Having the
disk labelling code run in the child process limits what it
can do.
In particularly libvirtd would like to remember the current
disk image label, and only change shared image labels for the
first VM to start. This requires use & update of global state
in the libvirtd daemon, and thus cannot run in the child
process context.
The solution is to split domainSetSecurityLabel into two parts,
one applies process label, and the other handles disk image
labelling. At the same time domainRestoreSecurityLabel is
similarly split, just so that it matches the style. Thus the
previous 4 methods are replaced by the following 6 new methods
* domainGenSecurityLabel
Allocate the unique label for the domain about to be started
No actual change here.
* domainReleaseSecurityLabel
Release the unique label for the domain
* domainGetSecurityProcessLabel
Retrieve the current live security label for a process
Merely renamed for clarity.
* domainSetSecurityProcessLabel
Apply the previously allocated label to the current process
* domainRestoreSecurityAllLabel
Restore the original disk image / device labelling.
* domainSetSecurityAllLabel
Setup all disk image / device labelling
The SELinux and AppArmour drivers are then updated to comply with
this new spec. Notice that the AppArmour driver was actually a
little different. It was creating its profile for the disk image
and device labels in the 'domainGenSecurityLabel' method, where as
the SELinux driver did it in 'domainSetSecurityLabel'. With the
new method split, we can have consistency, with both drivers doing
that in the domainSetSecurityAllLabel method.
NB, the AppArmour changes here haven't been compiled so may not
build.
The QEMU driver is doing 90% of the calls to check for static vs
dynamic labelling. Except it is forgetting todo so in many places,
in particular hotplug is mistakenly assigning disk labels. Move
all this logic into the security drivers themselves, so the HV
drivers don't have to think about it.
* src/security/security_driver.h: Add virDomainObjPtr parameter
to virSecurityDomainRestoreHostdevLabel and to
virSecurityDomainRestoreSavedStateLabel
* src/security/security_selinux.c, src/security/security_apparmor.c:
Add explicit checks for VIR_DOMAIN_SECLABEL_STATIC and skip all
chcon() code in those cases
* src/qemu/qemu_driver.c: Remove all checks for VIR_DOMAIN_SECLABEL_STATIC
or VIR_DOMAIN_SECLABEL_DYNAMIC. Add missing checks for possibly NULL
driver entry points.
Allows the initiator to use a variety of IQNs rather than just the
system IQN when creating iSCSI pools.
* docs/schemas/storagepool.rng: extends the syntax with <iqn name="..."/>
* src/conf/storage_conf.[ch]: read and stores the iqn name
* src/storage/storage_backend_iscsi.[ch]: implement the IQN selection
when detected
* src/lxc/lxc_container.c src/lxc/lxc_controller.c src/lxc/lxc_driver.c
src/network/bridge_driver.c src/qemu/qemu_driver.c
src/uml/uml_driver.c: virFileMakePath returns 0 for success, or the
value of errno on failure, so error checking should be to test
if non-zero, not if lower than 0
Previously the uid/gid/mode in the xml was ignored when creating new
storage pool directories. This commit attempts to honor the requested
permissions, and spits out an error if it can't.
Note that when creating the directory, the rest of the path leading up
to the final element is created using current uid/gid/mode, and the
final element gets the settings from xml. It is NOT an error for the
directory to already exist; in this case, the perms for the existing
directory are just set (if necessary).
* src/storage/storage_backend_fs.c: update the virStorageBackendFileSystemBuild
function to check the directory hierarchy separately then create the
leaf directory with the right attributes
In order to avoid problems trying to chown files that were created by
root on a root-squashing nfs server, fork a new process that setuid's
to the desired uid before creating the file. (It's only done this way
if the pool containing the new volume is of type 'netfs', otherwise
the old method of creating the file followed by chown() is used.)
This changes the semantics of the "create_func" slightly - previously
it was assumed that this function just created the file, then the
caller would chown it to the desired uid. Now, create_func does both
operations.
There are multiple functions that can take on the role of create_func:
createFileDir - previously called mkdir(), now calls virDirCreate().
virStorageBackendCreateRaw - previously called open(),
now calls virFileCreate().
virStorageBackendCreateQemuImg - use virRunWithHook() to setuid/gid.
virStorageBackendCreateQcowCreate - same.
virStorageBackendCreateBlockFrom - preserve old behavior (but attempt
chown when necessary even if not root)
* src/storage/storage_backend.[ch] src/storage/storage_backend_disk.c
src/storage/storage_backend_fs.c src/storage/storage_backend_logical.c
src/storage/storage_driver.c: change the create_func implementations,
also propagate the pool information to be able to detect NETFS ones.
These functions create a new file or directory with the given
uid/gid. If the flag VIR_FILE_CREATE_AS_UID is given, they do this by
forking a new process, calling setuid/setgid in the new process, and
then creating the file. This is better than simply calling open then
fchown, because in the latter case, a root-squashing nfs server would
create the new file as user nobody, then refuse to allow fchown.
If VIR_FILE_CREATE_AS_UID is not specified, the simpler tactic of
creating the file/dir, then chowning is is used. This gives better
results in cases where the parent directory isn't on a root-squashing
NFS server, but doesn't give permission for the specified uid/gid to
create files. (Note that if the fork/setuid method fails to create the
file due to access privileges, the parent process will make a second
attempt using this simpler method.)
If the bit VIR_FILE_CREATE_ALLOW_EXIST is set in the flags, an
existing file/directory will not cause an error; in this case, the
function will simply set the permissions of the file/directory to
those requested. If VIR_FILE_CREATE_ALLOW_EXIST is not specified, an
existing file/directory is considered (and reported as) an error.
Return from both of these functions is 0 on success, or the value of
errno if there was a failure.
* src/util/util.[ch]: add the 2 new util functions
The test expected all environment variables copied in qemudBuildCommandLine
to have known values. So all of them have to be either set to a known value
or be unset. SDL_VIDEODRIVER and QEMU_AUDIO_DRV are not handled at all but
should be handled. Unset both, otherwise the test will fail if they are set
in the testing environment.
* src/qemu/qemu_conf.c: add a comment about copied environment variables
and qemuxml2argvtest
* tests/qemuxml2argvtest.c: unset SDL_VIDEODRIVER and QEMU_AUDIO_DRV
* src/conf/domain_conf.c (virDomainChrDefFormat): Plug a leak on
an error path, and at the same time, eliminate the need for a
"cleanup:" block. Before, the "return -1" after the switch
would leak an "addr" string. Now, by reversing the port,addr-
getting blocks we can free "addr" immediately and skip the goto.
The 'int virInterfaceIsActive()' method was directly returning the
value of the 'int active:1' bitfield in virIntefaceDefPtr. A bitfield
with a signed integer, will hold the values 0 and -1, not 0 and +1
as might be expected. This meant that virInterfaceIsActive() was
always returning -1 when the interface was active, not +1 & thus all
callers thought an error had occurred. To protect against this kind
of mistake again, change all bitfields to be unsigned ints
* daemon/libvirtd.h, src/conf/domain_conf.h, src/conf/interface_conf.h,
src/conf/network_conf.h: Change bitfields to unsigned int.
Invoking the virConnectGetCapabilities() method causes the QEMU
driver to rebuild its internal capabilities object. Unfortunately
it was forgetting to register the custom domain status XML hooks
again.
To avoid this kind of error in the future, the code which builds
capabilities is refactored into one single method, which can be
called from all locations, ensuring reliable rebuilds.
* src/qemu/qemu_driver.c: Fix rebuilding of capabilities XML and
guarentee it is always consistent
* src/util/logging.c (virLogMessage): Include "ignore-value.h".
Use it to ignore the return value of safewrite.
Use STDERR_FILENO, rather than "2".
* bootstrap (modules): Add ignore-value.
* gnulib: Update to latest, for ignore-value that is now LGPLv2+.
This was accomplished in xml parsing by doing away with the
stripped-down virInterfaceBareDef object, and just always using
virInterfaceDef, but with restrictions in certain places (eg, the type
of subordinate interface allowed in parsing depends on the parent
interface).
xml formatting was similarly adjusted. In addition, the formatting
functions keep track of the level of interface nesting, and insert
extra leading spaces on each line accordingly (using %*s).
The only change in formatted xml from previous (aside frmo supporting
new combinations of interface types) is that the subordinate ethernet
interfaces take up 2 lines rather than one, eg:
<interface type='ethernet' name='eth0'>
</interface>
instead of:
<interface type='ethernet' name='eth0'/>
I noticed some debug messages are printed with an empty lines after
them. This patch removes these empty lines from all invocations of the
following macros:
VIR_DEBUG
VIR_DEBUG0
VIR_ERROR
VIR_ERROR0
VIR_INFO
VIR_WARN
VIR_WARN0
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
New pciDeviceIsAssignable() function for checking whether a given PCI
device can be assigned to a guest was added. Currently it only checks
for ACS being enabled on all PCIe switches between root and the PCI
device. In the future, it could be the right place to check whether a
device is unbound or bound to a stub driver.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently CPU topology may only be specified together with CPU model:
<cpu match='exact'>
<model>name</model>
<topology sockets='1' cores='2' threads='3'/>
</cpu>
This patch allows for CPU topology specification without the need for
also specifying CPU model:
<cpu>
<topology sockets='1' cores='2' threads='3'/>
</cpu>
'match' attribute and 'model' element are made optional with the
restriction that 'match' attribute has to be set when 'model' is
present.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When comparing x86 CPUs, features with 'disabled' policy were mistakenly
required to be supported by the host CPU.
Likewise, features with 'force' policy which were supported by host CPU
would make CPUs incompatible if 'strict' match was used by guest CPU.
This patch fixes both issues.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
QEMU's command line equivalent for the following domain XML fragment
<vcpus>2</vcpus>
<cpu ...>
...
<topology sockets='1' cores='2', threads='1'/>
</cpu>
is
-smp 2,sockets=1,cores=2,threads=1
This syntax was introduced in QEMU-0.12.
Version 2 changes:
- -smp argument build split into a separate function
- always add ",sockets=S,cores=C,threads=T" to -smp if qemu supports it
- use qemuParseCommandLineKeywords for command line parsing
Version 3 changes:
- ADD_ARG_LIT => ADD_ARG and line reordering in qemudBuildCommandLine
- rebased
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Current version expects name=value,... list and when an incorrect string
such as "a,b,c=d" would be parsed as "a,b,c" keyword with "d" value
without reporting any error, which is probably not the expected
behavior.
This patch adds an extra argument called allowEmptyValue, which if
non-zero will permit keywords with no value; "a,b=c,,d=" will be parsed
as follows:
keyword value
"a" NULL
"b" "c"
"" NULL
"d" ""
In case allowEmptyValue is zero, the string is required to contain
name=value pairs only; retvalues is guaranteed to contain non-NULL
pointers. Now, "a,b,c=d" will result in an error.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Replace
-balloon virtio
With
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3
This allows it to get correct assigned PCI address as declared in
previous patch
* src/qemu/qemu_conf.c: Convert Virtio ballon to -device and
give it an explicit PCI address
* tests/qemuxml2argvdata/qemuxml2argv-*args: Add in virtio balloon
where appropriate
Instead of relying on QEMU to assign PCI addresses and then querying
them with 'info pci', manually assign all PCI addresses before starting
the guest. These addresses are not stable across reboots. That will
come in a later patch
NB, the PIIX3 (IDE, FDC, ISA-Bridge) will always have slot 1 and
VGA will always have slot 2. We declare the Virtio Balloon gets
slot 3, and then all remaining slots are for configured devices.
* src/qemu/qemu_conf.c: If -device is supported, then assign all PCI
addresses when building the command line
* src/qemu/qemu_driver.c: Don't query monitor for PCI addresses if
they have already been assigned
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-pci-address-device.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-device.args,
tests/qemuxml2argvdata/qemuxml2argv-sound-device.args,
tests/qemuxml2argvdata/qemuxml2argv-watchdog-device.args: Update
to include PCI slot/bus information
QEMU always configures a VGA card. If no video card is included in
the libvirt XML, it is neccessary to explicitly turn off the default
using -vga none
* src/qemu/qemu_conf.c: Pass -vga none if no video card is configured
* tests/qemuargv2xmltest.c, tests/qemuxml2argvtest.c: Test for
handling -vga none.
* tests/qemuxml2argvdata/qemuxml2argv-nographics-vga.args,
tests/qemuxml2argvdata/qemuxml2argv-nographics-vga.xml: Test
data files
Not all QEMU builds default to SDL graphics for their display.
Newer QEMU now has an explicit -sdl flag, which we can use to
explicitly request SDL intead of relying on the default. This
protects libvirt against unexpected changes in graphics default
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Probe for -sdl
flag and use it if it is found
* tests/qemuhelptest.c: Add SDL flag to tests
The old syntax was
-chardev SOMECONFIG
-nic user,guestfwd=tcp:IP:PORT-chardev:CHARDEV
The new syntax is
-chardev SOMECONFIG
-netdev user,guestfwd=tcp:IP:PORT,chardev=ID,id=user-ID
The old syntax was
-usbdevice host:PRODUCT:VENDOR
Or
-usbdevice host:BUS.DEV
The new syntax is
-device usb-host,product=PRODUCT,vendor=VENDOR
Or
-device usb-host,hostbus=BUS,hostaddr=DEV
The previous syntax was severely limited in its options
-usbdevice disk:/home/berrange/output.img
The new syntax is the same as for other disk types
-drive file=/home/berrange/output.img,if=none,id=usb-1,index=1
-device usb-storage,drive=usb-1
Again, the index= arg is wrong here, and will be removed in a
later merge
The current syntax uses a pair of args
-net nic,macaddr=52:54:00:56:6c:55,vlan=3,model=pcnet,name=pcnet.0
-net user,vlan=3,name=user.0
The new syntax does not need the vlan craziness anymore, and
so has a simplified pair of args
-netdev user,id=user.0
-device pcnet,netdev=user.0,id=pcnet.0,mac=52:54:00:56:6c:55,addr=<PCI SLOT>
The current preferred syntax for disk drives uses
-drive file=/vms/plain.qcow,if=virtio,index=0,boot=on,format=qcow
The new syntax splits this up into a pair of linked args
-drive file=/vms/plain.qcow,if=none,id=drive-virtio-0,format=qcow2
-device virtio-blk-pci,drive=drive-virtio-0,id=virtio-0,addr=<PCI SLOT>
SCSI/IDE devices also get a bus property linking them to the
controller
-device scsi-disk,drive=drive-scsi0-0-0,id=scsi0-0-0,bus=scsi0.0,scsi-id=0
-device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bus=ide0,unit=0
The current syntax for audio devices is a horrible multiplexed
arg
-soundhw sb16,pcspk,ac97
The new syntax is
-device sb16,id=sound0
or
-device AC97,id=sound1,addr=<PCI SLOT>
NB, pcspk still uses the old -soundhw syntax
The current character device syntax uses either
-serial tty,path=/dev/ttyS2
Or
-chardev tty,id=serial0,path=/dev/ttyS2 -serial chardev:serial0
With the new -device support, we now prefer
-chardev file,id=serial0,path=/tmp/serial.log -device isa-serial,chardev=serial0
This patch changes the existing -chardev syntax to use this new
scheme, and fallbacks to the old plain -serial syntax for old
QEMU.
The monitor device changes to
-chardev socket,id=monitor,path=/tmp/test-monitor,server,nowait -mon chardev=monitor
In addition, this patch adds --nodefaults, which kills off the
default serial, parallel, vga and nic devices. THis avoids the
need for us to explicitly turn each off
When starting a guest, give every device a unique alias. This will
be used for the 'id' parameter in -device args in later patches.
It can also be used to uniquely identify devices in the monitor
For old QEMU without -device, assign disk names based on QEMU's
historical naming scheme.
* src/qemu/qemu_conf.c: Assign unique device aliases
* src/qemu/qemu_driver.c: Remove obsolete qemudDiskDeviceName
and use the device alias in eject & blockstats commands
* src/storage/storage_backend_fs.c (virStorageBackendFileSystemRefresh):
Correct parentheses. The documented intent is to ignore non-regular
files, yet due to a parenthesization error all errors were handled
that way.
Probe for the new -device flag and if available set the -nodefaults
flag, instead of using -net none, -serial none or -parallel none.
Other device types will be converted to use -device in later patches.
The -nodefaults flag will help avoid unwelcome surprises from future
QEMU releases
* src/qemu/qemu_conf.c: Probe for -device. Add -nodefaults flag.
Remove -net none, -serial none or -parallel none
* src/qemu/qemu_conf.h: Define QEMU_CMD_FLAG_DEVICE
* tests/qemuhelpdata/qemu-0.12.1: New data file for 0.12.1 QEMU
* tests/qemuhelptest.c: Test feature extraction from 0.12.1 QEMU
Although the serial, parallel, chanel, input & fs devices do
not have PCI address info, they can all have device aliases.
Thus it neccessary to associate the virDomainDeviceInfo data
with them all.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add hooks for
parsing / formatting device info for serial, parallel, channel
input and fs devices.
* docs/schemas/domain.rng: Associate device info with character
devices, input & fs device
This patch introduces the support for giving all devices a short,
unique name, henceforth known as a 'device alias'. These aliases
are not set by the end user, instead being assigned by the hypervisor
if it decides it want to support this concept.
The QEMU driver sets them whenever using the -device arg syntax
and uses them for improved hotplug/hotunplug. it is the intent
that other APIs (block / interface stats & device hotplug) be
able to accept device alias names in the future.
The XML syntax is
<alias name="video0"/>
This may appear in any type of device that supports device info.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add a 'alias'
field to virDomainDeviceInfo struct & parse/format it in XML
* src/libvirt_private.syms: Export virDomainDefClearDeviceAliases
* src/qemu/qemu_conf.c: Replace use of "nic_name" field with the
standard device alias
* src/qemu/qemu_driver.c: Clear device aliases at shutdown
The PCI device addresses are only valid while the VM is running,
since they are auto-assigned by QEMU. After shutdown they must
all be cleared. Future QEMU driver enhancement will allow for
persistent PCI address assignment
* src/conf/domain_conf.h, src/conf/domain_conf.c, src/libvirt_private.syms
Add virDomainDefClearPCIAddresses() method for wiping out auto assigned
PCI addresses
* src/qemu/qemu_driver.c: Clear PCI addresses at VM shutdown
Existing applications using libvirt are not aware of the disk
controller concept. Thus, after parsing the <disk> definitions
in the XML, it is neccessary to create <controller> elements
to satisfy all requested disks, as per their defined drive
addresses
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDefAddDiskControllers()
method for populating disk controllers, and call it after
parsing disk definitions.
* src/qemu/qemu_conf.c: Call virDomainDefAddDiskControllers()
when doing ARGV -> XML conversion
* tests/qemuxml2argvdata/qemuxml2argv*.xml: Add disk controller
data to all data files which don't have it already
It is perfectly acceptable to have multiple sound devices of
same type in guest configuration. If the underlying hypervisor
does not like this, it is its job to complain, not the XML
parser's
* src/conf/domain_conf.c: Remove hack which deleted duplicated
sound device models.
* tests/xml2sexprdata/xml2sexpr-fv-sound.xml: Remove duplicate
models
Hotunplug of devices requires that we know their PCI address. Even
hotplug of SCSI drives, required that we know the PCI address of
the SCSI controller to attach the drive to. We can find this out
by running 'info pci' and then correlating the vendor/product IDs
with the devices we booted with.
Although this approach is somewhat fragile, it is the only viable
option with QEMU < 0.12, since there is no way for libvirto set
explicit PCI addresses when creating devices in the first place.
For QEMU > 0.12, this code will not be used.
* src/qemu/qemu_driver.c: Assign all dynamic PCI addresses on
startup of QEMU VM, matching vendor/product IDs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API for fetching PCI device address mapping
The current SCSI hotplug support attaches a brand new SCSI controller
for every disk. This is broken because the semantics differ from those
used when starting the VM initially. In the latter case, each SCSI
controller is filled before a new one is added.
If the user specifies an high drive index (sdazz) then at initial
startup, many intermediate SCSI controllers may be added with no
drives.
This patch changes SCSI hotplug so that it exactly matches the
behaviour of initial startup. First the SCSI controller number is
determined for the drive to be hotplugged. If any controller upto
and including that controller number is not yet present, it is
attached. Then finally the drive is attached to the last controller.
NB, this breaks SCSI hotunplug, because there is no 'drive_del'
command in current QEMU. Previous SCSI hotunplug was broken in
any case because it was unplugging the entire controller, not
just the drive in question.
A future QEMU will allow proper SCSI hotunplug of a drive.
This patch is derived from work done by Wolfgang Mauerer on disk
controllers.
* src/qemu/qemu_driver.c: Fix SCSI hotplug to add a drive to
the correct controller, instead of just attaching a new
controller.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for 'drive_add' command
This patch allows for explicit hotplug/unplug of SCSI controllers.
Ordinarily this is not required, since QEMU/libvirt will attach
a new SCSI controller whenever one is required. Allowing explicit
hotplug of controllers though, enables the caller to specify a
static PCI address, instead of auto-assigning the next available
PCI slot. Or it will when we have static PCI addressing.
This patch is derived from Wolfgang Mauerer's disk controller
patch series.
* src/qemu/qemu_driver.c: Support hotplug & unplug of SCSI
controllers
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new API for attaching PCI SCSI controllers
The latter is not officially "wrong", but *is* terribly anachronistic.
I think automake documentation or comments call that syntax obsolescent.
* cfg.mk (_makefile_at_at_check_exceptions): Exempt @SCHEMADIR@
and @SYSCONFDIR@ uses -- there are no Makefile variables for those.
* docs/Makefile.am: Use $(INSTALL), not @INSTALL@.
* examples/dominfo/Makefile.am: Similar.
* examples/domsuspend/Makefile.am: Similar.
* proxy/Makefile.am: Similar.
* python/Makefile.am: Similar.
* python/tests/Makefile.am: Similar.
* src/Makefile.am: Similar.
* tests/Makefile.am: Similar.
Until recently, some gnulib-generated replacement headers
included *other* headers that were not strictly necessary,
thus masking the need in this file for an explicit <stdlib.h>.
* src/util/util.c: Include <stdlib.h> for declarations of e.g.,
strtol, random_r, getenv, etc.
Current implementation of x86Decode() used for CPUID -> model+features
translation does not always select the closest CPU model. When walking
through all models from cpu_map.xml the function considers a new
candidate as a better choice than a previously selected candidate only
if the new one is a superset of the old one. In case the new candidate
is closer to host CPU but lacks some feature comparing to the old
candidate, the function does not choose well.
This patch changes the algorithm so that the closest model is always
selected. That is, the model which requires the lowest number of
additional features to describe host CPU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemudFindCharDevicePTYsMonitor reports an error if 'info chardev' didn't
provide information for a requested device, even if the log output parsing
had found the pty path for that device. This makes pty assignment fail for
older QEMU/KVM versions. For example KVM 72 on Debian doesn't support
'info chardev', so qemuMonitorTextGetPtyPaths cannot parse any useful
information and the hash for device-id-to-pty-path mapping stays empty.
Make qemudFindCharDevicePTYsMonitor report an error only if the log output
parsing and the 'info chardev' parsing failed to provide the pty path.
* src/conf/domain_conf.c: add defaults for the video device
* src/esx/esx_vmx.[ch]: add VNC support to the VMX handling
* tests/vmx2xmltest.c, tests/xml2vmxtest.c: add tests for the VNC support
The current code for using -drive simply sets the -drive 'index'
parameter. QEMU internally converts this to bus/unit depending
on the type of drive. This does not give us precise control over
the bus/unit assignment though. This change switches over to make
libvirt explicitly calculate the bus/unit number.
In addition bus/unit/index are actually irrelevant for VirtIO
disks, since each virtio disk is a separate PCI device. No disk
controller is involved.
Doing the conversion to bus/unit in libvirt allows us to correctly
attach SCSI controllers when required.
* src/qemu/qemu_conf.c: Specify bus/unit instead of index for
disks
* tests/qemuxml2argvdata/qemuxml2argv-disk*.args: Switch over from
using index=NNNN, to bus=NN, unit=NN for SCSI/IDE/Floppy disks
To enable it to be called from multiple locations, split out
the code for building the -drive arg string. This will be needed
by later patches which do drive hotplug, the conversion to use
-device, and the conversion to controller/bus/unit addressing
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add qemuBuildDriveStr
for building -drive arg string
Convert the QEMU monitor APIs over to use virDomainDeviceAddress
structs for passing addresses in/out, instead of individual bits.
This makes the number of parameters smaller & easier to deal with.
No functional change
* src/qemu/qemu_driver.c, src/qemu/qemu_monitor.c,
src/qemu/qemu_monitor.h, src/qemu/qemu_monitor_text.c,
src/qemu/qemu_monitor_text.h: Change monitor hotplug APIs to
take an explicit address ptr for all host/guest addresses
This augments virDomainDevice with a <controller> element
that is used to represent disk controllers (e.g., scsi
controllers). The XML format is given by
<controller type="scsi" index="<num>">
<address type="pci" domain="0xNUM" bus="0xNUM" slot="0xNUM"/>
</controller>
where type denotes the disk interface (scsi, ide,...), index
is an integer that identifies the controller for association
with disks, and the <address> element specifies the controller
address on the PCI bus as described in previous commits
The address element can be omitted; in this case, an address
will be assigned automatically.
Most of the code in this patch is from Wolfgang Mauerer's
previous disk controller series
* docs/schemas/domain.rng: Define syntax for <controller>
XML element
* src/conf/domain_conf.c, src/conf/domain_conf.h: Define
virDomainControllerDef struct, and routines for parsing
and formatting XML
* src/libvirt_private.syms: Add virDomainControllerInsert
and virDomainControllerDefFree
When parsing the <disk> element specification, if no <address>
is provided for the disk, then automatically assign one based on
the <target dev='sdXX'/> device name. This provides for backwards
compatability with existing applications using libvirt, while also
allowing new apps to have complete fine grained control.
* src/conf/domain_conf.h, src/conf/domain_conf.c,
src/libvirt_private.syms: Add virDomainDiskDefAssignAddress()
for assigning a controller/bus/unit address based on disk target
* src/qemu/qemu_conf.c: Call virDomainDiskDefAssignAddress() after
generating XML from ARGV
* tests/qemuxml2argvdata/*.xml: Add in drive address information
to all XML files
Add the virDomainDeviceAddress information to the sound, video
and watchdog devices. This means all of them gain the new XML
element
<address .... />
This brings them upto par with disk/net/hostdev devices which
already have address info
* src/conf/domain_conf.h: Add virDomainDeviceAddress to sound,
video & watchdog device struts.
* src/conf/domain_conf.c: Hook up parsing/formatting for
virDomainDeviceAddress in sound, video & watchdog devices
* docs/schemas/domain.rng: Associate device address info
with sound, video & watchdog
Introduce a new structure
struct _virDomainDeviceDriveAddress {
unsigned int controller;
unsigned int bus;
unsigned int unit;
};
and plug that into virDomainDeviceAddress and generates XML that
looks like
<address type='drive' controller='1' bus='0' unit='5'/>
This syntax will be used by the QEMU driver to explicitly control
how drives are attached to the bus
* src/conf/domain_conf.h, src/conf/domain_conf.c: Parsing and
formatting of drive addresses
* docs/schemas/domain.rng: Define new address format for drives
All guest devices now use a common device address structure
summarized by:
enum virDomainDeviceAddressType {
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE,
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI,
};
struct _virDomainDevicePCIAddress {
unsigned int domain;
unsigned int bus;
unsigned int slot;
unsigned int function;
};
struct _virDomainDeviceInfo {
int type;
union {
virDomainDevicePCIAddress pci;
} addr;
};
This replaces the anonymous structs in Disk/Net/Hostdev data
structures. Where available, the address is *always* printed
in the XML file, instead of being hidden in the internal state
file.
<address type='pci' domain='0x0000' bus='0x1e' slot='0x07' function='0x0'/>
The structure definition is based on Wolfgang Mauerer's disk
controller patch series.
* docs/schemas/domain.rng: Define the <address> syntax and
associate it with disk/net/hostdev devices
* src/conf/domain_conf.h, src/conf/domain_conf.c,
src/libvirt_private.syms: APIs for parsing/formatting address
information. Also remove the QEMU specific 'pci_addr' attributes
* src/qemu/qemu_driver.c: Replace use of 'pci_addr' attrs with
new standardized format.
With the introduction virDispatchError, hook function errors are
never sent through the error callback, so users will never see
these messages.
Fix this by calling virDispatchError after hook failure.
Based off how QEMU does it, look through /sys/bus/usb/devices/* for
matching vendor:product info, and if found, use info from the surrounding
files to build the device's /dev/bus/usb path.
This fixes USB device assignment by vendor:product when running qemu
as non-root (well, it should, but for some reason I couldn't reproduce
the failure people are seeing in [1], but it appears to work properly)
[1] https://bugzilla.redhat.com/show_bug.cgi?id=542450
udev doesn't prefix USB product/vendor info with '0x', so the
strtol conversions were wrong for the product field (vendor already
set the correct base). Make the change for PCI product/vendor as
well to be safe.
This fixes USB device assignment via virt-manager.
This allows debug statements and raised errors in hook functions to
actually be logged somewhere (stderr). Users can enable debugging in the
daemon and now see more info in /var/log/libvirt/...
Upstream xen has changed parameters to the migration operation
several times over the past 18 months. Changeset 17553 removed
the resouce parameter, Changesets 17709, 17753, and 20326 added
ssl, node, and change_home_server parameters respectively.
Fortunately, testing has revealed that xend will fail the
operation if a parameter is missing but happily honor it if
unknown parameters are provided. Thus all currently supported
parameters can be provided, satisfying current xend but not
regressing older versions.
The virRaiseErrorFull() may invoke the error handler callback
functions an application has registered. This is not good
because the connection object may not be available at this
point, and the caller may be holding locks. This creates a
problem if the error handler calls back into libvirt.
The solutuon is to move invocation of the handler into the
final cleanup code in the public API entry points, where it
is guarenteed to have safe state.
* src/libvirt.c: Invoke virDispatchError() in all error paths
* src/util/virterror.c: Remove virSetConnError/virSetGlobalError,
replacing with virDispatchError(). Move invocation of the
error callbacks into virDispatchError() instead of the
virRaiseErrorFull function which is not in a safe context
qemudWaitForMonitor calls qemudReadLogOutput with qemudFindCharDevicePTYs
as callback. qemudFindCharDevicePTYs calls qemudExtractTTYPath to assign
a string to chr->data.file.path. Afterwards qemudWaitForMonitor may call
qemudFindCharDevicePTYsMonitor that overwrites chr->data.file.path without
freeing the old value. This results in leaking the memory allocated by
qemudExtractTTYPath.
Report an OOM error if the strdup in qemudFindCharDevicePTYsMonitor fails.
Only use pseudo-random generator for uuid if using /dev/random fails.
* src/util/uuid.c: The original code. would only print the warning
message if using /dev/random failed, but would still go ahead and call
virUUIDGeneratePseudoRandomBytes in all cases anyway.
Currently only the faultcode and faultstring are deserialized, the
detail part is ignored. The implementation of many new SOAP types
would be necessary to deserialize the detail part correctly. As an
intermediate solution the raw response is dumped to the debug log.
The -mem-prealloc flag should be used when using large pages
This ensures qemu tries to allocate all required memory immediately,
rather than when first used. The latter mode will crash qemu
if hugepages aren't available when accessed, while the former
should gracefully fallback to non-hugepages.
* src/qemu/qemu_conf.c: add -mem-prealloc flag to qemu command line
when using large pages
xen-unstable c/s 20685 changed the domctl interface, adding a field to
xen_domctl_getdomaininfo structure. This additional field causes stack
corruption in libvirt. xen-unstable c/s 20711 rightly bumped the domctl
interface version so it is at least possible to handle the new field.
This change accounts for shr_pages field added to xen_domctl_getdomaininfo
structure.
The MAC addresses with 00:50:56 prefix are split into several ranges:
00:50:56:00:00:00 - 00:50:56:3f:ff:ff 'static' range (manually assigned)
00:50:56:80:00:00 - 00:50:56:bf:ff:ff 'vpx' range (assigned by a VI Client)
Erroneously the 'vpx' range was assumed to be larger and to occupy the
remaining addresses of the 00:50:56 prefix that are not part of the 'static'
range.
00:50:56 was used as prefix for generated MAC addresses, this is not possible
anymore, because there are gaps in the allowed ranges. Therefore, change the
prefix to 00:0c:29 which is the prefix for auto generated MAC addresses anyway.
Allow arbitrary MAC addresses to be used and set the checkMACAddress VMX option
to false in case the MAC address doesn't fall into any predefined range.
* docs/drvesx.html.in: update website accordingly
* src/esx/esx_driver.c: set the auto generation prefix to 00:0c:29
* src/esx/esx_vmx.c: fix MAC address range handling and allow arbitrary MAC
addresses
* tests/vmx2xml*, tests/xml2vmx*: add some basic MAC address range tests
The data passed to the callback is not guaranteed to be zero terminated,
take care of that by coping the data and adding a zero terminator.
Also dump the data for other types than CURLINFO_TEXT.
Set CURLOPT_VERBOSE to 1 so the debug callback is called when enabled.
A domain with virtualHW version 4 is allowed on an ESX 4.0 server.
If a domain is migrated from an ESX 3.5 server to an ESX 4.0 server
then the virtualHW version stays the same. So a ESX 4.0 server can
host domains with virtualHW version 4.
This invalid free results in heap corruption. Some symptoms I saw
because of this were libvirtd crashing and virt-manager hanging
while trying to enumerate devices.
The behavior for the qemu balloon device has changed. Formerly, a virtio
balloon device was provided by default. Now, '-balloon virtio' must be
specified on the command line to enable it. This patch causes libvirt to
add '-balloon virtio' to the command line whenever the -balloon option is
available.
* src/qemu/qemu_conf.c src/qemu/qemu_conf.h: check for the new flag and
add "-baloon vitio" to qemu command when needed
* tests/qemuhelptest.c: add the new flag for detection
This patch removes the call to vol update after the volume build completes.
The update call is currently meaningless anyway because the vol build is passed
a copy of the definition, so the update result is thrown away. More
importantly, if the user specified a selinux label for the volume, the update
call results in a double free of the label
* src/storage/storage_backend_fs.c: remove the update call
This change makes the 'info chardev' parser ignore any trailing
whitespace on a line. This fixes a specific problem handling a '\r\n'
line ending.
* src/qemu/qemu_monitor_text.c: Ignore trailing whitespace in
'info chardev' output.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepare2): Remove useless
test of always-non-NULL uri_out parameter. Use ATTRIBUTE_NONNULL to
inform tools.
All other stateful drivers are linked directly to libvirtd
instead of libvirt.so. Link the secret driver to libvirtd too.
* daemon/Makefile.am: link the secret driver to libvirtd
* daemon/libvirtd.c: add #ifdef WITH_SECRETS blocks
* src/Makefile.am: don't link the secret driver to libvirt.so
* src/libvirt_private.syms: remove the secretRegister symbol
If using a remote access, sometimes an RPC entry point is not
available, and currently we just end up with a raw:
error: unknown procedure: xxx
error, while this should be more cleanly reported as an unsupported
entry point like for local access
* src/remote/remote_driver.c: convert missing remote entry points into
the unsupported feature error
When querying about a domain from 0.3.3 (or RHEL 5.3) domain located
on a 0.6.3 (RHEL-5) machine, the errors are not properly reported.
This patch from Olivier Fourdan <ofourdan@redhat.com> , slightly
modified to not change the semantic when the domain os details cannot
be provided
* src/xen/proxy_internal.c src/xen/xen_hypervisor.c: add some missing
error reports
Found while trying to cross-compile libvirt on Fedora 12 for Windows.
gnulib redefines 'close' to 'close_used_without_including_unistd_h'
in sys/socket.h if winsock2.h is present and unistd.h has not been
included before sys/socket.h. Reorder some includes to fix this.
* src/qemu/qemu_conf.h: Remove QEMU_CMD_FLAG_0_12 and just leave
the lone JSON flag
* src/qemu/qemu_conf.c: Enable JSON on QEMU 0.13 or later, but
leave it disabled for now
The Xen code for making HVM VT-d PCI passthrough attach and detach
wasn't working properly:
1) In xenDaemonAttachDevice(), we were always trying to reconfigure
a PCI passthrough device, even the first time we added it. This was
because the code in virDomainXMLDevID() was not checking xenstore for
the existence of the device, and always returning 0 (meaning that
the device already existed).
2) In xenDaemonDetachDevice(), we were trying to use "device_destroy"
to detach a PCI device. While you would think that is the right
method to call, it's actually wrong for PCI devices. In particular,
in upstream Xen (and soon in RHEL-5 Xen), device_configure is actually
used to destroy a PCI device.
To fix the attach
problem I add a lookup into xenstore to see if the device we are
trying to attach already exists. To fix the detach problem I change
it so that for PCI detach (only), we use device_configure with the
appropriate sxpr to do the detachment.
* src/xen/xend_internal.c: don't use device_destroy for PCI devices
and fix the other issues.
* src/xen/xs_internal.c src/xen/xs_internal.h: add
xenStoreDomainGetPCIID()
The XML XPath for detecting JSON in the running VM statefile was
wrong causing all VMs to get JSON mode enabled at libvirtd restart.
In addition if a VM was running a JSON enabled QEMU once, and then
altered to point to a non-JSON enabled QEMU later the 'monJSON'
flag would not get reset to 0.
* src/qemu/qemu_driver.c: Fix setting/detection of JSON mode
The code for connecting to a server tries each socket in turn
until it finds one that connects. Unfortunately for TLS sockets
if it connected, but failed TLS handshake it would treat that
as a failure to connect, and try the next socket. This is bad,
it should have reported the TLS failure immediately.
$ virsh -c qemu://somehost.com/system
error: unable to connect to libvirtd at 'somehost.com': Invalid argument
error: failed to connect to the hypervisor
$ ./tools/virsh -c qemu://somehost.com/system
error: server certificate failed validation: The certificate hasn't got a known issuer.
error: failed to connect to the hypervisor
* src/remote/remote_driver.c: Stop trying to connect if the
TLS handshake fails
Use a dynamically sized xdr_array to pass memory stats on the wire. This
supports the addition of future memory stats and reduces the message size
since only supported statistics are returned.
* src/remote/remote_protocol.x: provide defines for the new entry point
* src/remote/remote_driver.c daemon/remote.c: implement the client and
server side
* daemon/remote_dispatch_args.h daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h daemon/remote_dispatch_table.h
src/remote/remote_protocol.c src/remote/remote_protocol.h: generated
stubs
Support for memory statistics reporting is accepted for qemu inclusion.
Statistics are reported via the monitor command 'info balloon' as a comma
seprated list:
(qemu) info balloon
balloon: actual=1024,mem_swapped_in=0,mem_swapped_out=0,major_page_faults=88,minor_page_faults=105535,free_mem=1017065472,total_mem=1045229568
Libvirt, qemu, and the guest operating system may support a subset of the
statistics defined by the virtio spec. Thus, only statistics recognized by
components will be reported.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement the
new entry point by using info balloon monitor command
Set up the types for the domainMemoryStats function and insert it into the
virDriver structure definition. Because of static initializers, update
every driver and set the new field to NULL.
* include/libvirt/libvirt.h.in: new API
* src/driver.h src/*/*_driver.c src/vbox/vbox_tmpl.c: add the new
entry to the driver structure
* python/generator.py: fix compiler errors, the actual python binding is
implemented later
If a virtual machine is destroyed on a ESX server then immediately
undefining this virtual machine on a vCenter may fail, because the
vCenter has not been informed about the status change yet. Therefore,
destroy a virtual machine on a vCenter if available, so the vCenter
is up-to-date when the virtual machine should be undefined.
Undefining a virtual machine on an ESX server leaves a orphan on the
vCenter behind. So undefine a virtual machine on a vCenter if available
to fix this problem.
If an ESX host is managed by a vCenter, it knows the IP address of the
vCenter. Setting the vCenter query parameter to * allows to connect to the
vCenter known to an ESX host without the need to specify its IP address
or hostname explicitly.
esxDomainLookupByUUID() and esxDomainIsActive() lookup a domain by asking
ESX for all known domains and searching manually for the one with the
matching UUID. This is inefficient. The VI API allows to lookup by UUID
directly: FindByUuid().
* src/esx/esx_driver.c: change esxDomainLookupByUUID() and esxDomainIsActive()
to use esxVI_LookupVirtualMachineByUuid(), also reorder some functions to
keep them in sync with the driver struct
Questions can block tasks, to handle them automatically the driver can answers
them with the default answer. The auto_answer query parameter allows to enable
this automatic question handling.
* src/esx/README: add a detailed explanation for automatic question handling
* src/esx/esx_driver.c: add automatic question handling for all task related
driver functions
* src/esx/esx_util.[ch]: add handling for the auto_answer query parameter
* src/esx/esx_vi.[ch], src/esx/esx_vi_methods.[ch], src/esx/esx_vi_types.[ch]:
add new VI API methods and types and additional helper functions for
automatic question handling
Commit 33a198c1f6 increased the gcrypt
version requirement to 1.4.2 because the GCRY_THREAD_OPTION_VERSION
define was added in this version.
The configure script doesn't check for the gcrypt version. To support
gcrypt versions < 1.4.2 change the virTLSThreadImpl initialization
to use GCRY_THREAD_OPTION_VERSION only if it's defined.
Each driver supporting CPU selection must fill in host CPU capabilities.
When filling them, drivers for hypervisors running on the same node as
libvirtd can use cpuNodeData() to obtain raw CPU data. Other drivers,
such as VMware, need to implement their own way of getting such data.
Raw data can be decoded into virCPUDefPtr using cpuDecode() function.
When implementing virConnectCompareCPU(), a hypervisor driver can just
call cpuCompareXML() function with host CPU capabilities.
For each guest for which a driver supports selecting CPU models, it must
set the appropriate feature in guest's capabilities:
virCapabilitiesAddGuestFeature(guest, "cpuselection", 1, 0)
Actions needed when a domain is being created depend on whether the
hypervisor understands raw CPU data (currently CPUID for i686, x86_64
architectures) or symbolic names has to be used.
Typical use by hypervisors which prefer CPUID (such as VMware and Xen):
- convert guest CPU configuration from domain's XML into a set of raw
data structures each representing one of the feature policies:
cpuEncode(conn, architecture, guest_cpu_config,
&forced_data, &required_data, &optional_data,
&disabled_data, &forbidden_data)
- create a mask or whatever the hypervisor expects to see and pass it
to the hypervisor
Typical use by hypervisors with symbolic model names (such as QEMU):
- get raw CPU data for a computed guest CPU:
cpuGuestData(conn, host_cpu, guest_cpu_config, &data)
- decode raw data into virCPUDefPtr with a possible restriction on
allowed model names:
cpuDecode(conn, guest, data, n_allowed_models, allowed_models)
- pass guest->model and guest->features to the hypervisor
* src/cpu/cpu.c src/cpu/cpu.h src/cpu/cpu_generic.c
src/cpu/cpu_generic.h src/cpu/cpu_map.c src/cpu/cpu_map.h
src/cpu/cpu_x86.c src/cpu/cpu_x86.h src/cpu/cpu_x86_data.h
* configure.in: check for CPUID instruction
* src/Makefile.am: glue the new files in
* src/libvirt_private.syms: add new private symbols
* po/POTFILES.in: add new cpu files containing translatable strings
* src/remote/remote_protocol.x: update with new entry point
* daemon/remote.c: add the new server dispatcher
* daemon/remote_dispatch_args.h daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h daemon/remote_dispatch_table.h
src/remote/remote_protocol.c src/remote/remote_protocol.h: regenerated
* src/driver.h: add an extra entry point in the structure
* src/esx/esx_driver.c src/lxc/lxc_driver.c src/opennebula/one_driver.c
src/openvz/openvz_driver.c src/phyp/phyp_driver.c src/qemu/qemu_driver.c
src/remote/remote_driver.c src/test/test_driver.c src/uml/uml_driver.c
src/vbox/vbox_tmpl.c src/xen/xen_driver.c: add NULL entry points for
all drivers
* include/libvirt/virterror.h src/util/virterror.c: add new domain
VIR_FROM_CPU for errors
* src/conf/cpu_conf.c src/conf/cpu_conf.h: new parsing module
* src/Makefile.am proxy/Makefile.am: include new files
* src/conf/capabilities.[ch] src/conf/domain_conf.[ch]: reference
new code
* src/libvirt_private.syms: private export of new entry points
GNUTLS uses gcrypt for its crypto functions. gcrypt requires
that the app/library initializes threading before using it.
We don't want to force apps using libvirt to know about
gcrypt, so we make virInitialize init threading on their
behalf. This location also ensures libvirtd has initialized
it correctly. This initialization is required even if libvirt
itself were only using one thread, since another non-libvirt
library (eg GTK-VNC) could also be using gcrypt from another
thread
* src/libvirt.c: Register thread functions for gcrypt
* configure.in: Add -lgcrypt to linker flags
* src/storage/storage_driver.c: Fix IsPersistent() and IsActivE()
methods on storage pools to use 'storagePrivateData' instead
of 'privateData'. Also fix naming convention of objects
* src/esx/esx_vi.c (esxVI_List_CastFromAnyType): For invalid
inputs, fail right away. Do not "goto failure" where a NULL
input pointer would be dereferenced.
* src/esx/esx_util.c (esxUtil_ParseDatastoreRelatedPath): Return
right away for invalid inputs, rather than using them (which would
dereference NULL pointers) in clean-up code.
The virFileResolveLink utility function relied on the POSIX guarantee
that stat.st_size of a symlink is the length of the value. However,
on some types of file systems, it is invalid, so do not rely on it.
Use gnulib's areadlink module instead.
* bootstrap (modules): Add areadlink.
* src/util/util.c: Include "areadlink.h".
Let areadlink perform the readlink and malloc.
* configure.in (AC_CHECK_FUNCS): Remove readlink. No need,
since it's presence is guaranteed by gnulib.
* src/xen/xm_internal.c (xenXMConfigGetULong): Remove useless and
misleading test (always false) for val->str == NULL before code that
always dereferences val->str. "val" comes from virConfGetValue, and
at that point, val->str is guaranteed to be non-NULL.
(xenXMConfigGetBool): Likewise.
* src/util/conf.c (virConfSetValue): Ensure that vir->str is never NULL,
not even if someone tries to set such a value via virConfSetValue.
* src/qemu/qemu_driver.c (doNonTunnelMigrate): Don't let a
NULL "uri_out" provoke a NULL-dereference in doNativeMigrate:
supply omitted goto-after-qemudReportError.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepareTunnel): Upon an
out of memory error, we would end up with unixfile==NULL and attempt
to unlink(NULL). Skip the unlink when it's NULL.
* src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Set
"event" to NULL after qemuDomainEventQueue frees it, so a
subsequent free (after endjob label) upon qemuMonitorStartCPUs
failure does not cause a double free.
* src/node_device/node_device_driver.c (update_driver_name): The
previous code would write one byte beyond the end of the 4KiB
stack buffer when presented with a symlink value of exactly that
length (very unlikely). Remove the automatic buffer and use
virFileResolveLink in place of readlink. Suggested by Daniel Veillard.
* src/storage/storage_backend_fs.c: virStorageBackendFileSystemDelete
was incorrectly calling unlink() in an attempt to remove a directory.
It should be calling rmdir() instead.
src/node_device/node_device_udev.c was using a function available only
on the daemon code, fix this and use the function available globally
* src/node_device/node_device_udev.c: replace use of virEventAddHandleImpl
by virEventAddHandle
If there are no references remaining to the object, vm is set to NULL
and vm->persistent cannot be accessed. Fixed by this trivial patch.
* src/qemu/qemu_driver.c (qemudDomainCoreDump): Avoid possible
NULL pointer dereference on --crash dump.
This is trivial for QEMU since you just have to not stop the vm before
starting the dump. And for Xen, you just pass the flag down to xend.
* include/libvirt/libvirt.h.in (virDomainCoreDumpFlags): Add VIR_DUMP_LIVE.
* src/qemu/qemu_driver.c (qemudDomainCoreDump): Support live dumping.
* src/xen/xend_internal.c (xenDaemonDomainCoreDump): Support live dumping.
* tools/virsh.c (opts_dump): Add --live. (cmdDump): Map it to VIR_DUMP_LIVE.
This patch adds the --crash option (already present in "xm dump-core")
to "virsh dump". virDomainCoreDump already has a flags argument, so
the API/ABI is untouched.
* include/libvirt/libvirt.h.in (virDomainCoreDumpFlags): New flag for
CoreDump
* src/test/test_driver.c (testDomainCoreDump): Do not crash
after dump unless VIR_DUMP_CRASH is given.
* src/qemu/qemu_driver.c (qemudDomainCoreDump): Shutdown the domain
instead of restarting it if --crash is passed.
* src/xen/xend_internal.c (xenDaemonDomainCoreDump): Support --crash.
* tools/virsh.c (opts_dump): Add --crash.
(cmdDump): Map it to flags for virDomainCoreDump and pass them.
1) qemuMigrateToCommand uses ">>" so we have to truncate the file
before starting the migration;
2) the command wasn't updated to chown the driver and set/restore
the security lavels;
3) the VM does not have to be resumed if migration fails;
4) the file is not removed when migration fails.
* src/qemu/qemu_driver.c (qemuDomainCoreDump): Truncate file before
dumping, set/restore ownership and security labels for the file.
Those were pointed by DanB in his review but not yet fixed
* src/qemu/qemu_driver.c: qemudWaitForMonitor() use EnterMonitorWithDriver()
and ExitMonitorWithDriver() there
* src/qemu/qemu_monitor_text.c: checking fro strdu failure and hash
table add error in qemuMonitorTextGetPtyPaths()
This change makes the QEMU driver get pty paths from the output of the
monitor 'info chardev' command. This output is structured, and contains
both the name of the device and the path on the same line. This is
considerably more reliable than parsing the startup log output, which
requires the parsing code to know which order QEMU will print pty
information in.
Note that we still need to parse the log output as the monitor itself
may be on a pty. This should be rare, however, and the new code will
replace all pty paths parsed by the log output method once the monitor
is available.
* src/qemu/qemu_monitor.(c|h) src/qemu_monitor_text.(c|h): Implement
qemuMonitorGetPtyPaths().
* src/qemu/qemu_driver.c: Get pty path information using
qemuMonitorGetPtyPaths().
Change -monitor, -serial and -parallel output to use -chardev if it is
available.
* src/qemu/qemu_conf.c: Update qemudBuildCommandLine to use -chardev where
available.
* tests/qemuxml2argvtest.c tests/qemuxml2argvdata/: Add -chardev equivalents
for all current serial and parallel tests.
Even if qemudStartVMDaemon suceeds, an error was logged such as
'qemuRemoveCgroup:1778 : internal error Unable to find cgroup for'.
This is because qemudStartVMDaemon calls qemuRemoveCgroup to
ensure that old cgroup does not remain. This workaround makes
sense but leaving an error message may confuse users.
* src/qemu/qemu_driver.c: a an option to the function to suppress the
error being logged
This patch fixes the bug where paused/running state is not
transmitted during migration. As a result, in the QEMU driver
for example the machine was always started on the destination
end.
In order to do so, just read the state and if it is appropriate and
set the VIR_MIGRATE_PAUSED flag.
* src/libvirt.c (virDomainMigrateVersion1, virDomainMigrateVersion2):
Automatically add VIR_MIGRATE_PAUSED when appropriate.
* src/xen/xend_internal.c (xenDaemonDomainMigratePerform): Give a nicer
error message when migration of paused domains is attempted.
This adds a new flag, VIR_MIGRATE_PAUSED, that mandates pausing
the migrated VM before starting it.
* include/libvirt/libvirt.h.in (virDomainMigrateFlags): Add VIR_MIGRATE_PAUSED.
* src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Handle VIR_MIGRATE_PAUSED.
* tools/virsh.c (opts_migrate): Add --suspend. (cmdMigrate): Handle it.
* tools/virsh.pod (migrate): Document it.
This makes a small change on the failed-migration path. Up to now,
all VMs that failed non-live migration after the "stop" command
were restarted. This must not be done when the VM was paused in
the first place.
* src/qemu/qemu_driver.c (qemudDomainMigratePerform): Do not restart
a paused VM that fails migration. Set paused state after "stop",
reset it after failure.
We don't use this method of reloading rules anymore, so we can just
kill the code.
This simplifies things a lot because we no longer need to keep a
table of the rules we've added.
* src/util/iptables.c: kill iptablesReloadRules()
Long ago we tried to use Fedora's lokkit utility in order to register
our iptables rules so that 'service iptables restart' would
automatically load our rules.
There was one fatal flaw - if the user had configured iptables without
lokkit, then we would clobber that configuration by running lokkit.
We quickly disabled lokkit support, but never removed it. Let's do
that now.
The 'my virtual network stops working when I restart iptables' still
remains. For all the background on this saga, see:
https://bugzilla.redhat.com/227011
* src/util/iptables.c: remove lokkit support
* configure.in: remove --enable-lokkit
* libvirt.spec.in: remove the dirs used only for saving rules for lokkit
* src/Makefile.am: ditto
* src/libvirt_private.syms, src/network/bridge_driver.c,
src/util/iptables.h: remove references to iptablesSaveRules
This is the expected behaviour, I think - reloading libvirtd should
be a subset of restarting it.
Note, we reload the rules after we've determined which networks
are active (because we only add the rules for active networks)
and before we start autostart networks (to avoid re-adding the
rules).
* src/network/bridge_driver.c: reload iptables rules on startup
Currently, when we add iptables rules, we keep them on a list so that
we can easily reload them on e.g. 'service libvirtd reload'.
However, we don't save this list to disk, so if libvirtd is restarted
we lose the ability to reload the rules.
The fix is simple - just re-add the damn things on reload.
Note, we delete the rules before re-adding them, just like the current
behaviour of iptRulesReload().
* src/network/bridge_driver.c: re-add the iptables rules on reload.
Replace free(virBufferContentAndReset()) with virBufferFreeAndReset().
Update documentation and replace all remaining calls to free() with
calls to VIR_FREE(). Also add missing calls to virBufferFreeAndReset()
and virReportOOMError() in OOM error cases.
xen-unstable changesets 20321 and 20521 added support for
description in xend domain config. This patch extends that
support in xend backend.
* src/xen/xend_internal.c: add parse and output of domain description
The QEMU 0.10.0 release (and possibly other 0.10.x) has a bug where
it sometimes/often forgets to display the initial monitor greeting
line, soley printing a (qemu). This in turn confuses the text
console parsing because it has a '(qemu)' it is not expecting. The
confusion results in a negative malloc. Bad things follow.
This re-writes the text console handling to be more robust. The key
idea is that it should only look for a (qemu), once it has seen the
original command echo'd back. This ensures it'll skip the bogus stray
(qemu) with broken QEMUs.
* src/qemu/qemu_monitor.c: Add some (disabled) debug code
* src/qemu/qemu_monitor_text.c: Re-write way command replies
are detected
Since the monitor I/O is processed out of band from the main
thread(s) invoking monitor commands, the virDomainObj may be
deleted by the I/O thread. The qemuDomainObjBeginJob takes an
extra reference to protect against final deletion, but this
reference is released by the corresponding EndJob call. THus
after the EndJob call it may not be valid to reference the
virDomainObj any more. To allow callers to detect this, the
EndJob call is changed to return the remaining reference count.
* src/conf/domain_conf.c: Make virDomainObjUnref return the
remaining reference count
* src/qemu/qemu_driver.c: Avoid referencing virDomainObjPtr
after qemuDomainObjEndJob if it has been deleted.
Fix this warning, there is no need to use an intermediate,
different array pointer.
network.c: In function 'getIPv6Addr':
network.c:50: warning: dereferencing type-punned pointer will break strict-aliasing rules
* src/util/network.c: avoid an intermediary pointer cast
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add callbacks
for reset, shutdown, poweroff and stop events. Add convenience
methods for emiting those events
With addition of events there will be alot of callbacks.
To avoid having to add many APIs to register callbacks,
provide them all at once in a big table
* src/qemu/qemu_driver.c: Pass in a callback table to QEMU
monitor code
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h Replace
the EOF and disk secret callbacks with a callback table
Initial support for the new QEMU monitor protocol using JSON
as the data encoding format instead of plain text
* po/POTFILES.in: Add src/qemu/qemu_monitor_json.c
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Hack to turn on QMP
mode. Replace with a version number check on >= 0.12 later
* src/qemu/qemu_monitor.c: Delegate to json monitor if enabled
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
impl of QMP protocol
* src/Makefile.am: Add src/qemu/qemu_monitor_json.{c,h}
Now that drivers are using a private domain object state blob,
the virDomainObjFormat/Parse methods are no longer able to
directly serialize all neccessary state to/from XML. It is
thus neccessary to introduce a pair of callbacks fo serializing
private state.
The code for serializing vCPU PIDs and the monitor device
config can now move out of domain_conf.c and into the
qemu_driver.c where they belong.
* src/conf/capabilities.h: Add callbacks for serializing private
state to/from XML
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove the
monitor, monitor_chr, monitorWatch, nvcpupids and vcpupids
fields from virDomainObjPtr. Remove code that serialized
those fields
* src/libvirt_private.syms: Export virXPathBoolean
* src/qemu/qemu_driver.c: Add callbacks for serializing monitor
and vcpupid data to/from XML
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Pass monitor
char device config into qemuMonitorOpen directly.
The code to start CPUs executing has nothing todo with CPU
affinity masks, so pull it out of the qemudInitCpuAffinity()
method and up into qemudStartVMDaemon()
* src/qemu/qemu_driver.c: Pull code to start CPUs executing out
of qemudInitCpuAffinity()
The current QEMU disk media change does not support setting the
disk format. The new JSON monitor will support this, so add an
extra parameter to pass this info in
* src/qemu/qemu_driver.c: Pass in disk format when changing media
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add a 'format' arg to qemuMonitorChangeMedia()
The qemuMonitorEscape() method, and the VIR_ENUM for migration
status will be needed by the JSON monitor too, so move that code
into the shared qemu_monitor.c file instead of qemu_monitor_text.c
* src/qemu/qemu_monitor.h: Declare qemuMonitorMigrationStatus enum
and qemuMonitorEscapeArg and qemuMonitorEscapeShell methods
* src/qemu/qemu_monitor.c: Implement qemuMonitorMigrationStatus enum
and qemuMonitorEscapeArg and qemuMonitorEscapeShell methods
* src/qemu/qemu_monitor_text.c: Remove above methods/enum
If QEMU shuts down while we're in the middle of processing a
monitor command, the monitor will be freed, and upon cleaning
up we attempt to do qemuMonitorUnlock(priv->mon) when priv->mon
is NULL.
To address this we introduce proper reference counting into
the qemuMonitorPtr object, and hold an extra reference whenever
executing a command.
* src/qemu/qemu_driver.c: Hold a reference on the monitor while
executing commands, and only NULL-ify the priv->mon field when
the last reference is released
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add reference
counting to handle safe deletion of monitor objects
configure: yajl: no
CC libvirt_util_la-json.lo
util/json.c:32:27: error: yajl/yajl_gen.h: No such file or directory
util/json.c:33:29: error: yajl/yajl_parse.h: No such file or directory
* src/util/json.c: remove the includes if yajl not configured in
This introduces simple API for handling JSON data. There is
an internal data structure 'virJSONValuePtr' which stores a
arbitrary nested JSON value (number, string, array, object,
nul, etc). There are APIs for constructing/querying objects
and APIs for parsing/formatting string formatted JSON data.
This uses the YAJL library for parsing/formatting from
http://lloyd.github.com/yajl/
* src/util/json.h, src/util/json.c: Data structures and APIs
for representing JSON data, and parsing/formatting it
* configure.in: Add check for yajl library
* libvirt.spec.in: Add build requires for yajl
* src/Makefile.am: Add json.c/h
* src/libvirt_private.syms: Export JSON symbols to drivers
Some of the very useful calls for XML parsing provided by util/xml.[ch]
were not exported as private symbols. This patch fixes this.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Xen HVM guests with PV drivers end up with two network interfaces for
each configured interface. One of them being emulated by qemu and the
other one paravirtual. As this might not be desirable, the attached
patch provides a way for users to specify that only paravirtual network
interface should be presented to the guest.
The configuration was inspired by qemu/kvm driver, for which users can
specify model='virtio' to use paravirtual network interface.
The patch adds support for model='netfront' which results in
type=netfront instead of type=ioemu (or nothing for newer xen versions)
in guests native configuration. Xen's qemu ignores interfaces with
type != ioemu and only paravirtual network device will be seen in the
guest.
Four possible configuration scenarios follow:
- no model specified in domain's XML
- libvirt will behave like before this change; it will set
type=ioemu for HVM guests on xen host which is not newer than
XEND_CONFIG_MAX_VERS_NET_TYPE_IOEMU
- covered by existing tests
- PV guest, any model
- no functional change, model is passed as is (and ignored by the
hypervisor)
- covered by existing tests (e.g., *-net-e1000.*)
- HVM guest, model=netfront
- type is set to "netfront", model is not specified
- covered by new *-net-netfront.* tests
- HVM guest, model != netfront
- type is set to "ioemu", model is passed as is
- covered by new *-net-ioemu.* tests
The fourth scenario feels like a regression for xen newer than
XEND_CONFIG_MAX_VERS_NET_TYPE_IOEMU as users who had a model specified
in their guest's configuration won't see a paravirtual interface in
their guests any more. On the other hand, the reason for specifying a
model is most likely the fact that they want to use such model which
implies emulated interface. Users of older xen won't be affected at all
as their xen provides paravirtual interface regardless of the type used.
- src/xen/xend_internal.c: add netfront support for the xend backend
- src/xen/xm_internal.c: add netfront support for the XM serialization too
Also fixed serial port configuration which was broken due to recent
change in virDomainChrDef where targetType was newly added.
* src/Makefile.am: add new files
* src/vbox/vbox_driver.c: add case for version 3.1
* src/vbox/vbox_tmpl.c: refactor common patterns into macros, support for
version 3.1, serial port configuration fix
* src/vbox/vbox_CAPI_v3_1.h, src/vbox/vbox_V3_1.c: generated code
esxVMX_IndexToDiskName handles indices up to 701. This limit comes
from a mapping gap in virDiskNameToIndex:
sdzy -> 700
sdzz -> 701
sdaaa -> 728
sdaab -> 729
This line in virDiskNameToIndex causes this gap:
idx = (idx + i) * 26;
Fixing it by altering this line to:
idx = (idx + (i < 1 ? 0 : 1)) * 26;
Also add a new version of virIndexToDiskName that handles the inverse
mapping for arbitrary indices.
* src/esx/esx_vmx.[ch]: remove esxVMX_IndexToDiskName
* src/util/util.[ch]: add virIndexToDiskName and fix mapping gap
* tests/esxutilstest.c: update test to verify that the gap is fixed
* src/conf/domain_conf.c: don't call virDomainObjUnlock twice
* src/qemu/qemu_driver.c: relock driver lock if an error occurs in
qemuDomainObjBeginJobWithDriver, enter/exit monitor with driver
in qemudDomainSave
The instruction "See Makefile.am" in libvirt.private_syms
always makes me think that this file is autogenerated
and should not be touched manually. This patch spares
every reader of libvirt.private_syms the hassle of
reading Makefile.am before augmenting libvirt.private_syms.
Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Commit 790f0b3057 causes the contents of
the names array to be freed even on success, resulting in no listing of
defined but inactive Xen domains.
Spotted by Jim Fehlig
Introduce a new type="dir" mode for <disks> that allows use of
QEMU's virtual FAT block device driver. eg
<disk type='dir' device='floppy'>
<source dir='/tmp/test'/>
<target dev='fda' bus='fdc'/>
<readonly/>
</disk>
gets turned into
-drive file=fat:floppy:/tmp/test,if=floppy,index=0
Only read-only disks are supported with virtual FAT mode
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add type="dir"
* docs/schemas/domain.rng: Document new disk type
* src/xen/xend_internal.c, src/xen/xm_internal.c: Raise error for
unsupported disk types
* tests/qemuxml2argvdata/qemuxml2argv-disk-cdrom-empty.args: Fix
empty disk file handling
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-fat.args,
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-fat.xml,
tests/qemuxml2argvdata/qemuxml2argv-floppy-drive-fat.args,
tests/qemuxml2argvdata/qemuxml2argv-floppy-drive-fat.xml
tests/qemuxml2argvtest.c: Test QEMU vitual FAT driver
* src/qemu/qemu_conf.c: Support generating fat:/some/dir type
disk args
* src/security/security_selinux.c: Temporarily skip labelling
of directory based disks
The cpu_set_t type can only cope with NR_CPUS <= 1024, beyond this
it is neccessary to use alternate CPU_SET maps with a dynamically
allocated CPU map
* src/util/processinfo.c: Support new unlimited size CPU set type
* src/Makefile.am: Add processinfo.h/processinfo.c
* src/util/processinfo.c, src/util/processinfo.h: Module providing
APIs for getting/setting process CPU affinity
* src/qemu/qemu_driver.c: Switch over to new APIs for schedular
affinity
* src/libvirt_private.syms: Export virProcessInfoSetAffinity
and virProcessInfoGetAffinity to internal drivers
0.7.3 was broken
* configure.in docs/news.html.in: release of 0.7.4
* configure.in libvirt.spec.in: require netcf >= 0.1.4
* src/Makefile.am: node_device/node_device_udev.h was missing from
NODE_DEVICE_DRIVER_UDEV_SOURCES breaking compilation on platforms with
udev
Recent qemu releases require command option '-enable-qemu' in order
for the kvm functionality be activated. Libvirt needs to pass this flag
to qemu when starting a domain. Note that without the option,
even if both the kernel and qemu support KVM, KVM will not be activated
and VMs will be very slow.
* src/qemu/qemu_conf.h src/qemu/qemu_conf.c: parse the extra command
line option from help and add it when running kvm
* tests/qemuhelptest.c: this modified the flags output for qemu-0.10.5
and qemu-kvm-0.11.0-rc2 regression tests
Erroneously included the sysfs_path and parent_sysfs_path elements in
the node device xml, they were not supposed to show up there
* src/conf/node_device_conf.c: remove the output of the 2 fields
I realized that I inadvertently added a member to the def struct to
contain each device's sysfs path when there was an existing member in the
dev struct for "OS specific path to device metadat, eg sysfs" Since the
udev backend needs to record the sysfs path while it's in the process of
creating the device, before the dev struct gets allocated, I chose to
remove the member from the dev struct.
* src/conf/node_device_conf.c src/conf/node_device_conf.h
src/node_device/node_device_driver.c src/node_device/node_device_hal.c
src/node_device/node_device_udev.c: remove devicePath from the
structure and use def->sysfs_path instead
The qemudStartVMDaemon() and several functions it calls use
the QEMU monitor. The QEMU driver is locked while this function
is executing, so it is rquired to release the driver lock and
reacquire it either side of issuing a monitor command. It
failed todo so, leading to deadlock
* qemu/qemu_driver.c: Release driver when in qemudStartVMDaemon
and things it calls
VMware uses two MAC address prefixes: 00:0c:29 and 00:50:56. The 00:0c:29
prefix is used for ESX server generated addresses. The 00:50:56 prefix is
split into two parts. MAC addresses above 00:50:56:3f:ff:ff are generated
by a vCenter. The rest of the 00:50:56 prefix can be assigned manually.
Any MAC address within the 00:0c:29 and 00:50:56 prefix can be specified
in a domain XML config and the driver will handle the details internally.
* src/esx/esx_vmx.c: fix MAC address formatting
* tests/xml2vmxdata/*: update test files accordingly
* docs/drivers.html.in: list the ESX driver
* docs/drvesx.html.in: the new ESX driver documentation
* docs/hvsupport.html.in: add the ESX driver to the matrix
* docs/index.html.in, docs/sitemap.html.in: list the ESX driver
* src/esx/esx_driver.c: fix and cleanup some comments
* src/xen/xen_hypervisor.c: xen-unstable changeset 19788 removed
MAX_VIRT_CPUS from public headers, breaking compilation of libvirt
on -unstable. Its semanitc was retained with XEN_LEGACY_MAX_VCPUS.
Ensure MAX_VIRT_CPUS is defined accordingly.
The QEMU monitor open method would not take a reference on
the virDomainObjPtr until it had successfully opened the
monitor. The cleanup code upon failure to open though would
call qemuMonitorClose() which would in turn decrement the
reference count. This caused the virDoaminObjPtr to be mistakenly
freed and then the whole driver crashes
* src/qemu/qemu_monitor.c: Fix reference counting in
qemuMonitorOpen
The HAL driver returns a fatal error code in the case where HAL
is not running. This causes the entire libvirtd daemon to quit
which isn't desirable. Instead it should simply disable the HAL
driver
* src/node_device/node_device_hal.c: Quietly disable HAL if it is
not running
Fixes https://launchpad.net/bugs/453335
* src/security/virt-aa-helper.c: suppress confusing and misleading
apparmor denied message when kvm/qemu tries to open a libvirt specified
readonly file (such as a cdrom) with write permissions. libvirt uses
the readonly attribute for the security driver only, and has no way
of telling kvm/qemu that the device should be opened readonly
Fixes https://launchpad.net/bugs/460271
* src/security/virt-aa-helper.c: require absolute path for dynamic added
files. This is required by AppArmor and conveniently prevents adding
tcp consoles to the profile
The wrong variable was being passed in with the LXC event callback
resulting in a later deadlock or crash
* src/lxc/lxc_driver.c: Pass 'vm' instead of 'driver' to event
callback
In the scenario where the cgroups were mounted but the
particular group did not exist, and the caller had not
requested auto-creation, the code would fail to return
an error condition. This caused the lxc_controller to
think the cgroup existed, and it then later failed when
attempting to use it
* src/util/cgroup.c: Raise an error if the cgroup path does not
exist
There is a race condition in HAL driver startup where the callback
can get triggered before we have finished startup. This then causes
a deadlock in the driver.
* src/node_device/node_device_hal.c: RElease driver lock before
registering DBus callbacks
If the virDomainDefPtr object has an 'id' of -1, then forcably
set the VIR_DOMAIN_XML_INACTIVE flag to ensure generated XML
does not include any cruft from the previously running guest
such as console PTY path, or VNC port.
* src/conf/domain_conf.c: Set VIR_DOMAIN_XML_INACTIVE if
def->id is -1. Replace checks for def->id == -1 with
check against flags & VIR_DOMAIN_XML_INACTIVE.
The capng_lock() call sets the SECURE_NO_SETUID_FIXUP and SECURE_NOROOT
bits on the process. This prevents the kernel granting capabilities to
processes with an effective UID of 0, or with setuid programs. This is
not actually what we want in the container init process. It should be
allowed to run setuid processes & keep capabilities when root. All that
is required is masking a handful of dangerous capabilities from the
bounding set.
* src/lxc/lxc_container.c: Remove bogus capng_lock() call.
* src/security/virt-aa-helper.c: get_definition() now calls the new
caps_mockup() function which will parse the XML for os.type,
os.type.arch and then sets the wordsize. These attributes are needed
only to get a valid virCapsPtr for virDomainDefParseString(). The -H
and -b options are now removed from virt-aa-helper (they weren't used
yet anyway).
* tests/virt-aa-helper-test: extend and fixes tests, chmod'ed 755
uses libpciaccess to provide human readable names for PCI vendor and
device IDs
* configure.in: add a requirement for libpciaccess >= 0.10.0
* src/Makefile.am: add the associated compilation flags and link
* src/node_device/node_device_udev.c: lookup the libpciaccess for
vendor name and product name based on their ids
* configure.in src/Makefile.am: remove the configuration check and
build instructions
* src/node_device/node_device_devkit.c: removed the module
* src/node_device/node_device_driver.c src/node_device/node_device_driver.h:
removed references to the old backend
* src/conf/node_device_conf.h src/conf/node_device_conf.c: add specific
support for SCSI target in node device capabilities
* src/node_device/node_device_udev.c: add some extra detection code
when handling udev output
* configure.in: add new --with-udev, disabled by default, and requiring
libudev > 145
* src/node_device/node_device_udev.c src/node_device/node_device_udev.h:
the new node device backend
* src/node_device/node_device_linux_sysfs.c: moved node_device_hal_linux.c
to a better file name
* src/conf/node_device_conf.c src/conf/node_device_conf.h: add a couple
of fields in node device definitions, and an API to look them up,
remove a couple of unused fields from previous patch.
* src/node_device/node_device_driver.c src/node_device/node_device_driver.h:
plug the new driver
* po/POTFILES.in src/Makefile.am src/libvirt_private.syms: add the new
files and symbols
* src/util/util.h src/util/util.c: add a new convenience macro
virBuildPath and virBuildPathInternal() function
There is currently no way to determine the libvirt version of a remote
libvirtd we are connected to. This is a useful piece of data to enable
feature detection.
* src/xen/xen_driver.c: Add support for VIR_MIGRATE_PERSIST_DEST flag
* src/xen/xend_internal.c: Add support for VIR_MIGRATE_UNDEFINE_SOURCE flag
* include/libvirt/virterror.h, src/util/virterror.c: Add new errorcode
VIR_ERR_MIGRATE_PERSIST_FAILED
* src/conf/domain_conf.h src/conf/domain_conf.c: add the new entry in
the enum and lists of virDomainDiskBus
* src/qemu/qemu_conf.c: same for virDomainDiskQEMUBus
The xenstore database sometimes has stale domain IDs which are not
present in the hypervisor anymore. Filter these out to avoid causing
confusion
* src/xen/xs_internal.c: Filter domain IDs against HV's list
* src/xen/xen_hypervisor.h, src/xen/xen_hypervisor.c: Add new
xenHypervisorHasDomain() method for checking ID validity
The xenUnifiedNumOfDomains and xenUnifiedListDomains methods work
together as a pair, so it is critical they both apply the same
logic. With the current mis-matched logic it is possible to sometimes
get into a state when you miss certain active guests.
* src/xen/xen_driver.c: Change xenUnifiedNumOfDomains ordering to
match xenUnifiedListDomains.
When running qemu:///system instance, libvirtd runs as root,
but QEMU may optionally be configured to run non-root. When
then saving a guest to a state file, the file is initially
created as root, and thus QEMU cannot write to it. It is also
missing labelling required to allow access via SELinux.
* src/qemu/qemu_driver.c: Set ownership on save image before
running migrate command in virDomainSave impl. Call out to
security driver to set save image labelling
* src/security/security_driver.h: Add driver APIs for setting
and restoring saved state file labelling
* src/security/security_selinux.c: Implement saved state file
labelling for SELinux
Introduce a number of new APIs to expose some boolean properties
of objects, which cannot otherwise reliably determined, nor are
aspects of the XML configuration.
* virDomainIsActive: Checking virDomainGetID is not reliable
since it is not possible to distinguish between error condition
and inactive domain for ID of -1.
* virDomainIsPersistent: Check whether a persistent config exists
for the domain
* virNetworkIsActive: Check whether the network is active
* virNetworkIsPersistent: Check whether a persistent config exists
for the network
* virStoragePoolIsActive: Check whether the storage pool is active
* virStoragePoolIsPersistent: Check whether a persistent config exists
for the storage pool
* virInterfaceIsActive: Check whether the host interface is active
* virConnectIsSecure: whether the communication channel to the
hypervisor is secure
* virConnectIsEncrypted: whether any network based commnunication
channels are encrypted
NB, a channel can be secure, even if not encrypted, eg if it does
not involve the network, like a UNIX socket, or pipe.
* include/libvirt/libvirt.h.in: Define public API
* src/driver.h: Define internal driver API
* src/libvirt.c: Implement public API entry point
* src/libvirt_public.syms: Export API symbols
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/interface/netcf_driver.c, src/network/bridge_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub out driver tables
* src/libvirt.c src/lxc/lxc_conf.c src/lxc/lxc_container.c
src/lxc/lxc_controller.c src/node_device/node_device_hal.c
src/openvz/openvz_conf.c src/qemu/qemu_driver.c
src/qemu/qemu_monitor_text.c src/remote/remote_driver.c
src/storage/storage_backend_disk.c src/storage/storage_driver.c
src/util/logging.c src/xen/sexpr.c src/xen/xend_internal.c
src/xen/xm_internal.c: Steve Grubb <sgrubb@redhat.com> sent a code
review and those are the fixes correcting the problems
Some monitor commands may take a very long time to complete. It is
not desirable to block other incoming API calls forever. With this
change, if an existing API call is holding the job lock, additional
API calls will not wait forever. They will time out after a short
period of time, allowing application to retry later.
* include/libvirt/virterror.h, src/util/virterror.c: Add new
VIR_ERR_OPERATION_TIMEOUT error code
* src/qemu/qemu_driver.c: Change to a timed condition variable
wait for acquiring the monitor job lock
QEMU monitor commands may sleep for a prolonged period of time.
If the virDomainObjPtr or qemu driver lock is held this will
needlessly block execution of many other API calls. it also
prevents asynchronous monitor events from being dispatched
while a monitor command is executing, because deadlock will
ensure.
To resolve this, it is neccessary to release all locks while
executing a monitor command. This change introduces a flag
indicating that a monitor job is active, and a condition
variable to synchronize access to this flag. This ensures that
only a single thread can be making a state change or executing
a monitor command at a time, while still allowing other API
calls to be completed without blocking
* src/qemu/qemu_driver.c: Release driver and domain lock when
running monitor commands. Re-add locking to disk passphrase
callback
* src/qemu/THREADS.txt: Document threading rules
Change the QEMU monitor file handle watch to poll for both
read & write events, as well as EOF. All I/O to/from the
QEMU monitor FD is now done in the event callback thread.
When the QEMU driver needs to send a command, it puts the
data to be sent into a qemuMonitorMessagePtr object instance,
queues it for dispatch, and then goes to sleep on a condition
variable. The event thread sends all the data, and then waits
for the reply to arrive, putting the response / error data
back into the qemuMonitorMessagePtr and notifying the condition
variable.
There is a temporary hack in the disk passphrase callback to
avoid acquiring the domain lock. This avoids a deadlock in
the command processing, since the domain lock is still held
when running monitor commands. The next commit will remove
the locking when running commands & thus allow re-introduction
of locking the disk passphrase callback
* src/qemu/qemu_driver.c: Temporarily don't acquire lock in
disk passphrase callback. To be reverted in next commit
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Remove
raw I/O functions, and a generic qemuMonitorSend() for
invoking a command
* src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Remove all low level I/O, and use the new qemuMonitorSend()
API. Provide a qemuMonitorTextIOProcess() method for detecting
command/reply/prompt boundaries in the monitor data stream
Use ssh keyfiles from the current user's home directory instead of trying
to use keyfiles from a hardcoded /home/user directory. Fallback to
username/password authentication if keyfiles are not available or keyfile
authentication failed.
Add reference counting on the virDomainObjPtr objects. With the
forthcoming asynchronous QEMU monitor, it will be neccessary to
release the lock on virDomainObjPtr while waiting for a monitor
command response. It is neccessary to ensure one thread can't
delete a virDomainObjPtr while another is waiting. By introducing
reference counting threads can make sure objects they are using
are not accidentally deleted while unlocked.
* src/conf/domain_conf.h, src/conf/domain_conf.c: Add
virDomainObjRef/Unref APIs, remove virDomainObjFree
* src/openvz/openvz_conf.c: replace call to virDomainObjFree
with virDomainObjUnref
In preparation of the monitor I/O process becoming fully asynchronous,
it is neccessary to ensure all access to internals of the qemuMonitorPtr
object is protected by a mutex lock.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add mutex for locking
monitor.
* src/qemu/qemu_driver.c: Add locking around all monitor commands
Change the QEMU driver to not directly invoke the text mode monitor
APIs. Instead add a generic wrapper layer, which will eventually
invoke either the text or JSON protocol code as needed. Pass an
qemuMonitorPtr object into the monitor APIs instead of virDomainObjPtr
to complete the de-coupling of the monitor impl from virDomainObj
data structures
* src/qemu/qemu_conf.h: Remove qemuDomainObjPrivate definition
* src/qemu/qemu_driver.c: Add qemuDomainObjPrivate definition.
Pass qemuMonitorPtr into all monitor APIs instead of the
virDomainObjPtr instance.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add thin
wrappers for all qemuMonitorXXX command APIs, calling into
qemu_monitor_text.c/h
* src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Rename qemuMonitor -> qemuMonitorText & update to accept
qemuMonitorPtr instead of virDomainObjPtr
Decouple the monitor code from the virDomainDefPtr structure
by moving the disk encryption lookup code back into the
qemu_driver.c file. Instead provide a function callback to
the monitor code which can be invoked to retrieve encryption
data as required.
* src/qemu/qemu_driver.c: Add findDomainDiskEncryption,
and findVolumeQcowPassphrase. Pass address of the method
findVolumeQcowPassphrase into qemuMonitorOpen()
* src/qemu/qemu_monitor.c: Associate a disk
encryption function callback with the qemuMonitorPtr
object.
* src/qemu/qemu_monitor_text.c: Remove findDomainDiskEncryption
and findVolumeQcowPassphrase.
Introduce a new qemuDomainObjPrivate object which is used to store
the private QEMU specific data associated with each virDomainObjPtr
instance. This contains a single member, an instance of the new
qemuMonitorPtr object which encapsulates the QEMU monitor state.
The internals of the latter are private to the qemu_monitor* files,
not to be shown to qemu_driver.c
* src/qemu/qemu_conf.h: Definition of qemuDomainObjPrivate.
* src/qemu/qemu_driver.c: Register a functions for creating
and freeing qemuDomainObjPrivate instances with the domain
capabilities. Remove the qemudDispatchVMEvent() watch since
I/O watches are now handled by the monitor code itself. Pass
a new qemuHandleMonitorEOF() callback into qemuMonitorOpen
to allow notification when the monitor quits.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Introduce
the 'qemuMonitor' object. Temporarily add new APIs
qemuMonitorWrite, qemuMonitorRead, qemuMonitorWaitForInput
to allow text based monitor impl to perform I/O.
* src/qemu/qemu_monitor_text.c: Call APIs for reading/writing
to monitor instead of accessing the file handle directly.
The qemu_driver.c code should not contain any code that interacts
with the QEMU monitor at a low level. A previous commit moved all
the command invocations out. This change moves out the code which
actually opens the monitor device.
* src/qemu/qemu_driver.c: Remove qemudOpenMonitor & methods called
from it.
* src/Makefile.am: Add qemu_monitor.{c,h}
* src/qemu/qemu_monitor.h: Add qemuMonitorOpen()
* src/qemu/qemu_monitor.c: All code for opening the monitor
* src/util/threads-pthread.c: pthreads APIs do not set errno, instead
the return value is the positive errno. Set errno based on the return
value in the wrappers
* src/util/pci.c, src/util/pci.h: Make the pciDeviceList struct
opaque to callers of the API. Add accessor methods for managing
devices in the list
* src/qemu/qemu_driver.c: Update to use APIs instead of directly
accessing pciDeviceList fields
As it was basically unimplemented and more confusing than useful
at the moment.
* src/libvirt_private.syms: remove from internal symbols list
* src/qemu/qemu_bridge_filter.c src/util/ebtables.c: remove code and
one use of the unimplemented function
In case of an error the domains hash and the driver mutex may leak.
* src/opennebula/one_driver.c: free/destroy domains hash and driver
mutex in error cases
- Make reading ID from file working for IDs > 127
- Fix inverse error check for writing ID to file
- Use feof() to distinguish EOF from real error of fread()
- Don't interpret libssh2 error codes as number of bytes
phypNumDomainsGeneric() and phypListDomainsGeneric() return 0 in case
of an error. This makes it impossible to distinguish between an actual
error and no domains being defined on the hypervisor. It also turn the
no domains situation into an error. Return -1 in case of an error to
fix this problem.
* src/network/bridge_driver.c: when exec'ing dnsmaq, if there are
DHCP ranges defined, then compute and pass the --dhcp-lease-max
deriving the maximum number of leases
* src/conf/network_conf.h: extend the structure to store the range
* src/conf/network_conf.c: before adding a range parse the IP addresses
do some checking and keep the size
* src/internal.h (ATTRIBUTE_SENTINEL): New, it's a ggc feature and
protected as such
* src/util/buf.c (virBufferStrcat): Use it.
* src/util/ebtables.c (ebtablesAddRemoveRule): Use it.
* src/util/iptables.c (iptableAddRemoveRule: Use it.
* src/util/qparams.h (new_qparam_set, append_qparams): Use it.
* docs/apibuild.py: avoid breaking the API generator with that new
internal keyword macro
* include/libvirt/virterror.h src/util/virterror.c: add a new error
VIR_ERR_CONFIG_UNSUPPORTED for valid but unsupported configuration options
* src/conf/domain_conf.c: Throw an error if guestfwd address isn't IPv4
and cleanup a number of parsing return error values.
allows the following to be specified in a domain:
<channel type='pipe'>
<source path='/tmp/guestfwd'/>
<target type='guestfwd' address='10.0.2.1' port='4600'/>
</channel>
* proxy/Makefile.am: add network.c as dep of domain_conf.c
* docs/schemas/domain.rng src/conf/domain_conf.[ch]: extend the domain
schemas and the parsing/serialization side for the new construct
QEmu support will add the following on the qemu command line:
-chardev pipe,id=channel0,path=/tmp/guestfwd
-net user,guestfwd=tcp:10.0.2.1:4600-chardev:channel0
* src/qemu/qemu_conf.c: Add argument output for channel
* tests/qemuxml2(argv|xml)test.c: Add test for <channel> domain syntax
A character device's target (it's interface in the guest) had only a
single property: port. This patch is in preparation for adding targets
which require other properties.
Since this changes the conf type for character devices this affects
a number of drivers:
* src/conf/domain_conf.[ch] src/esx/esx_vmx.c src/qemu/qemu_conf.c
src/qemu/qemu_driver.c src/uml/uml_conf.c src/uml/uml_driver.c
src/vbox/vbox_tmpl.c src/xen/xend_internal.c src/xen/xm_internal.c:
target properties are moved into a union in virDomainChrDef, and a
targetType field is added to identify which union member should be
used. All current code which touches a virDomainChrDef is updated both
to use the new union field, and to populate targetType if necessary.
Current implementation of lxc driver creates vethN named
interface(s) in the host and passes as it is to a container.
The reason why it doesn't use ethN is due to the limitation
that one namespace cannot have multiple iterfaces that have
an identical name so that we give up creating ethN named
interface in the host for the container.
However, we should be able to allow the container to have
ethN by changing the name after clone(CLONE_NEWNET).
* src/lxc/lxc_container.c src/lxc/veth.c src/lxc/veth.h: do the clone
and then renames interfaces eth0 ... ethN to keep the interface names
familiar in the domain
* src/lxc/lxc_container.c src/lxc/lxc_controller.c src/lxc/lxc_driver.c
src/lxc/veth.c: most of cleanups are just capitalizing their messages
though, some fixes wrong error messages and awkward indentations, and
improves error messages.
* src/qemu/qemu.conf src/qemu/qemu_conf.c src/qemu/qemu_conf.h: there is
a new config type option for mac filtering
* src/qemu/qemu_bridge_filter.[ch]: new module for the ebtable entry points
* src/qemu/qemu_driver.c: plug the MAC filtering at the right places
in the domain life cycle
* src/Makefile.am po/POTFILES.in: add the new module
* configure.in: look for ebtables binary location if present
* src/Makefile.am: add the new module
* src/util/ebtables.[ch]: new module and internal APIs around
the ebtables binary
* src/libvirt_private.syms: export the symbols only internally
- Don't duplicate SystemError
- Use proper error code in domain_conf
- Fix a broken error call in qemu_conf
- Don't use VIR_ERR_ERROR in security driver (isn't a valid code in this case)
All drivers have copy + pasted inadequate error reporting which wraps
util.c:virGetHostname. Move all error reporting to this function, and improve
what we report.
Changes from v1:
Drop the driver wrappers around virGetHostname. This means we still need
to keep the new conn argument to virGetHostname, but I think it's worth
it.
This patch updates the xml parsing and formatting, and the associated
virInterfaceDef data structure to support IPv6, along the way adding
support for multiple protocols per interface, and multiple IP
addresses per protocol.
* src/conf/interface_conf.[ch]: update the structures, code for parsing
and serialization
This patch adds the flag VIR_INTERFACE_XML_INACTIVE to
virInterfaceGetXMLDesc's flags. When it is*not* set (the default), the
live interface info will be returned in the XML (in particular, the IP
address(es) and netmask(s) will be retrieved by querying the interface
directly, rather than reporting what's in the config file). The
backend of this is in netcf's ncf_if_xml_state() function.
* configure.in libvirt.spec.in: requires netcf >= 0.1.3
* include/libvirt/libvirt.h.in: adds flag VIR_INTERFACE_XML_INACTIVE
* src/conf/interface_conf.c src/interface/netcf_driver.c src/libvirt.c:
update the parsing and backend routines accordingly
* tools/virsh.c: change interface edit to inactive definition and
adds the inactive flag for interface dump
The minimal XML returned from ncf_if_xml_state() doesn't contain this
attribute (which makes no sense in the case of reporting current
status of the interface), and it was preventing it from passing
through the parse/format step.
* src/conf/interface_conf.[ch]: add a new virInterfaceStartMode value
and modify loading/saving accordingly
There are places where an interface will not have a mac address, and netcf
returns this as a NULL pointer rather than a pointer to an empty string.
Rather than checking for this all over the place in libvirt, just save it
in the virInterface object as an empty string.
* src/datatypes.c: allow NULL mac in virGetInterface()
introduced on commit 9231aa7d95
* src/qemu/qemu_driver.c: in qemudRemoveDomainStatus fix a reference
to an undefined variable buf and free up an allocated string
When building with --disable-nls, I got a few messages like this:
storage/storage_backend.c: In function 'virStorageBackendCreateQemuImg':
storage/storage_backend.c:571: warning: format not a string literal and no format arguments
Fix these up.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
qemudShutdownVMDaemon() calls qemudRemoveDomainStatus(), which
then calls virFileDeletePID(). qemudShutdownVMDaemon() then
unnecessarily calls virFileDeletePID() again. Remove this second
usage of it, and also slightly refactor qemudRemoveDomainStatus()
to VIR_WARN appropriate error messages.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The LXC driver was mistakenly returning -1 for lxcStartup()
in scenarios that are not an error. This caused the libvirtd
to quit for unprivileged users. This fixes the return code
of LXC driver, and also adds a "name" field to the virStateDriver
struct and logging to make it easier to find these problems
in the future
* src/driver.h: Add a 'name' field to state driver to allow
easy identification during failures
* src/libvirt.c: Log name of failed driver for virStateInit
failures
* src/lxc/lxc_driver.c: Don't return a failure code for
lxcStartup() if LXC is not available on this host, simply
disable the driver.
* src/network/bridge_driver.c, src/node_device/node_device_devkit.c,
src/node_device/node_device_hal.c, src/opennebula/one_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/secret/secret_driver.c, src/storage/storage_driver.c,
src/uml/uml_driver.c, src/xen/xen_driver.c: Fill in name
field in virStateDriver struct
If an error occurs between the allocation of an item and appending it
to the list, the item leaks. Free such orphaned items in error cases.
* src/esx/esx_vi.c: free orphaned items in error cases
The default transport for the VI API is HTTPS. If the server redirects
from HTTPS to HTTP the driver would silently follow that redirection.
The user assumes to communicate with the server over a secure transport
but isn't.
This patch disables automatical redirection following. The driver reports
an error if the server tries to redirect.
* src/esx/esx_vi.c: refactor the call to curl_easy_perform() into a
function and do error handling there, disable automatical redirection
following for curl
* src/esx/esx_vi.h: change the type of responseCode to int
Unified function naming scheme:
- 'lookup' functions query the ESX or vCenter for information
- 'get' functions return information from a local object
* src/esx/esx_driver.c, src/esx/esx_vi.[ch]: unify function naming
In order to register a new virtual machine the ESX driver needs to upload
a VMX file to a datastore. Try to put this file beside the main VMDK file
of the virtual machine. Change the disk selection for datastore detection
to choose the first file-based harddisk instead of just the first disk.
The first disk may be a CDROM disk and ISO images are normaly not located
in the virtual machine's directory.
* src/esx/esx_driver.c: change disk selection for datastore detection
This allows to use domain-xml-from-native with VMX files that reference
unavailable datastores.
* src/esx/esx_vmx.c: fallback to the preliminary name if the datastore
cannot be found
Rename virDomainIsActive to virDomainObjIsActive, and
virInterfaceIsActive to virInterfaceObjIsActive and finally
virNetworkIsActive to virNetworkObjIsActive.
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/conf/interface_conf.h, src/conf/network_conf.c,
src/conf/network_conf.h, src/lxc/lxc_driver.c,
src/network/bridge_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/qemu/qemu_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c: Update for
renamed APIs.
Nearly all of the methods in src/util/util.h have error codes that
must be checked by the caller to correct detect & report failure.
Add ATTRIBUTE_RETURN_CHECK to ensure compile time validation of
this
* daemon/libvirtd.c: Add explicit check on return value of virAsprintf
* src/conf/domain_conf.c: Add missing check on virParseMacAddr return
value status & report error
* src/network/bridge_driver.c: Add missing OOM check on virAsprintf
and report error
* src/qemu/qemu_conf.c: Add missing check on virParseMacAddr return
value status & report error
* src/security/security_selinux.c: Remove call to virRandomInitialize
that's done in libvirt.c already
* src/storage/storage_backend_logical.c: Add check & log on virRun
return status
* src/util/util.c: Add missing checks on virAsprintf/Run status
* src/util/util.h: Annotate all methods with ATTRIBUTE_RETURN_CHECK
if they return an error status code
* src/vbox/vbox_tmpl.c: Add missing check on virParseMacAddr
* src/xen/xm_internal.c: Add missing checks on virAsprintf
* tests/qemuargv2xmltest.c: Remove bogus call to virRandomInitialize()
The virDomainObjPtr object stores state about a running domain.
This object is shared across all drivers so it is not appropriate
to include driver specific state here. This patch adds the ability
to request a blob of private data per domain object instance. The
driver must provide a allocator & deallocator for this purpose
THis patch abuses the virCapabilitiesPtr structure for storing the
allocator/deallocator callbacks, since it is already being abused
for other internal things relating to parsing. This should be moved
out into a separate object at some point.
* src/conf/capabilities.h: Add privateDataAllocFunc and
privateDataFreeFunc fields
* src/conf/domain_conf.c: Invoke the driver allocators / deallocators
when creating/freeing virDomainObjPtr instances.
* src/conf/domain_conf.h: Pass virCapsPtr into virDomainAssignDef
to allow access to the driver specific allocator function
* src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/qemu/qemu_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c: Update for
change in virDomainAssignDef contract
__in6_u.__u6_addr16 is the private name for this struct member,
s6_addr16 is the public one
* src/util/network.c: dont use the private field, but the public one.
John Levon raised the issue that remoteIOEventLoop() poll call was
reissued after EINTR was caught making it uninterruptible.
* src/remote/remote_driver.c: catch EAGAIN instead as suggested by
Richard Jones
The current virDomainObjListPtr object stores domain objects in
an array. This means that to find a particular objects requires
O(n) time, and more critically acquiring O(n) mutex locks.
The new impl replaces the array with a virHashTable, keyed off
UUID. Finding a object based on UUID is now O(1) time, and only
requires a single mutex lock. Finding by name/id is unchanged
in complexity.
In changing this, all code which iterates over the array had
to be updated to use a hash table iterator function callback.
Several of the functions which were identically duplicating
across all drivers were pulled into domain_conf.c
* src/conf/domain_conf.h, src/conf/domain_conf.c: Change
virDomainObjListPtr to use virHashTable. Add a initializer
method virDomainObjListInit, and rename virDomainObjListFree
to virDomainObjListDeinit, since its not actually freeing
the container, only its contents. Also add some convenient
methods virDomainObjListGetInactiveNames,
virDomainObjListGetActiveIDs and virDomainObjListNumOfDomains
which can be used to implement the correspondingly named
public API entry points in drivers
* src/libvirt_private.syms: Export new methods from domain_conf.h
* src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_conf.c, src/openvz/openvz_driver.c,
src/qemu/qemu_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c: Update all code
to deal with hash tables instead of arrays for domains
We need to parse a source XML block for FindPoolSources, so this is a step
in sharing the parsing. The new storage pool XML 2 XML tests cover this area
pretty well to ensure we aren't causing regressions.
This patch adds an optional attribute to the <bootp> tag, that
allows to specify a TFTP server address other than the address of
the DHCP server itself.
This can be used to forward the BOOTP settings of the host down to the
guest. This is something that configurations such as Xen's default
network achieve naturally, but must be done manually for NAT.
* docs/formatnetwork.html.in: Document new attribute.
* docs/schemas/network.rng: Add it to schema.
* src/conf/network_conf.h: Add it to struct.
* src/conf/network_conf.c: Add it to parser and pretty printer.
* src/network/bridge_driver.c: Put it in the dnsmasq command line.
* tests/networkxml2xmlin/netboot-proxy-network.xml
tests/networkxml2xmlout/netboot-proxy-network.xml
tests/networkxml2xmltest.c: add new tests
In xenInotifyXendDomainsDirLookup() the wrong UUID variable is used
to search in the config info list.
In xenInotifyEvent() the event is dispatched if it's NULL.
Both were introduced in bc898df2c7.
We should always be using virGetHostname in place of
gethostname; thus add in a new syntax-check rule to make
sure no new uses creep in.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We can slightly tighten up the regex's used to detect the use of
nonreentrant functions. We can also check src/util/virterror.c
by modifying a comment; I think it's worth it to get the additional
coverage.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
xenUnifiedDomainEventRegister() calls out to
virDomainEventCallbackListAdd(), which increments the reference
count on the connection. That is fine, but then
xenUnifiedDomainEventRegister() increments the usage count again,
leading to a usage count leak. Remove the increment in the xen
register, and the UnrefConnect in the xen unregister.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
If no matching device was found (cap == NULL) then no strdup() call
was made and *wwnn and *wwpn are untouched. Checking them for NULL
in this situation may result in reporting an false-positive OOM error
because *wwnn and *wwpn may be initialized to NULL by the caller.
Only check *wwnn and *wwpn for NULL if a matching device was found
(cap != NULL) and thus strdup() was called.
* src/conf/node_device_conf.c: only report an OOM error if there
really is one
virXPathNodeSet() could return -1 when doing an evaluation failure
due to xmlXPathEval() from libxml2 behaviour.
* src/util/xml.c: make sure we always return 0 unless the returned
XPath type is of the wrong type (meaning the query passed didn't
evaluate to a node set and code must be fixed)
https://bugzilla.redhat.com/show_bug.cgi?id=528575
virsh -c lxc:/// autostart vm1
was crashing the daemon
* src/lxc/lxc_conf.h src/lxc/lxc_conf.c: initialize the driver
autostartDir to avoid a NULL reference and implement autostart for LXC
Currently MAC address configuration of container veth is just ignored.
This patch implements the missing feature.
* src/lxc/veth.c, src/lxc/veth.h: add setMacAddr
* src/lxc/lxc_driver.c: set macaddr of container veth if specified
Most of the hash iterators need to modify either payload of
data args. The const annotation prevents this.
* src/util/hash.h, src/util/hash.c: Remove const-ness from
virHashForEach/Iterator
* src/xen/xm_internal.c: Remove bogus casts
A cgroup file returns integer value terminated with '\n' and remaining
it has sometimes harmful effects, for example it leads virStrToLong_ull
to fail.
* src/util/cgroup.c: strip out terminating \n when reading a value
If the the qemu and kvm binaries are the same, we don't include machine
types in the kvm domain info.
However, the code which refreshes the machine types info from the
previous capabilities structure first looks at the kvm domain's info,
finds it matches and then copies the empty machine types list over
for the top-level qemu domain.
That doesn't make sense, we shouldn't copy an empty machin types list.
* src/qemu/qemu_conf.c: qemudGetOldMachinesFromInfo(): don't copy an
empty machine types list.
* src/util/buf.c: if virBufferEscapeString was called on a buffer that
had 0 bytes of space, a size of -1 will be passed to snprintf, resulting
in a segmentation fault, this preallocate some space.
* src/conf/storage_conf.c src/conf/storage_conf.h: extend the enums
and values
* docs/schemas/storagepool.rng: add to the list of storage pool type
formats
Normally, when you migrate a domain from host A to host B,
the domain on host A remains defined but shutoff and the domain
on host B remains running but is a "transient". Add a new
flag to virDomainMigrate() to allow the original domain to be
undefined on source host A, and a new flag to virDomainMigrate() to
allow the new domain to be persisted on the destination host B.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When specifying bridge delay via network XML define, we were looking for
the 'delay' attribute, but would dump the value as 'forwardDelay'. Have
the output match the expected input (and schema).
The fread_file_lim() function uses fread() but never handles
EINTR results, causing unexpected failures when reading QEMU
help arg info. It was unneccessarily using FILE * instead
of plain UNIX file handles, which prevented use of saferead()
* src/util/util.c: Switch fread_file_lim over to use saferead
instead of fread, remove FILE * use, and rename
$ sudo virsh pool-start idontexist
10:58:18.716: warning : processCallDispatchReply:7612 : Method call error
error: failed to get pool 'idontexist'
error: Storage pool not found: no pool with matching name 'idontexist'
That warning doesn't server much purpose being printed via a virsh call. So
remove the message.
The logic for running the decompression programs was broken in
commit f238709304, so that for
non-raw formats the decompression program was never run, and
for raw formats, it tried to exec an argv[] with initial NULL
in the program name.
* src/qemu/qemu_driver.c: Fix logic in runing decompression program
If one has e.g.
<guest>
<os_type>hvm</os_type>
<arch name='x86_64'>
<wordsize>64</wordsize>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<machine>pc-0.11</machine>
<machine canonical='pc-0.11'>pc</machine>
<machine>pc-0.10</machine>
<machine>isapc</machine>
<domain type='qemu'>
</domain>
<domain type='kvm'>
<emulator>/usr/bin/kvm</emulator>
<machine>pc</machine>
<machine>isapc</machine>
</domain>
</arch>
</guest>
and start a guest with:
<domain type='kvm'>
...
<os>
<type arch='x86_64'>hvm</type>
...
</os>
</domain>
then the default machine type should be 'pc' and not 'pc-0.11'
Issue was reported by Anton Protopopov.
* src/capabilities.[ch]: pass the domain type to
virCapabilitiesDefaultGuestArch() and use it to look up the default
machine type from a specific guest domain if needed.
* src/conf/domain_conf.c, src/xen/xm_internal.c: update
* tests/qemuxml2argvdata/qemuxml2argv-machine-aliases2.xml: update
the domain type to 'kvm' and remove the machine type to check
that the default gets looked up correctly
Introduces several new public API options for migration
- VIR_MIGRATE_PEER2PEER: With this flag the client only
invokes the virDomainMigratePerform method, expecting
the source host driver to do whatever is required to
complete the entire migration process.
- VIR_MIGRATE_TUNNELLED: With this flag the actual data
for migration will be tunnelled over the libvirtd RPC
channel. This requires that VIR_MIGRATE_PEER2PEER is
also set.
- virDomainMigrateToURI: This is variant of the existing
virDomainMigrate method which does not require any
virConnectPtr for the destination host. Given suitable
driver support, this allows for all the same modes as
virDomainMigrate()
The URI for VIR_MIGRATE_PEER2PEER must be a valid libvirt
URI. For non-p2p migration a hypervisor specific migration
URI is used.
virDomainMigrateToURI without a PEER2PEER flag is only
support for Xen currently, and it involves XenD talking
directly to XenD, no libvirtd involved at all.
* include/libvirt/libvirt.h.in: Add VIR_MIGRATE_PEER2PEER
flag for migration
* src/libvirt_internal.h: Add feature flags for peer to
peer migration (VIR_FEATURE_MIGRATE_P2P) and direct
migration (VIR_MIGRATE_PEER2PEER mode)
* src/libvirt.c: Implement support for VIR_MIGRATE_PEER2PEER
and virDomainMigrateToURI APIs.
* src/xen/xen_driver.c: Advertise support for DIRECT migration
* src/xen/xend_internal.c: Add TODO item for p2p migration
* src/libvirt_public.syms: Export virDomainMigrateToURI
method
* src/qemu/qemu_driver.c: Add support for PEER2PEER and
migration, and adapt TUNNELLED migration.
* tools/virsh.c: Add --p2p and --direct args and use the
new virDomainMigrateToURI method where possible.
Re-arrange the doTunnelMigrate method putting all non-QEMU local
state setup steps first. This maximises chances of success before
then starting destination QEMU for receiving incoming migration.
Altogether this can reduce the number of goto cleanup labels to
something more managable.
* qemu/qemu_driver.c: Re-order steps in doTunnelMigrate
Simplify the doTunnelMigrate code by pulling out the code for
sending all tunnelled data into separate helper
* qemu/qemu_driver.c: introduce doTunnelSendAll() method
Simplify the doTunnelMigrate() method by pulling out the code
which opens/closes the virConnectPtr object into a parent
method
* qemu/qemu_driver.c: Add doPeer2PeerMigrate which then calls
doTunnelMigrate with dconn & dom_xml
virStreamAbort is needed when the caller wishes to terminate
the stream early, not when virStreamSend fails.
* qemu/qemu_driver.c: Fix calling of virStreamAbort during
tunnelled migration
The code for tunnelled migration was added in a dedicated method,
but the native migration code is still inline in the top level
qemudDomainMigratePerform() API. Move the native code out into
a dedicated method too to make things more maintainable.
* src/qemu/qemu_driver.c: Pull code for performing a native
QEMU migration out into separate method
The code for tunnelled migration wierdly required the app to pass
a NULL 'dconn' parameter, only to have to use virConnectOpen
itself shortly thereafter to get a 'dconn' object. Remove this
bogus check & require the app to always pas 'dconn' as before
* src/libvirt.c: Require 'dconn' for virDomainMigrate calls again
and remove call to virConnectOpen
Since virMigratePrepareTunnel() is used for migration over the
native libvirt connection, there is never any need to pass the
target URI to this method.
* daemon/remote.c, src/driver.h, src/libvirt.c, src/libvirt_internal.h,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/remote/remote_protocol.c, src/remote/remote_protocol.h,
src/remote/remote_protocol.x: Remove 'uri_in' parameter from
virMigratePrepareTunnel() method
Move the VIR_DRV_FEATURE* constants into libvirt_internal.h
since these flags are indicating whether APIs in the
libvirt_internal.h file are supported by a driver
* src/driver.h: Remove VIR_DRV_FEATURE* constants
* src/libvirt_internal.h: Add VIR_DRV_FEATURE* constants, using
an enum instead of #define
* src/internal.h: pull in libvirt_internal.h
* src/lxc/lxc.conf: new configuration file, there is currently one
tunable "log_with_libvirtd" that controls whether an lxc controller will
log only to the container log file, or whether it will honor libvirtd's
log output configuration. This provides a way to have libvirtd and its
children log to a single file. The default is to log to the container
log file.
* src/Makefile.am libvirt.spec.in: add the new file
* src/lxc/lxc_conf.[ch] src/lxc/lxc_driver.c: read the new log value
from the configuration file and pass the log informations when
starting up a container.
* src/lxc/lxc_driver.c src/lxc/lxc_controller.c: before launching the
lxc controller, have the lxc driver query the log settings and setup
envp[]. This provides the advantage of honoring the actual log
configuration instead of only what had been set in the environment.
The lxc controller now simply has to call virLogSetFromEnv().
When configuring logging settings, keep more information about the
output destination. Add accessors to retrieve the filter and output
settings in the original string form; this to be used to set up
environment for a child process that also logs.
* src/util/logging.[ch]: add virLogGetFilters and virLogGetOutputs
accessors and modify the internals (including virLogDefineOutput())
to save the data needed for the accessors
* src/util/util.[ch]: Add virFileAbsPath() function to ensure an
absolute path for a potentially realtive path.
* src/libvirt_private.syms: add it in libvirt private symbols
* configure.in: look for AppArmor and devel
* src/security/security_apparmor.[ch] src/security/security_driver.c
src/Makefile.am: add and plug the new driver
* src/security/virt-aa-helper.c: new binary which is used exclusively by
the AppArmor security driver to manipulate AppArmor.
* po/POTFILES.in: registers the new files
* tests/Makefile.am tests/secaatest.c tests/virt-aa-helper-test:
tests for virt-aa-helper and the security driver, secaatest.c is
identical to seclabeltest.c except it initializes the 'apparmor'
driver instead of 'selinux'
The patch implements the missing memory control APIs for lxc, i.e.,
domainGetMaxMemory, domainSetMaxMemory, domainSetMemory, and improves
domainGetInfo to return proper amount of used memory via cgroup.
* src/libvirt_private.syms: Export virCgroupGetMemoryUsage
and add missing virCgroupSetMemory
* src/lxc/lxc_driver.c: Implement missing memory functions
* src/util/cgroup.c, src/util/cgroup.h: Add the function
to get used memory
When James Morris originally submitted his sVirt patches (as seen in
libvirt 0.6.1), he did not require on disk labelling for
virSecurityDomainRestoreImageLabel. A later commit[2] changed this
behavior to assume on disk labelling, which halts implementations for
path-based MAC systems such as AppArmor and TOMOYO where
vm->def->seclabel is required to obtain the label.
* src/security/security_driver.h src/qemu/qemu_driver.c
src/security/security_selinux.c: adds the 'virDomainObjPtr vm'
argument back to *RestoreImageLabel
Add virNodeDeviceParseFile, and make virNodeDeviceParseNode non-static. These
will be used by the test driver.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Add a simple 'computer' device for the default driver. Only implement
the basic calls, no creation or destroy happening.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Fix migration, broken in two different ways by the QEMU monitor
abstraction. Note that the QEMU console emits a "\r\n" as the
line-ending.
* src/qemu/qemu_monitor_text.c (qemuMonitorGetMigrationStatus):
Fix "info migrate" command and its output's parsing.
Implementation of tunnelled migration, using a Unix Domain Socket
on the qemu backend. Note that this requires very new versions of
qemu (0.10.7 at least) in order to get the appropriate bugfixes.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The upcoming tunnelled migration needs to be able to set
a migration in progress in the background, as well as
be able to cancel a migration when a problem has happened.
This patch allows for both of these to properly work.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
A simple misplaced break out of a switch results in:
libvir: error : Failed to open file '/sys/bus/pci/devices/0000:00:54c./vendor': No such file or directory
libvir: error : Failed to open file '/sys/bus/pci/devices/0000:00:54c./device': No such file or directory
libvir: error : this function is not supported by the hypervisor: Failed to read product/vendor ID for 0000:00:54c.
when trying to passthrough a USB host device to qemu.
* src/security_selinux.c: fix a switch/break thinko
* src/conf/domain_conf.c: a simple typo in an XML domain file could lead
to a crash, because we called STRPREFIX() on the looked up value without
checking it was non-null.
* src/conf/domain_conf.c: when declaring a <interface type="bridge">
tag, <source> needs a "bridge" attribute, but the parser complains
about a missing "dev" attribute.
* docs/schemas/domain.rng: allow one <description> tag in the top level
of the <domain> to store user information as text
* src/conf/domain_conf.c src/conf/domain_conf.h: extend the structure
to store this text, grab it at parse time and save it back when
present after <uuid>
Function comments for virStreamEvent{Add,Update,Remove}Callback() are
missing a trailing ':'. Therefore apibuild.py fails to parse the comment
and warns about the missing ':'.
* docs/libvirt-api.xml, docs/libvirt-refs.xml: updated by apibuild.py
* src/libvirt.c: add missing ':' in function comments
Use virStorageFileGetMetadata() to find any backing stores for images
and re-label them
Without this, qemu cannot access qcow2 backing files, see:
https://bugzilla.redhat.com/497131
* src/security/security_selinux.c: re-label backing store files in
SELinuxSetSecurityImageLabel()
Finally, we get to the point of all this.
Move virStorageGetMetadataFromFD() to virStorageFileGetMetadataFromFD()
and move to src/util/storage_file.[ch]
There's no functional changes in this patch, just code movement
* src/storage/storage_backend_fs.c: move code from here ...
* src/util/storage_file.[ch]: ... to here
* src/libvirt_private.syms: export virStorageFileGetMetadataFromFD()
Introduce a metadata structure and make virStorageGetMetadataFromFD()
fill it in.
* src/util/storage_file.h: add virStorageFileMetadata
* src/backend/storage_backend_fs.c: virStorageGetMetadataFromFD() now
fills in the virStorageFileMetadata structure
Prepare the code probing a file's format and associated metadata for
moving into libvirt_util.
* src/storage/storage_backend_fs.c: re-factor the format and metadata
probing code in preparation for moving it
Rename virStorageVolFormatFileSystem to virStorageFileFormat and
move to src/util/storage_file.[ch]
* src/Makefile.am: add src/util/storage_file.[ch]
* src/conf/storage_conf.[ch]: move enum from here ...
* src/util/storage_file.[ch]: .. to here
* src/libvirt_private.syms: update To/FromString exports
* src/storage/storage_backend.c, src/storage/storage_backend_fs.c,
src/vbox/vbox_tmpl.c: update for above changes
When using VNC for graphics + keyboard + mouse, we shouldn't
then use the host OS for audio. Audio should go back over
VNC.
When using SDL for graphics, we should use the host OS for
audio since that's where the display is. We need to allow
certain QEMU env variables to be passed through to guest
too to allow choice of QEMU audio backend.
* qemud/libvirtd.sysconf: Mention QEMU/SDL audio env vars
* src/qemu_conf.c: Passthrough QEMU/SDL audio env for SDL display,
disable host audio for VNC display
Defines the extensions to the remote protocol for generic
data streams. Adds a bunch of helper code to the libvirtd
daemon for working with data streams.
* daemon/Makefile.am: Add stream.c/stream.h to build
* daemon/stream.c, qemud/stream.h: Generic helper functions for
creating new streams, associating streams with clients, finding
existing streams for a client and removing/deleting streams.
* src/remote/remote_protocol.x: Add a new 'REMOTE_STREAM' constant
for the 'enum remote_message_type' for encoding stream data
in wire messages. Add a new 'REMOTE_CONTINUE' constant to
'enum remote_message_status' to indicate further data stream
messsages are expected to follow. Document how the
remote_message_header is used to encode data streams
* src/remote/remote_protocol.h: Regenerate
* daemon/dispatch.c: Remove assumption that a error message
sent to client is always type=REMOTE_REPLY. It may now
also be type=REMOTE_STREAM. Add convenient method for
sending outgoing stream data packets. Log and ignore
non-filtered incoming stream packets. Add a method for
serializing a stream error message
* daemon/dispatch.h: Add API for serializing stream errors
and sending stream data packets
* daemon/qemud.h: Add struct qemud_client_stream for tracking
active data streams for clients. Tweak filter function
operation so that it accepts a client object too.
* daemon/qemud.c: Refactor code for free'ing message objects
which have been fully transmitted into separate method.
Release all active streams when client shuts down. Change
filter function to be responsible for queueing the message
* include/libvirt/libvirt.h.in: Public API contract for
virStreamPtr object
* src/libvirt_public.syms: Export data stream APIs
* src/libvirt_private.syms: Export internal helper APIs
* src/libvirt.c: Data stream API driver dispatch
* src/datatypes.h, src/datatypes.c: Internal helpers for virStreamPtr
object
* src/driver.h: Define internal driver API for streams
* .x-sc_avoid_write: Ignore src/libvirt.c because it trips
up on comments including write()
* python/Makefile.am: Add libvirt-override-virStream.py
* python/generator.py: Add rules for virStreamPtr class
* python/typewrappers.h, python/typewrappers.c: Wrapper
for virStreamPtr
* docs/libvirt-api.xml, docs/libvirt-refs.xml: Regenerate
with new APIs
* src/qemu/qemu_monitor_text.c: Always print command and reply
in qemuMonitorCommandWithHandler. Print all args in each monitor
command API & remove redundant relpy printing
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add new
qemuMonitorRemoveHostNetwork() command for removing host
networks
* src/qemu/qemu_driver.c: Convert NIC hotplug methods over
to use qemuMonitorRemoveHostNetwork()
* src/qemu/qemu_conf.h, src/qemu/qemu_conf.c: Remove prefix arg
from qemuBuildHostNetStr which is no longer required
* src/qemu/qemu_driver.c: Refactor to use qemuMonitorAddHostNetwork()
API for adding host network
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorAddHostNetwork() method for adding host networks
* src/qemu/qemu_conf.c: Remove separator from qemuBuildNicStr()
args, and remove hardcoded 'nic' prefix. Leave it upto callers
instead
* src/qemu/qemu_driver.c: Switch over to using the new
qemuMonitorAddPCINetwork() method for NIC hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorAddPCINetwork API for PCI network device hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorCloseFileHandle and qemuMonitorSendFileHandle
APIs for processing file handles
* src/qemu/qemu_driver.c: Convert NIC hotplug method over to
use qemuMonitorCloseFileHandle and qemuMonitorSendFileHandle
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
API qemuMonitorAddPCIDisk()
* src/qemu/qemu_driver.c: Convert over to using the new
qemuMonitorAddPCIDisk() method, and remove now obsolete
qemudEscape() method
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new API
qemuMonitorRemovePCIDevice() for removing PCI device
* src/qemu/qemu_driver.c: Convert all places removing PCI devices
over to new qemuMonitorRemovePCIDevice() API
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
API qemuMonitorAddPCIHostDevice()
* src/qemu/qemu_driver.c: Switch to using qemuMonitorAddPCIHostDevice()
for PCI host device hotplug
One API adds an exact device based on bus+dev, the other adds
any device matching vendor+product
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorAddUSBDeviceExact() and qemuMonitorAddUSBDeviceMatch()
commands.
* src/qemu/qemu_driver.c: Switch over to using the new
qemuMonitorAddUSBDeviceExact() and qemuMonitorAddUSBDeviceMatch()
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorAddUSBDisk() API
* src/qemu/qemu_driver.c: Switch USB disk hotplug to the new
src/qemu/qemu_driver.c API.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorMigrateToCommand() API
* src/qemu/qemu_driver.c: Switch over to using the
qemuMonitorMigrateToCommand() API for core dumps and save
to file APIs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new API
qemuMonitorMigrateToHost() for doing TCP migration
* src/qemu/qemu_driver.c: Convert to use qemuMonitorMigrateToHost().
Also handle proper URIs (tcp:// as well as tcp:)
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorGetMigrationStatus() command.
* src/qemu/qemu_driver.c: Use new qemuMonitorGetMigrationStatus()
command to check completion status.
* src/qemu/qemu_driver.c: Use new qemuMonitorSetMigrationSpeed()
API during migration
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add new
qemuMonitorSetMigrationSpeed() API
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a new
qemuMonitorGetBlockStatsInfo() command
* src/qemu/qemu_driver.c: Remove directly use of blockstats in
favour of calling qemuMonitorGetBlockStatsInfo()
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add new APIs
qemuMonitorSaveVirtualMemory() and qemuMonitorSavePhysicalMemory()
* src/qemu/qemu_driver.c: Use the new qemuMonitorSaveVirtualMemory()
and qemuMonitorSavePhysicalMemory() APIs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new APis
qemuMonitorChangeMedia and qemuMonitorEjectMedia. Pull in code
for qemudEscape
* src/qemu/qemu_driver.c: Remove code that directly issues 'eject'
and 'change' commands in favour of API calls.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add new
qemuMonitorSetBalloon() based on existing code in
qemudDomainSetMemoryBalloon
* src/qemu/qemu_driver.c: Remove use of qemudDomainSetMemoryBalloon()
in favour of qemuMonitorSetBalloon(). Fix error code when balloon
is not supported
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Pull old
qemudDomainGetMemoryBalloon() code into a new method called
qemuMonitorGetBalloonInfo()
* src/qemu/qemu_driver.c: Update to call qemuMonitorGetBalloonInfo()
and remove qemudDomainGetMemoryBalloon().
* src/qemu/qemu_driver.c: Remove use of 'system_powerdown'
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add a new
qemuMonitorSystemPowerdown() api call
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a new
qemuMonitorStopCPUs() API
* src/qemu/qemu_driver.c: Replace direct monitor commands for 'stop'
with qemuMonitorStopCPUs()
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Rename
Rename qemudMonitorSendCont to qemuMonitorStartCPUs
* src/qemu/qemu_driver.c: Update callers for new name
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a
new qemuMonitorSetVNCPassword() API
* src/qemu/qemu_driver.c: Refactor qemudInitPasswords to
call qemuMonitorSetVNCPassword()
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add a new
qemuMonitorGetCPUInfo() command
* src/qemu/qemu_driver.c: Refactor qemudDetectVcpuPIDs to
use qemuMonitorGetCPUInfo()
Pull out all the QEMU monitor interaction code to a separate
file. This will make life easier when we need to drop in a
new implementation for the forthcoming QMP machine friendly
monitor support.
Next step is to add formal APIs for each monitor command,
and remove direct commands for sending/receiving generic
data.
* src/Makefile.am: Add qemu_monitor.c to build
* src/qemu/qemu_driver.c: Remove code for monitor interaction
* src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: New
file for monitor interaction
* po/POTFILES.in: Add src/qemu/qemu_monitor_text.c
* src/conf/interface_conf.c: This was causing subsequent calls to
virXPathxxx() to fail, since ctxt->node was left pointing at the
dhcp node, rather than the protocol node.
* src/util/xml.c: The virXPath... function take extra care to preserve
the XPath context node (ctxt->node) but in the case of virXPathString
and virXPathBoolean they forgot to do this on the error path. This
patch fixes this and move all ctxt->node = relnode instuctions just
after the xmlXPathEval() to make sure this doesn't happen if this code
is modified.
The python method help docs are copied across from the C
funtion comments, but in the process all line breaks and
indentation was being lost. This made the resulting text
and code examples completely unreadable. Both the API
doc extractor and the python generator were destroying
whitespace & this fixes them to preserve it exactly.
* docs/apibuild.py: Preserve all whitespace when extracting
function comments. Print function comment inside a <![CDATA[
section to fully preserve all whitespace. Look for the
word 'returns' to describe return values, instead of 'return'
to avoid getting confused with code examples including the
C 'return' statement.
* python/generator.py: Preserve all whitespace when printing
function help docs
* src/libvirt.c: Change any return parameter indicated by
'return' to be 'returns', to avoid confusing the API extractor
* docs/libvirt-api.xml: Re-build for fixed descriptions
An inaccessible datastore has no valid URL property so don't
access its URI property.
* src/esx/esx_vi.c: esxVI_LookupDatastoreByName(): check if datastore is
accessible before accessing its URL property
* src/esx/esx_vmx.c: update to changed datastore properties
A given domain XML gets converted to a VMX config, uploaded to the host
and registered as new virtual machine.
* src/esx/esx_driver.c: refactor datastore related path parsing into
esxUtil_ParseDatastoreRelatedPath()
* src/esx/esx_util.[ch]: add esxUtil_ParseDatastoreRelatedPath()
* src/esx/esx_vi.[ch]: add esxVI_Context_UploadFile(), add datastores to
the traversal in esxVI_BuildFullTraversalSpecList(), add
esxVI_LookupDatastoreByName()
* src/esx/esx_vi_methods.[ch]: add esxVI_RegisterVM_Task()
* src/esx/esx_vi_types.c: make some error message more verbose
* src/esx/esx_vmx.[ch]: add esxVMX_AbsolutePathToDatastoreRelatedPath()
to convert a path into a datastore related path, add esxVMX_ParseFileName()
to convert from VMX path format to domain XML path format, extend the other
parsing function to be datastore aware, add esxVMX_FormatFileName() to
convert from domain XML path format to VMX path format, fix VMX ethernet
entry formating
* tests/esxutilstest.c: add test for esxUtil_ParseDatastoreRelatedPath()
* tests/vmx2xmldata/*: update domain XML files to use datastore related paths
* tests/xml2vmxdata/*: update domain XML files to use datastore related paths,
update VMX files to use absolute paths
Add esxVI_Occurence enum to describe expected occurence of items
* src/esx/esx_driver.c: update the use of esxVI_LookupVirtualMachineByUuid()
* src/esx/esx_vi.c: add an esxVI_Occurence parameter to
esxVI_LookupVirtualMachineByUuid() and take care if esxVI_FindByUuid()
can't find anything for a given uuid
* src/esx/esx_vi.h: add esxVI_Occurence enum
* src/esx/esx_vi_methods.c: expect null or more items to be returned
from esxVI_FindByUuid()
* src/esx/esx_driver.c: add esxSupportsLongMode() and update esxCapsInit()
* src/esx/esx_vi.[ch]: Add AnyType handling for lists
* src/esx/esx_vi_types.c: bind VI type HostCpuIdInfo
Extend and cleanup the VMX to domain XML mapping. Add the domain XML to
VMX mapping functions.
* src/esx/esx_driver.c: add esxDomainXMLToNative()
* src/esx/esx_vmx.[ch]: add esxVMX_SCSIDiskNameToControllerAndID(),
esxVMX_IDEDiskNameToControllerAndID(), esxVMX_FloppyDiskNameToController(),
esxVMX_GatherSCSIControllers(), add basic handling for the VMX guestOS entry
to distinguish between i686 and x86_64, make SCSI virtualDev VMX entry
optional as it should be, map the VMX networkName entry to the domain XML
interface bridge name, add basic mapping for serial devices in pipe mode,
add several esxVMX_Format*() functions
This enables the auth callback to automatically distinguish between
requests for ESX host and vCenter credentials.
* src/esx/esx_util.[ch]: set challenge for auth callback to hostname
* src/esx/esx_driver.c: add esxNodeGetFreeMemory(), cache IP address
* src/esx/esx_vi.[ch]: refactor resource pool query into esxVI_GetResourcePool()
* src/esx/esx_vi_types.[ch]: bind VI type ResourcePoolResourceUsage
Currently, libvirtd will start a dnsmasq process for the virtual
network, but (aside from killing the dnsmasq process and replacing it),
there's no way to define tftp boot options.
This change introduces the appropriate tags to the dhcp configuration:
<network>
<name>default</name>
<bridge name="virbr%d" />
<forward/>
<ip address="192.168.122.1" netmask="255.255.255.0">
<tftp root="/var/lib/tftproot" />
<dhcp>
<range start="192.168.122.2" end="192.168.122.254" />
<bootp file="pxeboot.img"/>
</dhcp>
</ip>
</network>
When the attributes are present, these are passed to the
arguments to dnsmasq:
dnsmasq [...] --enable-tftp --tftp-root /srv/tftp --dhcp-boot pxeboot.img
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
from <tftp /> from <bootp />
At present, only local tftp servers are supported (ie, dnsmasq runs as
the tftp server), but we could improve this in future by adding a
server= attribute.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2009-09-21 Paolo Bonzini <pbonzini@redhat.com>
Jeremy Kerr <jk@ozlabs.org>
* docs/formatnetwork.html.in: Document new tags.
* docs/formatnetwork.html: Regenerate.
* docs/schemas/network.rng: Update.
* src/network_conf.c (virNetworkDefFree): Free new fields.
(virNetworkDHCPRangeDefParseXML): Parse <bootp>.
(virNetworkIPParseXML): New, parsing <dhcp> and <tftp>.
(virNetworkDefParseXML): Use virNetworkIPParseXML instead of
virNetworkDHCPRangeDefParseXML.
(virNetworkDefFormat): Pretty print new fields.
* src/network_conf.h (struct _virNetworkDef): Add netboot fields.
* src/network_driver.c (networkBuildDnsmasqArgv): Add
TFTP and BOOTP arguments.
* tests/Makefile.am (EXTRA_DIST): Add networkschemadata.
* tests/networkschematest: Look in networkschemadata.
* tests/networkschemadata/netboot-network.xml: New.
Add the virStrncpy function, which takes a dst string, source string,
the number of bytes to copy and the number of bytes available in the
dest string. If the source string is too large to fit into the
destination string, including the \0 byte, then no data is copied and
the function returns NULL. Otherwise, this function copies n bytes
from source into dst, including the \0, and returns a pointer to the
dst string. This function is intended to replace all unsafe uses
of strncpy in the code base, since strncpy does *not* guarantee that
the buffer terminates with a \0.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Latest upstream QEMU can be built with Xen support, which introduces
a -xen-domid argument. This was mistakenly detected as -domid due
to old Xenner support. Adapt to cope with both syntax. Also only
set domid if the virt type is xen, or the guest type is xen
* src/qemu_conf.c, src/qemu_conf.h: Detect new -xen-domid flag in
preference to -domid.
* tests/qemuxml2argvdata/qemuxml2argv-bootloader.args,
tests/qemuxml2argvdata/qemuxml2argv-input-xen.args: Add missing
-domid param
* tests/qemuxml2argvdata/qemuxml2argv-misc-uuid.args: Remove bogus
-boot param.
* tests/qemuxml2argvtest.c: Add missing QEMUD_CMD_FLAG_DOMID params
The commit cb51aa48a7 "Fix up connection
reference counting." changed the driver closing and virConnectPtr
unref-logic in virConnectClose().
Before this commit virConnectClose() closed all drivers of the given
virConnectPtr and virUnrefConnect()'ed it afterwards. After this
commit the driver-closing is done in virUnrefConnect() if and only if
the ref-count of the virConnectPtr dropped to zero.
This change in execution order leads to a virConnectPtr leak, at least
for connections to Xen.
The relevant call sequences:
virConnectOpen() -> xenUnifiedOpen() ...
... xenInotifyOpen() -> virConnectRef(conn)
... xenStoreOpen() -> xenStoreAddWatch() -> conn->refs++
virConnectClose() -> xenUnifiedClose() ...
... xenInotifyClose() -> virUnrefConnect(conn)
... xenStoreClose() -> xenStoreRemoveWatch() -> virUnrefConnect(conn)
Before the commit this additional virConnectRef/virUnrefConnect calls
where no problem, because virConnectClose() closed the drivers
explicitly and the additional refs added by the Xen subdrivers were
removed properly. After the commit this additional refs result in a
virConnectPtr leak (including a leak of the hypercall file handle;
that's how I noticed this problem), because now the drivers are only
close if and only if the ref-count drops to zero, but this cannot
happen anymore, because the additional refs from the Xen subdrivers
would only be removed if the drivers get closed, but that doesn't
happen because the ref-count cannot drop to zero.
The fix for this problem is simple: remove the
virConnectRef/virUnrefConnect calls from the Xen subdrivers (see
attached patch). Maybe someone could explain why the Xen Inotify and
Xen Store driver do this extra ref-counting, but none of the other Xen
subdrivers. It seems unnecessary to me and can be removed without
problems.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/conf/domain_conf.c: Don't assume all virDomainObjPtr have
a non-NULL monitor_chr field in virDomainObjFormat.
* src/lxc/lxc_driver.c: Implement suspend/resume driver APis
* src/util/cgroup.c, src/util/cgroup.h: Support the 'freezer'
cgroup controller
* src/libvirt_private.syms: Export virCgroupSetFreezerState
and virCgroupGetFreezerState
When making changes to the remote protocol, src/ is always built
first, so rpcgen should live there, to avoid having to run make
in the 'daemon/' directory before building src/
* src/Makefile.am: Add rules for rpcgen, and drop -I../daemon from
remote client build
* daemon/Makefile.am: Add -I../src/remote/ to libvirtd build
and remove rpcgen rules
* daemon/libvirtd.c: Adapt include of remote_driver.h taking
into account new -I flag
* daemon/remote_protocol.c, daemon/remote_protocol.h,
daemon/remote_protocol.x: Move to src/remote/
* daemon/rpcgen_fix.pl: Move to src/remote/rpcgen_fix.pl
* src/capabilities.c, src/capabilities.h, src/domain_conf.c,
src/domain_conf.h, src/domain_event.c, src/domain_event.h,
src/interface_conf.c, src/interface_conf.h,
src/network_conf.c, src/network_conf.h, src/node_device_conf.c,
src/node_device_conf.h, src/secret_conf.c, src/secret_conf.h,
src/storage_conf.c, src/storage_conf.h, src/storage_encryption_conf.c,
src/storage_encryption_conf.h: Move to src/conf/
* src/Makefile.am: Add -Isrc/conf to the individual build targets
which need to use XML config APIs. Remove LIBXML_CFLAGS, LIBSSH2_CFLAGS
and SELINUX_CFLAGS from global INCLUDES and only have them in build
targets which actually need them. Create a libvirt_conf.la
convenience library for all config parsers
* src/hostusb.h: Remove bogus include of domain_conf.h
* tests/Makefile.am: Add -Isrc/conf. Remove bogus -I$builddir/src
since it never has any generated header files
* daemon/Makefile.am: Add -Isrc/conf
* proxy/Makefile.am: Add -Isrc/conf and cope with renamed files
* src/hash.c: Remove bogus include of libxml/threads.h
* daemon/default-network.xml: Move to src/network/default.xml
* daemon/libvirtd_qemu.aug, daemon/test_libvirtd_qemu.aug: Move
to src/qemu/
* src/qemu.conf: Move to src/qemu/qemu.conf
* daemon/Makefile.am: Remove rules for default-nmetwork.xml and
libvirtd_qemu.aug and test_libvirtd_qemu.aug. Fix typo in
uninstall-local that would install polkit again.
* src/Makefile.am: Add rules for installing network/default.xml
and the qemu/*.aug files. Add test case for QEMU augeas files.
Add uninstall-local rule for files/directories created during
install. Rename install-exec-local to install-data-local.
Only install qemu.conf if WITH_QEMU is set.
* tests/networkschematest: Update for XML location move
Move the virsh tool and its man page into the tools directory
* Makefile.am: Remove rules for virsh.1 man page
* virsh.1: Remove auto-generated file
* docs/Makefile.am: Remove rules for virsh.pod man page
* docs/virsh.pod: Move to tools/ directory
* src/Makefile.am, src/.gitignore: Remove rules for virsh
* src/console.c, src/console.h, src/*.ico, src/virsh_win_icon.rc,
src/virsh.c: Move into tools/ directory
* tools/Makefile.am: Add rules for building virsh
* tools/.gitignore: Ignore virsh built files
* tests/virshtest.c, tests/int-overflow: Update for new
virsh location