Changeset
commit 5073aa994a
Author: Cole Robinson <crobinso@redhat.com>
Date: Mon Jan 11 11:40:46 2010 -0500
Added support for product/vendor based passthrough, but it only
worked at the security driver layer. The main guest XML config
was not updated with the resolved bus/device ID. When the QEMU
argv refactoring removed use of product/vendor, this then broke
launching guests.
THe solution is to move the product/vendor resolution up a layer
into the QEMU driver. So the first thing QEMU does is resolve
the product/vendor to a bus/device and updates the XML config
with this info. The rest of the code, including security drivers
and QEMU argv generated can now rely on bus/device always being
set.
* src/util/hostusb.c, src/util/hostusb.h: Split vendor/product
resolution code out of usbGetDevice and into usbFindDevice.
Add accessors for bus/device ID
* src/security/virt-aa-helper.c, src/security/security_selinux.c,
src/qemu/qemu_security_dac.c: Remove vendor/product from the
usbGetDevice() calls
* src/qemu/qemu_driver.c: Use usbFindDevice to resolve vendor/product
into a bus/device ID
The pci_del command is not being ported to QMP. Convert all the
QEMU hotplug code over to use device_del whenever it is available
to avoid the pci_del problem
* src/qemu/qemu_driver.c: Convert unplug code to device_del
Previously hot-unplug could not be supported for USB devices
in QEMU, since usb_del required the guest visible address
which libvirt never knows. With 'device_del' command we can
now unplug based on device alias, so support that.
* src/qemu/qemu_driver.c: Use device_del to remove USB devices
Upstart crashes & burns in a heap if $TERM environment variable
is missing. Presumably the kernel always sets this when booting
init on a real machine, so libvirt should set it for containers
too.
To make a typical inittab / mingetty setup happier, we need to
symlink the primary console /dev/pts/0 to /dev/tty1.
Improve logging in certain scenarios to make troubleshooting
easier
* src/lxc/lxc_container.c: Create /dev/tty1 and set $TERM
When using the 'ns' cgroup controller, the moment a process calls
'unshare(CLONE_NEWNS)', it will be given a private cgroup tree
under its current location. This really messages up the LXC
controller process, because it ends up creating the containers'
cgroup in the wrong place. The fix is fairly easy, just move
the cgroup setup before the code which calls unshare(). The
'ns' controller will still create extra undesired cgroups, but
they at least won't break libvirt's setup now.
The patch also adds a missing cgroups allow rule for /dev/tty
device node
When getting the driver/domain cgroup it is possible to specify
whether it should be auto created. If auto-creation was turned
off, libvirt still mistakenly created its own top level cgroup
* src/util/cgroup.c: Honour autocreate flag for top level cgroup
This allows the config to have a setting that means "leave it alone",
eg when building a pool where the directory already exists the user
may want the current uid/gid of the directory left intact. This
actually gets us back to older behavior - before recent changes to the
pool building code, we weren't as insistent about honoring the uid/gid
settings in the XML, and virt-manager was taking advantage of this
behavior.
As a side benefit, removing calls to getuid/getgid from the XML
parsing functions also seems like a good idea. And having a default
that is different from a common/useful value (0 == root) is a good
thing in general, as it removes ambiguity from decisions (at least one
place in the code was checking for (perms.uid == 0) to see if a
special uid was requested).
Note that this will only affect newly created pools and volumes. Due
to the way that the XML is parsed, then formatted for newly created
volumes, all existing pools/volumes already have an explicit uid and
gid set.
src/conf/storage_conf.c: Remove calls to setuid/setgid for default values
of uid/gid, and set them to -1 instead
src/storage/storage_backend.c:
src/storage/storage_backend_fs.c:
Make account for the new default values of perms.uid
and perms.gid.
With the recent changes to the linking defaults in Fedora 13 (namely
enabling --no-add-needed behaviour by default), we have to pass the
dlopen()-providing libraries directly at the link of the module; use the
same AC_SEARCH_LIBS function as used before to look for it and add it to
the Makefile.
QEMU has a monitor command 'set_cpu' which allows a specific
CPU to be toggled between online& offline state. libvirt CPU
hotplug does not work in terms of individual indexes CPUs.
Thus to support this, we iteratively toggle the online state
when the total number of vCPUs is adjusted via libvirt
NB, currently untested since QEMU segvs when running this!
* src/qemu/qemu_driver.c: Toggle online state for CPUs when
doing hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
monitor API for toggling a CPU's online status via 'set_cpu
The storage backend implementations all presume that the XML parser
is validating correctness of the source specification. The check for
a source device was lost at some point. This allowed for a potential
crash in the disk backend. Re-introduce the sanity check
* src/conf/storage_conf.c: Re-add check for source device
When trying to delete a pool the error message claimed the volume
could not be deleted.
* src/storage/storage_driver.c: Error message referred to
volumes instead of pools
The QMP code was running query-migration instead of query-migrate.
This doesn't work so well
* src/qemu/qemu_monitor_json.c: s/query-migration/query-migrate/
The code to remove the cgroup after QEMU failed to startup could
be obscuring a real error from earlier on. It is not neccessary
to raise an error in this case, so tell cgroups to keep quiet
* src/qemu/qemu_driver.c: Don't raise cgroups error in QEMU start
cleanup code.
The QEMU hotunplug code for PCI devices was looking at host
devices in the guest config without first filtering non
PCI devices. This means it was reading garbage
* src/qemu/qemu_driver.c: Filter out non-PCI devices
Commit 3c12a67b76 added
a dependency on the NFS_SUPER_MAGIC macro, which is
defined in linux/magic.h. Unfortunately linux/magic.h
is not available in RHEL-5, and causes a compile error.
Just define it locally, since this is something that
can't change.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Move *all* file operations related to creation and writing of libvirt
header to the domain save file into a hook function that is called by
virFileOperation. First try to call virFileOperation as root. If that
fails with EACCESS, and (in the case of Linux) statfs says that we're
trying to save the file on an NFS share, rerun virFileOperation,
telling it to fork a child process and setuid to the qemu user. This
is the only way we can successfully create a file on a root-squashed
NFS server.
This patch (along with setting dynamic_ownership=0 in qemu.conf)
makes qemudDomainSave work on root-squashed NFS.
* src/qemu/qemu_driver.c: provide new qemudDomainSaveFileOpHook()
utility, use it in qemudDomainSave() if normal creation of the
file as root failed, and after checking the filesystem type for
the storage is NFS. In that case we also bypass the security
driver, as this would fail on NFS.
If qemudDomainRestore fails to open the domain save file, create a
pipe, then fork a process that does setuid(qemu_user) and opens the
file, then reads this file and stuffs it into the pipe. the parent
libvirtd process will use the other end of the pipe as its fd, then
reap the child process after it's done reading.
This makes domain restore work on a root-squash NFS share that is only
visible to the qemu user.
* src/qemu/qemu_driver.c: add new qemudOpenAsUID() helper function,
and use it in qemudDomainRestore() if reading as root directly failed.
The USB/PCI device hotplug code for the QEMU driver was forgetting
to allocate a unique device alias.
* src/qemu/qemu_driver.c: Fill in device alias for USB/PCI devices
The code assumed that 'device_add' returned an empty string upon
success. This is not true, it sometimes prints random debug info.
THus we need to check for an explicit fail string
* src/qemu/qemu_monitor_text.c: Fix error checking of the device_add
monitor command
Another warning caught by coverity. Continue to perform best-effort
closing and resource release, but warn the caller about the failure.
* src/esx/esx_driver.c (esxClose): Return an error on failure to close.
Various safezero() implementations used either -1, errno or -errno
return values. This patch fixes them all to return -1 and set errno
appropriately.
There was also a bug in size parameter passed to safewrite() which could
result in an attempt to write gigabytes out of a megabyte buffer.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When a VM save attempt failed, the VM would be left in a paused
state. It is neccessary to resume CPU execution upon failure
if it was running originally
* src/qemu/qemu_driver.c: Resume CPUs upon save failure
This supports cancellation of jobs for the QEMU driver against
the virDomainMigrate, virDomainSave and virDomainCoreDump APIs.
It is not yet supported for the virDomainRestore API, although
it is desirable.
* src/qemu/qemu_driver.c: Issue 'migrate_cancel' command if
virDomainAbortJob is issued during a migration operation
* tools/virsh.c: Add a domjobabort command
This defines the wire protocol for the new API
* src/remote/remote_protocol.x: Wire protocol definition
* src/remote/remote_driver.c,daemon/remote.c: Client and server
side implementation
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h,
daemon/remote_dispatch_table.h, src/remote/remote_protocol.c,
src/remote/remote_protocol.h: Re-generate from remote_protocol.x
This provides the internal glue for the driver API
* src/driver.h: Internal API contract
* src/libvirt.c, src/libvirt_public.syms: Connect public API
to driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub out entry points
Introduce support for virDomainGetJobInfo in the QEMU driver. This
allows for monitoring of any API that uses the 'info migrate' monitor
command. ie virDomainMigrate, virDomainSave and virDomainCoreDump
Unfortunately QEMU does not provide a way to monitor incoming migration
so we can't wire up virDomainRestore yet.
The virsh tool gets a new command 'domjobinfo' to query status
* src/qemu/qemu_driver.c: Record virDomainJobInfo and start time
in qemuDomainObjPrivatePtr objects. Add generic shared handler
for calling 'info migrate' with all migration based APIs.
* src/qemu/qemu_monitor_text.c: Fix parsing of 'info migration' reply
* tools/virsh.c: add new 'domjobinfo' command to query progress
* src/remote/remote_protocol.x: Define wire protocol format
for virDomainGetJobInfo API
* src/remote/remote_driver.c, daemon/remote.c: Implement client
and server marshalling code for virDomainGetJobInfo()
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h, daemon/remote_dispatch_table.h,
src/remote/remote_protocol.c, src/remote/remote_protocol.h: Rebuild
files from src/remote/remote_protocol.x
The internal glue layer for the new pubic API
* src/driver.h: Define internal driver API contract
* src/libvirt.c, src/libvirt_public.syms: Wire up public
API to internal driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub new entry point
All other uses of get_str_prop in this file that ignored
failure explicitly cast to void.
* src/node_device/node_device_hal.c (dev_create): Silence coverity
warning.
A number of the error messages raised when parsing USB devices
refered to PCI devices by mistake
* src/qemu/qemu_conf.c: s/PCI/USB/ in qemuParseCommandLineUSB()
when the underlying qemu supports the drive/device model and the
controller has been added this way.
* src/qemu/qemu_driver.c: use qemuMonitorDelDevice() when detaching
PCI controller and if supported
* src/qemu/qemu_monitor.[ch]: add new qemuMonitorDelDevice() function
* src/qemu/qemu_monitor_json.[ch]: JSON backend for DelDevice command
* src/qemu/qemu_monitor_text.[ch]: Text backend for DelDevice command
* src/qemu/qemu_driver.c: in qemudDomainDetachPciControllerDevice()
when a controller is not present in the system anymore, the PCI
address must be deleted from libvirt's hashtable because it can
be re-used for other purposes.
* src/qemu/qemu_driver.c: in qemudDomainAttachDevice(), one must not
delete the data part when the operation succeeds because it is
required later on. The correct pattern to handlethe parsed
representation of the device information on success
is dev->data.controller = NULL; virDomainDeviceDefFree(dev);,
which leaves the structure pointed at by data in memory.
* src/cpu/cpu_x86.c (x86Decode): Don't dereference NULL when passed
a NULL "models" pointer, or when passed a nonzero "nmodels" value
and a corresponding NULL models[i].
* src/phyp/phyp_driver.c (phypUUIDTable_Push): Move incr/decr
of ptr/nread into the loop where those variables are used.
Also, remove "exit" label and just-preceding "goto".
Allow an arbitrary timezone with QEMU by setting the $TZ environment
variable when launching QEMU
* src/qemu/qemu_conf.c: Set TZ environment variable if a timezone
is requested
* tests/qemuxml2argvtest.c: Add test case for timezones
* tests/qemuxml2argvdata/qemuxml2argv-clock-france.xml,
tests/qemuxml2argvdata/qemuxml2argv-clock-france.args: Data
for timezone tests
This extends the XML to allow for
<clock offset='timezone' timezone='Europe/Paris'/>
This is useful if the admin has not configured any timezone on the
host OS, but still wants to synchronize a guest to a specific one.
* src/conf/domain_conf.h, src/conf/domain_conf.c: Support extra
'timezone' attribute on clock configuration
* docs/schemas/domain.rng: Add 'timezone' attribute
* src/xen/xend_internal.c, src/xen/xm_internal.c: Reject configs
with a configurable timezone
This allows QEMU guests to be started with an arbitrary clock
offset
The test case can't actually be enabled, since QEMU argv expects
an absolute timestring, and this will obviously change every
time the test runs :-( Hopefully QEMU will allow a relative
time offset in the future.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Use the -rtc arg
if available to support variable clock offset mode
* tests/qemuhelptest.c: Add QEMUD_CMD_FLAG_RTC for qemu 0.12.1
* qemuxml2argvdata/qemuxml2argv-clock-variable.args,
qemuxml2argvdata/qemuxml2argv-clock-variable.xml,
qemuxml2argvtest.c: Test case, except we can't actually enable
it yet.
This introduces a third option for clock offset synchronization,
that allows an arbitrary / variable adjustment to be set. In
essence the XML contains the time delta in seconds, relative to
UTC.
<clock offset='variable' adjustment='123465'/>
The difference from 'utc' mode, is that management apps should
track adjustments and preserve them at next reboot.
* docs/schemas/domain.rng: Schema for new clock mode
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parse
new clock time delta
* src/libvirt_private.syms, src/util/xml.c, src/util/xml.h: Add
virXPathLongLong() method
The XML will soon be extended to allow more than just a simple
localtime/utc boolean flag. This change replaces the plain
'int localtime' with a separate struct to prepare for future
extension
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add a new
virDomainClockDef structure
* src/libvirt_private.syms: Export virDomainClockOffsetTypeToString
and virDomainClockOffsetTypeFromString
* src/qemu/qemu_conf.c, src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
src/xen/xm_internal.c: Updated to use new structure for localtime
While building under RHEL-5, I got a compile warning because
virDomainObjFormat was defined but not used. That came about
because in RHEL-5 we build with "#define PROXY", and
virDomainObjFormat is only used with !PROXY. Move the
define.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/openvz/openvz_conf.c (openvzGetVEID): Always call fclose.
Diagnose parse failure also when vzlist output is empty.
If somehow we read a -1, diagnose that (albeit as a parse failure).
Kind of minor, but it annoys me that the default auth callback
doesn't put a space between the prompt and the input, like a typical
terminal, ssh, etc. This patch changes the current prompt:
Please enter your authentication name:myuser
to
Please enter your authentication name: myuser
In bug 567931 we found that virt-top would exit occasionally
when the terminal window was resized. Tracking this down it
turned out that SIGWINCH was being delivered to the process at
exactly the point where the libvirt remote driver was calling
poll(2) waiting for a reply from libvirtd.
This caused the poll(2) call to be interrupted (returning errno
EINTR). However handling EINTR the same way as EAGAIN was not
the solution to this problem since we found previously that this
would break Ctrl-C handling (commit 47fec8eac2).
The correct solution is to mask out SIGWINCH for the duration
of the poll(2) system call. The per-thread mask is changed and
restored immediately after the call. Since we are using
pthread_sigmask, this should not affect other threads, and
since we restore the signal mask immediately afterwards it should
not affect the current thread visibly either. Other possibly
problematic signals are SIGCHLD and SIGPIPE and these are
masked too.
Note use of ignore_value: It's not fatal if we cannot mask out
SIGWINCH, and in any case pthread_sigmask never fails on Linux
as long as you supply the correct arguments.
I tested this patch and it cures the original problem with
virt-top.
Create the filesystem on the partition used by the pool
* configure.ac: check for mkfs availability
* libvirt.spec.in: add extra require on util-linux for mkfs
* src/storage/storage_backend_fs.c: run mkfs with the expected
fs type when creating a filesystem pool
We were using 'Y' to mean exabyte, when the correct abbreviation would be
'E' ('Y' is yettabyte, which is exabyte * 1024 * 1024). While it isn't
strictly backwards compatible, I highly doubt anyone was actually using
this broken behavior, so I don't see any harm in in dropping 'Y' handling.
Recently we introduced O_DSYNC flag when creating raw storage files to
avoid filling all disk cache with dirty pages. However, the patch got
lost when virStorageBackendCreateRaw was reworked using
virFileOperation. Let's use O_DSYNC again.
* src/xen/xend_internal.c (xenDaemonDomainSetAutostart): Avoid a NULL
dereference upon non-SEXPR_VALUE'd on_xend_start. This bug was
introduced by commit 37ce5600c0.
There were a few operations on the storage volume file that were still
being done as root, which will fail if the file is on a root-squashed
NFS share. The result was that attempts to create a storage volume of
type "raw" on a root-squashed NFS share would fail.
This patch uses the newly introduced "hook" function in
virFileOperation to execute all those file operations in the child
process that's run under the uid that owns the file (and, presumably,
has permission to write to the NFS share)
* src/storage/storage_backend.c: use virFileOperation() in
virStorageBackendCreateRaw, turning virStorageBackendCreateRaw()
into a new createRawFileOpHook() hook
It turns out it is also useful to be able to perform other operations
on a file created while running as a different uid (eg, write things
to that file), and possibly to do this to a file that already
exists. This patch adds an optional hook function to the renamed (for
more accuracy of purpose) virFileOperation; the hook will be called
after the file has been opened (possibly created) and gid/mode
checked/set, before closing it.
As with the other operations on the file, if the VIR_FILE_OP_AS_UID
flag is set, this hook function will be called in the context of a
child process forked from the process that called virFileOperation.
The implication here is that, while all data in memory is available to
this hook function, any modification to that data will not be seen by
the caller - the only indication in memory of what happened in the
hook will be the return value (which the hook should set to 0 on
success, or one of the standard errno values on failure).
Another piece of making the function more flexible was to add an
"openflags" argument. This arg should contain exactly the flags to be
passed to open(2), eg O_RDWR | O_EXCL, etc.
In the process of adding the hook to virFileOperation, I also realized
that the bits to fix up file owner/group/mode settings after creation
were being done in the parent process, which could fail, so I moved
them to the child process where they should be.
* src/util/util.[ch]: rename and rework virFileCreate-->virFileOperation,
and redo flags in virDirCreate
* storage/storage_backend.c, storage/storage_backend_fs.c: update the
calls to virFileOperation/virDirCreate to reflect changes in the API,
but don't yet take advantage of the hook.
Fix multiple veth problem.
NETIF setting was overwritten after first CT because any CT could not be
found by name.
* src/openvz/openvz_conf.c src/openvz/openvz_conf.h: add the
openvzGetVEID lookup function
* src/openvz/openvz_driver.c: use it in openvzDomainSetNetwork()
If the hostname as returned by "gethostname" resolves
to "localhost" (as it does with the broken Fedora-12
installer), then live migration will fail because the
source will try to migrate to itself. Detect this
situation up-front and abort the live migration before
we do any real work.
* src/util/util.h src/util/util.c: add a new virGetHostnameLocalhost
with an optional localhost check, and rewire virGetHostname() to use
it
* src/libvirt_private.syms: expose the new function
* src/qemu/qemu_driver.c: use it in qemudDomainMigratePrepare2()
* src/xen/xend_internal.c (xenDaemonDomainSetAutostart): Rewrite to
avoid dereferencing the result of sexpr_lookup. While in this
particular case, it was guaranteed never to be NULL, due to the
preceding "if sexpr_node(...)" guard, it's cleaner to skip the
sexpr_node call altogether, and also saves a lookup.
This patch sets or unsets the IFF_VNET_HDR flag depending on what device
is used in the VM. The manipulation of the flag is done in the open
function and is only fatal if the IFF_VNET_HDR flag could not be cleared
although it has to be (or if an ioctl generally fails). In that case the
macvtap tap is closed again and the macvtap interface torn.
* src/qemu/qemu_conf.c src/qemu/qemu_conf.h: pass qemuCmdFlags to
qemudPhysIfaceConnect()
* src/util/macvtap.c src/util/macvtap.h: add vnet_hdr boolean to
openMacvtapTap(), and private function configMacvtapTap()
* src/qemu/qemu_driver.c: add extra qemuCmdFlags when calling
qemudPhysIfaceConnect()
For __virExec() this is a semantic NOP except for when fork()
fails. __virExec() would previously forget to restore the signal mask
in this case; virFork() corrects this behavior.
virFileCreate() and virDirCreate() gain the code to reset the logging
and properly deal with the signal handling race condition.
This also removes a log message that had a typo ("cannot fork o create
file '%s'") - this error is now logged in a more generic manner in
virFork() (more generic, but really just as informative, since the
fact that it's forking to create a file is immaterial to the fact that
it simply can't fork)
* src/util/util.c: use the generic virFork() in the 3 functions
virFork() contains bookkeeping that must be done any time a process
forks. Currently this includes:
1) Call virLogLock() prior to fork() and virLogUnlock() just after,
to avoid a deadlock if some other thread happens to hold that lock
during the fork.
2) Reset the logging hooks and send all child process log messages to
stderr.
3) Block all signals prior to fork(), then either a) reset the signal
mask for the parent process, or b) clear the signal mask for the
child process.
Note that the signal mask handling in __virExec erroneously fails to
restore the signal mask when fork() fails. virFork() fixes this
problem.
Other than this, it attempts to behave as closely to fork() as
possible (including preserving errno for the caller), with a couple
exceptions:
1) The return value is 0 (success) or -1 (failure), while the pid is
returned via the pid_t* argument. Like fork(), if pid < 0 there is
no child process, otherwise both the child and the parent will
return to the caller, and both should look at the return value,
which will indicate if some of the extra processing outlined above
encountered an error.
2) If virFork() returns with pid < 0 or with a return value < 0
indicating an error condition, the error has already been
reported. You can log an additional message if you like, but it
isn't necessary, and may be awkwardly extraneous.
Note that virFork()'s child process will *never* call _exit() - if a
child process is created, it will return to the caller.
* util.c util.h: add virFork() function, based on what is currently
done in __virExec().
Support virtio-serial controller and virtio channel in QEMU backend.
Will output
the following for virtio-serial controller:
-device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4,max_ports=16,vectors=4
and the following for a virtio channel:
-chardev pty,id=channel0 \
-device
virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0
* src/qemu/qemu_conf.c: Add argument output for virtio
* tests/qemuxml2argvtest.c
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args: Add test for
QEMU command line generation
Add support for virtio-serial by defining a new 'virtio' channel target type
and a virtio-serial controller. Allows the following to be specified in a
domain:
<controller type='virtio-serial' index='0' ports='16' vectors='4'/>
<channel type='pty'>
<target type='virtio' name='org.linux-kvm.port.0'/>
<address type='virtio-serial' controller='0' bus='0'/>
</channel>
* docs/schemas/domain.rng: Add virtio-serial controller and virtio
channel type.
* src/conf/domain_conf.[ch]: Domain parsing/serialization for
virtio-serial controller and virtio channel.
* tests/qemuxml2xmltest.c
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.xml: add domain xml
parsing test
* src/libvirt_private.syms src/qemu/qemu_conf.c:
virDomainDefAddDiskControllers() renamed to
virDomainDefAddImplicitControllers()
Remove virDomainDevicePCIAddressEqual and virDomainDeviceDriveAddressEqual,
which are defined but not used anywhere.
* src/conf/domain_conf.[ch] src/libvirt_private.syms: Remove
virDomainDevicePCIAddressEqual and virDomainDeviceDriveAddressEqual.
Currently we just error with ex. 'virbr0: No such device'.
Since we are using public API calls here, we need to ensure that any
raised error is properly saved and restored, since API entry points
always reset messages.
virGetLastError returns NULL if no error has been set, not on
allocation error like virSetError assumed. Use virLastErrorObject
instead. This fixes virSetError when no error is currently stored.
Rework and simplification of teardown of the macvtap device.
Basically all devices with the same MAC address and link device are kept
alive and not attempted to be torn down. If a macvtap device linked to a
physical interface with a certain MAC address 'M' is to be created it
will automatically fail if the interface is 'up'ed and another macvtap
with the same properties (MAC addr 'M', link dev) happens to be 'up'.
This will prevent the VM from starting or the device from being attached
to a running VM. Stale interfaces are assumed to be there for some
reason and not stem from libvirt.
In the VM shutdown path, it's assuming that an interface name is always
available so that if the device type is DIRECT it can be torn down
using its name.
* src/util/macvtap.h src/libvirt_macvtap.syms: change of deleting routine
* src/util/macvtap.c: cleanups and change of deleting routine
* src/qemu/qemu_driver.c: change cleanup on shutdown
* src/qemu/qemu_conf.c: don't delete Macvtap in qemudPhysIfaceConnect()
The QEMU JSON monitor changed balloon commands to return/accept
bytes instead of kilobytes. Update libvirt to cope with this
* src/qemu/qemu_monitor_json.c: Expect/use bytes for ballooning
* src/uml/uml_driver.c (umlMonitorCommand): This function would
sometimes return -1, yet fail to free the "reply" it had allocated.
Hence, no caller would know to free the corresponding argument.
When returning -1, be sure to free all allocated resources.
* src/storage/storage_backend_mpath.c (virStorageBackendIsMultipath):
The result of dm_get_next_target was never used (and isn't needed),
so don't store it.
Similar to the Set*Mem commands, this implementation was bogus and
misleading. Make it clear this is a hotplug only operation, and that the
hotplug piece isn't even implemented.
Also drop the overkill maxvcpus validation: we don't perform this check
at XML define time so clearly no one is missing it, and there is
always the risk that our info will be out of date, possibly preventing
legitimate CPU values.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
SetMem and SetMaxMem are hotplug only APIs, any persistent config
changes are supposed to go via XML definition. The original implementation
of these calls were incorrect and had the nasty side effect of making
a psuedo persistent change that would be lost after libvirtd restart
(I didn't know any better).
Fix these APIs to rightly reject non running domains.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
The plain QEMU tree does not include 'thread_id' in the JSON
output. Thus we need to treat it as non-fatal if missing.
* src/qemu/qemu_monitor_json.c: Treat missing thread_id as non-fatal
A typo in the check for the primary IDE controller could cause
a crash on restore depending on the exact guest config.
* src/qemu/qemu_conf.c: Fix s/video/controller/ typo & slot
number typo
Current error reporting for JSON mode returns the full JSON
command string and full JSON error string. This is not very
user friendly, so this change makes the error report only
contain the basic command name, and friendly error message
description string. The full JSON data is logged instead.
* src/qemu/qemu_monitor_json.c: Always return the 'desc' field from
the JSON error message to users.
When in JSON mode, QEMU requires that 'qmp_capabilities' is run as
the first command in the monitor. This is a no-op when run in the
text mode monitor
* src/qemu/qemu_driver.c: Run capabilities negotiation when
connecting to the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
support for the 'qmp_capabilities' command, no-op in text mode.
This part adds support for qemu making a macvtap tap device available
via file descriptor passed to qemu command line. This also attempts to
tear down the macvtap device when a VM terminates. This includes support
for attachment and detachment to/from running VM.
* src/qemu/qemu_conf.[ch] src/qemu/qemu_driver.c: add support in the
QEmu driver
This part adds the helper code to setup and tear down macvtap devices
using direct communication with the device driver via netlink sockets.
The rather short messages received from the netlink layer are now
written into a dynamically allocated buffer
* src/util/macvtap.h src/util/macvtap.c: provides the new module
* po/POTFILES.in: the module contains translated strings
This part adds support to domain_conf.{c|h} for parsing the new
interface XML of type 'direct'. The parsed mode is now stored as
an int.
* src/conf/domain_conf.c src/conf/domain_conf.h: extend parsing code
* src/util/macvtap.h: empty header to not break compilation
This patch adds build support for libvirt checking for certain contents
of /usr/include/linux/if_link.h to see whether macvtap support is
compilable on that system. One can disable macvtap support in libvirt
via --without-macvtap passed to configure.
* configure.ac src/Makefile.am: new build support
* src/libvirt_macvtap.syms: list of exported symbols
* src/util/macvtap.c: empty module to not break compilation
The virRaiseError macro inside of virSecurityReportError expands to
virRaiseErrorFull and includes the __FILE__, __FUNCTION__ and __LINE__
information. But this three values are always the same for every call
to virSecurityReportError and do not reflect the actual error context.
Converting virSecurityReportError into a macro results in getting the
correct __FILE__, __FUNCTION__ and __LINE__ information.
Current PCI addresses are allocated at time of VM startup.
To make them truely persistent, it is neccessary to do this
at time of virDomainDefine/virDomainCreate. The code in
qemuStartVMDaemon still remains in order to cope with upgrades
from older libvirt releases
* src/qemu/qemu_driver.c: Rename existing qemuAssignPCIAddresses
to qemuDetectPCIAddresses. Add new qemuAssignPCIAddresses which
does auto-allocation upfront. Call qemuAssignPCIAddresses from
qemuDomainDefine and qemuDomainCreate to assign PCI addresses that
can then be persisted. Don't clear PCI addresses at shutdown if
they are intended to be persistent
The old text mode monitor prompts for a password when disks are
encrypted. This interactive approach doesn't work for JSON mode
monitor. Thus there is a new 'block_passwd' command that can be
used.
* src/qemu/qemu_driver.c: Split out code for looking up a disk
secret from findVolumeQcowPassphrase, into a new method
getVolumeQcowPassphrase. Enhance qemuInitPasswords() to also
set the disk encryption password via the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for the 'block_passwd' monitor command.
Since c26cb9234f, the dname
parameter has been ignored by these two functions. Use it.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepareTunnel): Honor dname
parameter once again.
(qemudDomainMigratePrepare2): Likewise.
All other libvirt functions use array first and then number of elements
in that array. Let's make cpuDecode follow this rule.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
With QEMU >= 0.12 the host and guest side of disks no longer have
the same naming convention. Specifically the host side will now
get a 'drive-' prefix added to its name. The 'info blockstats'
monitor command returns the host side name, so it is neccessary
to strip this off when looking up stats since libvirt stores the
guest side name !
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Move 'drive-' prefix
string to a defined constant
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Strip
off 'drive-' prefix (if found) when looking up disk stats
Currently the timeout for reading startup output is 3 seconds. If the
host is under any sort of load, we can easily trigger this. Lets bump
it to 30 seconds.
Since the polling loop checks to see if the process has died, we shouldn't
erroneously hit this timeout if qemu bombs (only if it is stuck in some
infinite loop).
Use the ATTRIBUTE_NONNULL annotation to mark some virConnectPtr
args as mandatory non-null so the compiler can warn of mistakes
* src/conf/domain_event.h: All virConnectPtr args must be non-null
* src/qemu/qemu_conf.h: qemudBuildCommandLine and
qemudNetworkIfaceConnect() must be given non-null connection
* tests/qemuxml2argvtest.c: Provide a non-null (dummy) connection to
qemudBuildCommandLine()
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in secret_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in interface_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in cpu_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in storage_conf.{h,c} and storage_encryption_conf.{h,c}
and update all callers to match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in node_device_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in network_conf.{h,c} and update all callers to
match
All callers now pass a NULL virConnectPtr into the USB/PCi device
iterator functions. Therefore the virConnectPtr arg can now be
removed from these functions
* src/util/hostusb.h, src/util/hostusb.c: Remove virConnectPtr
from usbDeviceFileIterate
* src/util/pci.c, src/util/pci.h: Remove virConnectPtr arg from
pciDeviceFileIterate
* src/qemu/qemu_security_dac.c, src/security/security_selinux.c: Update
to drop redundant virConnectPtr arg
The QEMU flags are commonly stored as a signed or unsigned int,
allowing only 31 flags. This limit is rather close, so to aid
future patches, change it to a 64-bit int
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h, src/qemu/qemu_driver.c,
tests/qemuargv2xmltest.c, tests/qemuhelptest.c, tests/qemuxml2argvtest.c:
Use 'unsigned long long' for QEMU flags
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in security_driver.{h,c} and update all callers to
match
The security driver was mistakenly initialized before the QEMU
config file was loaded. This prevents it being turned off again.
The capabilities XML was also getting the wrong security driver
name, due to the stacked driver arrangement.
* src/qemu/qemu_driver.c: Fix initialization order and capabilities
model name
* src/util/util.h (virAsprintf): Remove ATTRIBUTE_RETURN_CHECK, since
it is perfectly fine to ignore the return value, now that the pointer
is guaranteed to be set to NULL upon failure.
* src/util/storage_file.c (absolutePathFromBaseFile): Remove now-
unnecessary use of ignore_value.
* src/util/storage_file.c (absolutePathFromBaseFile): While this use
of virAsprintf is slightly cleaner than using stpncpy(stpcpy(...,
it does impose an artificial limitation on the length of the base_file
name. Rather than asserting that it does not exceed INT_MAX, return
NULL when it does.
When creating preallocated large raw files opening them with O_DSYNC
prevents long delays in reading because cache pages can be immediately
reused without writing them on a disk first.
virDomain{Attach,Detach}Device is now only permitted on active
domains. Explicitly state this restriction in the API
documentation.
V2: Only change doc, dropping the hunk that forced the restriction
in libvirt frontend.
When configured with --enable-gcc-warnings, it didn't even compile.
* src/util/storage_file.c: Include <assert.h>.
(absolutePathFromBaseFile): Assert that converting size_t to int is valid.
Reverse length/string args to match "%.*s".
Explicitly ignore the return value of virAsprintf.
* src/util/storage_file.c: Include "dirname.h".
(absolutePathFromBaseFile): Rewrite not to leak, and to require
fewer allocations.
* bootstrap (modules): Add dirname-lgpl.
* .gnulib: Update submodule to the latest.
When restoring from a saved guest image, the XML would already
contain the PCI slot ID of the IDE controller & video card.
The attempt to explicitly reserve this upfront would thus fail
everytime.
* src/qemu/qemu_conf.c: Reserve IDE controller / video card
slot at time of need, rather than upfront
Similar fix as previous one but for fork() usage when creating
a file or directory
* src/util/util.c: virLogLock() and virLogUnlock() around fork()
in virFileCreate() and virDirCreateSimple()
Ad pointed out by Dan Berrange:
So if some thread in libvirtd is currently executing a logging call,
while another thread calls virExec(), that other thread no longer
exists in the child, but its lock is never released. So when the
child then does virLogReset() it deadlocks.
The only way I see to address this, is for the parent process to call
virLogLock(), immediately before fork(), and then virLogUnlock()
afterwards in both parent & child. This will ensure that no other
thread
can be holding the lock across fork().
* src/util/logging.[ch] src/libvirt_private.syms: export virLogLock() and
virLogUnlock()
* src/util/util.c: lock just before forking and unlock just after - in
both parent and child.
The original udev node device backend neglected to lock the driverState
struct containing the device list when adding and removing devices
* src/node_device/node_device_udev.c: add necessary locks in
udevRemoveOneDevice() and udevAddOneDevice()
* src/xen/xs_internal.c (xenStoreDomainIntroduced): Don't use -1
as an allocation size upon xenStoreNumOfDomains failure.
(xenStoreDomainReleased): Likewise.
If the primary security driver (SELinux/AppArmour) was disabled
then the secondary QEMU DAC security driver was also disabled.
This is mistaken, because the latter must be active at all times
* src/qemu/qemu_driver.c: Ensure DAC driver is always active
* src/xen/xen_hypervisor.c: Remove all "domain == NULL" tests.
* src/xen/xen_hypervisor.h: Instead, use ATTRIBUTE_NONNULL to
mark each "domain" parameter as "known always to be non-NULL".
When attaching a USB host device based on vendor/product, libvirt
will resolve the vendor/product into a device/bus pair. This means
that when printing XML we should allow device/bus info to be printed
at any time if present
* src/conf/domain_conf.c, docs/schemas/domain.rng: Allow USB device
bus info alongside vendor/product
To allow devices to be hot(un-)plugged it is neccessary to ensure
they all have a unique device aliases. This fixes the hotplug
methods to assign device aliases before invoking the monitor
commands which need them
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Expose methods
for assigning device aliases for disks, host devices and
controllers
* src/qemu/qemu_driver.c: Assign device aliases when hotplugging
all types of device
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-pci-address-device.args,
tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-address-device.args:
Update for changed hostdev naming scheme
This patch re-arranges the QEMU device alias assignment code to
make it easier to call into the same codeblock when performing
device hotplug. The new code has the ability to skip over already
assigned names to facilitate hotplug
* src/qemu/qemu_driver.c: Call qemuAssignDeviceNetAlias()
instead of qemuAssignNetNames
* src/qemu/qemu_conf.h: Export qemuAssignDeviceNetAlias()
instead of qemuAssignNetNames
* src/qemu/qemu_driver.c: Merge the legacy disk/network alias
assignment code into the main methods
The current way of assigning names to the host network backend and
NIC device in QEMU was over complicated, by varying naming scheme
based on the NIC model and backend type. This simplifies the naming
to simply be 'net0' and 'hostnet0', allowing code to easily determine
the host network name and vlan based off the primary device alias
name 'net0'. This in turn allows removal of alot of QEMU specific
code from the XML parser, and makes it easier to assign new unique
names for NICs that are hotplugged
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove hostnet_name
and vlan fields from virNetworkDefPtr
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h, src/qemu/qemu_driver.c:
Use a single network alias naming scheme regardless of NIC type
or backend type. Determine VLANs from the alias name.
* tests/qemuxml2argvdata/qemuxml2argv-net-eth-names.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-device.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.args: Update
for new simpler naming scheme
The QEMU 0.12.x tree has the -netdev command line argument, but not
corresponding monitor command. We can't enable the former, without
the latter since it will break hotplug/unplug.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Disable -netdev usage
until 0.13 at earliest
* tests/qemuxml2argvtest.c: Add test for -netdev syntax
* tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.xml: Test
data files for -netdev syntax
PCI disk, disk controllers, net devices and host devices need to
have PCI addresses assigned before they are hot-plugged
* src/qemu/qemu_conf.c: Add APIs for ensuring a device has an
address and releasing unused addresses
* src/qemu/qemu_driver.c: Ensure all devices have addresses
when hotplugging.
The current QEMU code allocates PCI addresses incrementally starting
at 4. This is not satisfactory because the user may have given some
addresses in their XML config, which need to be skipped over when
allocating addresses to remaining devices.
It is thus neccessary to maintain a list of already allocated PCI
addresses and then only allocate ones that remain unused. This is
also required for domain device hotplug to work properly later.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add APIs for creating
list of existing PCI addresses, and allocating new addresses.
Refactor address assignment to use this code
* src/qemu/qemu_driver.c: Pull PCI address assignment up into the
qemuStartVMDaemon() method, as a prelude to moving it into the
'define' method. Update list of allocated addresses when connecting
to a running VM at daemon startup.
* tests/qemuxml2argvtest.c, tests/qemuargv2xmltest.c,
tests/qemuxml2xmltest.c: Remove USB product test since all
passthrough is done based on address
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-product.args,
tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-product.xml: Kil
unused data files
The virDomainDeviceInfoIterate() function will provide a
convenient way to iterate over all devices in a domain.
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDeviceInfoIterate()
function.
Since QEMU startup uses the new -device argument, the hotplug
code needs todo the same. This converts disk, network and
host device hotplug to use the device_add command
* src/qemu/qemu_driver.c: Use new device_add monitor APIs
whereever possible
The way QEMU is started has been changed to use '-device' and
the new style '-drive' syntax. This needs to be mirrored in
the hotplug code, requiring addition of two new APIs.
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Define APIs
qemuMonitorAddDevice() and qemuMonitorAddDrive()
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Implement the new monitor APIs
To allow for better code reuse from hotplug methods, the code for
generating PCI/USB hostdev arg values is split out into separate
methods
* qemu/qemu_conf.h, qemu/qemu_conf.c: Introduce new APis for
qemuBuildPCIHostdevPCIDevStr, qemuBuildUSBHostdevUsbDevStr
and qemuBuildUSBHostdevDevStr
All the helper functions for building command line arguments
now return a 'char *', instead of acepting a 'char **' or
virBufferPtr argument
* qemu/qemu_conf.c: Standardize syntax for building args
* qemu/qemu_conf.h: Export all functions for building args
* qemu/qemu_driver.c: Update for changed syntax for building
NIC/hostnet args
udevGetUintProperty was called with base set to 0 for busnum and devnum.
With base 0 strtoul parses the number as octal if it start with a 0. But
busnum and devnum are decimal and udev returns them padded with leading
zeros. So strtoul parses them as octal. This works for certain decimal
values like 001-007, but fails for values like 008.
Change udevProcessUSBDevice to use base 10 for busnum and devnum.
* src/util/util.c (virGetUserID, virGetGroupID): In the unlikely event
that sysconf(_SC_GETPW_R_SIZE_MAX) fails, don't use -1 as the size in
the subsequent allocation.
Similar to the race fixed by
be34c3c7ef, make sure
to wait around for KVM to release the resources from
a hot-detached PCI device before attempting to
rebind that device to the host driver.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The QEMU driver contained code to generate a -device string for piix4-ide, but
wasn't using it. This change removes this string generation. It also adds a
comment explaining why IDE and FDC controllers don't generate -device strings.
The change also generates an error if a sata controller is specified for a QEMU
domain, as this isn't supported.
* src/qemu/qemu_conf.c: Remove VIR_DOMAIN_CONTROLLER_TYPE_IDE handler in
qemuBuildControllerDevStr(). Ignore IDE and FDC controllers. Error if
SATA controller is discovered. Add comments.
On RHEL-5 the qemu-kvm binary is located in /usr/libexec.
To reduce confusion for people trying to run upstream libvirt
on RHEL-5 machines, make the qemu driver look in /usr/libexec
for the qemu-kvm binary.
To make this work, I modified virFindFileInPath to handle an
absolute path correctly. I also ran into an issue where
NULL was sometimes being passed for the file parameter
to virFindFileInPath; it didn't crash prior to this patch
since it was building paths like /usr/bin/(null). This
is non-standard behavior, though, so I added a NULL
check at the beginning.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/util/util.c (virGetUserEnt): In the unlikely event that
sysconf(_SC_GETPW_R_SIZE_MAX) fails, don't use -1 as the size in
the subsequent allocation.
xen-unstable c/s 20762 bumped XEN_SYSCTL_INTERFACE_VERSION to 7. The
interface change does not affect libvirt, other than xenHypervisorInit()
failing since version 7 is not tried.
The attached patch accommodates the upcoming Xen 4.0 release by checking
for XEN_SYSCTL_INTERFACE_VERSION 7. If found, it sets
XEN_DOMCTL_INTERFACE_VERSION to 6, which is also new to Xen 4.0.
it causes a NULL-dereference on some systems like Solaris 10.
* src/node_device/node_device_linux_sysfs.c. Include <stdlib.h>.
(get_sriov_function): Use canonicalize_file_name, not realpath.
* bootstrap (modules): Add canonicalize-lgpl.
This fixes a segfault when the event handler is called after shutdown
when the global driver state is NULL again.
Also fix a locking issue in an error path.
virFileMakePath is a recursive function that was creates a buffer
PATH_MAX bytes long for each recursion (one recursion for each element
in the path). This changes it to have no buffers on the stack, and to
allocate just one buffer total, no matter how many elements are in the
path. Because the modified algorithm requires a char* to be passed in
rather than const char *, it is now 2 functions - a toplevel API
function that remains identical in function, and a 2nd helper function
called for the recursions, which 1) doesn't allocate anything, and 2)
takes a char* arg, so it can modify the contents.
* src/util/util.c: rewrite virFileMakePath
This reverts commit cdc42d0a48.
As DanB pointed out, this patch is actually wrong. The real
bug that was causing me to see this problem is a bug
introduced in a RHEL-5 libvirt snapshot, and I'm going to
fix the real bug there.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
If you shutdown libvirtd while a domain with PCI
devices is running, then try to restart libvirtd,
libvirtd will crash.
This happens because qemuUpdateActivePciHostdevs() is calling
pciDeviceListSteal() with a dev of 0x0 (NULL), and then trying
to dereference it. This patch fixes it up so that
qemuUpdateActivePciHostdevs() steals the devices after first
Get()'ting them, avoiding the crash.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/qemu/qemu_monitor_text.c (qemuMonitorTextAttachDrive): Most other
failures in this function would "goto cleanup", but one mistakenly
returned directly, skipping the cleanup and resulting in a leak.
In addition, iterating the "try_command" loop would clobber, and
thus leak, the "cmd" allocated on the first iteration,
so be careful to free it in addition to "reply" beforehand.
The KVM build of QEMU includs the thread ID of each vCPU in the
'query-cpus' output. This is required for pinning guests to
particular host CPUs
* src/qemu/qemu_monitor_json.c: Extract 'thread_id' from CPU info
* src/util/json.c, src/util/json.h: Declare returned strings
to be const
* src/qemu/qemu_monitor.c: Wire up JSON mode for qemuMonitorGetPtyPaths
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Fix
const correctness. Add missing error message in the function
qemuMonitorJSONGetAllPCIAddresses. Add implementation of the
qemuMonitorGetPtyPaths function calling 'query-chardev'.
Two files were using functions from <sys/stat.h> but not including
in. Most of the time they got this automatically via another header,
but certain build flag combinations can reveal the problem
* src/lxc/lxc_container.c, src/node_device/node_device_linux_sysfs.c:
Add <sys/stat.h>
The <console> tag is supposed to result in addition of a single
<serial> device for HVM guests. The 'targetType' attribute was
missing though causing the compatibility code to add a second
<console> device
* src/conf/domain_conf.c: Set targetType for serial device
When libvirtd shuts down, it places a <state/> tag in the XML
state file it writes out for guests with PCI passthrough
devices. For devices that are attached at bootup time, the
state tag is empty. However, at libvirtd startup time, it
ignores anything with a <state/> tag in the XML, effectively
hiding the guest.
This patch remove the check for VIR_DOMAIN_XML_INTERNAL_STATUS
when parsing the XML.
* src/conf/domain_conf.c: remove VIR_DOMAIN_XML_INTERNAL_STATUS
flag check in virDomainHostdevSubsysPciDefParseXML()
Certain hypervisors (like qemu/kvm) map the PCI bar(s) on
the host when doing device passthrough. This can lead to a race
condition where the hypervisor is still cleaning up the device while
libvirt is trying to re-attach it to the host device driver. To avoid
this situation, we look through /proc/iomem, and if the hypervisor is
still holding onto the bar (denoted by the string in the matcher variable),
then we can wait around a bit for that to clear up.
v2: Thanks to review by DV, make sure we wait the full timeout per-device
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The patches to add ACS checking to PCI device passthrough
introduced a bug. With the current code, if you try to
passthrough a device on the root bus (i.e. bus 0), then
it denies the passthrough. This is because the code in
pciDeviceIsBehindSwitchLackingACS() to check for a parent
device doesn't take into account the possibility of the
root bus. If we are on the root bus, it means we
legitimately can't find a parent, and it also means that
we don't have to worry about whether ACS is enabled.
Therefore return 0 (indicating we don't lack ACS) from
pciDeviceIsBehindSwitchLackingACS().
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Fix a small problem with the qemu memory stats parsing algorithm. If qemu
reports a stat that libvirt does not recognize, skip past it so parsing can
continue. This corrects a potential infinite loop in the parsing code that can
only be triggered if new statistics are added to qemu.
* src/qemu/qemu_monitor_text.c: qemuMonitorParseExtraBalloonInfo add a
skip for extra ','
The loop looking for the controller associated with a SCI drive had
an off by one, causing it to miss the last controller.
* src/qemu/qemu_driver.c: Fix off-by-1 in searching for SCSI
drive hotplug
The hotplug code in QEMU was leaking memory because although the
inner device object was being moved into the main virDomainDefPtr
config object, the outer container virDomainDeviceDefPtr was not.
* src/qemu/qemu_driver.c: Clarify code to show that the inner
device object is owned by the main domain config upon
successfull attach.
Add the ability to turn off dynamic management of file permissions
for libvirt guests.
* qemu/libvirtd_qemu.aug: Support 'dynamic_ownership' flag
* qemu/qemu.conf: Document 'dynamic_ownership' flag.
* qemu/qemu_conf.c: Load 'dynamic_ownership' flag
* qemu/test_libvirtd_qemu.aug: Test 'dynamic_ownership' flag
The hotplug code was not correctly invoking the security driver
in error paths. If a hotplug attempt failed, the device would
be left with VM permissions applied, rather than restored to the
original permissions. Also, a CDROM media that is ejected was
not restored to original permissions. Finally there was a bogus
call to set hostdev permissions in the hostdev unplug code
* qemu/qemu_driver.c: Fix security driver usage in hotplug/unplug
If there is a problem with VM startup, PCI devices may be left
assigned to pci-stub / pci-back. Adding a call to reattach
host devices in the cleanup path is required.
* qemu/qemu_driver.c: qemuDomainReAttachHostDevices() when
VM startup fails
Remove all the QEMU driver calls for setting file ownership and
process uid/gid. Instead wire in the QEMU DAC security driver,
stacking it ontop of the primary SELinux/AppArmour driver.
* qemu/qemu_driver.c: Switch over to new DAC security driver
This new security driver is responsible for managing UID/GID changes
to the QEMU process, and any files/disks/devices assigned to it.
* qemu/qemu_conf.h: Add flag for disabling automatic file permission
changes
* qemu/qemu_security_dac.h, qemu/qemu_security_dac.c: New DAC driver
for QEMU guests
* Makefile.am: Add new files
Pulling the disk labelling code out of the exec hook, and into
libvirtd will allow it to access shared state in the daemon. It
will also make debugging & error reporting easier / more reliable.
* qemu/qemu_driver.c: Move initial disk labelling calls up into
libvirtd. Add cleanup of disk labels upon failure
If a VM fails to start, we can't simply free the security label
strings, we must call the domainReleaseSecurityLabel() method
otherwise the reserved 'mcs' level will be leaked in SElinux
* src/qemu/qemu_driver.c: Invoke domainReleaseSecurityLabel()
when domain fails to start
The current security driver architecture has the following
split of logic
* domainGenSecurityLabel
Allocate the unique label for the domain about to be started
* domainGetSecurityLabel
Retrieve the current live security label for a process
* domainSetSecurityLabel
Apply the previously allocated label to the current process
Setup all disk image / device labelling
* domainRestoreSecurityLabel
Restore the original disk image / device labelling.
Release the unique label for the domain
The 'domainSetSecurityLabel' method is special because it runs
in the context of the child process between the fork + exec.
This is require in order to set the process label. It is not
required in order to label disks/devices though. Having the
disk labelling code run in the child process limits what it
can do.
In particularly libvirtd would like to remember the current
disk image label, and only change shared image labels for the
first VM to start. This requires use & update of global state
in the libvirtd daemon, and thus cannot run in the child
process context.
The solution is to split domainSetSecurityLabel into two parts,
one applies process label, and the other handles disk image
labelling. At the same time domainRestoreSecurityLabel is
similarly split, just so that it matches the style. Thus the
previous 4 methods are replaced by the following 6 new methods
* domainGenSecurityLabel
Allocate the unique label for the domain about to be started
No actual change here.
* domainReleaseSecurityLabel
Release the unique label for the domain
* domainGetSecurityProcessLabel
Retrieve the current live security label for a process
Merely renamed for clarity.
* domainSetSecurityProcessLabel
Apply the previously allocated label to the current process
* domainRestoreSecurityAllLabel
Restore the original disk image / device labelling.
* domainSetSecurityAllLabel
Setup all disk image / device labelling
The SELinux and AppArmour drivers are then updated to comply with
this new spec. Notice that the AppArmour driver was actually a
little different. It was creating its profile for the disk image
and device labels in the 'domainGenSecurityLabel' method, where as
the SELinux driver did it in 'domainSetSecurityLabel'. With the
new method split, we can have consistency, with both drivers doing
that in the domainSetSecurityAllLabel method.
NB, the AppArmour changes here haven't been compiled so may not
build.
The QEMU driver is doing 90% of the calls to check for static vs
dynamic labelling. Except it is forgetting todo so in many places,
in particular hotplug is mistakenly assigning disk labels. Move
all this logic into the security drivers themselves, so the HV
drivers don't have to think about it.
* src/security/security_driver.h: Add virDomainObjPtr parameter
to virSecurityDomainRestoreHostdevLabel and to
virSecurityDomainRestoreSavedStateLabel
* src/security/security_selinux.c, src/security/security_apparmor.c:
Add explicit checks for VIR_DOMAIN_SECLABEL_STATIC and skip all
chcon() code in those cases
* src/qemu/qemu_driver.c: Remove all checks for VIR_DOMAIN_SECLABEL_STATIC
or VIR_DOMAIN_SECLABEL_DYNAMIC. Add missing checks for possibly NULL
driver entry points.
Allows the initiator to use a variety of IQNs rather than just the
system IQN when creating iSCSI pools.
* docs/schemas/storagepool.rng: extends the syntax with <iqn name="..."/>
* src/conf/storage_conf.[ch]: read and stores the iqn name
* src/storage/storage_backend_iscsi.[ch]: implement the IQN selection
when detected
* src/lxc/lxc_container.c src/lxc/lxc_controller.c src/lxc/lxc_driver.c
src/network/bridge_driver.c src/qemu/qemu_driver.c
src/uml/uml_driver.c: virFileMakePath returns 0 for success, or the
value of errno on failure, so error checking should be to test
if non-zero, not if lower than 0
Previously the uid/gid/mode in the xml was ignored when creating new
storage pool directories. This commit attempts to honor the requested
permissions, and spits out an error if it can't.
Note that when creating the directory, the rest of the path leading up
to the final element is created using current uid/gid/mode, and the
final element gets the settings from xml. It is NOT an error for the
directory to already exist; in this case, the perms for the existing
directory are just set (if necessary).
* src/storage/storage_backend_fs.c: update the virStorageBackendFileSystemBuild
function to check the directory hierarchy separately then create the
leaf directory with the right attributes
In order to avoid problems trying to chown files that were created by
root on a root-squashing nfs server, fork a new process that setuid's
to the desired uid before creating the file. (It's only done this way
if the pool containing the new volume is of type 'netfs', otherwise
the old method of creating the file followed by chown() is used.)
This changes the semantics of the "create_func" slightly - previously
it was assumed that this function just created the file, then the
caller would chown it to the desired uid. Now, create_func does both
operations.
There are multiple functions that can take on the role of create_func:
createFileDir - previously called mkdir(), now calls virDirCreate().
virStorageBackendCreateRaw - previously called open(),
now calls virFileCreate().
virStorageBackendCreateQemuImg - use virRunWithHook() to setuid/gid.
virStorageBackendCreateQcowCreate - same.
virStorageBackendCreateBlockFrom - preserve old behavior (but attempt
chown when necessary even if not root)
* src/storage/storage_backend.[ch] src/storage/storage_backend_disk.c
src/storage/storage_backend_fs.c src/storage/storage_backend_logical.c
src/storage/storage_driver.c: change the create_func implementations,
also propagate the pool information to be able to detect NETFS ones.
These functions create a new file or directory with the given
uid/gid. If the flag VIR_FILE_CREATE_AS_UID is given, they do this by
forking a new process, calling setuid/setgid in the new process, and
then creating the file. This is better than simply calling open then
fchown, because in the latter case, a root-squashing nfs server would
create the new file as user nobody, then refuse to allow fchown.
If VIR_FILE_CREATE_AS_UID is not specified, the simpler tactic of
creating the file/dir, then chowning is is used. This gives better
results in cases where the parent directory isn't on a root-squashing
NFS server, but doesn't give permission for the specified uid/gid to
create files. (Note that if the fork/setuid method fails to create the
file due to access privileges, the parent process will make a second
attempt using this simpler method.)
If the bit VIR_FILE_CREATE_ALLOW_EXIST is set in the flags, an
existing file/directory will not cause an error; in this case, the
function will simply set the permissions of the file/directory to
those requested. If VIR_FILE_CREATE_ALLOW_EXIST is not specified, an
existing file/directory is considered (and reported as) an error.
Return from both of these functions is 0 on success, or the value of
errno if there was a failure.
* src/util/util.[ch]: add the 2 new util functions
The test expected all environment variables copied in qemudBuildCommandLine
to have known values. So all of them have to be either set to a known value
or be unset. SDL_VIDEODRIVER and QEMU_AUDIO_DRV are not handled at all but
should be handled. Unset both, otherwise the test will fail if they are set
in the testing environment.
* src/qemu/qemu_conf.c: add a comment about copied environment variables
and qemuxml2argvtest
* tests/qemuxml2argvtest.c: unset SDL_VIDEODRIVER and QEMU_AUDIO_DRV
* src/conf/domain_conf.c (virDomainChrDefFormat): Plug a leak on
an error path, and at the same time, eliminate the need for a
"cleanup:" block. Before, the "return -1" after the switch
would leak an "addr" string. Now, by reversing the port,addr-
getting blocks we can free "addr" immediately and skip the goto.
The 'int virInterfaceIsActive()' method was directly returning the
value of the 'int active:1' bitfield in virIntefaceDefPtr. A bitfield
with a signed integer, will hold the values 0 and -1, not 0 and +1
as might be expected. This meant that virInterfaceIsActive() was
always returning -1 when the interface was active, not +1 & thus all
callers thought an error had occurred. To protect against this kind
of mistake again, change all bitfields to be unsigned ints
* daemon/libvirtd.h, src/conf/domain_conf.h, src/conf/interface_conf.h,
src/conf/network_conf.h: Change bitfields to unsigned int.
Invoking the virConnectGetCapabilities() method causes the QEMU
driver to rebuild its internal capabilities object. Unfortunately
it was forgetting to register the custom domain status XML hooks
again.
To avoid this kind of error in the future, the code which builds
capabilities is refactored into one single method, which can be
called from all locations, ensuring reliable rebuilds.
* src/qemu/qemu_driver.c: Fix rebuilding of capabilities XML and
guarentee it is always consistent
* src/util/logging.c (virLogMessage): Include "ignore-value.h".
Use it to ignore the return value of safewrite.
Use STDERR_FILENO, rather than "2".
* bootstrap (modules): Add ignore-value.
* gnulib: Update to latest, for ignore-value that is now LGPLv2+.
This was accomplished in xml parsing by doing away with the
stripped-down virInterfaceBareDef object, and just always using
virInterfaceDef, but with restrictions in certain places (eg, the type
of subordinate interface allowed in parsing depends on the parent
interface).
xml formatting was similarly adjusted. In addition, the formatting
functions keep track of the level of interface nesting, and insert
extra leading spaces on each line accordingly (using %*s).
The only change in formatted xml from previous (aside frmo supporting
new combinations of interface types) is that the subordinate ethernet
interfaces take up 2 lines rather than one, eg:
<interface type='ethernet' name='eth0'>
</interface>
instead of:
<interface type='ethernet' name='eth0'/>
I noticed some debug messages are printed with an empty lines after
them. This patch removes these empty lines from all invocations of the
following macros:
VIR_DEBUG
VIR_DEBUG0
VIR_ERROR
VIR_ERROR0
VIR_INFO
VIR_WARN
VIR_WARN0
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
New pciDeviceIsAssignable() function for checking whether a given PCI
device can be assigned to a guest was added. Currently it only checks
for ACS being enabled on all PCIe switches between root and the PCI
device. In the future, it could be the right place to check whether a
device is unbound or bound to a stub driver.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently CPU topology may only be specified together with CPU model:
<cpu match='exact'>
<model>name</model>
<topology sockets='1' cores='2' threads='3'/>
</cpu>
This patch allows for CPU topology specification without the need for
also specifying CPU model:
<cpu>
<topology sockets='1' cores='2' threads='3'/>
</cpu>
'match' attribute and 'model' element are made optional with the
restriction that 'match' attribute has to be set when 'model' is
present.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When comparing x86 CPUs, features with 'disabled' policy were mistakenly
required to be supported by the host CPU.
Likewise, features with 'force' policy which were supported by host CPU
would make CPUs incompatible if 'strict' match was used by guest CPU.
This patch fixes both issues.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
QEMU's command line equivalent for the following domain XML fragment
<vcpus>2</vcpus>
<cpu ...>
...
<topology sockets='1' cores='2', threads='1'/>
</cpu>
is
-smp 2,sockets=1,cores=2,threads=1
This syntax was introduced in QEMU-0.12.
Version 2 changes:
- -smp argument build split into a separate function
- always add ",sockets=S,cores=C,threads=T" to -smp if qemu supports it
- use qemuParseCommandLineKeywords for command line parsing
Version 3 changes:
- ADD_ARG_LIT => ADD_ARG and line reordering in qemudBuildCommandLine
- rebased
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Current version expects name=value,... list and when an incorrect string
such as "a,b,c=d" would be parsed as "a,b,c" keyword with "d" value
without reporting any error, which is probably not the expected
behavior.
This patch adds an extra argument called allowEmptyValue, which if
non-zero will permit keywords with no value; "a,b=c,,d=" will be parsed
as follows:
keyword value
"a" NULL
"b" "c"
"" NULL
"d" ""
In case allowEmptyValue is zero, the string is required to contain
name=value pairs only; retvalues is guaranteed to contain non-NULL
pointers. Now, "a,b,c=d" will result in an error.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Replace
-balloon virtio
With
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3
This allows it to get correct assigned PCI address as declared in
previous patch
* src/qemu/qemu_conf.c: Convert Virtio ballon to -device and
give it an explicit PCI address
* tests/qemuxml2argvdata/qemuxml2argv-*args: Add in virtio balloon
where appropriate
Instead of relying on QEMU to assign PCI addresses and then querying
them with 'info pci', manually assign all PCI addresses before starting
the guest. These addresses are not stable across reboots. That will
come in a later patch
NB, the PIIX3 (IDE, FDC, ISA-Bridge) will always have slot 1 and
VGA will always have slot 2. We declare the Virtio Balloon gets
slot 3, and then all remaining slots are for configured devices.
* src/qemu/qemu_conf.c: If -device is supported, then assign all PCI
addresses when building the command line
* src/qemu/qemu_driver.c: Don't query monitor for PCI addresses if
they have already been assigned
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-pci-address-device.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-device.args,
tests/qemuxml2argvdata/qemuxml2argv-sound-device.args,
tests/qemuxml2argvdata/qemuxml2argv-watchdog-device.args: Update
to include PCI slot/bus information
QEMU always configures a VGA card. If no video card is included in
the libvirt XML, it is neccessary to explicitly turn off the default
using -vga none
* src/qemu/qemu_conf.c: Pass -vga none if no video card is configured
* tests/qemuargv2xmltest.c, tests/qemuxml2argvtest.c: Test for
handling -vga none.
* tests/qemuxml2argvdata/qemuxml2argv-nographics-vga.args,
tests/qemuxml2argvdata/qemuxml2argv-nographics-vga.xml: Test
data files
Not all QEMU builds default to SDL graphics for their display.
Newer QEMU now has an explicit -sdl flag, which we can use to
explicitly request SDL intead of relying on the default. This
protects libvirt against unexpected changes in graphics default
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Probe for -sdl
flag and use it if it is found
* tests/qemuhelptest.c: Add SDL flag to tests
The old syntax was
-chardev SOMECONFIG
-nic user,guestfwd=tcp:IP:PORT-chardev:CHARDEV
The new syntax is
-chardev SOMECONFIG
-netdev user,guestfwd=tcp:IP:PORT,chardev=ID,id=user-ID
The old syntax was
-usbdevice host:PRODUCT:VENDOR
Or
-usbdevice host:BUS.DEV
The new syntax is
-device usb-host,product=PRODUCT,vendor=VENDOR
Or
-device usb-host,hostbus=BUS,hostaddr=DEV
The previous syntax was severely limited in its options
-usbdevice disk:/home/berrange/output.img
The new syntax is the same as for other disk types
-drive file=/home/berrange/output.img,if=none,id=usb-1,index=1
-device usb-storage,drive=usb-1
Again, the index= arg is wrong here, and will be removed in a
later merge
The current syntax uses a pair of args
-net nic,macaddr=52:54:00:56:6c:55,vlan=3,model=pcnet,name=pcnet.0
-net user,vlan=3,name=user.0
The new syntax does not need the vlan craziness anymore, and
so has a simplified pair of args
-netdev user,id=user.0
-device pcnet,netdev=user.0,id=pcnet.0,mac=52:54:00:56:6c:55,addr=<PCI SLOT>
The current preferred syntax for disk drives uses
-drive file=/vms/plain.qcow,if=virtio,index=0,boot=on,format=qcow
The new syntax splits this up into a pair of linked args
-drive file=/vms/plain.qcow,if=none,id=drive-virtio-0,format=qcow2
-device virtio-blk-pci,drive=drive-virtio-0,id=virtio-0,addr=<PCI SLOT>
SCSI/IDE devices also get a bus property linking them to the
controller
-device scsi-disk,drive=drive-scsi0-0-0,id=scsi0-0-0,bus=scsi0.0,scsi-id=0
-device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bus=ide0,unit=0
The current syntax for audio devices is a horrible multiplexed
arg
-soundhw sb16,pcspk,ac97
The new syntax is
-device sb16,id=sound0
or
-device AC97,id=sound1,addr=<PCI SLOT>
NB, pcspk still uses the old -soundhw syntax
The current character device syntax uses either
-serial tty,path=/dev/ttyS2
Or
-chardev tty,id=serial0,path=/dev/ttyS2 -serial chardev:serial0
With the new -device support, we now prefer
-chardev file,id=serial0,path=/tmp/serial.log -device isa-serial,chardev=serial0
This patch changes the existing -chardev syntax to use this new
scheme, and fallbacks to the old plain -serial syntax for old
QEMU.
The monitor device changes to
-chardev socket,id=monitor,path=/tmp/test-monitor,server,nowait -mon chardev=monitor
In addition, this patch adds --nodefaults, which kills off the
default serial, parallel, vga and nic devices. THis avoids the
need for us to explicitly turn each off
When starting a guest, give every device a unique alias. This will
be used for the 'id' parameter in -device args in later patches.
It can also be used to uniquely identify devices in the monitor
For old QEMU without -device, assign disk names based on QEMU's
historical naming scheme.
* src/qemu/qemu_conf.c: Assign unique device aliases
* src/qemu/qemu_driver.c: Remove obsolete qemudDiskDeviceName
and use the device alias in eject & blockstats commands
* src/storage/storage_backend_fs.c (virStorageBackendFileSystemRefresh):
Correct parentheses. The documented intent is to ignore non-regular
files, yet due to a parenthesization error all errors were handled
that way.
Probe for the new -device flag and if available set the -nodefaults
flag, instead of using -net none, -serial none or -parallel none.
Other device types will be converted to use -device in later patches.
The -nodefaults flag will help avoid unwelcome surprises from future
QEMU releases
* src/qemu/qemu_conf.c: Probe for -device. Add -nodefaults flag.
Remove -net none, -serial none or -parallel none
* src/qemu/qemu_conf.h: Define QEMU_CMD_FLAG_DEVICE
* tests/qemuhelpdata/qemu-0.12.1: New data file for 0.12.1 QEMU
* tests/qemuhelptest.c: Test feature extraction from 0.12.1 QEMU
Although the serial, parallel, chanel, input & fs devices do
not have PCI address info, they can all have device aliases.
Thus it neccessary to associate the virDomainDeviceInfo data
with them all.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add hooks for
parsing / formatting device info for serial, parallel, channel
input and fs devices.
* docs/schemas/domain.rng: Associate device info with character
devices, input & fs device
This patch introduces the support for giving all devices a short,
unique name, henceforth known as a 'device alias'. These aliases
are not set by the end user, instead being assigned by the hypervisor
if it decides it want to support this concept.
The QEMU driver sets them whenever using the -device arg syntax
and uses them for improved hotplug/hotunplug. it is the intent
that other APIs (block / interface stats & device hotplug) be
able to accept device alias names in the future.
The XML syntax is
<alias name="video0"/>
This may appear in any type of device that supports device info.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add a 'alias'
field to virDomainDeviceInfo struct & parse/format it in XML
* src/libvirt_private.syms: Export virDomainDefClearDeviceAliases
* src/qemu/qemu_conf.c: Replace use of "nic_name" field with the
standard device alias
* src/qemu/qemu_driver.c: Clear device aliases at shutdown
The PCI device addresses are only valid while the VM is running,
since they are auto-assigned by QEMU. After shutdown they must
all be cleared. Future QEMU driver enhancement will allow for
persistent PCI address assignment
* src/conf/domain_conf.h, src/conf/domain_conf.c, src/libvirt_private.syms
Add virDomainDefClearPCIAddresses() method for wiping out auto assigned
PCI addresses
* src/qemu/qemu_driver.c: Clear PCI addresses at VM shutdown
Existing applications using libvirt are not aware of the disk
controller concept. Thus, after parsing the <disk> definitions
in the XML, it is neccessary to create <controller> elements
to satisfy all requested disks, as per their defined drive
addresses
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDefAddDiskControllers()
method for populating disk controllers, and call it after
parsing disk definitions.
* src/qemu/qemu_conf.c: Call virDomainDefAddDiskControllers()
when doing ARGV -> XML conversion
* tests/qemuxml2argvdata/qemuxml2argv*.xml: Add disk controller
data to all data files which don't have it already
It is perfectly acceptable to have multiple sound devices of
same type in guest configuration. If the underlying hypervisor
does not like this, it is its job to complain, not the XML
parser's
* src/conf/domain_conf.c: Remove hack which deleted duplicated
sound device models.
* tests/xml2sexprdata/xml2sexpr-fv-sound.xml: Remove duplicate
models
Hotunplug of devices requires that we know their PCI address. Even
hotplug of SCSI drives, required that we know the PCI address of
the SCSI controller to attach the drive to. We can find this out
by running 'info pci' and then correlating the vendor/product IDs
with the devices we booted with.
Although this approach is somewhat fragile, it is the only viable
option with QEMU < 0.12, since there is no way for libvirto set
explicit PCI addresses when creating devices in the first place.
For QEMU > 0.12, this code will not be used.
* src/qemu/qemu_driver.c: Assign all dynamic PCI addresses on
startup of QEMU VM, matching vendor/product IDs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API for fetching PCI device address mapping
The current SCSI hotplug support attaches a brand new SCSI controller
for every disk. This is broken because the semantics differ from those
used when starting the VM initially. In the latter case, each SCSI
controller is filled before a new one is added.
If the user specifies an high drive index (sdazz) then at initial
startup, many intermediate SCSI controllers may be added with no
drives.
This patch changes SCSI hotplug so that it exactly matches the
behaviour of initial startup. First the SCSI controller number is
determined for the drive to be hotplugged. If any controller upto
and including that controller number is not yet present, it is
attached. Then finally the drive is attached to the last controller.
NB, this breaks SCSI hotunplug, because there is no 'drive_del'
command in current QEMU. Previous SCSI hotunplug was broken in
any case because it was unplugging the entire controller, not
just the drive in question.
A future QEMU will allow proper SCSI hotunplug of a drive.
This patch is derived from work done by Wolfgang Mauerer on disk
controllers.
* src/qemu/qemu_driver.c: Fix SCSI hotplug to add a drive to
the correct controller, instead of just attaching a new
controller.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for 'drive_add' command
This patch allows for explicit hotplug/unplug of SCSI controllers.
Ordinarily this is not required, since QEMU/libvirt will attach
a new SCSI controller whenever one is required. Allowing explicit
hotplug of controllers though, enables the caller to specify a
static PCI address, instead of auto-assigning the next available
PCI slot. Or it will when we have static PCI addressing.
This patch is derived from Wolfgang Mauerer's disk controller
patch series.
* src/qemu/qemu_driver.c: Support hotplug & unplug of SCSI
controllers
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new API for attaching PCI SCSI controllers
The latter is not officially "wrong", but *is* terribly anachronistic.
I think automake documentation or comments call that syntax obsolescent.
* cfg.mk (_makefile_at_at_check_exceptions): Exempt @SCHEMADIR@
and @SYSCONFDIR@ uses -- there are no Makefile variables for those.
* docs/Makefile.am: Use $(INSTALL), not @INSTALL@.
* examples/dominfo/Makefile.am: Similar.
* examples/domsuspend/Makefile.am: Similar.
* proxy/Makefile.am: Similar.
* python/Makefile.am: Similar.
* python/tests/Makefile.am: Similar.
* src/Makefile.am: Similar.
* tests/Makefile.am: Similar.
Until recently, some gnulib-generated replacement headers
included *other* headers that were not strictly necessary,
thus masking the need in this file for an explicit <stdlib.h>.
* src/util/util.c: Include <stdlib.h> for declarations of e.g.,
strtol, random_r, getenv, etc.
Current implementation of x86Decode() used for CPUID -> model+features
translation does not always select the closest CPU model. When walking
through all models from cpu_map.xml the function considers a new
candidate as a better choice than a previously selected candidate only
if the new one is a superset of the old one. In case the new candidate
is closer to host CPU but lacks some feature comparing to the old
candidate, the function does not choose well.
This patch changes the algorithm so that the closest model is always
selected. That is, the model which requires the lowest number of
additional features to describe host CPU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemudFindCharDevicePTYsMonitor reports an error if 'info chardev' didn't
provide information for a requested device, even if the log output parsing
had found the pty path for that device. This makes pty assignment fail for
older QEMU/KVM versions. For example KVM 72 on Debian doesn't support
'info chardev', so qemuMonitorTextGetPtyPaths cannot parse any useful
information and the hash for device-id-to-pty-path mapping stays empty.
Make qemudFindCharDevicePTYsMonitor report an error only if the log output
parsing and the 'info chardev' parsing failed to provide the pty path.
* src/conf/domain_conf.c: add defaults for the video device
* src/esx/esx_vmx.[ch]: add VNC support to the VMX handling
* tests/vmx2xmltest.c, tests/xml2vmxtest.c: add tests for the VNC support
The current code for using -drive simply sets the -drive 'index'
parameter. QEMU internally converts this to bus/unit depending
on the type of drive. This does not give us precise control over
the bus/unit assignment though. This change switches over to make
libvirt explicitly calculate the bus/unit number.
In addition bus/unit/index are actually irrelevant for VirtIO
disks, since each virtio disk is a separate PCI device. No disk
controller is involved.
Doing the conversion to bus/unit in libvirt allows us to correctly
attach SCSI controllers when required.
* src/qemu/qemu_conf.c: Specify bus/unit instead of index for
disks
* tests/qemuxml2argvdata/qemuxml2argv-disk*.args: Switch over from
using index=NNNN, to bus=NN, unit=NN for SCSI/IDE/Floppy disks
To enable it to be called from multiple locations, split out
the code for building the -drive arg string. This will be needed
by later patches which do drive hotplug, the conversion to use
-device, and the conversion to controller/bus/unit addressing
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add qemuBuildDriveStr
for building -drive arg string
Convert the QEMU monitor APIs over to use virDomainDeviceAddress
structs for passing addresses in/out, instead of individual bits.
This makes the number of parameters smaller & easier to deal with.
No functional change
* src/qemu/qemu_driver.c, src/qemu/qemu_monitor.c,
src/qemu/qemu_monitor.h, src/qemu/qemu_monitor_text.c,
src/qemu/qemu_monitor_text.h: Change monitor hotplug APIs to
take an explicit address ptr for all host/guest addresses
This augments virDomainDevice with a <controller> element
that is used to represent disk controllers (e.g., scsi
controllers). The XML format is given by
<controller type="scsi" index="<num>">
<address type="pci" domain="0xNUM" bus="0xNUM" slot="0xNUM"/>
</controller>
where type denotes the disk interface (scsi, ide,...), index
is an integer that identifies the controller for association
with disks, and the <address> element specifies the controller
address on the PCI bus as described in previous commits
The address element can be omitted; in this case, an address
will be assigned automatically.
Most of the code in this patch is from Wolfgang Mauerer's
previous disk controller series
* docs/schemas/domain.rng: Define syntax for <controller>
XML element
* src/conf/domain_conf.c, src/conf/domain_conf.h: Define
virDomainControllerDef struct, and routines for parsing
and formatting XML
* src/libvirt_private.syms: Add virDomainControllerInsert
and virDomainControllerDefFree
When parsing the <disk> element specification, if no <address>
is provided for the disk, then automatically assign one based on
the <target dev='sdXX'/> device name. This provides for backwards
compatability with existing applications using libvirt, while also
allowing new apps to have complete fine grained control.
* src/conf/domain_conf.h, src/conf/domain_conf.c,
src/libvirt_private.syms: Add virDomainDiskDefAssignAddress()
for assigning a controller/bus/unit address based on disk target
* src/qemu/qemu_conf.c: Call virDomainDiskDefAssignAddress() after
generating XML from ARGV
* tests/qemuxml2argvdata/*.xml: Add in drive address information
to all XML files
Add the virDomainDeviceAddress information to the sound, video
and watchdog devices. This means all of them gain the new XML
element
<address .... />
This brings them upto par with disk/net/hostdev devices which
already have address info
* src/conf/domain_conf.h: Add virDomainDeviceAddress to sound,
video & watchdog device struts.
* src/conf/domain_conf.c: Hook up parsing/formatting for
virDomainDeviceAddress in sound, video & watchdog devices
* docs/schemas/domain.rng: Associate device address info
with sound, video & watchdog
Introduce a new structure
struct _virDomainDeviceDriveAddress {
unsigned int controller;
unsigned int bus;
unsigned int unit;
};
and plug that into virDomainDeviceAddress and generates XML that
looks like
<address type='drive' controller='1' bus='0' unit='5'/>
This syntax will be used by the QEMU driver to explicitly control
how drives are attached to the bus
* src/conf/domain_conf.h, src/conf/domain_conf.c: Parsing and
formatting of drive addresses
* docs/schemas/domain.rng: Define new address format for drives
All guest devices now use a common device address structure
summarized by:
enum virDomainDeviceAddressType {
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE,
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI,
};
struct _virDomainDevicePCIAddress {
unsigned int domain;
unsigned int bus;
unsigned int slot;
unsigned int function;
};
struct _virDomainDeviceInfo {
int type;
union {
virDomainDevicePCIAddress pci;
} addr;
};
This replaces the anonymous structs in Disk/Net/Hostdev data
structures. Where available, the address is *always* printed
in the XML file, instead of being hidden in the internal state
file.
<address type='pci' domain='0x0000' bus='0x1e' slot='0x07' function='0x0'/>
The structure definition is based on Wolfgang Mauerer's disk
controller patch series.
* docs/schemas/domain.rng: Define the <address> syntax and
associate it with disk/net/hostdev devices
* src/conf/domain_conf.h, src/conf/domain_conf.c,
src/libvirt_private.syms: APIs for parsing/formatting address
information. Also remove the QEMU specific 'pci_addr' attributes
* src/qemu/qemu_driver.c: Replace use of 'pci_addr' attrs with
new standardized format.
With the introduction virDispatchError, hook function errors are
never sent through the error callback, so users will never see
these messages.
Fix this by calling virDispatchError after hook failure.
Based off how QEMU does it, look through /sys/bus/usb/devices/* for
matching vendor:product info, and if found, use info from the surrounding
files to build the device's /dev/bus/usb path.
This fixes USB device assignment by vendor:product when running qemu
as non-root (well, it should, but for some reason I couldn't reproduce
the failure people are seeing in [1], but it appears to work properly)
[1] https://bugzilla.redhat.com/show_bug.cgi?id=542450
udev doesn't prefix USB product/vendor info with '0x', so the
strtol conversions were wrong for the product field (vendor already
set the correct base). Make the change for PCI product/vendor as
well to be safe.
This fixes USB device assignment via virt-manager.
This allows debug statements and raised errors in hook functions to
actually be logged somewhere (stderr). Users can enable debugging in the
daemon and now see more info in /var/log/libvirt/...
Upstream xen has changed parameters to the migration operation
several times over the past 18 months. Changeset 17553 removed
the resouce parameter, Changesets 17709, 17753, and 20326 added
ssl, node, and change_home_server parameters respectively.
Fortunately, testing has revealed that xend will fail the
operation if a parameter is missing but happily honor it if
unknown parameters are provided. Thus all currently supported
parameters can be provided, satisfying current xend but not
regressing older versions.
The virRaiseErrorFull() may invoke the error handler callback
functions an application has registered. This is not good
because the connection object may not be available at this
point, and the caller may be holding locks. This creates a
problem if the error handler calls back into libvirt.
The solutuon is to move invocation of the handler into the
final cleanup code in the public API entry points, where it
is guarenteed to have safe state.
* src/libvirt.c: Invoke virDispatchError() in all error paths
* src/util/virterror.c: Remove virSetConnError/virSetGlobalError,
replacing with virDispatchError(). Move invocation of the
error callbacks into virDispatchError() instead of the
virRaiseErrorFull function which is not in a safe context
qemudWaitForMonitor calls qemudReadLogOutput with qemudFindCharDevicePTYs
as callback. qemudFindCharDevicePTYs calls qemudExtractTTYPath to assign
a string to chr->data.file.path. Afterwards qemudWaitForMonitor may call
qemudFindCharDevicePTYsMonitor that overwrites chr->data.file.path without
freeing the old value. This results in leaking the memory allocated by
qemudExtractTTYPath.
Report an OOM error if the strdup in qemudFindCharDevicePTYsMonitor fails.
Only use pseudo-random generator for uuid if using /dev/random fails.
* src/util/uuid.c: The original code. would only print the warning
message if using /dev/random failed, but would still go ahead and call
virUUIDGeneratePseudoRandomBytes in all cases anyway.
Currently only the faultcode and faultstring are deserialized, the
detail part is ignored. The implementation of many new SOAP types
would be necessary to deserialize the detail part correctly. As an
intermediate solution the raw response is dumped to the debug log.
The -mem-prealloc flag should be used when using large pages
This ensures qemu tries to allocate all required memory immediately,
rather than when first used. The latter mode will crash qemu
if hugepages aren't available when accessed, while the former
should gracefully fallback to non-hugepages.
* src/qemu/qemu_conf.c: add -mem-prealloc flag to qemu command line
when using large pages
xen-unstable c/s 20685 changed the domctl interface, adding a field to
xen_domctl_getdomaininfo structure. This additional field causes stack
corruption in libvirt. xen-unstable c/s 20711 rightly bumped the domctl
interface version so it is at least possible to handle the new field.
This change accounts for shr_pages field added to xen_domctl_getdomaininfo
structure.
The MAC addresses with 00:50:56 prefix are split into several ranges:
00:50:56:00:00:00 - 00:50:56:3f:ff:ff 'static' range (manually assigned)
00:50:56:80:00:00 - 00:50:56:bf:ff:ff 'vpx' range (assigned by a VI Client)
Erroneously the 'vpx' range was assumed to be larger and to occupy the
remaining addresses of the 00:50:56 prefix that are not part of the 'static'
range.
00:50:56 was used as prefix for generated MAC addresses, this is not possible
anymore, because there are gaps in the allowed ranges. Therefore, change the
prefix to 00:0c:29 which is the prefix for auto generated MAC addresses anyway.
Allow arbitrary MAC addresses to be used and set the checkMACAddress VMX option
to false in case the MAC address doesn't fall into any predefined range.
* docs/drvesx.html.in: update website accordingly
* src/esx/esx_driver.c: set the auto generation prefix to 00:0c:29
* src/esx/esx_vmx.c: fix MAC address range handling and allow arbitrary MAC
addresses
* tests/vmx2xml*, tests/xml2vmx*: add some basic MAC address range tests
The data passed to the callback is not guaranteed to be zero terminated,
take care of that by coping the data and adding a zero terminator.
Also dump the data for other types than CURLINFO_TEXT.
Set CURLOPT_VERBOSE to 1 so the debug callback is called when enabled.
A domain with virtualHW version 4 is allowed on an ESX 4.0 server.
If a domain is migrated from an ESX 3.5 server to an ESX 4.0 server
then the virtualHW version stays the same. So a ESX 4.0 server can
host domains with virtualHW version 4.
This invalid free results in heap corruption. Some symptoms I saw
because of this were libvirtd crashing and virt-manager hanging
while trying to enumerate devices.
The behavior for the qemu balloon device has changed. Formerly, a virtio
balloon device was provided by default. Now, '-balloon virtio' must be
specified on the command line to enable it. This patch causes libvirt to
add '-balloon virtio' to the command line whenever the -balloon option is
available.
* src/qemu/qemu_conf.c src/qemu/qemu_conf.h: check for the new flag and
add "-baloon vitio" to qemu command when needed
* tests/qemuhelptest.c: add the new flag for detection
This patch removes the call to vol update after the volume build completes.
The update call is currently meaningless anyway because the vol build is passed
a copy of the definition, so the update result is thrown away. More
importantly, if the user specified a selinux label for the volume, the update
call results in a double free of the label
* src/storage/storage_backend_fs.c: remove the update call
This change makes the 'info chardev' parser ignore any trailing
whitespace on a line. This fixes a specific problem handling a '\r\n'
line ending.
* src/qemu/qemu_monitor_text.c: Ignore trailing whitespace in
'info chardev' output.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepare2): Remove useless
test of always-non-NULL uri_out parameter. Use ATTRIBUTE_NONNULL to
inform tools.
All other stateful drivers are linked directly to libvirtd
instead of libvirt.so. Link the secret driver to libvirtd too.
* daemon/Makefile.am: link the secret driver to libvirtd
* daemon/libvirtd.c: add #ifdef WITH_SECRETS blocks
* src/Makefile.am: don't link the secret driver to libvirt.so
* src/libvirt_private.syms: remove the secretRegister symbol
If using a remote access, sometimes an RPC entry point is not
available, and currently we just end up with a raw:
error: unknown procedure: xxx
error, while this should be more cleanly reported as an unsupported
entry point like for local access
* src/remote/remote_driver.c: convert missing remote entry points into
the unsupported feature error
When querying about a domain from 0.3.3 (or RHEL 5.3) domain located
on a 0.6.3 (RHEL-5) machine, the errors are not properly reported.
This patch from Olivier Fourdan <ofourdan@redhat.com> , slightly
modified to not change the semantic when the domain os details cannot
be provided
* src/xen/proxy_internal.c src/xen/xen_hypervisor.c: add some missing
error reports
Found while trying to cross-compile libvirt on Fedora 12 for Windows.
gnulib redefines 'close' to 'close_used_without_including_unistd_h'
in sys/socket.h if winsock2.h is present and unistd.h has not been
included before sys/socket.h. Reorder some includes to fix this.
* src/qemu/qemu_conf.h: Remove QEMU_CMD_FLAG_0_12 and just leave
the lone JSON flag
* src/qemu/qemu_conf.c: Enable JSON on QEMU 0.13 or later, but
leave it disabled for now
The Xen code for making HVM VT-d PCI passthrough attach and detach
wasn't working properly:
1) In xenDaemonAttachDevice(), we were always trying to reconfigure
a PCI passthrough device, even the first time we added it. This was
because the code in virDomainXMLDevID() was not checking xenstore for
the existence of the device, and always returning 0 (meaning that
the device already existed).
2) In xenDaemonDetachDevice(), we were trying to use "device_destroy"
to detach a PCI device. While you would think that is the right
method to call, it's actually wrong for PCI devices. In particular,
in upstream Xen (and soon in RHEL-5 Xen), device_configure is actually
used to destroy a PCI device.
To fix the attach
problem I add a lookup into xenstore to see if the device we are
trying to attach already exists. To fix the detach problem I change
it so that for PCI detach (only), we use device_configure with the
appropriate sxpr to do the detachment.
* src/xen/xend_internal.c: don't use device_destroy for PCI devices
and fix the other issues.
* src/xen/xs_internal.c src/xen/xs_internal.h: add
xenStoreDomainGetPCIID()
The XML XPath for detecting JSON in the running VM statefile was
wrong causing all VMs to get JSON mode enabled at libvirtd restart.
In addition if a VM was running a JSON enabled QEMU once, and then
altered to point to a non-JSON enabled QEMU later the 'monJSON'
flag would not get reset to 0.
* src/qemu/qemu_driver.c: Fix setting/detection of JSON mode
The code for connecting to a server tries each socket in turn
until it finds one that connects. Unfortunately for TLS sockets
if it connected, but failed TLS handshake it would treat that
as a failure to connect, and try the next socket. This is bad,
it should have reported the TLS failure immediately.
$ virsh -c qemu://somehost.com/system
error: unable to connect to libvirtd at 'somehost.com': Invalid argument
error: failed to connect to the hypervisor
$ ./tools/virsh -c qemu://somehost.com/system
error: server certificate failed validation: The certificate hasn't got a known issuer.
error: failed to connect to the hypervisor
* src/remote/remote_driver.c: Stop trying to connect if the
TLS handshake fails
Use a dynamically sized xdr_array to pass memory stats on the wire. This
supports the addition of future memory stats and reduces the message size
since only supported statistics are returned.
* src/remote/remote_protocol.x: provide defines for the new entry point
* src/remote/remote_driver.c daemon/remote.c: implement the client and
server side
* daemon/remote_dispatch_args.h daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h daemon/remote_dispatch_table.h
src/remote/remote_protocol.c src/remote/remote_protocol.h: generated
stubs
Support for memory statistics reporting is accepted for qemu inclusion.
Statistics are reported via the monitor command 'info balloon' as a comma
seprated list:
(qemu) info balloon
balloon: actual=1024,mem_swapped_in=0,mem_swapped_out=0,major_page_faults=88,minor_page_faults=105535,free_mem=1017065472,total_mem=1045229568
Libvirt, qemu, and the guest operating system may support a subset of the
statistics defined by the virtio spec. Thus, only statistics recognized by
components will be reported.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement the
new entry point by using info balloon monitor command
Set up the types for the domainMemoryStats function and insert it into the
virDriver structure definition. Because of static initializers, update
every driver and set the new field to NULL.
* include/libvirt/libvirt.h.in: new API
* src/driver.h src/*/*_driver.c src/vbox/vbox_tmpl.c: add the new
entry to the driver structure
* python/generator.py: fix compiler errors, the actual python binding is
implemented later
If a virtual machine is destroyed on a ESX server then immediately
undefining this virtual machine on a vCenter may fail, because the
vCenter has not been informed about the status change yet. Therefore,
destroy a virtual machine on a vCenter if available, so the vCenter
is up-to-date when the virtual machine should be undefined.
Undefining a virtual machine on an ESX server leaves a orphan on the
vCenter behind. So undefine a virtual machine on a vCenter if available
to fix this problem.
If an ESX host is managed by a vCenter, it knows the IP address of the
vCenter. Setting the vCenter query parameter to * allows to connect to the
vCenter known to an ESX host without the need to specify its IP address
or hostname explicitly.
esxDomainLookupByUUID() and esxDomainIsActive() lookup a domain by asking
ESX for all known domains and searching manually for the one with the
matching UUID. This is inefficient. The VI API allows to lookup by UUID
directly: FindByUuid().
* src/esx/esx_driver.c: change esxDomainLookupByUUID() and esxDomainIsActive()
to use esxVI_LookupVirtualMachineByUuid(), also reorder some functions to
keep them in sync with the driver struct
Questions can block tasks, to handle them automatically the driver can answers
them with the default answer. The auto_answer query parameter allows to enable
this automatic question handling.
* src/esx/README: add a detailed explanation for automatic question handling
* src/esx/esx_driver.c: add automatic question handling for all task related
driver functions
* src/esx/esx_util.[ch]: add handling for the auto_answer query parameter
* src/esx/esx_vi.[ch], src/esx/esx_vi_methods.[ch], src/esx/esx_vi_types.[ch]:
add new VI API methods and types and additional helper functions for
automatic question handling
Commit 33a198c1f6 increased the gcrypt
version requirement to 1.4.2 because the GCRY_THREAD_OPTION_VERSION
define was added in this version.
The configure script doesn't check for the gcrypt version. To support
gcrypt versions < 1.4.2 change the virTLSThreadImpl initialization
to use GCRY_THREAD_OPTION_VERSION only if it's defined.
Each driver supporting CPU selection must fill in host CPU capabilities.
When filling them, drivers for hypervisors running on the same node as
libvirtd can use cpuNodeData() to obtain raw CPU data. Other drivers,
such as VMware, need to implement their own way of getting such data.
Raw data can be decoded into virCPUDefPtr using cpuDecode() function.
When implementing virConnectCompareCPU(), a hypervisor driver can just
call cpuCompareXML() function with host CPU capabilities.
For each guest for which a driver supports selecting CPU models, it must
set the appropriate feature in guest's capabilities:
virCapabilitiesAddGuestFeature(guest, "cpuselection", 1, 0)
Actions needed when a domain is being created depend on whether the
hypervisor understands raw CPU data (currently CPUID for i686, x86_64
architectures) or symbolic names has to be used.
Typical use by hypervisors which prefer CPUID (such as VMware and Xen):
- convert guest CPU configuration from domain's XML into a set of raw
data structures each representing one of the feature policies:
cpuEncode(conn, architecture, guest_cpu_config,
&forced_data, &required_data, &optional_data,
&disabled_data, &forbidden_data)
- create a mask or whatever the hypervisor expects to see and pass it
to the hypervisor
Typical use by hypervisors with symbolic model names (such as QEMU):
- get raw CPU data for a computed guest CPU:
cpuGuestData(conn, host_cpu, guest_cpu_config, &data)
- decode raw data into virCPUDefPtr with a possible restriction on
allowed model names:
cpuDecode(conn, guest, data, n_allowed_models, allowed_models)
- pass guest->model and guest->features to the hypervisor
* src/cpu/cpu.c src/cpu/cpu.h src/cpu/cpu_generic.c
src/cpu/cpu_generic.h src/cpu/cpu_map.c src/cpu/cpu_map.h
src/cpu/cpu_x86.c src/cpu/cpu_x86.h src/cpu/cpu_x86_data.h
* configure.in: check for CPUID instruction
* src/Makefile.am: glue the new files in
* src/libvirt_private.syms: add new private symbols
* po/POTFILES.in: add new cpu files containing translatable strings
* src/remote/remote_protocol.x: update with new entry point
* daemon/remote.c: add the new server dispatcher
* daemon/remote_dispatch_args.h daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h daemon/remote_dispatch_table.h
src/remote/remote_protocol.c src/remote/remote_protocol.h: regenerated
* src/driver.h: add an extra entry point in the structure
* src/esx/esx_driver.c src/lxc/lxc_driver.c src/opennebula/one_driver.c
src/openvz/openvz_driver.c src/phyp/phyp_driver.c src/qemu/qemu_driver.c
src/remote/remote_driver.c src/test/test_driver.c src/uml/uml_driver.c
src/vbox/vbox_tmpl.c src/xen/xen_driver.c: add NULL entry points for
all drivers
* include/libvirt/virterror.h src/util/virterror.c: add new domain
VIR_FROM_CPU for errors
* src/conf/cpu_conf.c src/conf/cpu_conf.h: new parsing module
* src/Makefile.am proxy/Makefile.am: include new files
* src/conf/capabilities.[ch] src/conf/domain_conf.[ch]: reference
new code
* src/libvirt_private.syms: private export of new entry points
GNUTLS uses gcrypt for its crypto functions. gcrypt requires
that the app/library initializes threading before using it.
We don't want to force apps using libvirt to know about
gcrypt, so we make virInitialize init threading on their
behalf. This location also ensures libvirtd has initialized
it correctly. This initialization is required even if libvirt
itself were only using one thread, since another non-libvirt
library (eg GTK-VNC) could also be using gcrypt from another
thread
* src/libvirt.c: Register thread functions for gcrypt
* configure.in: Add -lgcrypt to linker flags
* src/storage/storage_driver.c: Fix IsPersistent() and IsActivE()
methods on storage pools to use 'storagePrivateData' instead
of 'privateData'. Also fix naming convention of objects
* src/esx/esx_vi.c (esxVI_List_CastFromAnyType): For invalid
inputs, fail right away. Do not "goto failure" where a NULL
input pointer would be dereferenced.
* src/esx/esx_util.c (esxUtil_ParseDatastoreRelatedPath): Return
right away for invalid inputs, rather than using them (which would
dereference NULL pointers) in clean-up code.
The virFileResolveLink utility function relied on the POSIX guarantee
that stat.st_size of a symlink is the length of the value. However,
on some types of file systems, it is invalid, so do not rely on it.
Use gnulib's areadlink module instead.
* bootstrap (modules): Add areadlink.
* src/util/util.c: Include "areadlink.h".
Let areadlink perform the readlink and malloc.
* configure.in (AC_CHECK_FUNCS): Remove readlink. No need,
since it's presence is guaranteed by gnulib.
* src/xen/xm_internal.c (xenXMConfigGetULong): Remove useless and
misleading test (always false) for val->str == NULL before code that
always dereferences val->str. "val" comes from virConfGetValue, and
at that point, val->str is guaranteed to be non-NULL.
(xenXMConfigGetBool): Likewise.
* src/util/conf.c (virConfSetValue): Ensure that vir->str is never NULL,
not even if someone tries to set such a value via virConfSetValue.
* src/qemu/qemu_driver.c (doNonTunnelMigrate): Don't let a
NULL "uri_out" provoke a NULL-dereference in doNativeMigrate:
supply omitted goto-after-qemudReportError.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepareTunnel): Upon an
out of memory error, we would end up with unixfile==NULL and attempt
to unlink(NULL). Skip the unlink when it's NULL.
* src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Set
"event" to NULL after qemuDomainEventQueue frees it, so a
subsequent free (after endjob label) upon qemuMonitorStartCPUs
failure does not cause a double free.
* src/node_device/node_device_driver.c (update_driver_name): The
previous code would write one byte beyond the end of the 4KiB
stack buffer when presented with a symlink value of exactly that
length (very unlikely). Remove the automatic buffer and use
virFileResolveLink in place of readlink. Suggested by Daniel Veillard.
* src/storage/storage_backend_fs.c: virStorageBackendFileSystemDelete
was incorrectly calling unlink() in an attempt to remove a directory.
It should be calling rmdir() instead.
src/node_device/node_device_udev.c was using a function available only
on the daemon code, fix this and use the function available globally
* src/node_device/node_device_udev.c: replace use of virEventAddHandleImpl
by virEventAddHandle
If there are no references remaining to the object, vm is set to NULL
and vm->persistent cannot be accessed. Fixed by this trivial patch.
* src/qemu/qemu_driver.c (qemudDomainCoreDump): Avoid possible
NULL pointer dereference on --crash dump.
This is trivial for QEMU since you just have to not stop the vm before
starting the dump. And for Xen, you just pass the flag down to xend.
* include/libvirt/libvirt.h.in (virDomainCoreDumpFlags): Add VIR_DUMP_LIVE.
* src/qemu/qemu_driver.c (qemudDomainCoreDump): Support live dumping.
* src/xen/xend_internal.c (xenDaemonDomainCoreDump): Support live dumping.
* tools/virsh.c (opts_dump): Add --live. (cmdDump): Map it to VIR_DUMP_LIVE.
This patch adds the --crash option (already present in "xm dump-core")
to "virsh dump". virDomainCoreDump already has a flags argument, so
the API/ABI is untouched.
* include/libvirt/libvirt.h.in (virDomainCoreDumpFlags): New flag for
CoreDump
* src/test/test_driver.c (testDomainCoreDump): Do not crash
after dump unless VIR_DUMP_CRASH is given.
* src/qemu/qemu_driver.c (qemudDomainCoreDump): Shutdown the domain
instead of restarting it if --crash is passed.
* src/xen/xend_internal.c (xenDaemonDomainCoreDump): Support --crash.
* tools/virsh.c (opts_dump): Add --crash.
(cmdDump): Map it to flags for virDomainCoreDump and pass them.
1) qemuMigrateToCommand uses ">>" so we have to truncate the file
before starting the migration;
2) the command wasn't updated to chown the driver and set/restore
the security lavels;
3) the VM does not have to be resumed if migration fails;
4) the file is not removed when migration fails.
* src/qemu/qemu_driver.c (qemuDomainCoreDump): Truncate file before
dumping, set/restore ownership and security labels for the file.
Those were pointed by DanB in his review but not yet fixed
* src/qemu/qemu_driver.c: qemudWaitForMonitor() use EnterMonitorWithDriver()
and ExitMonitorWithDriver() there
* src/qemu/qemu_monitor_text.c: checking fro strdu failure and hash
table add error in qemuMonitorTextGetPtyPaths()
This change makes the QEMU driver get pty paths from the output of the
monitor 'info chardev' command. This output is structured, and contains
both the name of the device and the path on the same line. This is
considerably more reliable than parsing the startup log output, which
requires the parsing code to know which order QEMU will print pty
information in.
Note that we still need to parse the log output as the monitor itself
may be on a pty. This should be rare, however, and the new code will
replace all pty paths parsed by the log output method once the monitor
is available.
* src/qemu/qemu_monitor.(c|h) src/qemu_monitor_text.(c|h): Implement
qemuMonitorGetPtyPaths().
* src/qemu/qemu_driver.c: Get pty path information using
qemuMonitorGetPtyPaths().
Change -monitor, -serial and -parallel output to use -chardev if it is
available.
* src/qemu/qemu_conf.c: Update qemudBuildCommandLine to use -chardev where
available.
* tests/qemuxml2argvtest.c tests/qemuxml2argvdata/: Add -chardev equivalents
for all current serial and parallel tests.
Even if qemudStartVMDaemon suceeds, an error was logged such as
'qemuRemoveCgroup:1778 : internal error Unable to find cgroup for'.
This is because qemudStartVMDaemon calls qemuRemoveCgroup to
ensure that old cgroup does not remain. This workaround makes
sense but leaving an error message may confuse users.
* src/qemu/qemu_driver.c: a an option to the function to suppress the
error being logged
This patch fixes the bug where paused/running state is not
transmitted during migration. As a result, in the QEMU driver
for example the machine was always started on the destination
end.
In order to do so, just read the state and if it is appropriate and
set the VIR_MIGRATE_PAUSED flag.
* src/libvirt.c (virDomainMigrateVersion1, virDomainMigrateVersion2):
Automatically add VIR_MIGRATE_PAUSED when appropriate.
* src/xen/xend_internal.c (xenDaemonDomainMigratePerform): Give a nicer
error message when migration of paused domains is attempted.
This adds a new flag, VIR_MIGRATE_PAUSED, that mandates pausing
the migrated VM before starting it.
* include/libvirt/libvirt.h.in (virDomainMigrateFlags): Add VIR_MIGRATE_PAUSED.
* src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Handle VIR_MIGRATE_PAUSED.
* tools/virsh.c (opts_migrate): Add --suspend. (cmdMigrate): Handle it.
* tools/virsh.pod (migrate): Document it.
This makes a small change on the failed-migration path. Up to now,
all VMs that failed non-live migration after the "stop" command
were restarted. This must not be done when the VM was paused in
the first place.
* src/qemu/qemu_driver.c (qemudDomainMigratePerform): Do not restart
a paused VM that fails migration. Set paused state after "stop",
reset it after failure.
We don't use this method of reloading rules anymore, so we can just
kill the code.
This simplifies things a lot because we no longer need to keep a
table of the rules we've added.
* src/util/iptables.c: kill iptablesReloadRules()
Long ago we tried to use Fedora's lokkit utility in order to register
our iptables rules so that 'service iptables restart' would
automatically load our rules.
There was one fatal flaw - if the user had configured iptables without
lokkit, then we would clobber that configuration by running lokkit.
We quickly disabled lokkit support, but never removed it. Let's do
that now.
The 'my virtual network stops working when I restart iptables' still
remains. For all the background on this saga, see:
https://bugzilla.redhat.com/227011
* src/util/iptables.c: remove lokkit support
* configure.in: remove --enable-lokkit
* libvirt.spec.in: remove the dirs used only for saving rules for lokkit
* src/Makefile.am: ditto
* src/libvirt_private.syms, src/network/bridge_driver.c,
src/util/iptables.h: remove references to iptablesSaveRules
This is the expected behaviour, I think - reloading libvirtd should
be a subset of restarting it.
Note, we reload the rules after we've determined which networks
are active (because we only add the rules for active networks)
and before we start autostart networks (to avoid re-adding the
rules).
* src/network/bridge_driver.c: reload iptables rules on startup
Currently, when we add iptables rules, we keep them on a list so that
we can easily reload them on e.g. 'service libvirtd reload'.
However, we don't save this list to disk, so if libvirtd is restarted
we lose the ability to reload the rules.
The fix is simple - just re-add the damn things on reload.
Note, we delete the rules before re-adding them, just like the current
behaviour of iptRulesReload().
* src/network/bridge_driver.c: re-add the iptables rules on reload.
Replace free(virBufferContentAndReset()) with virBufferFreeAndReset().
Update documentation and replace all remaining calls to free() with
calls to VIR_FREE(). Also add missing calls to virBufferFreeAndReset()
and virReportOOMError() in OOM error cases.
xen-unstable changesets 20321 and 20521 added support for
description in xend domain config. This patch extends that
support in xend backend.
* src/xen/xend_internal.c: add parse and output of domain description
The QEMU 0.10.0 release (and possibly other 0.10.x) has a bug where
it sometimes/often forgets to display the initial monitor greeting
line, soley printing a (qemu). This in turn confuses the text
console parsing because it has a '(qemu)' it is not expecting. The
confusion results in a negative malloc. Bad things follow.
This re-writes the text console handling to be more robust. The key
idea is that it should only look for a (qemu), once it has seen the
original command echo'd back. This ensures it'll skip the bogus stray
(qemu) with broken QEMUs.
* src/qemu/qemu_monitor.c: Add some (disabled) debug code
* src/qemu/qemu_monitor_text.c: Re-write way command replies
are detected
Since the monitor I/O is processed out of band from the main
thread(s) invoking monitor commands, the virDomainObj may be
deleted by the I/O thread. The qemuDomainObjBeginJob takes an
extra reference to protect against final deletion, but this
reference is released by the corresponding EndJob call. THus
after the EndJob call it may not be valid to reference the
virDomainObj any more. To allow callers to detect this, the
EndJob call is changed to return the remaining reference count.
* src/conf/domain_conf.c: Make virDomainObjUnref return the
remaining reference count
* src/qemu/qemu_driver.c: Avoid referencing virDomainObjPtr
after qemuDomainObjEndJob if it has been deleted.
Fix this warning, there is no need to use an intermediate,
different array pointer.
network.c: In function 'getIPv6Addr':
network.c:50: warning: dereferencing type-punned pointer will break strict-aliasing rules
* src/util/network.c: avoid an intermediary pointer cast
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add callbacks
for reset, shutdown, poweroff and stop events. Add convenience
methods for emiting those events
With addition of events there will be alot of callbacks.
To avoid having to add many APIs to register callbacks,
provide them all at once in a big table
* src/qemu/qemu_driver.c: Pass in a callback table to QEMU
monitor code
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h Replace
the EOF and disk secret callbacks with a callback table
Initial support for the new QEMU monitor protocol using JSON
as the data encoding format instead of plain text
* po/POTFILES.in: Add src/qemu/qemu_monitor_json.c
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Hack to turn on QMP
mode. Replace with a version number check on >= 0.12 later
* src/qemu/qemu_monitor.c: Delegate to json monitor if enabled
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
impl of QMP protocol
* src/Makefile.am: Add src/qemu/qemu_monitor_json.{c,h}
Now that drivers are using a private domain object state blob,
the virDomainObjFormat/Parse methods are no longer able to
directly serialize all neccessary state to/from XML. It is
thus neccessary to introduce a pair of callbacks fo serializing
private state.
The code for serializing vCPU PIDs and the monitor device
config can now move out of domain_conf.c and into the
qemu_driver.c where they belong.
* src/conf/capabilities.h: Add callbacks for serializing private
state to/from XML
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove the
monitor, monitor_chr, monitorWatch, nvcpupids and vcpupids
fields from virDomainObjPtr. Remove code that serialized
those fields
* src/libvirt_private.syms: Export virXPathBoolean
* src/qemu/qemu_driver.c: Add callbacks for serializing monitor
and vcpupid data to/from XML
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Pass monitor
char device config into qemuMonitorOpen directly.
The code to start CPUs executing has nothing todo with CPU
affinity masks, so pull it out of the qemudInitCpuAffinity()
method and up into qemudStartVMDaemon()
* src/qemu/qemu_driver.c: Pull code to start CPUs executing out
of qemudInitCpuAffinity()
The current QEMU disk media change does not support setting the
disk format. The new JSON monitor will support this, so add an
extra parameter to pass this info in
* src/qemu/qemu_driver.c: Pass in disk format when changing media
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add a 'format' arg to qemuMonitorChangeMedia()
The qemuMonitorEscape() method, and the VIR_ENUM for migration
status will be needed by the JSON monitor too, so move that code
into the shared qemu_monitor.c file instead of qemu_monitor_text.c
* src/qemu/qemu_monitor.h: Declare qemuMonitorMigrationStatus enum
and qemuMonitorEscapeArg and qemuMonitorEscapeShell methods
* src/qemu/qemu_monitor.c: Implement qemuMonitorMigrationStatus enum
and qemuMonitorEscapeArg and qemuMonitorEscapeShell methods
* src/qemu/qemu_monitor_text.c: Remove above methods/enum
If QEMU shuts down while we're in the middle of processing a
monitor command, the monitor will be freed, and upon cleaning
up we attempt to do qemuMonitorUnlock(priv->mon) when priv->mon
is NULL.
To address this we introduce proper reference counting into
the qemuMonitorPtr object, and hold an extra reference whenever
executing a command.
* src/qemu/qemu_driver.c: Hold a reference on the monitor while
executing commands, and only NULL-ify the priv->mon field when
the last reference is released
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add reference
counting to handle safe deletion of monitor objects
configure: yajl: no
CC libvirt_util_la-json.lo
util/json.c:32:27: error: yajl/yajl_gen.h: No such file or directory
util/json.c:33:29: error: yajl/yajl_parse.h: No such file or directory
* src/util/json.c: remove the includes if yajl not configured in
This introduces simple API for handling JSON data. There is
an internal data structure 'virJSONValuePtr' which stores a
arbitrary nested JSON value (number, string, array, object,
nul, etc). There are APIs for constructing/querying objects
and APIs for parsing/formatting string formatted JSON data.
This uses the YAJL library for parsing/formatting from
http://lloyd.github.com/yajl/
* src/util/json.h, src/util/json.c: Data structures and APIs
for representing JSON data, and parsing/formatting it
* configure.in: Add check for yajl library
* libvirt.spec.in: Add build requires for yajl
* src/Makefile.am: Add json.c/h
* src/libvirt_private.syms: Export JSON symbols to drivers
Some of the very useful calls for XML parsing provided by util/xml.[ch]
were not exported as private symbols. This patch fixes this.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Xen HVM guests with PV drivers end up with two network interfaces for
each configured interface. One of them being emulated by qemu and the
other one paravirtual. As this might not be desirable, the attached
patch provides a way for users to specify that only paravirtual network
interface should be presented to the guest.
The configuration was inspired by qemu/kvm driver, for which users can
specify model='virtio' to use paravirtual network interface.
The patch adds support for model='netfront' which results in
type=netfront instead of type=ioemu (or nothing for newer xen versions)
in guests native configuration. Xen's qemu ignores interfaces with
type != ioemu and only paravirtual network device will be seen in the
guest.
Four possible configuration scenarios follow:
- no model specified in domain's XML
- libvirt will behave like before this change; it will set
type=ioemu for HVM guests on xen host which is not newer than
XEND_CONFIG_MAX_VERS_NET_TYPE_IOEMU
- covered by existing tests
- PV guest, any model
- no functional change, model is passed as is (and ignored by the
hypervisor)
- covered by existing tests (e.g., *-net-e1000.*)
- HVM guest, model=netfront
- type is set to "netfront", model is not specified
- covered by new *-net-netfront.* tests
- HVM guest, model != netfront
- type is set to "ioemu", model is passed as is
- covered by new *-net-ioemu.* tests
The fourth scenario feels like a regression for xen newer than
XEND_CONFIG_MAX_VERS_NET_TYPE_IOEMU as users who had a model specified
in their guest's configuration won't see a paravirtual interface in
their guests any more. On the other hand, the reason for specifying a
model is most likely the fact that they want to use such model which
implies emulated interface. Users of older xen won't be affected at all
as their xen provides paravirtual interface regardless of the type used.
- src/xen/xend_internal.c: add netfront support for the xend backend
- src/xen/xm_internal.c: add netfront support for the XM serialization too
Also fixed serial port configuration which was broken due to recent
change in virDomainChrDef where targetType was newly added.
* src/Makefile.am: add new files
* src/vbox/vbox_driver.c: add case for version 3.1
* src/vbox/vbox_tmpl.c: refactor common patterns into macros, support for
version 3.1, serial port configuration fix
* src/vbox/vbox_CAPI_v3_1.h, src/vbox/vbox_V3_1.c: generated code
esxVMX_IndexToDiskName handles indices up to 701. This limit comes
from a mapping gap in virDiskNameToIndex:
sdzy -> 700
sdzz -> 701
sdaaa -> 728
sdaab -> 729
This line in virDiskNameToIndex causes this gap:
idx = (idx + i) * 26;
Fixing it by altering this line to:
idx = (idx + (i < 1 ? 0 : 1)) * 26;
Also add a new version of virIndexToDiskName that handles the inverse
mapping for arbitrary indices.
* src/esx/esx_vmx.[ch]: remove esxVMX_IndexToDiskName
* src/util/util.[ch]: add virIndexToDiskName and fix mapping gap
* tests/esxutilstest.c: update test to verify that the gap is fixed
* src/conf/domain_conf.c: don't call virDomainObjUnlock twice
* src/qemu/qemu_driver.c: relock driver lock if an error occurs in
qemuDomainObjBeginJobWithDriver, enter/exit monitor with driver
in qemudDomainSave
The instruction "See Makefile.am" in libvirt.private_syms
always makes me think that this file is autogenerated
and should not be touched manually. This patch spares
every reader of libvirt.private_syms the hassle of
reading Makefile.am before augmenting libvirt.private_syms.
Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Commit 790f0b3057 causes the contents of
the names array to be freed even on success, resulting in no listing of
defined but inactive Xen domains.
Spotted by Jim Fehlig
Introduce a new type="dir" mode for <disks> that allows use of
QEMU's virtual FAT block device driver. eg
<disk type='dir' device='floppy'>
<source dir='/tmp/test'/>
<target dev='fda' bus='fdc'/>
<readonly/>
</disk>
gets turned into
-drive file=fat:floppy:/tmp/test,if=floppy,index=0
Only read-only disks are supported with virtual FAT mode
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add type="dir"
* docs/schemas/domain.rng: Document new disk type
* src/xen/xend_internal.c, src/xen/xm_internal.c: Raise error for
unsupported disk types
* tests/qemuxml2argvdata/qemuxml2argv-disk-cdrom-empty.args: Fix
empty disk file handling
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-fat.args,
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-fat.xml,
tests/qemuxml2argvdata/qemuxml2argv-floppy-drive-fat.args,
tests/qemuxml2argvdata/qemuxml2argv-floppy-drive-fat.xml
tests/qemuxml2argvtest.c: Test QEMU vitual FAT driver
* src/qemu/qemu_conf.c: Support generating fat:/some/dir type
disk args
* src/security/security_selinux.c: Temporarily skip labelling
of directory based disks
The cpu_set_t type can only cope with NR_CPUS <= 1024, beyond this
it is neccessary to use alternate CPU_SET maps with a dynamically
allocated CPU map
* src/util/processinfo.c: Support new unlimited size CPU set type
* src/Makefile.am: Add processinfo.h/processinfo.c
* src/util/processinfo.c, src/util/processinfo.h: Module providing
APIs for getting/setting process CPU affinity
* src/qemu/qemu_driver.c: Switch over to new APIs for schedular
affinity
* src/libvirt_private.syms: Export virProcessInfoSetAffinity
and virProcessInfoGetAffinity to internal drivers
0.7.3 was broken
* configure.in docs/news.html.in: release of 0.7.4
* configure.in libvirt.spec.in: require netcf >= 0.1.4
* src/Makefile.am: node_device/node_device_udev.h was missing from
NODE_DEVICE_DRIVER_UDEV_SOURCES breaking compilation on platforms with
udev
Recent qemu releases require command option '-enable-qemu' in order
for the kvm functionality be activated. Libvirt needs to pass this flag
to qemu when starting a domain. Note that without the option,
even if both the kernel and qemu support KVM, KVM will not be activated
and VMs will be very slow.
* src/qemu/qemu_conf.h src/qemu/qemu_conf.c: parse the extra command
line option from help and add it when running kvm
* tests/qemuhelptest.c: this modified the flags output for qemu-0.10.5
and qemu-kvm-0.11.0-rc2 regression tests
Erroneously included the sysfs_path and parent_sysfs_path elements in
the node device xml, they were not supposed to show up there
* src/conf/node_device_conf.c: remove the output of the 2 fields
I realized that I inadvertently added a member to the def struct to
contain each device's sysfs path when there was an existing member in the
dev struct for "OS specific path to device metadat, eg sysfs" Since the
udev backend needs to record the sysfs path while it's in the process of
creating the device, before the dev struct gets allocated, I chose to
remove the member from the dev struct.
* src/conf/node_device_conf.c src/conf/node_device_conf.h
src/node_device/node_device_driver.c src/node_device/node_device_hal.c
src/node_device/node_device_udev.c: remove devicePath from the
structure and use def->sysfs_path instead
The qemudStartVMDaemon() and several functions it calls use
the QEMU monitor. The QEMU driver is locked while this function
is executing, so it is rquired to release the driver lock and
reacquire it either side of issuing a monitor command. It
failed todo so, leading to deadlock
* qemu/qemu_driver.c: Release driver when in qemudStartVMDaemon
and things it calls
VMware uses two MAC address prefixes: 00:0c:29 and 00:50:56. The 00:0c:29
prefix is used for ESX server generated addresses. The 00:50:56 prefix is
split into two parts. MAC addresses above 00:50:56:3f:ff:ff are generated
by a vCenter. The rest of the 00:50:56 prefix can be assigned manually.
Any MAC address within the 00:0c:29 and 00:50:56 prefix can be specified
in a domain XML config and the driver will handle the details internally.
* src/esx/esx_vmx.c: fix MAC address formatting
* tests/xml2vmxdata/*: update test files accordingly
* docs/drivers.html.in: list the ESX driver
* docs/drvesx.html.in: the new ESX driver documentation
* docs/hvsupport.html.in: add the ESX driver to the matrix
* docs/index.html.in, docs/sitemap.html.in: list the ESX driver
* src/esx/esx_driver.c: fix and cleanup some comments
* src/xen/xen_hypervisor.c: xen-unstable changeset 19788 removed
MAX_VIRT_CPUS from public headers, breaking compilation of libvirt
on -unstable. Its semanitc was retained with XEN_LEGACY_MAX_VCPUS.
Ensure MAX_VIRT_CPUS is defined accordingly.
The QEMU monitor open method would not take a reference on
the virDomainObjPtr until it had successfully opened the
monitor. The cleanup code upon failure to open though would
call qemuMonitorClose() which would in turn decrement the
reference count. This caused the virDoaminObjPtr to be mistakenly
freed and then the whole driver crashes
* src/qemu/qemu_monitor.c: Fix reference counting in
qemuMonitorOpen
The HAL driver returns a fatal error code in the case where HAL
is not running. This causes the entire libvirtd daemon to quit
which isn't desirable. Instead it should simply disable the HAL
driver
* src/node_device/node_device_hal.c: Quietly disable HAL if it is
not running
Fixes https://launchpad.net/bugs/453335
* src/security/virt-aa-helper.c: suppress confusing and misleading
apparmor denied message when kvm/qemu tries to open a libvirt specified
readonly file (such as a cdrom) with write permissions. libvirt uses
the readonly attribute for the security driver only, and has no way
of telling kvm/qemu that the device should be opened readonly
Fixes https://launchpad.net/bugs/460271
* src/security/virt-aa-helper.c: require absolute path for dynamic added
files. This is required by AppArmor and conveniently prevents adding
tcp consoles to the profile
The wrong variable was being passed in with the LXC event callback
resulting in a later deadlock or crash
* src/lxc/lxc_driver.c: Pass 'vm' instead of 'driver' to event
callback
In the scenario where the cgroups were mounted but the
particular group did not exist, and the caller had not
requested auto-creation, the code would fail to return
an error condition. This caused the lxc_controller to
think the cgroup existed, and it then later failed when
attempting to use it
* src/util/cgroup.c: Raise an error if the cgroup path does not
exist
There is a race condition in HAL driver startup where the callback
can get triggered before we have finished startup. This then causes
a deadlock in the driver.
* src/node_device/node_device_hal.c: RElease driver lock before
registering DBus callbacks
If the virDomainDefPtr object has an 'id' of -1, then forcably
set the VIR_DOMAIN_XML_INACTIVE flag to ensure generated XML
does not include any cruft from the previously running guest
such as console PTY path, or VNC port.
* src/conf/domain_conf.c: Set VIR_DOMAIN_XML_INACTIVE if
def->id is -1. Replace checks for def->id == -1 with
check against flags & VIR_DOMAIN_XML_INACTIVE.
The capng_lock() call sets the SECURE_NO_SETUID_FIXUP and SECURE_NOROOT
bits on the process. This prevents the kernel granting capabilities to
processes with an effective UID of 0, or with setuid programs. This is
not actually what we want in the container init process. It should be
allowed to run setuid processes & keep capabilities when root. All that
is required is masking a handful of dangerous capabilities from the
bounding set.
* src/lxc/lxc_container.c: Remove bogus capng_lock() call.
* src/security/virt-aa-helper.c: get_definition() now calls the new
caps_mockup() function which will parse the XML for os.type,
os.type.arch and then sets the wordsize. These attributes are needed
only to get a valid virCapsPtr for virDomainDefParseString(). The -H
and -b options are now removed from virt-aa-helper (they weren't used
yet anyway).
* tests/virt-aa-helper-test: extend and fixes tests, chmod'ed 755
uses libpciaccess to provide human readable names for PCI vendor and
device IDs
* configure.in: add a requirement for libpciaccess >= 0.10.0
* src/Makefile.am: add the associated compilation flags and link
* src/node_device/node_device_udev.c: lookup the libpciaccess for
vendor name and product name based on their ids
* configure.in src/Makefile.am: remove the configuration check and
build instructions
* src/node_device/node_device_devkit.c: removed the module
* src/node_device/node_device_driver.c src/node_device/node_device_driver.h:
removed references to the old backend
* src/conf/node_device_conf.h src/conf/node_device_conf.c: add specific
support for SCSI target in node device capabilities
* src/node_device/node_device_udev.c: add some extra detection code
when handling udev output
* configure.in: add new --with-udev, disabled by default, and requiring
libudev > 145
* src/node_device/node_device_udev.c src/node_device/node_device_udev.h:
the new node device backend
* src/node_device/node_device_linux_sysfs.c: moved node_device_hal_linux.c
to a better file name
* src/conf/node_device_conf.c src/conf/node_device_conf.h: add a couple
of fields in node device definitions, and an API to look them up,
remove a couple of unused fields from previous patch.
* src/node_device/node_device_driver.c src/node_device/node_device_driver.h:
plug the new driver
* po/POTFILES.in src/Makefile.am src/libvirt_private.syms: add the new
files and symbols
* src/util/util.h src/util/util.c: add a new convenience macro
virBuildPath and virBuildPathInternal() function
There is currently no way to determine the libvirt version of a remote
libvirtd we are connected to. This is a useful piece of data to enable
feature detection.
* src/xen/xen_driver.c: Add support for VIR_MIGRATE_PERSIST_DEST flag
* src/xen/xend_internal.c: Add support for VIR_MIGRATE_UNDEFINE_SOURCE flag
* include/libvirt/virterror.h, src/util/virterror.c: Add new errorcode
VIR_ERR_MIGRATE_PERSIST_FAILED
* src/conf/domain_conf.h src/conf/domain_conf.c: add the new entry in
the enum and lists of virDomainDiskBus
* src/qemu/qemu_conf.c: same for virDomainDiskQEMUBus
The xenstore database sometimes has stale domain IDs which are not
present in the hypervisor anymore. Filter these out to avoid causing
confusion
* src/xen/xs_internal.c: Filter domain IDs against HV's list
* src/xen/xen_hypervisor.h, src/xen/xen_hypervisor.c: Add new
xenHypervisorHasDomain() method for checking ID validity
The xenUnifiedNumOfDomains and xenUnifiedListDomains methods work
together as a pair, so it is critical they both apply the same
logic. With the current mis-matched logic it is possible to sometimes
get into a state when you miss certain active guests.
* src/xen/xen_driver.c: Change xenUnifiedNumOfDomains ordering to
match xenUnifiedListDomains.
When running qemu:///system instance, libvirtd runs as root,
but QEMU may optionally be configured to run non-root. When
then saving a guest to a state file, the file is initially
created as root, and thus QEMU cannot write to it. It is also
missing labelling required to allow access via SELinux.
* src/qemu/qemu_driver.c: Set ownership on save image before
running migrate command in virDomainSave impl. Call out to
security driver to set save image labelling
* src/security/security_driver.h: Add driver APIs for setting
and restoring saved state file labelling
* src/security/security_selinux.c: Implement saved state file
labelling for SELinux
Introduce a number of new APIs to expose some boolean properties
of objects, which cannot otherwise reliably determined, nor are
aspects of the XML configuration.
* virDomainIsActive: Checking virDomainGetID is not reliable
since it is not possible to distinguish between error condition
and inactive domain for ID of -1.
* virDomainIsPersistent: Check whether a persistent config exists
for the domain
* virNetworkIsActive: Check whether the network is active
* virNetworkIsPersistent: Check whether a persistent config exists
for the network
* virStoragePoolIsActive: Check whether the storage pool is active
* virStoragePoolIsPersistent: Check whether a persistent config exists
for the storage pool
* virInterfaceIsActive: Check whether the host interface is active
* virConnectIsSecure: whether the communication channel to the
hypervisor is secure
* virConnectIsEncrypted: whether any network based commnunication
channels are encrypted
NB, a channel can be secure, even if not encrypted, eg if it does
not involve the network, like a UNIX socket, or pipe.
* include/libvirt/libvirt.h.in: Define public API
* src/driver.h: Define internal driver API
* src/libvirt.c: Implement public API entry point
* src/libvirt_public.syms: Export API symbols
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/interface/netcf_driver.c, src/network/bridge_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub out driver tables
* src/libvirt.c src/lxc/lxc_conf.c src/lxc/lxc_container.c
src/lxc/lxc_controller.c src/node_device/node_device_hal.c
src/openvz/openvz_conf.c src/qemu/qemu_driver.c
src/qemu/qemu_monitor_text.c src/remote/remote_driver.c
src/storage/storage_backend_disk.c src/storage/storage_driver.c
src/util/logging.c src/xen/sexpr.c src/xen/xend_internal.c
src/xen/xm_internal.c: Steve Grubb <sgrubb@redhat.com> sent a code
review and those are the fixes correcting the problems
Some monitor commands may take a very long time to complete. It is
not desirable to block other incoming API calls forever. With this
change, if an existing API call is holding the job lock, additional
API calls will not wait forever. They will time out after a short
period of time, allowing application to retry later.
* include/libvirt/virterror.h, src/util/virterror.c: Add new
VIR_ERR_OPERATION_TIMEOUT error code
* src/qemu/qemu_driver.c: Change to a timed condition variable
wait for acquiring the monitor job lock
QEMU monitor commands may sleep for a prolonged period of time.
If the virDomainObjPtr or qemu driver lock is held this will
needlessly block execution of many other API calls. it also
prevents asynchronous monitor events from being dispatched
while a monitor command is executing, because deadlock will
ensure.
To resolve this, it is neccessary to release all locks while
executing a monitor command. This change introduces a flag
indicating that a monitor job is active, and a condition
variable to synchronize access to this flag. This ensures that
only a single thread can be making a state change or executing
a monitor command at a time, while still allowing other API
calls to be completed without blocking
* src/qemu/qemu_driver.c: Release driver and domain lock when
running monitor commands. Re-add locking to disk passphrase
callback
* src/qemu/THREADS.txt: Document threading rules
Change the QEMU monitor file handle watch to poll for both
read & write events, as well as EOF. All I/O to/from the
QEMU monitor FD is now done in the event callback thread.
When the QEMU driver needs to send a command, it puts the
data to be sent into a qemuMonitorMessagePtr object instance,
queues it for dispatch, and then goes to sleep on a condition
variable. The event thread sends all the data, and then waits
for the reply to arrive, putting the response / error data
back into the qemuMonitorMessagePtr and notifying the condition
variable.
There is a temporary hack in the disk passphrase callback to
avoid acquiring the domain lock. This avoids a deadlock in
the command processing, since the domain lock is still held
when running monitor commands. The next commit will remove
the locking when running commands & thus allow re-introduction
of locking the disk passphrase callback
* src/qemu/qemu_driver.c: Temporarily don't acquire lock in
disk passphrase callback. To be reverted in next commit
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Remove
raw I/O functions, and a generic qemuMonitorSend() for
invoking a command
* src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Remove all low level I/O, and use the new qemuMonitorSend()
API. Provide a qemuMonitorTextIOProcess() method for detecting
command/reply/prompt boundaries in the monitor data stream
Use ssh keyfiles from the current user's home directory instead of trying
to use keyfiles from a hardcoded /home/user directory. Fallback to
username/password authentication if keyfiles are not available or keyfile
authentication failed.
Add reference counting on the virDomainObjPtr objects. With the
forthcoming asynchronous QEMU monitor, it will be neccessary to
release the lock on virDomainObjPtr while waiting for a monitor
command response. It is neccessary to ensure one thread can't
delete a virDomainObjPtr while another is waiting. By introducing
reference counting threads can make sure objects they are using
are not accidentally deleted while unlocked.
* src/conf/domain_conf.h, src/conf/domain_conf.c: Add
virDomainObjRef/Unref APIs, remove virDomainObjFree
* src/openvz/openvz_conf.c: replace call to virDomainObjFree
with virDomainObjUnref
In preparation of the monitor I/O process becoming fully asynchronous,
it is neccessary to ensure all access to internals of the qemuMonitorPtr
object is protected by a mutex lock.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add mutex for locking
monitor.
* src/qemu/qemu_driver.c: Add locking around all monitor commands
Change the QEMU driver to not directly invoke the text mode monitor
APIs. Instead add a generic wrapper layer, which will eventually
invoke either the text or JSON protocol code as needed. Pass an
qemuMonitorPtr object into the monitor APIs instead of virDomainObjPtr
to complete the de-coupling of the monitor impl from virDomainObj
data structures
* src/qemu/qemu_conf.h: Remove qemuDomainObjPrivate definition
* src/qemu/qemu_driver.c: Add qemuDomainObjPrivate definition.
Pass qemuMonitorPtr into all monitor APIs instead of the
virDomainObjPtr instance.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add thin
wrappers for all qemuMonitorXXX command APIs, calling into
qemu_monitor_text.c/h
* src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Rename qemuMonitor -> qemuMonitorText & update to accept
qemuMonitorPtr instead of virDomainObjPtr
Decouple the monitor code from the virDomainDefPtr structure
by moving the disk encryption lookup code back into the
qemu_driver.c file. Instead provide a function callback to
the monitor code which can be invoked to retrieve encryption
data as required.
* src/qemu/qemu_driver.c: Add findDomainDiskEncryption,
and findVolumeQcowPassphrase. Pass address of the method
findVolumeQcowPassphrase into qemuMonitorOpen()
* src/qemu/qemu_monitor.c: Associate a disk
encryption function callback with the qemuMonitorPtr
object.
* src/qemu/qemu_monitor_text.c: Remove findDomainDiskEncryption
and findVolumeQcowPassphrase.
Introduce a new qemuDomainObjPrivate object which is used to store
the private QEMU specific data associated with each virDomainObjPtr
instance. This contains a single member, an instance of the new
qemuMonitorPtr object which encapsulates the QEMU monitor state.
The internals of the latter are private to the qemu_monitor* files,
not to be shown to qemu_driver.c
* src/qemu/qemu_conf.h: Definition of qemuDomainObjPrivate.
* src/qemu/qemu_driver.c: Register a functions for creating
and freeing qemuDomainObjPrivate instances with the domain
capabilities. Remove the qemudDispatchVMEvent() watch since
I/O watches are now handled by the monitor code itself. Pass
a new qemuHandleMonitorEOF() callback into qemuMonitorOpen
to allow notification when the monitor quits.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Introduce
the 'qemuMonitor' object. Temporarily add new APIs
qemuMonitorWrite, qemuMonitorRead, qemuMonitorWaitForInput
to allow text based monitor impl to perform I/O.
* src/qemu/qemu_monitor_text.c: Call APIs for reading/writing
to monitor instead of accessing the file handle directly.
The qemu_driver.c code should not contain any code that interacts
with the QEMU monitor at a low level. A previous commit moved all
the command invocations out. This change moves out the code which
actually opens the monitor device.
* src/qemu/qemu_driver.c: Remove qemudOpenMonitor & methods called
from it.
* src/Makefile.am: Add qemu_monitor.{c,h}
* src/qemu/qemu_monitor.h: Add qemuMonitorOpen()
* src/qemu/qemu_monitor.c: All code for opening the monitor
* src/util/threads-pthread.c: pthreads APIs do not set errno, instead
the return value is the positive errno. Set errno based on the return
value in the wrappers
* src/util/pci.c, src/util/pci.h: Make the pciDeviceList struct
opaque to callers of the API. Add accessor methods for managing
devices in the list
* src/qemu/qemu_driver.c: Update to use APIs instead of directly
accessing pciDeviceList fields
As it was basically unimplemented and more confusing than useful
at the moment.
* src/libvirt_private.syms: remove from internal symbols list
* src/qemu/qemu_bridge_filter.c src/util/ebtables.c: remove code and
one use of the unimplemented function
In case of an error the domains hash and the driver mutex may leak.
* src/opennebula/one_driver.c: free/destroy domains hash and driver
mutex in error cases
- Make reading ID from file working for IDs > 127
- Fix inverse error check for writing ID to file
- Use feof() to distinguish EOF from real error of fread()
- Don't interpret libssh2 error codes as number of bytes
phypNumDomainsGeneric() and phypListDomainsGeneric() return 0 in case
of an error. This makes it impossible to distinguish between an actual
error and no domains being defined on the hypervisor. It also turn the
no domains situation into an error. Return -1 in case of an error to
fix this problem.
* src/network/bridge_driver.c: when exec'ing dnsmaq, if there are
DHCP ranges defined, then compute and pass the --dhcp-lease-max
deriving the maximum number of leases
* src/conf/network_conf.h: extend the structure to store the range
* src/conf/network_conf.c: before adding a range parse the IP addresses
do some checking and keep the size
* src/internal.h (ATTRIBUTE_SENTINEL): New, it's a ggc feature and
protected as such
* src/util/buf.c (virBufferStrcat): Use it.
* src/util/ebtables.c (ebtablesAddRemoveRule): Use it.
* src/util/iptables.c (iptableAddRemoveRule: Use it.
* src/util/qparams.h (new_qparam_set, append_qparams): Use it.
* docs/apibuild.py: avoid breaking the API generator with that new
internal keyword macro
* include/libvirt/virterror.h src/util/virterror.c: add a new error
VIR_ERR_CONFIG_UNSUPPORTED for valid but unsupported configuration options
* src/conf/domain_conf.c: Throw an error if guestfwd address isn't IPv4
and cleanup a number of parsing return error values.
allows the following to be specified in a domain:
<channel type='pipe'>
<source path='/tmp/guestfwd'/>
<target type='guestfwd' address='10.0.2.1' port='4600'/>
</channel>
* proxy/Makefile.am: add network.c as dep of domain_conf.c
* docs/schemas/domain.rng src/conf/domain_conf.[ch]: extend the domain
schemas and the parsing/serialization side for the new construct
QEmu support will add the following on the qemu command line:
-chardev pipe,id=channel0,path=/tmp/guestfwd
-net user,guestfwd=tcp:10.0.2.1:4600-chardev:channel0
* src/qemu/qemu_conf.c: Add argument output for channel
* tests/qemuxml2(argv|xml)test.c: Add test for <channel> domain syntax
A character device's target (it's interface in the guest) had only a
single property: port. This patch is in preparation for adding targets
which require other properties.
Since this changes the conf type for character devices this affects
a number of drivers:
* src/conf/domain_conf.[ch] src/esx/esx_vmx.c src/qemu/qemu_conf.c
src/qemu/qemu_driver.c src/uml/uml_conf.c src/uml/uml_driver.c
src/vbox/vbox_tmpl.c src/xen/xend_internal.c src/xen/xm_internal.c:
target properties are moved into a union in virDomainChrDef, and a
targetType field is added to identify which union member should be
used. All current code which touches a virDomainChrDef is updated both
to use the new union field, and to populate targetType if necessary.
Current implementation of lxc driver creates vethN named
interface(s) in the host and passes as it is to a container.
The reason why it doesn't use ethN is due to the limitation
that one namespace cannot have multiple iterfaces that have
an identical name so that we give up creating ethN named
interface in the host for the container.
However, we should be able to allow the container to have
ethN by changing the name after clone(CLONE_NEWNET).
* src/lxc/lxc_container.c src/lxc/veth.c src/lxc/veth.h: do the clone
and then renames interfaces eth0 ... ethN to keep the interface names
familiar in the domain