The current virDomainAttachDevice API can be (ab)used to change
the media of an existing CDROM/Floppy device. Going forward there
will be more devices that can be configured on the fly and overloading
virDomainAttachDevice for this is not too pleasant. This patch adds
a new virDomainUpdateDeviceFlags() explicitly just for modifying
existing devices.
* include/libvirt/libvirt.h.in: Add virDomainUpdateDeviceFlags
* src/driver.h: Internal API for virDomainUpdateDeviceFlags
* src/libvirt.c, src/libvirt_public.syms: Glue public API to
driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c, src/uml/uml_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Add
stubs for new driver entry point
This introduces a new event type
VIR_DOMAIN_EVENT_ID_GRAPHICS
The same event can be emitted in 3 scenarios
typedef enum {
VIR_DOMAIN_EVENT_GRAPHICS_CONNECT = 0,
VIR_DOMAIN_EVENT_GRAPHICS_INITIALIZE,
VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT,
} virDomainEventGraphicsPhase;
Connect/disconnect are triggered at socket accept/close.
The initialize phase is immediately after the protocol
setup and authentication has completed. ie when the
client is authorized and about to start interacting with
the graphical desktop
This event comes with *a lot* of potential information
- IP address, port & address family of client
- IP address, port & address family of server
- Authentication scheme (arbitrary string)
- Authenticated subject identity. A subject may have
multiple identities with some authentication schemes.
For example, vencrypt+sasl results in a x509dname
and saslUsername identities.
This results in a very complicated callback :-(
typedef enum {
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4,
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV6,
} virDomainEventGraphicsAddressType;
struct _virDomainEventGraphicsAddress {
int family;
const char *node;
const char *service;
};
typedef struct _virDomainEventGraphicsAddress virDomainEventGraphicsAddress;
typedef virDomainEventGraphicsAddress *virDomainEventGraphicsAddressPtr;
struct _virDomainEventGraphicsSubject {
int nidentity;
struct {
const char *type;
const char *name;
} *identities;
};
typedef struct _virDomainEventGraphicsSubject virDomainEventGraphicsSubject;
typedef virDomainEventGraphicsSubject *virDomainEventGraphicsSubjectPtr;
typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn,
virDomainPtr dom,
int phase,
virDomainEventGraphicsAddressPtr local,
virDomainEventGraphicsAddressPtr remote,
const char *authScheme,
virDomainEventGraphicsSubjectPtr subject,
void *opaque);
The wire protocol is similarly complex
struct remote_domain_event_graphics_address {
int family;
remote_nonnull_string node;
remote_nonnull_string service;
};
const REMOTE_DOMAIN_EVENT_GRAPHICS_IDENTITY_MAX = 20;
struct remote_domain_event_graphics_identity {
remote_nonnull_string type;
remote_nonnull_string name;
};
struct remote_domain_event_graphics_msg {
remote_nonnull_domain dom;
int phase;
remote_domain_event_graphics_address local;
remote_domain_event_graphics_address remote;
remote_nonnull_string authScheme;
remote_domain_event_graphics_identity subject<REMOTE_DOMAIN_EVENT_GRAPHICS_IDENTITY_MAX>;
};
This is currently implemented in QEMU for the VNC graphics
protocol, but designed to be usable with SPICE graphics in
the future too.
* daemon/remote.c: Dispatch graphics events to client
* examples/domain-events/events-c/event-test.c: Watch for
graphics events
* include/libvirt/libvirt.h.in: Define new graphics event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle graphics events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for VNC events and emit a libvirt graphics event
* src/remote/remote_driver.c: Receive and dispatch graphics
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
graphics events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for VNC_CONNECTED,
VNC_INITIALIZED & VNC_DISCONNETED events from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_IO_ERROR
This event includes the action that is about to be taken
as a result of the watchdog triggering
typedef enum {
VIR_DOMAIN_EVENT_IO_ERROR_NONE = 0,
VIR_DOMAIN_EVENT_IO_ERROR_PAUSE,
VIR_DOMAIN_EVENT_IO_ERROR_REPORT,
} virDomainEventIOErrorAction;
In addition it has the source path of the disk that had the
error and its unique device alias. It does not include the
target device name (/dev/sda), since this would preclude
triggering IO errors from other file backed devices (eg
serial ports connected to a file)
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *srcPath,
const char *devAlias,
int action,
void *opaque);
This is currently wired up to the QEMU block IO error events
* daemon/remote.c: Dispatch IO error events to client
* examples/domain-events/events-c/event-test.c: Watch for
IO error events
* include/libvirt/libvirt.h.in: Define new IO error event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle IO error events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block IO errors and emit a libvirt IO error event
* src/remote/remote_driver.c: Receive and dispatch IO error
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
IO error events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_IO_ERROR event
from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_WATCHDOG
This event includes the action that is about to be taken
as a result of the watchdog triggering
typedef enum {
VIR_DOMAIN_EVENT_WATCHDOG_NONE = 0,
VIR_DOMAIN_EVENT_WATCHDOG_PAUSE,
VIR_DOMAIN_EVENT_WATCHDOG_RESET,
VIR_DOMAIN_EVENT_WATCHDOG_POWEROFF,
VIR_DOMAIN_EVENT_WATCHDOG_SHUTDOWN,
VIR_DOMAIN_EVENT_WATCHDOG_DEBUG,
} virDomainEventWatchdogAction;
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventWatchdogCallback)(virConnectPtr conn,
virDomainPtr dom,
int action,
void *opaque);
* daemon/remote.c: Dispatch watchdog events to client
* examples/domain-events/events-c/event-test.c: Watch for
watchdog events
* include/libvirt/libvirt.h.in: Define new watchdg event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle watchdog events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for watchdogs and emit a libvirt watchdog event
* src/remote/remote_driver.c: Receive and dispatch watchdog
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
watchdog events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for WATCHDOG event
from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_RTC_CHANGE
This event includes the new UTC offset measured in seconds.
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventRTCChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
long long utcoffset,
void *opaque);
If the guest XML configuration for the <clock> is set to
offset='variable', then the XML will automatically be
updated with the new UTC offset value. This ensures that
during migration/save/restore the new offset is preserved.
* daemon/remote.c: Dispatch RTC change events to client
* examples/domain-events/events-c/event-test.c: Watch for
RTC change events
* include/libvirt/libvirt.h.in: Define new RTC change event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle RTC change events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for RTC changes and emit a libvirt RTC change event
* src/remote/remote_driver.c: Receive and dispatch RTC change
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
RTC change events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for RTC_CHANGE event
from QEMU monitor
The reboot event is not a normal lifecycle event, since the
virtual machine on the host does not change state. Rather the
guest OS is resetting the virtual CPUs. ie, the QEMU process
does not restart. Thus, this does not belong in the current
lifecycle events callback.
This introduces a new event type
VIR_DOMAIN_EVENT_ID_REBOOT
It takes no parameters, besides the virDomainPtr, so it can
use the generic callback signature.
* daemon/remote.c: Dispatch reboot events to client
* examples/domain-events/events-c/event-test.c: Watch for
reboot events
* include/libvirt/libvirt.h.in: Define new reboot event ID
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle reboot events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for reboots and emit a libvirt reboot event
* src/remote/remote_driver.c: Receive and dispatch reboot
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
reboot events
The libvirtd daemon impl will need to switch over to using the
new event APIs. To make this simpler, ensure all drivers currently
providing events support both the new APIs and old APIs.
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c, src/test/test_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Implement the new
virConnectDomainEvent(Dereg|Reg)isterAny driver entry points
The internal domain events APIs are designed to handle the lifecycle
events. This needs to be refactored to allow arbitrary new event
types to be handled.
* The signature of virDomainEventDispatchFunc changes to use
virConnectDomainEventGenericCallback instead of the lifecycle
event specific virConnectDomainEventCallback
* Every registered callback gains a unique ID to allow its
removal based on ID, instead of function pointer
* Every registered callback gains an 'eventID' to allow callbacks
for different types of events to be distinguished
* virDomainEventDispatch is adapted to filter out callbacks
whose eventID does not match the eventID of the event being
dispatched
* virDomainEventDispatch is adapted to filter based on the
domain name and uuid, if this filter is set for a callback.
* virDomainEvent type/detail fields are moved into a union to
allow different data fields for other types of events to be
added later
* src/conf/domain_event.h, src/conf/domain_event.c: Refactor
to allow handling of different types of events
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/xen/xen_driver.c: Change dispatch function signature
to use virConnectDomainEventGenericCallback
The current API for domain events has a number of problems
- Only allows for domain lifecycle change events
- Does not allow the same callback to be registered multiple times
- Does not allow filtering of events to a specific domain
This introduces a new more general purpose domain events API
typedef enum {
VIR_DOMAIN_EVENT_ID_LIFECYCLE = 0, /* virConnectDomainEventCallback */
...more events later..
}
int virConnectDomainEventRegisterAny(virConnectPtr conn,
virDomainPtr dom, /* Optional, to filter */
int eventID,
virConnectDomainEventGenericCallback cb,
void *opaque,
virFreeCallback freecb);
int virConnectDomainEventDeregisterAny(virConnectPtr conn,
int callbackID);
Since different event types can received different data in the callback,
the API is defined with a generic callback. Specific events will each
have a custom signature for their callback. Thus when registering an
event it is neccessary to cast the callback to the generic signature
eg
int myDomainEventCallback(virConnectPtr conn,
virDomainPtr dom,
int event,
int detail,
void *opaque)
{
...
}
virConnectDomainEventRegisterAny(conn, NULL,
VIR_DOMAIN_EVENT_ID_LIFECYCLE,
VIR_DOMAIN_EVENT_CALLBACK(myDomainEventCallback)
NULL, NULL);
The VIR_DOMAIN_EVENT_CALLBACK() macro simply does a "bad" cast
to the generic signature
* include/libvirt/libvirt.h.in: Define new APIs for registering
domain events
* src/driver.h: Internal driver entry points for new events APIs
* src/libvirt.c: Wire up public API to driver API for events APIs
* src/libvirt_public.syms: Export new APIs
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c,
src/xenapi/xenapi_driver.c: Stub out new API entries
"virsh dominfo <vm>" crashes if there's no primary security driver set
since we only intialize the secmodel.model and secmodel.doi if we have
one. Attached patch checks for securityPrimaryDriver instead of
securityDriver since the later is always set in qemudSecurityInit().
Closes: http://bugs.debian.org/574359
Attempt to turn on vhost-net mode for devices of type NETWORK, BRIDGE,
and DIRECT (macvtap).
* src/qemu/qemu_conf.h: add vhostfd to qemuBuildHostNetStr prototype
add qemudOpenVhostNet prototype new flag to set when :,vhost=" found in
qemu help
* src/qemu/qemu_conf.c: * set QEMUD_CMD_FLAG_VNET_HOST is ",vhost=" found
in qemu help
- qemudOpenVhostNet - opens /dev/vhost-net to pass to qemu if everything
is in place to use it.
- qemuBuildHostNetStr - add vhostfd to commandline if it's not empty
(higher levels decide whether or not to fill it in)
- qemudBuildCommandLine - if /dev/vhost-net is successfully opened, add
its fd to tapfds array so it isn't closed on qemu exec, and populate
vhostfd_name to be passed in to commandline builder.
* src/qemu/qemu_driver.c: add filler 0 for new arg to qemuBuildHostNetStr,
along with a note that this must be implemented in order for hot-plug of
vhost-net virtio devices to work properly (once qemu "netdev_add" monitor
command is implemented).
Currently no command can be sent to a qemu process while another job is
active. This patch adds support for signaling long-running jobs (such as
migration) so that other threads may request predefined operations to be
done during such jobs. Two signals are defined so far:
- QEMU_JOB_SIGNAL_CANCEL
- QEMU_JOB_SIGNAL_SUSPEND
The first one is used by qemuDomainAbortJob.
The second one is used by qemudDomainSuspend for suspending a domain
during migration, which allows for changing live migration into offline
migration. However, there is a small issue in the way qemudDomainSuspend
is currently implemented for migrating domains. The API calls returns
immediately after signaling migration job which means it is asynchronous
in this specific case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
* src/qemu/qemu_driver.c (qemudDomainAttachSCSIDisk): The ".controller"
member is an index, and *may* be 0. As such, the commit that we're
reverting broke SCSI disk hot-plug on controller 0.
Reported by Wolfgang Mauerer.
We need to call PrepareHostdevs to determine the USB device path before
any security calls. PrepareHostUSBDevices was also incorrectly skipping
all USB devices.
Currently if you dump the core of a qemu guest with
qemudDomainCoreDump, subsequent commands will hang
up libvirtd. This is because qemudDomainCoreDump
uses qemuDomainWaitForMigrationComplete, which expects
the qemuDriverLock to be held when it's called. This
patch does the simple thing and moves the qemuDriveUnlock
to the end of the qemudDomainCoreDump so that the driver
lock is held for the entirety of the call (as it is done
in qemudDomainSave). We will probably want to make the
lock more fine-grained than that in the future, but
we can fix both qemudDomainCoreDump and qemudDomainSave
at the same time.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The code to add job support into libvirtd caused a problem
in qemudDomainSetVcpus. In particular, a qemuDomainObjEndJob()
call was added at the end of the function, but a
corresponding qemuDomainObjBeginJob() was not. Additionally,
a call to qemuDomainObj{Enter,Exit}Monitor() was also missed
in qemudDomainHotplugVcpus(). These missing calls conspired to
cause a hang in the libvirtd process after the command was
finished. Fix this by adding the missing calls.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
As previously discussed[1], this patch removes the
qemudDomainSetMaxMemory() function, since it doesn't
work. This means that instead of getting somewhat
cryptic errors, you will now get:
error: Unable to change MaxMemorySize
error: this function is not supported by the hypervisor: virDomainSetMaxMemory
Which describes the situation perfectly.
[1] https://www.redhat.com/archives/libvir-list/2010-February/msg00928.html
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When adding domainMemoryStats API support for the qemu driver, I didn't
follow the locking rules exactly. The job condition must be held when
executing monitor commands. This corrects the segfaults I was seeing
when calling domainMemoryStats in a multi-threaded environment.
* src/qemu/qemu_driver.c: in qemudDomainMemoryStats() add missing
calls to qemuDomainObjBeginJob/qemuDomainObjEndJob
doTunnelSendAll function (used by QEMU migration) uses a 64k buffer on
the stack, which could be problematic. This patch replaces that with a
buffer from the heap.
While in the neighborhood, this patch also improves error reporting in
the case that saferead fails - previously, virStreamAbort() was called
(resetting errno) before reporting the error. It's been changed to
report the error first.
* src/qemu/qemu_driver.c: fix doTunnelSendAll() to use a malloc'ed
buffer
* src/qemu/qemu_driver.c (qemudDomainAttachSCSIDisk): Handle
the (theoretical) case of an empty controller list, so that
clang does not think the subsequent dereference of "cont"
would dereference an undefined variable (due to preceding
loop not iterating even once).
* src/qemu/qemu_driver.c (qemudDomainRestore): A corrupt save file
(in particular, a too-large header.xml_len value) would cause an
unwarranted out-of-memory error. Do not trust the just-read
header.xml_len. Instead, merely use that as a hint, and
read/allocate up to that number of bytes from the file.
Also verify that header.xml_len is positive; if it were negative,
passing it to virFileReadLimFD could cause trouble.
Changeset
commit 5073aa994a
Author: Cole Robinson <crobinso@redhat.com>
Date: Mon Jan 11 11:40:46 2010 -0500
Added support for product/vendor based passthrough, but it only
worked at the security driver layer. The main guest XML config
was not updated with the resolved bus/device ID. When the QEMU
argv refactoring removed use of product/vendor, this then broke
launching guests.
THe solution is to move the product/vendor resolution up a layer
into the QEMU driver. So the first thing QEMU does is resolve
the product/vendor to a bus/device and updates the XML config
with this info. The rest of the code, including security drivers
and QEMU argv generated can now rely on bus/device always being
set.
* src/util/hostusb.c, src/util/hostusb.h: Split vendor/product
resolution code out of usbGetDevice and into usbFindDevice.
Add accessors for bus/device ID
* src/security/virt-aa-helper.c, src/security/security_selinux.c,
src/qemu/qemu_security_dac.c: Remove vendor/product from the
usbGetDevice() calls
* src/qemu/qemu_driver.c: Use usbFindDevice to resolve vendor/product
into a bus/device ID
The pci_del command is not being ported to QMP. Convert all the
QEMU hotplug code over to use device_del whenever it is available
to avoid the pci_del problem
* src/qemu/qemu_driver.c: Convert unplug code to device_del
Previously hot-unplug could not be supported for USB devices
in QEMU, since usb_del required the guest visible address
which libvirt never knows. With 'device_del' command we can
now unplug based on device alias, so support that.
* src/qemu/qemu_driver.c: Use device_del to remove USB devices
QEMU has a monitor command 'set_cpu' which allows a specific
CPU to be toggled between online& offline state. libvirt CPU
hotplug does not work in terms of individual indexes CPUs.
Thus to support this, we iteratively toggle the online state
when the total number of vCPUs is adjusted via libvirt
NB, currently untested since QEMU segvs when running this!
* src/qemu/qemu_driver.c: Toggle online state for CPUs when
doing hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
monitor API for toggling a CPU's online status via 'set_cpu
The code to remove the cgroup after QEMU failed to startup could
be obscuring a real error from earlier on. It is not neccessary
to raise an error in this case, so tell cgroups to keep quiet
* src/qemu/qemu_driver.c: Don't raise cgroups error in QEMU start
cleanup code.
The QEMU hotunplug code for PCI devices was looking at host
devices in the guest config without first filtering non
PCI devices. This means it was reading garbage
* src/qemu/qemu_driver.c: Filter out non-PCI devices
Commit 3c12a67b76 added
a dependency on the NFS_SUPER_MAGIC macro, which is
defined in linux/magic.h. Unfortunately linux/magic.h
is not available in RHEL-5, and causes a compile error.
Just define it locally, since this is something that
can't change.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Move *all* file operations related to creation and writing of libvirt
header to the domain save file into a hook function that is called by
virFileOperation. First try to call virFileOperation as root. If that
fails with EACCESS, and (in the case of Linux) statfs says that we're
trying to save the file on an NFS share, rerun virFileOperation,
telling it to fork a child process and setuid to the qemu user. This
is the only way we can successfully create a file on a root-squashed
NFS server.
This patch (along with setting dynamic_ownership=0 in qemu.conf)
makes qemudDomainSave work on root-squashed NFS.
* src/qemu/qemu_driver.c: provide new qemudDomainSaveFileOpHook()
utility, use it in qemudDomainSave() if normal creation of the
file as root failed, and after checking the filesystem type for
the storage is NFS. In that case we also bypass the security
driver, as this would fail on NFS.
If qemudDomainRestore fails to open the domain save file, create a
pipe, then fork a process that does setuid(qemu_user) and opens the
file, then reads this file and stuffs it into the pipe. the parent
libvirtd process will use the other end of the pipe as its fd, then
reap the child process after it's done reading.
This makes domain restore work on a root-squash NFS share that is only
visible to the qemu user.
* src/qemu/qemu_driver.c: add new qemudOpenAsUID() helper function,
and use it in qemudDomainRestore() if reading as root directly failed.
The USB/PCI device hotplug code for the QEMU driver was forgetting
to allocate a unique device alias.
* src/qemu/qemu_driver.c: Fill in device alias for USB/PCI devices
When a VM save attempt failed, the VM would be left in a paused
state. It is neccessary to resume CPU execution upon failure
if it was running originally
* src/qemu/qemu_driver.c: Resume CPUs upon save failure
This supports cancellation of jobs for the QEMU driver against
the virDomainMigrate, virDomainSave and virDomainCoreDump APIs.
It is not yet supported for the virDomainRestore API, although
it is desirable.
* src/qemu/qemu_driver.c: Issue 'migrate_cancel' command if
virDomainAbortJob is issued during a migration operation
* tools/virsh.c: Add a domjobabort command
This provides the internal glue for the driver API
* src/driver.h: Internal API contract
* src/libvirt.c, src/libvirt_public.syms: Connect public API
to driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub out entry points
Introduce support for virDomainGetJobInfo in the QEMU driver. This
allows for monitoring of any API that uses the 'info migrate' monitor
command. ie virDomainMigrate, virDomainSave and virDomainCoreDump
Unfortunately QEMU does not provide a way to monitor incoming migration
so we can't wire up virDomainRestore yet.
The virsh tool gets a new command 'domjobinfo' to query status
* src/qemu/qemu_driver.c: Record virDomainJobInfo and start time
in qemuDomainObjPrivatePtr objects. Add generic shared handler
for calling 'info migrate' with all migration based APIs.
* src/qemu/qemu_monitor_text.c: Fix parsing of 'info migration' reply
* tools/virsh.c: add new 'domjobinfo' command to query progress
The internal glue layer for the new pubic API
* src/driver.h: Define internal driver API contract
* src/libvirt.c, src/libvirt_public.syms: Wire up public
API to internal driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub new entry point
when the underlying qemu supports the drive/device model and the
controller has been added this way.
* src/qemu/qemu_driver.c: use qemuMonitorDelDevice() when detaching
PCI controller and if supported
* src/qemu/qemu_monitor.[ch]: add new qemuMonitorDelDevice() function
* src/qemu/qemu_monitor_json.[ch]: JSON backend for DelDevice command
* src/qemu/qemu_monitor_text.[ch]: Text backend for DelDevice command
* src/qemu/qemu_driver.c: in qemudDomainDetachPciControllerDevice()
when a controller is not present in the system anymore, the PCI
address must be deleted from libvirt's hashtable because it can
be re-used for other purposes.
* src/qemu/qemu_driver.c: in qemudDomainAttachDevice(), one must not
delete the data part when the operation succeeds because it is
required later on. The correct pattern to handlethe parsed
representation of the device information on success
is dev->data.controller = NULL; virDomainDeviceDefFree(dev);,
which leaves the structure pointed at by data in memory.
If the hostname as returned by "gethostname" resolves
to "localhost" (as it does with the broken Fedora-12
installer), then live migration will fail because the
source will try to migrate to itself. Detect this
situation up-front and abort the live migration before
we do any real work.
* src/util/util.h src/util/util.c: add a new virGetHostnameLocalhost
with an optional localhost check, and rewire virGetHostname() to use
it
* src/libvirt_private.syms: expose the new function
* src/qemu/qemu_driver.c: use it in qemudDomainMigratePrepare2()
This patch sets or unsets the IFF_VNET_HDR flag depending on what device
is used in the VM. The manipulation of the flag is done in the open
function and is only fatal if the IFF_VNET_HDR flag could not be cleared
although it has to be (or if an ioctl generally fails). In that case the
macvtap tap is closed again and the macvtap interface torn.
* src/qemu/qemu_conf.c src/qemu/qemu_conf.h: pass qemuCmdFlags to
qemudPhysIfaceConnect()
* src/util/macvtap.c src/util/macvtap.h: add vnet_hdr boolean to
openMacvtapTap(), and private function configMacvtapTap()
* src/qemu/qemu_driver.c: add extra qemuCmdFlags when calling
qemudPhysIfaceConnect()
Rework and simplification of teardown of the macvtap device.
Basically all devices with the same MAC address and link device are kept
alive and not attempted to be torn down. If a macvtap device linked to a
physical interface with a certain MAC address 'M' is to be created it
will automatically fail if the interface is 'up'ed and another macvtap
with the same properties (MAC addr 'M', link dev) happens to be 'up'.
This will prevent the VM from starting or the device from being attached
to a running VM. Stale interfaces are assumed to be there for some
reason and not stem from libvirt.
In the VM shutdown path, it's assuming that an interface name is always
available so that if the device type is DIRECT it can be torn down
using its name.
* src/util/macvtap.h src/libvirt_macvtap.syms: change of deleting routine
* src/util/macvtap.c: cleanups and change of deleting routine
* src/qemu/qemu_driver.c: change cleanup on shutdown
* src/qemu/qemu_conf.c: don't delete Macvtap in qemudPhysIfaceConnect()
Similar to the Set*Mem commands, this implementation was bogus and
misleading. Make it clear this is a hotplug only operation, and that the
hotplug piece isn't even implemented.
Also drop the overkill maxvcpus validation: we don't perform this check
at XML define time so clearly no one is missing it, and there is
always the risk that our info will be out of date, possibly preventing
legitimate CPU values.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
SetMem and SetMaxMem are hotplug only APIs, any persistent config
changes are supposed to go via XML definition. The original implementation
of these calls were incorrect and had the nasty side effect of making
a psuedo persistent change that would be lost after libvirtd restart
(I didn't know any better).
Fix these APIs to rightly reject non running domains.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
When in JSON mode, QEMU requires that 'qmp_capabilities' is run as
the first command in the monitor. This is a no-op when run in the
text mode monitor
* src/qemu/qemu_driver.c: Run capabilities negotiation when
connecting to the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
support for the 'qmp_capabilities' command, no-op in text mode.
This part adds support for qemu making a macvtap tap device available
via file descriptor passed to qemu command line. This also attempts to
tear down the macvtap device when a VM terminates. This includes support
for attachment and detachment to/from running VM.
* src/qemu/qemu_conf.[ch] src/qemu/qemu_driver.c: add support in the
QEmu driver
Current PCI addresses are allocated at time of VM startup.
To make them truely persistent, it is neccessary to do this
at time of virDomainDefine/virDomainCreate. The code in
qemuStartVMDaemon still remains in order to cope with upgrades
from older libvirt releases
* src/qemu/qemu_driver.c: Rename existing qemuAssignPCIAddresses
to qemuDetectPCIAddresses. Add new qemuAssignPCIAddresses which
does auto-allocation upfront. Call qemuAssignPCIAddresses from
qemuDomainDefine and qemuDomainCreate to assign PCI addresses that
can then be persisted. Don't clear PCI addresses at shutdown if
they are intended to be persistent
The old text mode monitor prompts for a password when disks are
encrypted. This interactive approach doesn't work for JSON mode
monitor. Thus there is a new 'block_passwd' command that can be
used.
* src/qemu/qemu_driver.c: Split out code for looking up a disk
secret from findVolumeQcowPassphrase, into a new method
getVolumeQcowPassphrase. Enhance qemuInitPasswords() to also
set the disk encryption password via the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for the 'block_passwd' monitor command.
Since c26cb9234f, the dname
parameter has been ignored by these two functions. Use it.
* src/qemu/qemu_driver.c (qemudDomainMigratePrepareTunnel): Honor dname
parameter once again.
(qemudDomainMigratePrepare2): Likewise.
Currently the timeout for reading startup output is 3 seconds. If the
host is under any sort of load, we can easily trigger this. Lets bump
it to 30 seconds.
Since the polling loop checks to see if the process has died, we shouldn't
erroneously hit this timeout if qemu bombs (only if it is stuck in some
infinite loop).
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in cpu_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in node_device_conf.{h,c} and update all callers to
match
The QEMU flags are commonly stored as a signed or unsigned int,
allowing only 31 flags. This limit is rather close, so to aid
future patches, change it to a 64-bit int
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h, src/qemu/qemu_driver.c,
tests/qemuargv2xmltest.c, tests/qemuhelptest.c, tests/qemuxml2argvtest.c:
Use 'unsigned long long' for QEMU flags
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in security_driver.{h,c} and update all callers to
match
The security driver was mistakenly initialized before the QEMU
config file was loaded. This prevents it being turned off again.
The capabilities XML was also getting the wrong security driver
name, due to the stacked driver arrangement.
* src/qemu/qemu_driver.c: Fix initialization order and capabilities
model name
If the primary security driver (SELinux/AppArmour) was disabled
then the secondary QEMU DAC security driver was also disabled.
This is mistaken, because the latter must be active at all times
* src/qemu/qemu_driver.c: Ensure DAC driver is always active
To allow devices to be hot(un-)plugged it is neccessary to ensure
they all have a unique device aliases. This fixes the hotplug
methods to assign device aliases before invoking the monitor
commands which need them
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Expose methods
for assigning device aliases for disks, host devices and
controllers
* src/qemu/qemu_driver.c: Assign device aliases when hotplugging
all types of device
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-pci-address-device.args,
tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-address-device.args:
Update for changed hostdev naming scheme
This patch re-arranges the QEMU device alias assignment code to
make it easier to call into the same codeblock when performing
device hotplug. The new code has the ability to skip over already
assigned names to facilitate hotplug
* src/qemu/qemu_driver.c: Call qemuAssignDeviceNetAlias()
instead of qemuAssignNetNames
* src/qemu/qemu_conf.h: Export qemuAssignDeviceNetAlias()
instead of qemuAssignNetNames
* src/qemu/qemu_driver.c: Merge the legacy disk/network alias
assignment code into the main methods
The current way of assigning names to the host network backend and
NIC device in QEMU was over complicated, by varying naming scheme
based on the NIC model and backend type. This simplifies the naming
to simply be 'net0' and 'hostnet0', allowing code to easily determine
the host network name and vlan based off the primary device alias
name 'net0'. This in turn allows removal of alot of QEMU specific
code from the XML parser, and makes it easier to assign new unique
names for NICs that are hotplugged
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove hostnet_name
and vlan fields from virNetworkDefPtr
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h, src/qemu/qemu_driver.c:
Use a single network alias naming scheme regardless of NIC type
or backend type. Determine VLANs from the alias name.
* tests/qemuxml2argvdata/qemuxml2argv-net-eth-names.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-device.args,
tests/qemuxml2argvdata/qemuxml2argv-net-virtio-netdev.args: Update
for new simpler naming scheme
PCI disk, disk controllers, net devices and host devices need to
have PCI addresses assigned before they are hot-plugged
* src/qemu/qemu_conf.c: Add APIs for ensuring a device has an
address and releasing unused addresses
* src/qemu/qemu_driver.c: Ensure all devices have addresses
when hotplugging.
The current QEMU code allocates PCI addresses incrementally starting
at 4. This is not satisfactory because the user may have given some
addresses in their XML config, which need to be skipped over when
allocating addresses to remaining devices.
It is thus neccessary to maintain a list of already allocated PCI
addresses and then only allocate ones that remain unused. This is
also required for domain device hotplug to work properly later.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add APIs for creating
list of existing PCI addresses, and allocating new addresses.
Refactor address assignment to use this code
* src/qemu/qemu_driver.c: Pull PCI address assignment up into the
qemuStartVMDaemon() method, as a prelude to moving it into the
'define' method. Update list of allocated addresses when connecting
to a running VM at daemon startup.
* tests/qemuxml2argvtest.c, tests/qemuargv2xmltest.c,
tests/qemuxml2xmltest.c: Remove USB product test since all
passthrough is done based on address
* tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-product.args,
tests/qemuxml2argvdata/qemuxml2argv-hostdev-usb-product.xml: Kil
unused data files
Since QEMU startup uses the new -device argument, the hotplug
code needs todo the same. This converts disk, network and
host device hotplug to use the device_add command
* src/qemu/qemu_driver.c: Use new device_add monitor APIs
whereever possible
All the helper functions for building command line arguments
now return a 'char *', instead of acepting a 'char **' or
virBufferPtr argument
* qemu/qemu_conf.c: Standardize syntax for building args
* qemu/qemu_conf.h: Export all functions for building args
* qemu/qemu_driver.c: Update for changed syntax for building
NIC/hostnet args
Similar to the race fixed by
be34c3c7ef, make sure
to wait around for KVM to release the resources from
a hot-detached PCI device before attempting to
rebind that device to the host driver.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
If you shutdown libvirtd while a domain with PCI
devices is running, then try to restart libvirtd,
libvirtd will crash.
This happens because qemuUpdateActivePciHostdevs() is calling
pciDeviceListSteal() with a dev of 0x0 (NULL), and then trying
to dereference it. This patch fixes it up so that
qemuUpdateActivePciHostdevs() steals the devices after first
Get()'ting them, avoiding the crash.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Certain hypervisors (like qemu/kvm) map the PCI bar(s) on
the host when doing device passthrough. This can lead to a race
condition where the hypervisor is still cleaning up the device while
libvirt is trying to re-attach it to the host device driver. To avoid
this situation, we look through /proc/iomem, and if the hypervisor is
still holding onto the bar (denoted by the string in the matcher variable),
then we can wait around a bit for that to clear up.
v2: Thanks to review by DV, make sure we wait the full timeout per-device
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The loop looking for the controller associated with a SCI drive had
an off by one, causing it to miss the last controller.
* src/qemu/qemu_driver.c: Fix off-by-1 in searching for SCSI
drive hotplug
The hotplug code in QEMU was leaking memory because although the
inner device object was being moved into the main virDomainDefPtr
config object, the outer container virDomainDeviceDefPtr was not.
* src/qemu/qemu_driver.c: Clarify code to show that the inner
device object is owned by the main domain config upon
successfull attach.
The hotplug code was not correctly invoking the security driver
in error paths. If a hotplug attempt failed, the device would
be left with VM permissions applied, rather than restored to the
original permissions. Also, a CDROM media that is ejected was
not restored to original permissions. Finally there was a bogus
call to set hostdev permissions in the hostdev unplug code
* qemu/qemu_driver.c: Fix security driver usage in hotplug/unplug
If there is a problem with VM startup, PCI devices may be left
assigned to pci-stub / pci-back. Adding a call to reattach
host devices in the cleanup path is required.
* qemu/qemu_driver.c: qemuDomainReAttachHostDevices() when
VM startup fails
Remove all the QEMU driver calls for setting file ownership and
process uid/gid. Instead wire in the QEMU DAC security driver,
stacking it ontop of the primary SELinux/AppArmour driver.
* qemu/qemu_driver.c: Switch over to new DAC security driver
Pulling the disk labelling code out of the exec hook, and into
libvirtd will allow it to access shared state in the daemon. It
will also make debugging & error reporting easier / more reliable.
* qemu/qemu_driver.c: Move initial disk labelling calls up into
libvirtd. Add cleanup of disk labels upon failure
If a VM fails to start, we can't simply free the security label
strings, we must call the domainReleaseSecurityLabel() method
otherwise the reserved 'mcs' level will be leaked in SElinux
* src/qemu/qemu_driver.c: Invoke domainReleaseSecurityLabel()
when domain fails to start
The current security driver architecture has the following
split of logic
* domainGenSecurityLabel
Allocate the unique label for the domain about to be started
* domainGetSecurityLabel
Retrieve the current live security label for a process
* domainSetSecurityLabel
Apply the previously allocated label to the current process
Setup all disk image / device labelling
* domainRestoreSecurityLabel
Restore the original disk image / device labelling.
Release the unique label for the domain
The 'domainSetSecurityLabel' method is special because it runs
in the context of the child process between the fork + exec.
This is require in order to set the process label. It is not
required in order to label disks/devices though. Having the
disk labelling code run in the child process limits what it
can do.
In particularly libvirtd would like to remember the current
disk image label, and only change shared image labels for the
first VM to start. This requires use & update of global state
in the libvirtd daemon, and thus cannot run in the child
process context.
The solution is to split domainSetSecurityLabel into two parts,
one applies process label, and the other handles disk image
labelling. At the same time domainRestoreSecurityLabel is
similarly split, just so that it matches the style. Thus the
previous 4 methods are replaced by the following 6 new methods
* domainGenSecurityLabel
Allocate the unique label for the domain about to be started
No actual change here.
* domainReleaseSecurityLabel
Release the unique label for the domain
* domainGetSecurityProcessLabel
Retrieve the current live security label for a process
Merely renamed for clarity.
* domainSetSecurityProcessLabel
Apply the previously allocated label to the current process
* domainRestoreSecurityAllLabel
Restore the original disk image / device labelling.
* domainSetSecurityAllLabel
Setup all disk image / device labelling
The SELinux and AppArmour drivers are then updated to comply with
this new spec. Notice that the AppArmour driver was actually a
little different. It was creating its profile for the disk image
and device labels in the 'domainGenSecurityLabel' method, where as
the SELinux driver did it in 'domainSetSecurityLabel'. With the
new method split, we can have consistency, with both drivers doing
that in the domainSetSecurityAllLabel method.
NB, the AppArmour changes here haven't been compiled so may not
build.
The QEMU driver is doing 90% of the calls to check for static vs
dynamic labelling. Except it is forgetting todo so in many places,
in particular hotplug is mistakenly assigning disk labels. Move
all this logic into the security drivers themselves, so the HV
drivers don't have to think about it.
* src/security/security_driver.h: Add virDomainObjPtr parameter
to virSecurityDomainRestoreHostdevLabel and to
virSecurityDomainRestoreSavedStateLabel
* src/security/security_selinux.c, src/security/security_apparmor.c:
Add explicit checks for VIR_DOMAIN_SECLABEL_STATIC and skip all
chcon() code in those cases
* src/qemu/qemu_driver.c: Remove all checks for VIR_DOMAIN_SECLABEL_STATIC
or VIR_DOMAIN_SECLABEL_DYNAMIC. Add missing checks for possibly NULL
driver entry points.
* src/lxc/lxc_container.c src/lxc/lxc_controller.c src/lxc/lxc_driver.c
src/network/bridge_driver.c src/qemu/qemu_driver.c
src/uml/uml_driver.c: virFileMakePath returns 0 for success, or the
value of errno on failure, so error checking should be to test
if non-zero, not if lower than 0
Invoking the virConnectGetCapabilities() method causes the QEMU
driver to rebuild its internal capabilities object. Unfortunately
it was forgetting to register the custom domain status XML hooks
again.
To avoid this kind of error in the future, the code which builds
capabilities is refactored into one single method, which can be
called from all locations, ensuring reliable rebuilds.
* src/qemu/qemu_driver.c: Fix rebuilding of capabilities XML and
guarentee it is always consistent