https://bugzilla.redhat.com/show_bug.cgi?id=1038363
If a domain has a different maximum for persistent and live maxmem
or max vcpus, then it is possible to hit cases where libvirt
refuses to adjust the current values or gets halfway through
the adjustment before failing. Better is to determine up front
if the change is possible for all requested flags.
Based on an idea by Geoff Franks.
* src/qemu/qemu_driver.c (qemuDomainSetMemoryFlags): Compute
correct maximum if both live and config are being set.
(qemuDomainSetVcpusFlags): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The virDomainGetRootFilesystem method can be generalized to allow
any filesystem path to be obtained.
While doing this, start a new test case for purpose of testing various
helper methods in the domain_conf.{c,h} files, such as this one.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupXXX APIs' return value must be checked for
being less than 0, not equal to 0.
An VIR_ERR_OPERATION_INVALID error must also be raised
when the VM is not running to prevent a crash on NULL
priv->cgroup field.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
And provide domain summary stat in that case, for lxc backend.
Use case is a container inheriting all devices from the host,
e.g. when doing application containerization.
Destroying a suspended domain needs special action.
We cannot simply terminate all process because they are frozen.
Do deal with that we send them SIGKILL and thaw them.
Upon wakeup the process sees the pending signal and dies immediately.
Signed-off-by: Richard Weinberger <richard@nod.at>
IN6ADDR_ANY_INIT does not seem to be working as expected on MinGW:
error: missing braces around initializer [-Werror=missing-braces]
.sin6_addr = IN6ADDR_ANY_INIT,
Use the in6addr_any variable instead.
Reported by Daniel P. Berrange.
Currently, networkRunHook() is called in networkAllocateActualDevice and
friends. These functions, however, doesn't necessarily work on networks,
For example, if domain's interface is defined in this fashion:
<interface type='bridge'>
<mac address='52:54:00:0b:3b:16'/>
<source bridge='virbr1'/>
<model type='rtl8139'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</interface>
The networkAllocateActualDevice jumps directly onto 'validate' label as
the interface is not type of 'network'. Hence, @network is left
initialized to NULL and networkRunHook(network, ...) is called. One of
the things that the hook function does is dereference @network. Soupir.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Dumping a domain's core can take considerable time. Use the
recently added job functions and unlock the virDomainObj while
dumping core.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Saving domain memory and cpu state can take considerable time.
Use the recently added job functions and unlock the virDomainObj
while saving the domain.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When explicitly destroying a domain (libxlDomainDestroyFlags), or
handling an out-of-band domain shutdown event, cleanup the domain
in the context of a job. Introduce libxlVmCleanupJob to wrap
libxlVmCleanup in a job block.
Large balloon operation can be time consuming. Use the recently
added job functions and unlock the virDomainObj while ballooning.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Creating a large domain could potentially be time consuming. Use the
recently added job functions and unlock the virDomainObj while
the create operation is in progress.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This function, which only has five call sites, simply calls
libxl_domain_destroy and libxlVmCleanup. Call those functions
directly at the call sites, allowing more control over how a
domain is destroyed and cleaned up. This patch maintains the
existing semantic, leaving changes to a subsequent patch.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This patch changes network device type used by default from rtl8139
to virtio when architecture type is aarch64 and machine type is virt.
Qemu doesn't support any other machine types for aarch64 right now and
we can't make any other aarch64-specific tuning in this function yet.
Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
At this point it has a limited functionality and is highly
experimental. Supported domain operations are:
* define
* start
* destroy
* dumpxml
* dominfo
It's only possible to have only one disk device and only one
network, which should be of type bridge.
There is no keyboard working on PPC64 and PS2 mouse is only for X86
when graphics are enabled.
Add a USB keyboard and USB mouse for PPC64 when graphics are enabled.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Format qemu command line for USB keyboard
and add test cases for it.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
PS2 devices only work on X86 platform, other platforms may need
USB devices instead. Athough it doesn't influence the QEMU command line,
it's not right to add PS2 mouse/keyboard for non-X86 platform.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
There is no keyboard support currently in libvirt.
For some platforms (PPC64 QEMU) this makes graphics unusable,
since the keyboard is not implicit and it can't be added via libvirt.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The networkNotifyActualDevice function is accepting two arguments, not
one:
qemu/qemu_process.c: In function 'qemuProcessNotifyNets':
qemu/qemu_process.c:2776:47: error: macro "networkNotifyActualDevice" passed 2 arguments, but takes just 1
if (networkNotifyActualDevice(def, net) < 0)
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
aebbcdd didn't change the non-linux definition of the function,
breaking the build on FreeBSD:
../../src/util/virinitctl.c:164: error: conflicting types for
'virInitctlSetRunLevel'
../../src/util/virinitctl.h:40: error: previous declaration of
'virInitctlSetRunLevel' was here
Basically, the idea is copied from domain code, where tainting
exists for a while. Currently, only one taint reason exists -
VIR_NETWORK_TAINT_HOOK to mark those networks which caused invoking
of hook script.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There might be some use cases, where user wants to prepare the host or
its environment prior to starting a network and do some cleanup after
the network has been shut down. Consider all the functionality that
libvirt doesn't currently have as an example what a hook script can
possibly do.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In the next patch I'm going to need the network format function that
takes virBuffer as argument. However, slightly change of name is more
appropriate then: virNetworkDefFormatBuf to match the rest of functions
that format an object to buffer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Rewrite multiple hotunplug functions to to use the
virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with an absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rewrite lxcDomainAttachDeviceHostdevMiscLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rewrite lxcDomainAttachDeviceHostdevStorageLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rewrite lxcDomainAttachDeviceHostdevSubsysUSBLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rewrite lxcDomainAttachDeviceDiskLive function to use the
virProcessRunInMountNamespace helper. This avoids risk of
a malicious guest replacing /dev with a absolute symlink,
tricking the driver into changing the host OS filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use helper virProcessRunInMountNamespace in lxcDomainShutdownFlags and
lxcDomainReboot. Otherwise, a malicious guest could use symlinks
to force the host to manipulate the wrong file in the host's namespace.
Idea by Dan Berrange, based on an initial report by Reco
<recoverym4n@gmail.com> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732394
Signed-off-by: Eric Blake <eblake@redhat.com>
Implement virProcessRunInMountNamespace, which runs callback of type
virProcessNamespaceCallback in a container namespace. This uses a
child process to run the callback, since you can't change the mount
namespace of a thread. This implies that callbacks have to be careful
about what code they run due to async safety rules.
Idea by Dan Berrange, based on an initial report by Reco
<recoverym4n@gmail.com> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732394
Signed-off-by: Daniel Berrange <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add a helper function which takes a file path and ensures
that all directory components leading up to the file exist.
IOW, it strips the filename part of the path and passes
the result to virFileMakePath.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The check for whether the cgroup devices ACL is available is
done quite late during LXC hotplug - in fact after the device
node is already created in the container in some cases. Better
to do it upfront so we fail immediately.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC disk hotplug code was allowing block or character devices
to be given as disk. A disk is always a block device.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When detaching a USB device from an LXC guest we must remove
the device from the cgroup ACL. Unfortunately we were telling
the cgroup code to use the guest /dev path, not the host /dev
path, and the guest device node had already been unlinked.
This was, however, fortunate since the code passed &priv->cgroup
instead of priv->cgroup, so would have crash if the device node
were accessible.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After hotplugging a USB device, the LXC driver forgot
to add the device def to the virDomainDefPtr.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC code missed the 'usb' component out of the path
/dev/bus/usb/$BUSNUM/$DEVNUM, so it failed to actually
setup cgroups for the device. This was in fact lucky
because the call to virLXCSetupHostUsbDeviceCgroup
was also mistakenly passing '&priv->cgroup' instead of
just 'priv->cgroup'. So once the path is fixed, libvirtd
would then crash trying to access the bogus virCgroupPtr
pointer. This would have been a security issue, were it
not for the bogus path preventing the pointer reference
being reached.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virDomainDefCompatibleDevice blocks use of USB if no USB
controller is present. This is not correct for containers
since devices can be assigned directly regardless of any
controllers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, there's just one place where we care if hook script is
changing the domain XML: migration hook for incoming migration. In
all other places where a hook script is executed, we don't read the
XML back from the script.
Anyway, the hook script can alter domain XML and hence we should taint
it if the script did.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This new flag is to be used for tainting domains which
XML definition was altered at runtime by a hook script.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The internal pools were an idea in one of the first iterations of the
gluster series, which we decided not to use. Somehow the patch still
got pushed. Remove it as the internal flag isn't needed.
This reverts commit 362da8209d.
Also try to bind on IPv6 to check if the port is occupied.
Change the mocked bind in the test to return EADDRINUSE
for some ports only for the IPv4/IPv6 socket if we're testing
on a host with IPv6 compiled in.
Also mock socket() to make it fail with EAFNOTSUPPORTED
if LIBVIRT_TEST_IPV4ONLY is set in the environment, to
simulate a host without IPv6 support in the kernel. The
tests are repeated again with this variable set.
https://bugzilla.redhat.com/show_bug.cgi?id=1025407
In a44b7b87bc I've introduced a function
that initializes a storage file wrapper object on gluster based volumes.
The initialization function leaks the private data pointer in case of
failure. This patch fixes it.
Reported by John Ferlan.
In commit e32268184b I accidentally added
twice a typedef for virStorageFileBackend when I moved it between files
across patch iterations. The double declaration breaks build on older
compilers in RHEL5 and FreeBSD.
Remove the spurious definition.
Add support for gluster backed images as sources for snapshots in the
qemu driver. This will also simplify adding further network backed
volumes as sources for snapshot in case qemu will support them.
Use the new storage driver APIs to delete snapshot backing files in case
of failure instead of directly relying on "unlink". This will help us in
the future when we will be adding network based storage without local
representation in the host.
Add APIs that will allow to use the storage driver to assist in
operations on files even for remote filesystems without native
representation as files in the host.
All the data for getting the actual type is present in the snapshot
config. There is no need to have this function private to the qemu
driver and it will be re-used later in other parts of libvirt
All the data for getting the actual type is present in the domain
config. There is no need to have this function private to the qemu
driver and it will be re-used later in other parts of libvirt
The problem with VLAN is that the user still has to manually create the
vlan interface on the host. Then the generated configuration will use
it as a nerwork hostdev device. So the generated configurations of the
following two fragments are equivalent (see rhbz#1059637).
lxc.network.type = phys
lxc.network.link = eth0.5
lxc.network.type = vlan
lxc.network.link = eth0
lxc.network.vlan.id = 5
Some of the LXC configuration properties aren't migrated since they
would only cause problems in libvirt-lxc:
* lxc.network.ipv[46]: LXC driver doesn't setup IP address of guests,
see rhbz#1059624
* lxc.network.name, see rhbz#1059630
If no network configuration is provided, LXC only provides the loopback
interface. To match this, we need to use the privnet feature. LXC will
also define a 'none' network type in its 1.0.0 version that fits
libvirt LXC driver's default.
LXC rootfs can be either a directory or a block device or an image
file. The first two types have been implemented, but the image file is
still to be done since LXC auto-guesses the file format at mount time
and the LXC driver doesn't support the 'auto' format.
This function aims at converting LXC configuration into a libvirt
domain XML description to help users migrate from LXC to libvirt.
Here is an example of how the lxc configuration works:
virsh -c lxc:/// domxml-from-native lxc-tools /var/lib/lxc/migrate_test/config
It is possible that some parts couldn't be properly mapped into a
domain XML fragment, so users should carefully review the result
before creating the domain.
fstab files in lxc.mount lines will need to be merged into the
configuration file as lxc.mount.entry.
As we can't know the amount of memory of the host, we have to set a
default value for max_balloon that users will probably want to adjust.
virConf now honours a VIR_CONF_FLAG_LXC_FORMAT flag to handle LXC
configuration files. The differences are that property names can
contain '.' character and values are all strings without any bounding
quotes.
Provide a new virConfWalk function calling a handler on all non-comment
values. This function will be used by the LXC conversion code to loop
over LXC configuration lines.
Commit 57ddcc23 (v0.9.11) introduced the pmwakeup event, with
an optional 'reason' field reserved for possible future expansion.
But it failed to wire the field through RPC, so even if we do
add a reason in the future, we will be unable to get it back
to the user.
Worse, commit 7ba5defb (v1.0.0) repeated the same mistake with
the pmsuspend_disk event.
As long as we are adding new RPC calls, we might as well fix
the events to actually match the signature so that we don't have
to add yet another RPC in the future if we do decide to start
using the reason field.
* src/remote/remote_protocol.x
(remote_domain_event_callback_pmwakeup_msg)
(remote_domain_event_callback_pmsuspend_msg)
(remote_domain_event_callback_pmsuspend_disk_msg): Add reason
field.
* daemon/remote.c (remoteRelayDomainEventPMWakeup)
(remoteRelayDomainEventPMSuspend)
(remoteRelayDomainEventPMSuspendDisk): Pass reason to client.
* src/conf/domain_event.h (virDomainEventPMWakeupNewFromDom)
(virDomainEventPMSuspendNewFromDom)
(virDomainEventPMSuspendDiskNewFromDom): Require additional
parameter.
* src/conf/domain_event.c (virDomainEventPMClass): New class.
(virDomainEventPMDispose): New function.
(virDomainEventPMWakeupNew*, virDomainEventPMSuspendNew*)
(virDomainEventPMSuspendDiskNew*)
(virDomainEventDispatchDefaultFunc): Use new class.
* src/remote/remote_driver.c (remoteDomainBuildEvent*PM*): Pass
reason through.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Following the patterns established by lifecycle events, this
creates all the new RPC calls needed to pass callback IDs
for every domain event, and changes the limits in client and
server codes to use modern style when possible.
I've tested all combinations: both 'old client and new server'
and 'new client and old server' continue to work with the old
RPCs, and 'new client and new server' benefit from server-side
filtering with the new RPCs.
* src/remote/remote_protocol.x (REMOTE_PROC_DOMAIN_EVENT_*): Add
REMOTE_PROC_DOMAIN_EVENT_CALLBACK_* counterparts.
* daemon/remote.c (remoteRelayDomainEvent*): Send callbackID via
newer RPC when used with new-style registration.
(remoteDispatchConnectDomainEventCallbackRegisterAny): Extend to
cover all domain events.
* src/remote/remote_driver.c (remoteDomainBuildEvent*): Add new
Callback and Helper functions.
(remoteEvents): Match order of RPC numbers, register new handlers.
(remoteConnectDomainEventRegisterAny)
(remoteConnectDomainEventDeregisterAny): Extend to cover all
domain events.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The counterpart to the server RPC additions; here, a single
function can serve both old and new calls, while incoming
events must be serviced by two different functions. Again,
some wise choices in our XDR made it easier to share code
managing similar events.
While this only supports lifecycle events, it covers the
harder part of how Register and RegisterAny interact; the
remaining 15 events will be a mechanical change in a later
patch. For Register, we now have a callbackID locally for
more efficient cleanup if the RPC fails; we also prefer to
use the newer RPC where we know it is supported (the older
RPC must be used if we don't know if RegisterAny is
supported).
* src/remote/remote_driver.c (remoteEvents): Register new RPC
event handler.
(remoteDomainBuildEventLifecycle): Move guts...
(remoteDomainBuildEventLifecycleHelper): ...here.
(remoteDomainBuildEventCallbackLifecycle): New function.
(remoteConnectDomainEventRegister)
(remoteConnectDomainEventDeregister)
(remoteConnectDomainEventRegisterAny)
(remoteConnectDomainEventDeregisterAny): Use new RPC when supported.
We want to convert over to server-side events, even for older
APIs. To do that, the client side of the remote driver wants
to distinguish between legacy virConnectDomainEventRegister and
normal virConnectDomainEventRegisterAny, while knowing the
client callbackID and the server's serverID for both types of
registration. The client also needs to probe whether the
server supports server-side filtering. However, for ease of
review, we don't actually use the new RPCs until a later patch.
* src/conf/object_event_private.h (virObjectEventStateCallbackID):
Add parameter.
* src/conf/object_event.c (virObjectEventCallbackListAddID)
(virObjectEventStateRegisterID): Separate legacy from callbackID.
(virObjectEventStateCallbackID): Pass through parameter.
(virObjectEventCallbackLookup): Let legacy and global domain
lifecycle events share a common remoteID.
* src/conf/network_event.c (virNetworkEventStateRegisterID):
Update caller.
* src/conf/domain_event.c (virDomainEventStateRegister)
(virDomainEventStateRegisterID, virDomainEventStateDeregister):
Likewise.
(virDomainEventStateRegisterClient)
(virDomainEventStateCallbackID): Implement new functions.
* src/conf/domain_event.h (virDomainEventStateRegisterClient)
(virDomainEventStateCallbackID): New prototypes.
* src/remote/remote_driver.c (private_data): Add field.
(doRemoteOpen): Probe server feature.
(remoteConnectDomainEventRegister)
(remoteConnectDomainEventRegisterAny): Use new function.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch adds some new RPC call numbers, but for ease of review,
they sit idle until a later patch adds the client counterpart to
drive the new RPCs. Also for ease of review, I limited this patch
to just the lifecycle event; although converting the remaining
15 domain events will be quite mechanical. On the server side,
we have to have a function per RPC call, largely with duplicated
bodies (the key difference being that we store in our callback
opaque pointer whether events should be fired with old or new
style); meanwhile, a single function can drive multiple RPC
messages. With a strategic choice of XDR struct layout, we can
make the event generation code for both styles fairly compact.
I debated about adding a tri-state witness variable per
connection (values 'unknown', 'legacy', 'modern'). It would start
as 'unknown', move to 'legacy' if any RPC call is made to a legacy
event call, and move to 'modern' if the feature probe is made;
then the event code could issue an error if the witness state is
incorrect (a legacy RPC call while in 'modern', a modern RPC call
while in 'unknown' or 'legacy', and a feature probe while in
'legacy' or 'modern'). But while it might prevent odd behavior
caused by protocol fuzzing, I don't see that it would prevent
any security holes, so I considered it bloat.
Note that sticking @acl markers on the new RPCs generates unused
functions in access/viraccessapicheck.c, because there is no new
API call that needs to use the new checks; however, having a
consistent .x file is worth the dead code.
* src/libvirt_internal.h (VIR_DRV_FEATURE_REMOTE_EVENT_CALLBACK):
New feature.
* src/remote/remote_protocol.x
(REMOTE_PROC_CONNECT_DOMAIN_EVENT_CALLBACK_REGISTER_ANY)
(REMOTE_PROC_CONNECT_DOMAIN_EVENT_CALLBACK_DEREGISTER_ANY)
(REMOTE_PROC_DOMAIN_EVENT_CALLBACK_LIFECYCLE): New RPCs.
* daemon/remote.c (daemonClientCallback): Add field.
(remoteDispatchConnectDomainEventCallbackRegisterAny)
(remoteDispatchConnectDomainEventCallbackDeregisterAny): New
functions.
(remoteDispatchConnectDomainEventRegisterAny)
(remoteDispatchConnectDomainEventDeregisterAny): Mark legacy use.
(remoteRelayDomainEventLifecycle): Change message based on legacy
or new use.
(remoteDispatchConnectSupportsFeature): Advertise new feature.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
virGetStorageVol can return NULL on out-of-memory. If it does, cleanly
abort the volume clone operation.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
This reverts commit 67ccf91bf2.
We only generate the volume key after we've built it, but the storage
driver expects it to be filled after createVol finishes.
Squash the volume building back with creating to fulfill this
expectation.
Openstack Nova calls virConnectBaselineCPU() during initialization
of the instance to get a full list of CPU features.
This patch adds a stub to aarch64-specific code to handle
this request (no actual work is done). That's enough to have
this stub with limited functionality because qemu/kvm backend
supports only 'host-passthrough' cpu mode on aarch64.
Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
A small fix for the possiblitiy of jumping to an error path before
registering for domain events, preventing receiving important ones
like shutdown and death.
This shadows the index function on some systems (RHEL-6.4, FreeBSD 9):
../../src/conf/capabilities.c: In function 'virCapabilitiesGetCpusForNode':
../../src/conf/capabilities.c:1005: warning: declaration of'index'
shadows a global declaration [-Wshadow]
/usr/include/strings.h:57: warning: shadowed declaration is here [-Wshadow]
On some platforms like IBM PowerNV the NUMA node numbers can be
non-sequential. For eg. numactl --hardware o/p from such a machine looks
as given below
node distances:
node 0 1 16 17
0: 10 40 40 40
1: 40 10 40 40
16: 40 40 10 40
17: 40 40 40 10
The NUMA nodes are 0,1,16,17
Libvirt uses sequential index as NUMA node numbers and this can
result in crash or incorrect results.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Add a new backend for any character device. This backend uses channel
in spice connection. This channel is similar to spicevmc, but
all-purpose in contrast to spicevmc.
Apart from spicevmc, spiceport-backed chardev will not be formatted
into the command-line if there is no spice to use (with test for that
as well). For this I moved the def->graphics counting to the start
of the function so its results can be used in rest of the code even in
the future.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch is here just to ease the code review and make related
changes look more sensible. Apart from removing the condition this is
merely a whitespace (indentation) change.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Limiting ourselves to qemu without QEMU_CAPS_DEVICE capability, we
used '-serial none' only if there was no serial device defined in the
domain XML. This means that if we want to have a possibility of the
device being defined in XML, but not used in the command-line
(e.g. when it's pointless), we'll fail to attach '-serial none' to the
command-line (when skipping the device's command-line building and the
device being the only one).
Since there is no such device, this patch doesn't actually do
anything, but enables easier future additions in this manner.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add a new character device backend called 'spiceport' that uses
spice's channel for communications and apart from spicevmc can be used
as a backend for any character device from libvirt's point of view.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This new RBD format supports snapshotting and cloning. By having
libvirt create images in format 2 end-users of the created images
can benefit from the new RBD format.
Older versions of libvirt can work with this new RBD format as long
as librbd supports format 2. RBD format is supported by librbd since
version 0.56 (Ceph Bobtail).
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When restarting sheepdog pool, all volumes are missing.
This patch add automatically all volume from the added pool.
Adding last Daniel P. Berrange's syntaxes correction.
Adding vol on separeted function 'inspired' from parallels_storage :
parallelsAddDiskVolume
In order to make a client-only build successful on RHEL4 (yes, you
read that correctly!), commit 3ed2e54 modified src/util/virnetdev.c so
that the functional version of virNetDevGetVLanID() was only compiled
if GET_VLAN_VID_CMD was defined. However, it is *never* defined, but
is only an enum value, so the proper version was no longer compiled
even on platforms that support it. This resulted in the vlan tag not
being properly set for guest traffic on VEPA mode guest macvtap
interfaces that were bound to a vlan interface (that's the only place
that libvirt currently uses virNetDevGetVLanID)
Since there is no way to compile conditionally based on the presence
of an enum value, this patch modifies configure.ac to check for said
enum value with AC_CHECK_DECLS(), which #defines
HAVE_DECL_GET_VLAN_VID_CMD to 1 if it's successful compiling a test
program that uses GET_VLAN_VID_CMD (and still #defines it, but to 0,
if it's not successful). We can then make the compilation of
virNetDevGetVLanID() conditional on the value of
HAVE_DECL_GET_VLAN_VID_CMD.
Reset line numbering on each input file in check-aclrules.pl. Otherwise
it reports wrong line numbers in its error messages.
Signed-off-by: Yuri Myasoedov <ymyasoedov@yandex.ru>
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
In the network status XML we may have the <floor/> element with the
'sum' attribute. The attribute represents sum of all 'floor'-s of
computed over each interface connected to the network (this is needed to
guarantee certain bandwidth for certain domain). The sum is therefore a
number. However, if the number was mangled (e.g. by an user's
interference to network status file), we've just ignored it without
refusing to parse such file. This was all due to 'goto error' missing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code took into account only the global permissions. The domains now
support per-vm DAC labels and per-image DAC labels. Use the most
specific label available.
The lack of debug printings might be frustrating in the future.
Moreover, this function doesn't follow the usual pattern we have in the
rest of the code:
int ret = -1;
/* do some work */
ret = 0;
cleanup:
/* some cleanup work */
return ret;
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a new <timer> for the HyperV reference time counter enlightenment
and the iTSC reference page for Windows guests.
This feature provides a paravirtual approach to track timer events for
the guest (similar to kvmclock) with the option to use real hardware
clock on systems with a iTSC with compensation across various hosts.
According to the documentation various timer options are only supported
by certain timer types. Add a post parse check to verify that the user
didn't specify invalid options.
Also fix the qemu command line parsing function to set correct default
values for the kvmclock timer so that it passes the new check.
According to the documentation describing various tunables for domain
timers not all the fields are supported by all the driver types. Express
these in the RNG:
- rtc, platform: Only these support the "track" attribute.
- tsc: only one to support "frequency" and "mode" attributes
- hpet, pit: tickpolicy/catchup attribute/element
- kvmclock: no extra attributes are supported
Additionally the attributes of the <catchup> element for
tickpolicy='catchup' are optional according to the parsing code. Express
this in the XML and fix a spurious space added while formatting the
<catchup> element and add tests for it.
Coverity complains about "USE_AFTER_FREE" due to how virPCIDeviceSetStubDriver
"could" return either -1, 0, or 1 from the VIR_STRDUP() and then possibly makes
a call to virPCIDeviceDetach().
The only way this could happen is if NULL were passed as the "driver" name
and virStrdup() returned 0. Since the calling functions check < 0 on the
initial function call, the 0 possibility causes Coverity to complain.
To fix this - enforce that the second parameter is not NULL using
ATTRIBUTE_NONNULL(2) for the function prototype, then in virPCIDeviceDetach
add an sa_assert(dev->stubDriver). This will result in Coverity not complaining
any more.
Couple of codepaths shared the same code which can be moved out to a
function and on one of such places, qemuMigrationConfirmPhase(), the
domain was resumed even if it wasn't running before the migration
started.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1057407
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
libxlDomainRestoreFlags acquires the driver lock while reading the
domain config from the save file and adding it to
libxlDriverPrivatePtr->domains. But virDomainObjList provides
self-locking APIs, so remove the needless driver locking.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
If available, let libxl handle reaping any children it creates by
specifying libxl_sigchld_owner_libxl_always_selective_reap. This
feature was added to improve subprocess handling in libxl when used
in an application that does not install a SIGCHLD handler like
libvirt
http://lists.xen.org/archives/html/xen-devel/2014-01/msg01555.html
Prior to this patch, it is possible to hit asserts in libxl when
reaping subprocesses, particularly during simultaneous operations
on multiple domains. With this patch, and the corresponding changes
to libxl, I no longer see the asserts. Note that the libxl changes
will be included in Xen 4.4.0. Previous Xen versions will be
susceptible to hitting the asserts even with this patch applied to
the libvirt libxl driver.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Handling the domain shutdown event within the event handler seems
a bit unfair to libxl's event machinery. Domain "shutdown" could
take considerable time. E.g. if the shutdown reason is reboot,
the domain must be reaped and then started again.
Spawn a shutdown handler thread to do this work, allowing libxl's
event machinery to go about its business.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Due to some misunderstanding of requirements libxl places on timer
handling, I introduced the half-brained idea of maintaining a list
of timeouts that the driver could force to expire before freeing a
libxlDomainObjPrivate (and hence libxl_ctx). But testing all
the latest versions of Xen supported by the libxl driver (4.2.3,
4.3.1, 4.4.0 RC3), I see that libxl will handle this just fine and
there is no need to force expiration behind libxl's back. Indeed it
may be harmful to do so.
This patch removes the timer list, allowing libxl to handle cleanup
of its timer registrations.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When libxl registers an FD with the libxl driver, the refcnt of the
associated libxlDomainObjPrivate object is incremented. The refcnt
is decremented when libxl deregisters the FD. But some FDs are only
deregistered when their libxl ctx is freed, which unfortunately is
done in the libxlDomainObjPrivate dispose function. With references
held by the FDs, libxlDomainObjPrivate is never disposed.
I added the ref/unref in FD registration/deregistration when adding
the same in timer registration/deregistration. For timers, this
is a simple approach to ensuring the libxlDomainObjPrivate is not
disposed prior to their expirtation, which libxl guarantees will
occur. It is not needed for FDs, and only causes
libxlDomainObjPrivate to leak.
This patch removes the reference on libxlDomainObjPrivate for FD
registrations, but retains them for timer registrations. Tested on
the latest releases of Xen supported by the libxl driver: 4.2.3,
4.3.1, and 4.4.0 RC3.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This commit allows to attach/detach a <filesystem> device in qemu. For
this purpose I'm introducing two new functions: virDomainFSInsert() and
virDomainFSRemove() and adding necessary code in the qemu driver. It
compares filesystems based on their "destination" folder. So if two
filesystems share the same destination, they are considered equal and
the qemu driver would reject the insertion.
Signed-off-by: Matthieu Coudron <mattator@gmail.com>
If virDomainMemoryStats was run on a domain with virtio balloon driver
running on an old qemu which supports QMP but does not support qom-list
QMP command, libvirtd would crash. The reason is we did not check if
qemuMonitorJSONGetObjectListPaths failed and moreover we even stored its
result in an unsigned integer type.
When attempting a blockcommit from the top layer, the base argument
passed is NULL. This will be dereferenced when attempting a commit with
an empty image chain. Output the real volume path instead:
virsh blockcommit --verbose --path vda --domain DOMNAME --wait
error: invalid argument: top '/path/somefile' in chain for 'vda' has no backing file
instead of:
error: invalid argument: top '(null)' in chain for 'vda' has no backing file
Eric Blake suggested to change this message to be different from the
glibc's NULL deref protection message in printf to be able to
differentiate errors.
https://bugzilla.redhat.com/show_bug.cgi?id=1046192
Commit b8bf79a, which adds clock='variable', forgets to check
localtime basis in qemuBuildClockArgStr(). So that localtime
basis could not be used.
Reported-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 2ce63c1 added imagelabel generation when relabeling is turned
off. But we weren't filling out the sensitivity for type 'none' labels,
resulting in an invalid label:
$ virsh managedsave domain
error: unable to set security context 'system_u:object_r:svirt_image_t'
on fd 28: Invalid argument
Noticed a misuse of 'to' while testing my event regression under
polkit ACLs, and decided to review the entire conf files for
other legibility bugs.
* daemon/libvirtd.conf: Use correct grammar.
* src/qemu/qemu.conf: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1058839
Commit f9f56340 for CVE-2014-0028 almost had the right idea - we
need to check the ACL rules to filter which events to send. But
it overlooked one thing: the event dispatch queue is running in
the main loop thread, and therefore does not normally have a
current virIdentityPtr. But filter checks can be based on current
identity, so when libvirtd.conf contains access_drivers=["polkit"],
we ended up rejecting access for EVERY event due to failure to
look up the current identity, even if it should have been allowed.
Furthermore, even for events that are triggered by API calls, it
is important to remember that the point of events is that they can
be copied across multiple connections, which may have separate
identities and permissions. So even if events were dispatched
from a context where we have an identity, we must change to the
correct identity of the connection that will be receiving the
event, rather than basing a decision on the context that triggered
the event, when deciding whether to filter an event to a
particular connection.
If there were an easy way to get from virConnectPtr to the
appropriate virIdentityPtr, then object_event.c could adjust the
identity prior to checking whether to dispatch an event. But
setting up that back-reference is a bit invasive. Instead, it
is easier to delay the filtering check until lower down the
stack, at the point where we have direct access to the RPC
client object that owns an identity. As such, this patch ends
up reverting a large portion of the framework of commit f9f56340.
We also have to teach 'make check' to special-case the fact that
the event registration filtering is done at the point of dispatch,
rather than the point of registration. Note that even though we
don't actually use virConnectDomainEventRegisterCheckACL (because
the RegisterAny variant is sufficient), we still generate the
function for the purposes of documenting that the filtering
takes place.
Also note that I did not entirely delete the notion of a filter
from object_event.c; I still plan on using that for my upcoming
patch series for qemu monitor events in libvirt-qemu.so. In
other words, while this patch changes ACL filtering to live in
remote.c and therefore we have no current client of the filtering
in object_event.c, the notion of filtering in object_event.c is
still useful down the road.
* src/check-aclrules.pl: Exempt event registration from having to
pass checkACL filter down call stack.
* daemon/remote.c (remoteRelayDomainEventCheckACL)
(remoteRelayNetworkEventCheckACL): New functions.
(remoteRelay*Event*): Use new functions.
* src/conf/domain_event.h (virDomainEventStateRegister)
(virDomainEventStateRegisterID): Drop unused parameter.
* src/conf/network_event.h (virNetworkEventStateRegisterID):
Likewise.
* src/conf/domain_event.c (virDomainEventFilter): Delete unused
function.
* src/conf/network_event.c (virNetworkEventFilter): Likewise.
* src/libxl/libxl_driver.c: Adjust caller.
* src/lxc/lxc_driver.c: Likewise.
* src/network/bridge_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/test/test_driver.c: Likewise.
* src/uml/uml_driver.c: Likewise.
* src/vbox/vbox_tmpl.c: Likewise.
* src/xen/xen_driver.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1057321
pointed out that we weren't honoring the <bandwidth> element in
libvirt networks using <forward mode='bridge'/>. In fact, these
networks are just a method of giving a libvirt network name to an
existing Linux host bridge on the system, and libvirt doesn't have
enough information to know where to set such limits. We are working on
a method of supporting network bandwidths for some specific cases of
<forward mode='bridge'/>, but currently libvirt doesn't support it. So
the proper thing to do now is just log an error when someone tries to
put a <bandwidth> element in that type of network. (It's unclear if we
will be able to do proper bandwidth limiting for macvtap networks, and
most definitely we will not be able to support it for hostdev
networks).
While looking through the network XML documentation and comparing it
to the networkValidate function, I noticed that we also ignore the
presence of a mac address in the config in the same cases, rather than
failing so that the user will understand that their desired action has
not been taken.
This patch updates networkValidate() (which is called any time a
persistent network is defined, or a transient network created) to log
an error and fail if it finds either a <bandwidth> or <mac> element
and the network forward mode is anything except 'route'. 'nat', or
nothing. (Yes, neither of those elements is acceptable for any macvtap
mode, nor for a hostdev network).
NB: This does *not* cause failure to start any existing network that
contains one of those elements, so someone might have erroneously
defined such a network in the past, and that network will continue to
function unmodified. I considered it too disruptive to suddenly break
working configs on the next reboot after a libvirt upgrade.
https://bugzilla.redhat.com/show_bug.cgi?id=1045124
When loading modules, libvirt does not honor the modprobe blacklist.
Use the new virKModLoad() API in order to attempt load with blacklist check.
Use the new virKModIsBlacklisted() API to check if the failure to load
was due to the blacklist
Signed-off-by: John Ferlan <jferlan@redhat.com>
virKModConfig() - Return a buffer containing kernel module configuration
virKModLoad() - Load a specific module into the kernel configuration
virKModUnload() - Unload a specific module from the kernel configuration
virKModIsBlacklisted() - Determine whether a module is blacklisted within
the kernel configuration
commit f094aaac changed qemuPrepareHostdevPCIDevices() such that it
may modify the "backend" (vfio vs. legacy kvm) setting in the
virHostdevDef. However, qemuDomainAttachHostPciDevice() (used by
hotplug) copies the backend setting into a local *before* calling
qemuPrepareHostdevPCIDevices(), and then later makes a decision based
on that pre-change value.
The result is that, if the backend had been set to "default" (i.e. not
specified in the config) and was later updated to "VFIO" by
qemuPrepareHostdevPCIDevices(), the qemu process' MacMemLock is not
increased (as is required for VFIO device assignment).
This patch delays making the local copy of backend until after its
potential modification.
The previous patch fixed "forwardPlainNames" so that it really is
doing only what is intended, but left the default to be
"forwardPlainNames='no'". Discussion around the initial version of
that patch led to the decision that the default should instead be
"forwardPlainNames='yes'" (i.e. the original behavior before commit
f3886825). This patch makes that change to the default.
In commit f386825 we began adding the options
--domain-needed
--local=/$mydomain/
to all dnsmasq commandlines with the stated reason of preventing
forwarding of DNS queries for names that weren't fully qualified
domain names ("FQDN", i.e. a name that included some "."s and a domain
name). This was later changed to
domain-needed
local=/$mydomain/
when we moved the options from the dnsmasq commandline to a conf file.
The original patch on the list, and discussion about it, is here:
https://www.redhat.com/archives/libvir-list/2012-August/msg01594.html
When a domain name isn't specified (mydomain == ""), the addition of
"domain-needed local=//" will prevent forwarding of domain-less
requests to the virtualization host's DNS resolver, but if a domain
*is* specified, the addition of "local=/domain/" will prevent
forwarding of any requests for *qualified* names within that domain
that aren't resolvable by libvirt's dnsmasq itself.
An example of the problems this causes - let's say a network is
defined with:
<domain name='example.com'/>
<dhcp>
..
<host mac='52:54:00:11:22:33' ip='1.2.3.4' name='myguest'/>
</dhcp>
This results in "local=/example.com/" being added to the dnsmasq options.
If a guest requests "myguest" or "myguest.example.com", that will be
resolved by dnsmasq. If the guest asks for "www.example.com", dnsmasq
will not know the answer, but instead of forwarding it to the host, it
will return NOT FOUND to the guest. In most cases that isn't the
behavior an admin is looking for.
A later patch (commit 4f595ba) attempted to remedy this by adding a
"forwardPlainNames" attribute to the <dns> element. The idea was that
if forwardPlainNames='yes' (default is 'no'), we would allow
unresolved names to be forwarded. However, that patch was botched, in
that it only removed the "domain-needed" option when
forwardPlainNames='yes', and left the "local=/mydomain/".
Really we should have been just including the option "--domain-needed
--local=//" (note the lack of domain name) regardless of the
configured domain of the network, so that requests for names without a
domain would be treated as "local to dnsmasq" and not forwarded, but
all others (including those in the network's configured domain) would
be forwarded. We also shouldn't include *either* of those options if
forwardPlainNames='yes'. This patch makes those corrections.
This patch doesn't remedy the fact that default behavior was changed
by the addition of this feature. That will be handled in a subsequent
patch.
We support only one spicevmc channel name anyway and the code is
prepared to use the default one, there's only one check missing. It
is also mentioned in the documentation already and helps defining
domains with spice vdagent for people using virsh.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Coverity complains about default: label in libxl_driver.c not be able
to be reached. It's by design for the code and since it's not necessary
in the code nor does it elicit any compiler/make check warnings - just
remove it rather than adding a coverity[dead_error_begin] tag.
While I'm at it, lxc_driver.c and nodeinfo.c have the same design, so I
removed the default labels and the existing coverity tags.
The NWFilter code has as a deadlock race condition between
the virNWFilter{Define,Undefine} APIs and starting of guest
VMs due to mis-matched lock ordering.
In the virNWFilter{Define,Undefine} codepaths the lock ordering
is
1. nwfilter driver lock
2. virt driver lock
3. nwfilter update lock
4. domain object lock
In the VM guest startup paths the lock ordering is
1. virt driver lock
2. domain object lock
3. nwfilter update lock
As can be seen the domain object and nwfilter update locks are
not acquired in a consistent order.
The fix used is to push the nwfilter update lock upto the top
level resulting in a lock ordering for virNWFilter{Define,Undefine}
of
1. nwfilter driver lock
2. nwfilter update lock
3. virt driver lock
4. domain object lock
and VM start using
1. nwfilter update lock
2. virt driver lock
3. domain object lock
This has the effect of serializing VM startup once again, even if
no nwfilters are applied to the guest. There is also the possibility
of deadlock due to a call graph loop via virNWFilterInstantiate
and virNWFilterInstantiateFilterLate.
These two problems mean the lock must be turned into a read/write
lock instead of a plain mutex at the same time. The lock is used to
serialize changes to the "driver->nwfilters" hash, so the write lock
only needs to be held by the define/undefine methods. All other
methods can rely on a read lock which allows good concurrency.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are a number of pthreads impls available on Win32
these days, in particular the mingw64 project has a good
impl. Delete the native windows thread implementation and
rely on using pthreads everywhere.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The check-augeas-lockd test depends on the file
locking/qemu-lockd.conf, so must be skipped when QEMU
is disabled.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 10c9ceff6d intended to introduce new argument for the
testing purpose, but it missed the similar changing of the
device's sg_path. The problem was hidden since my laptop has
the /dev/sg0 and /dev/sg1. A later patch will modify the tests
accordingly.
Signed-off-by: Osier Yang <jyang@redhat.com>
Reported-by: Pavel Hrdina <phrdina@redhat.com>