Two reasons:
1.in none hotplug, we will pass it. We can see from libvirt function
qemuBuildVhostuserCommandLine
2.qemu will use this vetcor num to init msix table. If we don't pass, qemu
will use default value, this will cause VM can only use default value
interrupts at most.
Signed-off-by: gaohaifeng <gaohaifeng.gao@huawei.com>
Consider the following XML snippets:
$ cat scsicontroller.xml
<controller type='scsi' model='virtio-scsi' index='0'/>
$ cat scsihostdev.xml
<hostdev mode='subsystem' type='scsi'>
<source>
<adapter name='scsi_host0'/>
<address bus='0' target='8' unit='1074151456'/>
</source>
</hostdev>
If we create a guest that includes the contents of scsihostdev.xml,
but forget the virtio-scsi controller described in scsicontroller.xml,
one is silently created for us. The same holds true when attaching
a hostdev before the matching virtio-scsi controller.
(See qemuDomainFindOrCreateSCSIDiskController for context.)
Detaching the hostdev, followed by the controller, works well and the
guest behaves appropriately.
If we detach the virtio-scsi controller device first, any associated
hostdevs are detached for us by the underlying virtio-scsi code (this
is fine, since the connection is broken). But all is not well, as the
guest is unable to receive new virtio-scsi devices (the attach commands
succeed, but devices never appear within the guest), nor even be
shutdown, after this point.
While this is not libvirt's problem, we can prevent falling into this
scenario by checking if a controller is being used by any hostdev
devices. The same is already done for disk elements today.
Applying this patch and then using the XML snippets from earlier:
$ virsh detach-device guest_01 scsicontroller.xml
error: Failed to detach device from scsicontroller.xml
error: operation failed: device cannot be detached: device is busy
$ virsh detach-device guest_01 scsihostdev.xml
Device detached successfully
$ virsh detach-device guest_01 scsicontroller.xml
Device detached successfully
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Although nearly all host devices that are assigned to guests using
VFIO ("<hostdev>" devices in libvirt) are physically PCI Express
devices, until now libvirt's PCI address assignment has always
assigned them addresses on legacy PCI controllers in the guest, even
if the guest's machinetype has a PCIe root bus (e.g. q35 and
aarch64/virt).
This patch tries to assign them to an address on a PCIe controller
instead, when appropriate. First we do some preliminary checks that
might allow setting the flags without doing any extra work, and if
those conditions aren't met (and if libvirt is running privileged so
that it has proper permissions), we perform the (relatively) time
consuming task of reading the device's PCI config to see if it is an
Express device. If this is successful, the connect flags are set based
on the result, but if we aren't able to read the PCI config (most
likely due to the device not being present on the system at the time
of the check) we assume it is (or will be) an Express device, since
that is almost always the case anyway.
If libvirtd is running unprivileged, it can open a device's PCI config
data in sysfs, but can only read the first 64 bytes. But as part of
determining whether a device is Express or legacy PCI,
qemuDomainDeviceCalculatePCIConnectFlags() will be updated in a future
patch to call virPCIDeviceIsPCIExpress(), which tries to read beyond
the first 64 bytes of the PCI config data and fails with an error log
if the read is unsuccessful.
In order to avoid creating a parallel "quiet" version of
virPCIDeviceIsPCIExpress(), this patch passes a virQEMUDriverPtr down
through all the call chains that initialize the
qemuDomainFillDevicePCIConnectFlagsIterData, and saves the driver
pointer with the rest of the iterdata so that it can be used by
qemuDomainDeviceCalculatePCIConnectFlags(). This pointer isn't used
yet, but will be used in an upcoming patch (that detects Express vs
legacy PCI for VFIO assigned devices) to examine driver->privileged.
The path to the config file for a PCI device is conventiently stored
in a virPCIDevice object, but that object's contents aren't directly
visible outside of virpci.c, so we need to have an accessor function
for it if anyone needs to look at it.
This new function just calls fstat() (if provided with a valid fd) or
stat() (if fd is -1) and returns st_size (or -1 if there is an
error). We may decide we want this function to be more complex, and
handle things like block devices - this is a placeholder (that works)
for any more complicated function.
We can't change feature names for compatibility reasons even if they
contain typos or other software uses different names for the same
features. By adding alternative spellings in our CPU map we at least
allow anyone to grep for them and find the correct libvirt's name.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When virt-aa-helper parses xml content it can fail on security labels.
It fails by requiring to parse active domain content on seclabels that
are not yet filled in.
Testcase with virt-aa-helper on a minimal xml:
$ cat << EOF > /tmp/test.xml
<domain type='kvm'>
<name>test-seclabel</name>
<uuid>12345678-9abc-def1-2345-6789abcdef00</uuid>
<memory unit='KiB'>1</memory>
<os><type arch='x86_64'>hvm</type></os>
<seclabel type='dynamic' model='apparmor' relabel='yes'/>
<seclabel type='dynamic' model='dac' relabel='yes'/>
</domain>
EOF
$ /usr/lib/libvirt/virt-aa-helper -d -r -p 0 \
-u libvirt-12345678-9abc-def1-2345-6789abcdef00 < /tmp/test.xml
Current Result:
virt-aa-helper: error: could not parse XML
virt-aa-helper: error: could not get VM definition
Expected Result is a valid apparmor profile
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Only the latest APIs are fully documented and the documentation of the
older variants (which are just limited versions of the new APIs anyway)
points to the newest APIs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Restarting libvirtd on the source host at the end of migration when a
domain is already running on the destination would cause image labels to
be reset effectively killing the domain. Commit e8d0166e1d fixed similar
issue on the destination host, but kept the source always resetting the
labels, which was mostly correct except for the specific case handled by
this patch.
https://bugzilla.redhat.com/show_bug.cgi?id=1343858
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Post-copy migration needs bi-directional communication between the
source and the destination QEMU processes, which is not supported by
tunnelled migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1371358
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We had a lot of rados_conf_set and check works.
Use helper virStorageBackendRBDRADOSConfSet for them.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Thanks to the complex capability caching code virQEMUCapsProbeQMP was
never called when we were starting a new qemu VM. On the other hand,
when we are reconnecting to the qemu process we reload the capability
list from the status XML file. This means that the flag preventing the
function being called was not set and thus we partially reprobed some of
the capabilities.
The recent addition of CPU hotplug clears the
QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine does not support it.
The partial re-probe on reconnect results into attempting to call the
unsupported command and then killing the VM.
Remove the partial reprobe and depend on the stored capabilities. If it
will be necessary to reprobe the capabilities in the future, we should
do a full reprobe rather than this partial one.
QEMU 2.8.0 adds support for unavailable-features in
query-cpu-definitions reply. The unavailable-features array lists CPU
features which prevent a corresponding CPU model from being usable on
current host. It can only be used when all the unavailable features are
disabled. Empty array means the CPU model can be used without
modifications.
We can use unavailable-features for providing CPU model usability info
in domain capabilities XML:
<domainCapabilities>
...
<cpu>
<mode name='host-passthrough' supported='yes'/>
<mode name='host-model' supported='yes'>
<model fallback='allow'>Skylake-Client</model>
...
</mode>
<mode name='custom' supported='yes'>
<model usable='yes'>qemu64</model>
<model usable='yes'>qemu32</model>
<model usable='no'>phenom</model>
<model usable='yes'>pentium3</model>
<model usable='yes'>pentium2</model>
<model usable='yes'>pentium</model>
<model usable='yes'>n270</model>
<model usable='yes'>kvm64</model>
<model usable='yes'>kvm32</model>
<model usable='yes'>coreduo</model>
<model usable='yes'>core2duo</model>
<model usable='no'>athlon</model>
<model usable='yes'>Westmere</model>
<model usable='yes'>Skylake-Client</model>
...
</mode>
</cpu>
...
</domainCapabilities>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
"host" CPU model is supported by a special host-passthrough CPU mode and
users is not allowed to specify this model directly with custom mode.
Thus we should not advertise "host" CPU model in domain capabilities.
This worked well on architectures for which libvirt provides a list of
supported CPU models in cpu_map.xml (since "host" is not in the list).
But we need to explicitly filter "host" model out for all other
architectures.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
CPU models (and especially some additional details which we will start
probing for later) differ depending on the accelerator. Thus we need to
call query-cpu-definitions in both KVM and TCG mode to get all data we
want.
Tests in tests/domaincapstest.c are temporarily switched to TCG to avoid
having to squash even more stuff into this single patch. They will all
be switched back later in separate commits.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch moves the CPU models formatting code from
virQEMUCapsFormatCache into a separate function.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function just returned cached capabilities without checking whether
they are still valid. We should check that and refresh the capabilities
to make sure we don't return stale data. In other words, we should do
what all other lookup functions do.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function is made a little bit more readable and the code which
refreshes cached capabilities if they are not valid any more was moved
into a separate function (virQEMUCapsCacheValidate) so that it can be
reused in other places.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If a user asked for a KVM domain capabilities when KVM is not available,
we would happily return data we got when probing through TCG and
pretended they were relevant for KVM. Let's just report KVM is not
supported to avoid confusion.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When domain capabilities were introduced we did not have enough data to
decide whether KVM works on the host or not and thus working legacy/VFIO
device assignment was used as a witness. Now that we know whether KVM
was enabled when probing QEMU capabilities (and thus we know it's
working), we can use this knowledge to provide better default value for
virttype.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Since some may depend on the accelerator used when probing QEMU the
cache becomes invalid when KVM becomes available or if it is not
available anymore.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
CPU related capabilities may differ depending on accelerator used when
probing. Let's use KVM if available when probing QEMU and fall back to
TCG. The created capabilities already contain all we need to distinguish
whether KVM or TCG was used:
- KVM was used when probing capabilities:
QEMU_CAPS_KVM is set
QEMU_CAPS_ENABLE_KVM is not set
- TCG was used and QEMU supports KVM, but it failed (e.g., missing
kernel module or wrong /dev/kvm permissions)
QEMU_CAPS_KVM is not set
QEMU_CAPS_ENABLE_KVM is set
- KVM was not used and QEMU does not support it
QEMU_CAPS_KVM is not set
QEMU_CAPS_ENABLE_KVM is not set
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's set QEMU_CAPS_KVM and QEMU_CAPS_ENABLE_KVM early so that the rest
of the probing code can use these capabilities to handle KVM/TCG replies
differently.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Using -machine instead of -M for QMP probing is safe because any QEMU
binary which is capable of QMP probing supports -machine.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The code that runs a new QEMU process to be used for probing
capabilities is separated into four reusable functions so that any code
that wants to probe a QEMU process may just follow a few simple steps:
cmd = virQEMUCapsInitQMPCommandNew(...);
virQEMUCapsInitQMPCommandRun(cmd);
/* talk to the running QEMU process using its QMP monitor */
if (reprobeIsRequired) {
virQEMUCapsInitQMPCommandAbort(cmd, ...);
virQEMUCapsInitQMPCommandRun(cmd);
/* talk to the running QEMU process again */
}
virQEMUCapsInitQMPCommandFree(cmd);
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This reverts commit 3a6cf6fc16.
Mistakenly this commit was pushed because I thought I missed the
corret one b880ff42dd while in fact I didn't.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We have couple of functions that operate over NULL terminated
lits of strings. However, our naming sucks:
virStringJoin
virStringFreeList
virStringFreeListCount
virStringArrayHasString
virStringGetFirstWithPrefix
We can do better:
virStringListJoin
virStringListFree
virStringListFreeCount
virStringListHasString
virStringListGetFirstWithPrefix
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If libvirt is compiled without NUMACTL support starting libvirtd
reports a libvirt internal error "NUMA isn't available on this host"
without checking if NUMA support is compiled into the libvirt binaries.
This patch adds the missing NUMA support check to prevent the internal error.
It also includes a check if the cgroup controller cpuset is available before
using it.
The error was noticed when libvirtd was restarted with running domains and
on libvirtd start the qemuConnectCgroup gets called during qemuProcessReconnect.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
With the QEMU components in place, provide the XML parsing to
invoke that code when given the following XML snippet:
<hostdev mode='subsystem' type='scsi_host'>
<source protocol='vhost' wwpn='naa.501234567890abcd'/>
</hostdev>
An optional address element can be specified within the hostdev
(pick CCW or PCI as necessary):
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0625'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
Add basic vhost-scsi tests which were cloned from hostdev-scsi-virtio-scsi
in both xml2argv and xml2xml. Added ones for both vhost-scsi-ccw and
vhost-scsi-pci since the syntaxes are slightly different between them.
Also adjusted the docs to describe the changes.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Adjust the device string that is built for vhost-scsi devices so that it
can be invoked from hotplug.
From the QEMU command line, the file descriptors are expect to be numeric only.
However, for hotplug, the file descriptors are expected to begin with at least
one alphabetic character else this error occurs:
# virsh attach-device guest_0001 ~/vhost.xml
error: Failed to attach device from /root/vhost.xml
error: internal error: unable to execute QEMU command 'getfd':
Parameter 'fdname' expects a name not starting with a digit
We also close the file descriptor in this case, so that shutting down the
guest cleans up the host cgroup entries and allows future guests to use
vhost-scsi devices. (Otherwise the guest will silently end.)
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Open /dev/vhost-scsi, and record the resulting file descriptor, so that
the guest has access to the host device outside of the libvirt daemon.
Pass this information, along with data parsed from the XML file, to build
a device string for the qemu command line. That device string will be
for either a vhost-scsi-ccw device in the case of an s390 machine, or
vhost-scsi-pci for any others.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
For a new hostdev type='scsi_host' we have a number of
required functions for managing, adding, and removing the
host device to/from guests. Provide the basic infrastructure
for these tasks.
The name "SCSIVHost" (and its variants) is chosen to avoid
conflicts with existing code named "SCSIHost" to refer to
a hostdev type='scsi' protcol='none'.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
We already have a "scsi" hostdev subsys type, which refers to a single
LUN that is passed through to a guest. But what of things where
multiple LUNs are passed through via a single SCSI HBA, such as with
the vhost-scsi target? Create a new hostdev subsys type that will
carry this.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Do all the stuff for the vhost-scsi capability in QEMU,
so it's in place for our checks later.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
* add vboxDriver object to serve as a singleton global object that
holds references to IVirtualBox and ISession to be shared among
multiple connections. The vbox_driver is instantiated only once in
the first call vboxGetDriverConnection function that is guarded by
a mutex.
* call vbox API initialize only when the first connection is
established, and likewise uninitialize when last connection
disconnects. The prevents each subsequent connection from overwriting
IVirtualBox/ISession instances of any other active connection that
led to libvirtd segfaults. The virConnectOpen and virConnectClose
implementations are guarded by mutex on the global vbox_driver_lock
where the global vbox_driver object counts connectios and decides
when it's safe to call vbox's init/uninit routines.
* add IVirutalBoxClient to vboxDriver and use it to in tandem with newer
pfnClientInitialize/pfnClientUninitalize APIs for vbox versions that
support it, to avoid usage of the old pfnComInitialize/Uninitialize.
Removed the comment 'Set the migration source' as it isn't valid anymore
and 'start it up' isn't useful as qemuProcessStart() is already a
speaking name.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
The path prefixes for sysfs trees are always prepended by paths
beginning with a slash, making the trailing slash in the prefix
redundant.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Only generate a warning if there is something to report.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Just like in the previous commit, we are not updating CGroups on
chardev hot(un-)plug and thus leaving qemu unable to access any
non-default device users are trying to hotplug.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If users try to hotplug RNG device with a backend different to
/dev/random or /dev/urandom the whole operation fails as qemu is
unable to access the device. The problem is we don't update
device CGroups during the operation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainObjExitAgent is unsafe.
First it accesses domain object without domain lock.
Second it uses outdated logic that goes back to commit 79533da1 of
year 2009 when code was quite different. (unref function
instead of unreferencing only unlocked and disposed object
in case of last reference and leaved unlocking to the caller otherwise).
Nowadays this logic may lead to disposing locked object
i guess.
Another problem is that the callers of qemuDomainObjEnterAgent
use domain object again (namely priv->agent) without domain lock.
This patch address these two problems.
qemuDomainGetAgent is dropped as unused.
Sometimes after domain restart agent is unavailabe even
if it is up and running in guest. Diagnostic message is
"QEMU guest agent is not available due to an error"
that is 'priv->agentError' is set. Investiagion shows that
'priv->agent' is not NULL, so error flag is set probably
during domain shutdown process and not cleaned up eventually.
The patch is quite simple - just clean up error flag unconditionally
upon domain stop.
Other hunks address other cases when error flag is not cleaned up.
1. processSerialChangedEvent. We need to clean error flag
unconditionally here too. For example if upon first 'connected' event we
fail to connect and set error flag and then connect on second
'connected' event then error flag will remain set erroneously
and make agent unavailable.
2. qemuProcessHandleAgentEOF. If error flag is set and we get
EOF we need to change state (and diagnostic) from 'error' to
'not connected'.
qemuConnectAgent return -1 or -2 in case of different errors.
A. -1 is a case of unsuccessuful connection to guest agent.
B. -2 is a case of destoyed domain during connection attempt.
All qemuConnectAgent callers handle the first error the same way
so let's move this logic into qemuConnectAgent itself. Patched
function returns 0 in case A and -1 in case B.
It is already discussed in "[RFC] daemon: remove hardcode dep on libvirt-guests" [1].
Mgmt can use means to save/restore domains on system shutdown/boot other than
libvirt-guests.service. Thus we need to specify appropriate ordering dependency between
libvirtd, domains and save/restore service. This patch takes approach suggested
in RFC and introduces a systemd target, so that ordering can be built next way:
libvirtd -> domain -> virt-guest-shutdown.target -> save-restore.service.
This way domains are decoupled from specific shutdown service via intermediate
target.
[1] https://www.redhat.com/archives/libvir-list/2016-September/msg01353.html
Use the util function virHostdevIsSCSIDevice() to simplify if
statements.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Add the function virHostdevIsSCSIDevice() which detects whether a
hostdev is a SCSI device or not.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Add missing checks if a hostdev is a subsystem/SCSI device before access
the union member 'subsys'/'scsi'. Also fix indentation and simplify
qemuDomainObjCheckHostdevTaint().
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
New line character in name of storagepool is now forbidden because it
mess virsh output and can be confusing for users.
Validation of name is done in driver, after parsing XML to avoid
problems with dissappeared pools which was already created with
new-line char in name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
New line character in name of domain is now forbidden because it
mess virsh output and can be confusing for users.
Validation of name is done in drivers, after parsing XML to avoid
problems with dissappeared domains which was already created with
new-line char in name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 3f71c79768 added 'qemu_id' field to track the id of the cpu
as reported by query-cpus. The patch did not include changes necessary
to propagate the id through the functions matching the data to the
libvirt cpu structures and thus all vcpus had id 0.
Don't use qemuMonitorGetCPUInfo which does a lot of matching to get the
full picture which is not necessary and would be mostly discarded.
Refresh only the vcpu halted state using data from query-cpus.
We don't need to call qemuMonitorGetCPUInfo which is very inefficient to
get data required to update the vcpu 'halted' state.
Add a monitor helper that will retrieve the halted state and return it
in a bitmap so that it can be indexed easily.
Whenever qemuMonitorJSONCheckError returns 0, the "return" object is
guaranteed to exist. Thus virJSONValueObjectGetObject will never fail to
get it. On the other hand, virJSONValueObjectGetArray may fail since the
"return" object may not be an array.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Libvirt's code relies on this fact so don't allow parsing a command line
which would have none.
Libvirtd would crash in the post parse callback on such config.
After a944bd92 we gained support for setting gluster debug level.
However, due to a space we haven't tested whether augeas file
actually works.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After commit f2bf5fbb04, MinGW strikes again. Simply print pid as any
other place after commit b7d2d4af2b.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 94cc577807 tried fixing build on systems that did not have
SCHED_BATCH or SCHED_IDLE defined. But instead of changing it to
conditional support, it rather completely disabled the support for
setting any scheduler. Since then, such old systems are not
supported, but rather than reverting that commit, let's change that to
the conditional support. That way any addition to the list of
schedulers can follow the same style so that we're consistent in the
future.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1366460
When using the --overwrite switch on a pool-build or pool-create, the
The mkfs.ext{2|3|4} commands use mke2fs which requires using the '-F' switch
in order to force overwriting the current filesystem on the whole disk.
Likewise, the mkfs.vfat command uses mkfs.fat which requires using the '-I'
switch in order to force overwriting the current filesystem on the whole disk.
Old GCC on CentOS 6 thinks vendor and vendor_id might be used
uninitialized in virCPUDefStealModel. The compiler is wrong, though.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Implement in virtNetClient and VirNetSocket the needed functions to
expose a new libssh transport, providing all the options that the
libssh2 transport supports.
Implement a new libssh transport, which uses libssh to communicate with
remote hosts, and add all the build system stuff (search of libssh,
private symbols, etc) to built it.
This new transport supports all the common ssh authentication methods,
making use of libvirt's auth callbacks for interaction with the user.
Add a couple of helper functions to check whether one of the default
names of SSH keys (as documented in ssh-keygen(1)) exists, and use them
to specify a key for the libssh2 transport if none was passed.
Add an internal variable to mark the FD as "not owned" by the
virNetSocket, in case the internal implementation takes the actual
ownership of the descriptor; this avoids a warning when closing the
socket, as the FD would be invalid.
Guest CPU definitions with mode='custom' and missing <vendor> are
expected to run on a host CPU from any vendor as long as the required
CPU model can be used as a guest CPU on the host. But even though no CPU
vendor was explicitly requested we would sometimes force it due to a bug
in virCPUUpdate and virCPUTranslate.
The bug would effectively forbid cross vendor migrations even if they
were previously working just fine.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
PPC driver needs to convert POWERx_v* legacy CPU model names into POWERx
to maintain backward compatibility with existing domains. This patch
adds a new step into the guest CPU configuration work flow which CPU
drivers can use to convert legacy CPU definitions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
It was introduced by commit 7a51d9ebb, which started to use
monitor commands without job acquiring, which is unsafe and leads
to simultaneous access to vm->mon structure by different threads.
Crash backtrace is the following (shortened):
Program received signal SIGSEGV, Segmentation fault.
qemuMonitorSend (mon=mon@entry=0x7f4ef4000d20, msg=msg@entry=0x7f4f18e78640) at qemu/qemu_monitor.c:1011
1011 while (!mon->msg->finished) {
0 qemuMonitorSend () at qemu/qemu_monitor.c:1011
1 0x00007f691abdc720 in qemuMonitorJSONCommandWithFd () at qemu/qemu_monitor_json.c:298
2 0x00007f691abde64a in qemuMonitorJSONCommand at qemu/qemu_monitor_json.c:328
3 qemuMonitorJSONQueryCPUs at qemu/qemu_monitor_json.c:1408
4 0x00007f691abcaebd in qemuMonitorGetCPUInfo g@entry=false) at qemu/qemu_monitor.c:1931
5 0x00007f691ab96863 in qemuDomainRefreshVcpuHalted at qemu/qemu_domain.c:6309
6 0x00007f691ac0af99 in qemuDomainGetStatsVcpu at qemu/qemu_driver.c:18945
7 0x00007f691abef921 in qemuDomainGetStats at qemu/qemu_driver.c:19469
8 qemuConnectGetAllDomainStats at qemu/qemu_driver.c:19559
9 0x00007f693382e806 in virConnectGetAllDomainStats at libvirt-domain.c:11546
10 0x00007f6934470c40 in remoteDispatchConnectGetAllDomainStats at remote.c:6267
(gdb) p mon->msg
$1 = (qemuMonitorMessagePtr) 0x0
This change fixes it by calling qemuDomainRefreshVcpuHalted only when job is acquired.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
For machinetypes with a pci-root bus (all legacy PCI), libvirt will
make a "fake" reservation for one extra slot prior to assigning
addresses to unaddressed PCI endpoint devices in the domain. This will
trigger auto-adding of a pci-bridge for the final device to be
assigned an address *if that device would have otherwise instead been
the last device on the last available pci-bridge*; thus it assures
that there will always be at least one slot left open in the domain's
bus topology for expansion (which is important both for hotplug (since
a new pci-bridge can't be added while the guest is running) as well as
for offline additions to the config (since adding a new device might
otherwise in some cases require re-addressing existing devices, which
we want to avoid)).
It's important to note that for the above case (legacy PCI), we must
check for the special case of all slots on all buses being occupied
*prior to assigning any addresses*, and avoid attempting to reserve
the extra address in that case, because there is no free address in
the existing topology, so no place to auto-add a pci-bridge for
expansion (i.e. it would always fail anyway). Since that condition can
only be reached by manual intervention, this is acceptable.
For machinetypes with pcie-root (Q35, aarch64 virt), libvirt's
methodology for automatically expanding the bus topology is different
- pcie-root-ports are plugged into slots (soon to be functions) of
pcie-root as needed, and the new endpoint devices are assigned to the
single slot in each pcie-root-port. This is done so that the devices
are, by default, hotpluggable (the slots of pcie-root don't support
hotplug, but the single slot of the pcie-root-port does). Since
pcie-root-ports can only be plugged into pcie-root, and we don't
auto-assign endpoint devices to the pcie-root slots, this means
topology expansion doesn't compete with endpoint devices for slots, so
we don't need to worry about checking for all "useful" slots being
free *prior* to assigning addresses to new endpoint devices - as a
matter of fact, if we attempt to reserve the open slots before the
used slots, it can lead to errors.
Instead this patch just reserves one slot for a "future potential"
PCIe device after doing the assignment for actual devices, but only
if the only PCI controller defined prior to starting address
assignment was pcie-root, and only if we auto-added at least one PCI
controller during address assignment. This assures two things:
1) that reserving the open slots will only be done when the domain is
initially defined, never at any time after, and
2) that if the user understands enough about PCI controllers that they
are adding them manually, that we don't mess up their plan by
adding extras - if they know enough to add one pcie-root-port, or
to manually assign addresses such that no pcie-root-ports are
needed, they know enough to add extra pcie-root-ports if they want
them (this could be called the "libguestfs clause", since
libguestfs needs to be able to create domains with as few
devices/controllers as possible).
This is set to reserve a single free port for now, but could be
increased in the future if public sentiment goes in that direction
(it's easy to increase later, but essentially impossible to decrease)
Real Q35 hardware has an ICH9 chip that includes several integrated
devices at particular addresses (see the file docs/q35-chipset.cfg in
the qemu source). libvirt already attempts to put the first two sets
of ich9 USB2 controllers it finds at 00:1D.* and 00:1A.* to match the
real hardware. This patch does the same for the ich9 "HD audio"
device.
The main inspiration for this patch is that currently the *only*
device in a reasonable "workstation" type virtual machine config that
requires a legacy PCI slot is the audio device, Without this patch,
the standard Q35 machine created by virt-manager will have a
dmi-to-pci-bridge and a pci-bridge just for the sound device; with the
patch (and if you change the sound device model from the default
"ich6" to "ich9"), the machine definition constructed by virt-manager
has absolutely no legacy PCI controllers - any legacy PCI devices
(e.g. video and sound) are on pcie-root as integrated devices.
Previously we added a set of EHCI+UHCI controllers to Q35 machines to
mimic real hardware as closely as possible, but recent discussions
have pointed out that the nec-usb-xhci (USB3) controller is much more
virtualization-friendly (uses less CPU), so this patch switches the
default for Q35 machinetypes to add an XHCI instead (if it's
supported, which it of course *will* be).
Since none of the existing test cases left out USB controllers in the
input XML, a new Q35 test case was added which has *no* devices, so
ends up with only the defaults always put in by qemu, plus those added
by libvirt.
Now the a dmi-to-pci-bridge is automatically added just as it's needed
(when a pci-bridge is being added), we no longer have any need to
force-add one to every single Q35 domain.
Previously libvirt would only add pci-bridge devices automatically
when an address was requested for a device that required a legacy PCI
slot and none was available. This patch expands that support to
dmi-to-pci-bridge (which is needed in order to add a pci-bridge on a
machine with a pcie-root), and pcie-root-port (which is needed to add
a hotpluggable PCIe device). It does *not* automatically add
pcie-switch-upstream-ports or pcie-switch-downstream-ports (and
currently there are no plans for that).
Given the existing code to auto-add pci-bridge devices, automatically
adding pcie-root-ports is fairly straightforward. The
dmi-to-pci-bridge support is a bit tricky though, for a few reasons:
1) Although the only reason to add a dmi-to-pci-bridge is so that
there is a reasonable place to plug in a pci-bridge controller,
most of the time it's not the presence of a pci-bridge *in the
config* that triggers the requirement to add a dmi-to-pci-bridge.
Rather, it is the presence of a legacy-PCI device in the config,
which triggers auto-add of a pci-bridge, which triggers auto-add of
a dmi-to-pci-bridge (this is handled in
virDomainPCIAddressSetGrow() - if there's a request to add a
pci-bridge we'll check if there is a suitable bus to plug it into;
if not, we first add a dmi-to-pci-bridge).
2) Once there is already a single dmi-to-pci-bridge on the system,
there won't be a need for any more, even if it's full, as long as
there is a pci-bridge with an open slot - you can also plug
pci-bridges into existing pci-bridges. So we have to make sure we
don't add a dmi-to-pci-bridge unless there aren't any
dmi-to-pci-bridges *or* any pci-bridges.
3) Although it is strongly discouraged, it is legal for a pci-bridge
to be directly plugged into pcie-root, and we don't want to
auto-add a dmi-to-pci-bridge if there is already a pci-bridge
that's been forced directly into pcie-root.
Although libvirt will now automatically create a dmi-to-pci-bridge
when it's needed, the code still remains for now that forces a
dmi-to-pci-bridge on all domains with pcie-root (in
qemuDomainDefAddDefaultDevices()). That will be removed in a future
patch.
For now, the pcie-root-ports are added one to a slot, which is a bit
wasteful and means it will fail after 31 total PCIe devices (30 if
there are also some PCI devices), but helps keep the changeset down
for this patch. A future patch will have 8 pcie-root-ports sharing the
functions on a single slot.
Andrea had the right idea when he disabled the "reserve an extra
unused slot" bit for aarch64/virt. For *any* PCI Express-based
machine, it is pointless since 1) an extra legacy-PCI slot can't be
used for hotplug, since hotplug into legacy PCI slots doesn't work on
PCI Express machinetypes, and 2) even for "coldplug" expansion,
everybody will want to expand using Express controllers, not legacy
PCI.
This patch eliminates the extra slot reserve unless the system has a
pci-root (i.e. legacy PCI)
The nec-usb-xhci device (which is a USB3 controller) has always
presented itself as a PCI device when plugged into a legacy PCI slot,
and a PCIe device when plugged into a PCIe slot, but libvirt has
always auto-assigned it to a legacy PCI slot.
This patch changes that behavior to auto-assign to a PCIe slot on
systems that have pcie-root (e.g. Q35 and aarch64/virt).
Since we don't yet auto-create pcie-*-port controllers on demand, this
means a config with an nec-xhci USB controller that has no PCI address
assigned will also need to have an otherwise-unused pcie-*-port
controller specified:
<controller type='pci' model='pcie-root-port'/>
<controller type='usb' model='nec-xhci'/>
(this assumes there is an otherwise-unused slot on pcie-root to accept
the pcie-root-port)
The e1000e is an emulated network device based on the Intel 82574,
present in qemu 2.7.0 and later. Among other differences from the
e1000, it presents itself as a PCIe device rather than legacy PCI. In
order to get it assigned to a PCIe controller, this patch updates the
flags setting for network devices when the model name is "e1000e".
(Note that for some reason libvirt has never validated the network
device model names other than to check that there are no dangerous
characters in them. That should probably change, but is the subject of
another patch.)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1343094
libvirt previously assigned nearly all devices to a "hotpluggable"
legacy PCI slot even on machines with a PCIe root bus (and even though
most such machines don't even support hotplug on legacy PCI slots!)
Forcing all devices onto legacy PCI slots means that the domain will
need a dmi-to-pci-bridge (to convert from PCIe to legacy PCI) and a
pci-bridge (to provide hotpluggable legacy PCI slots which, again,
usually aren't hotpluggable anyway).
To help reduce the need for these legacy controllers, this patch tries
to assign virtio-1.0-capable devices to PCIe slots whenever possible,
by setting appropriate connectFlags in
virDomainCalculateDevicePCIConnectFlags(). Happily, when that function
was written (just a few commits ago) it was created with a
"virtioFlags" argument, set by both of its callers, which is the
proper connectFlags to set for any virtio-*-pci device - depending on
the arch/machinetype of the domain, and whether or not the qemu binary
supports virtio-1.0, that flag will have either been set to PCI or
PCIe. This patch merely enables the functionality by setting the flags
for the device to whatever is in virtioFlags if the device is a
virtio-*-pci device.
NB: the first virtio video device will be placed directly on bus 0
slot 1 rather than on a pcie-root-port due to the override for primary
video devices in qemuDomainValidateDevicePCISlotsQ35(). Whether or not
to change that is a topic of discussion, but this patch doesn't change
that particular behavior.
NB2: since the slot must be hotpluggable, and pcie-root (the PCIe root
complex) does *not* support hotplug, this means that suitable
controllers must also be in the config (i.e. either pcie-root-port, or
pcie-downstream-port). For now, libvirt doesn't add those
automatically, so if you put virtio devices in a config for a qemu
that has PCIe-capable virtio devices, you'll need to add extra
pcie-root-ports yourself. That requirement will be eliminated in a
future patch, but for now, it's simple to do this:
<controller type='pci' model='pcie-root-port'/>
<controller type='pci' model='pcie-root-port'/>
<controller type='pci' model='pcie-root-port'/>
...
Partially Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1330024
This patch cleans up the connect flags for certain types/models of
devices that aren't PCI to return 0. In the future that may be used as
an indicator to the caller about whether or not a device needs a PCI
address. For now it's just ignored, except for in
virDomainPCIAddressEnsureAddr() - called during device hotplug - (and
in some cases actually needs to be re-set to PCI|HOTPLUGGABLE just in
case someone (in some old config) has manually set a PCI address for a
device that isn't PCI.
Before now, all the qemu hotplug functions assumed that all devices to
be hotplugged were legacy PCI endpoint devices
(VIR_PCI_CONNECT_TYPE_PCI_DEVICE). This worked out "okay", because all
devices *are* legacy PCI endpoint devices on x86/440fx machinetypes,
and hotplug didn't work properly on machinetypes using PCIe anyway
(hotplugging onto a legacy PCI slot doesn't work, and until commit
b87703cf any attempt to manually specify a PCIe address for a
hotplugged device would be erroneously rejected).
This patch makes all qemu hotplug operations honor the pciConnectFlags
set by the single all-knowing function
qemuDomainDeviceCalculatePCIConnectFlags(). This is done in 3 steps,
but in a single commit since we would have to touch the other points
at each step anyway:
1) add a flags argument to the hypervisor-agnostic
virDomainPCIAddressEnsureAddr() (previously it hardcoded
..._PCI_DEVICE)
2) add a new qemu-specific function qemuDomainEnsurePCIAddress() which
gets the correct pciConnectFlags for the device from
qemuDomainDeviceConnectFlags(), then calls
virDomainPCIAddressEnsureAddr().
3) in qemu_hotplug.c replace all calls to
virDomainPCIAddressEnsureAddr() with calls to
qemuDomainEnsurePCIAddress()
So in effect, we're putting a "shim" on top of all calls to
virDomainPCIAddressEnsureAddr() that sets the right pciConnectFlags.
Set pciConnectFlags in each device's DeviceInfo and then use those
flags later when validating existing addresses in
qemuDomainCollectPCIAddress() and when assigning new addresses with
qemuDomainPCIAddressReserveNextAddr() (rather than scattering the
logic about which devices need which type of slot all over the place).
Note that the exact flags set by
qemuDomainDeviceCalculatePCIConnectFlags() are different from the
flags previously set manually in qemuDomainCollectPCIAddress(), but
this doesn't matter because all validation of addresses in that case
ignores the setting of the HOTPLUGGABLE flag, and treats PCIE_DEVICE
and PCI_DEVICE the same (this lax checking was done on purpose,
because there are some things that we want to allow the user to
specify manually, e.g. assigning a PCIe device to a PCI slot, that we
*don't* ever want libvirt to do automatically. The flag settings that
we *really* want to match are 1) the old flag settings in
qemuDomainAssignDevicePCISlots() (which is HOTPLUGGABLE | PCI_DEVICE
for everything except PCI controllers) and 2) the new flag settings
done by qemuDomainDeviceCalculatePCIConnectFlags() (which are
currently exactly that - HOTPLUGGABLE | PCI_DEVICE for everything
except PCI controllers).
The lowest level function of this trio
(qemuDomainDeviceCalculatePCIConnectFlags()) aims to be the single
authority for the virDomainPCIConnectFlags to use for any given device
using a particular arch/machinetype/qemu-binary.
qemuDomainFillDevicePCIConnectFlags() sets info->pciConnectFlags in a
single device (unless it has no virDomainDeviceInfo, in which case
it's a NOP).
qemuDomainFillAllPCIConnectFlags() sets info->pciConnectFlags in all
devices that have a virDomainDeviceInfo
The latter two functions aren't called anywhere yet. This commit is
just making them available. Later patches will replace all the current
hodge-podge of flag settings with calls to this single authority.
dom xml generated on begin step should be passed
to perform step in VIR_MIGRATE_PARAM_DEST_XML parameter.
Otherwise 'XML error: failed to parse xml document' is
raised on destination host as dom xml is NULL.
Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
Coverity identified that this variable might be leaked. And it's
right. If an error occurred and we have to roll back the control
jumps to try_remove label where we save the current error (see
0e82fa4c34 for more info). However, inside the code a jump onto
other label is possible thus leaking the error object.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use the newly introduced close callback helpers to make the code look just a
bit cleaner and more importantly, to fix the following memleak regarding a
dangling virAdmConnect object reference caused by assigning NULL to the close
callback data once the catch-disconnect routine used the callback followed
by a comparison of NULL to the originally defined close callback (which at that
moment had already been NULL'd by remoteAdminClientCloseFunc) in
virAdmConnectCloseCallbackUnregister.
717 (88 direct, 629 indirect) bytes in 1 blocks are definitely lost record
110 of 141
at 0x4C2A988: calloc (vg_replace_malloc.c:711)
by 0x530696F: virAllocVar (viralloc.c:560)
by 0x53689E6: virObjectNew (virobject.c:193)
by 0x5368B5E: virObjectLockableNew (virobject.c:219)
by 0x4E3E7EE: virAdmConnectNew (datatypes.c:900)
by 0x4E398BB: virAdmConnectOpen (libvirt-admin.c:220)
by 0x10D3E3: vshAdmConnect (virt-admin.c:161)
by 0x10D624: vshAdmReconnect (virt-admin.c:215)
by 0x10DB0A: cmdConnect (virt-admin.c:353)
by 0x11288F: vshCommandRun (vsh.c:1313)
by 0x10FDB6: main (virt-admin.c:1439)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1357358
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Well, there were three different spots where closeCallback->freeCallback was
called, not looking the same --> potential for bugs - and there indeed is a bug
with refcounting of the @conn object. So this patch partially follows the path
set by commit 24dbb69f by introducing some close callback helpers both to
replace all the spots where we call clean the close callback data with a
dedicated function and to be able to fix the refcounting bug causing a memleak.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The only place we change the @conn object is actually virAdmConnectOpen
routine, thus at the moment we don't really need to lock it, given the fact that
what we're trying to do here is to change the closeCallback object which is a
lockable object itself, so that should be enough to avoid races.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
As was suggested in an earlier review comment[1], we can
catch some additional code points by cleaning up how we use the
hostdev subsystem type in some switch statements.
[1] End of https://www.redhat.com/archives/libvir-list/2016-September/msg00399.html
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
The memory device alias needs to be treated as machine ABI as qemu is
using it in the migration stream for section labels. To simplify this
generate the alias from the slot number unless an existing broken
configuration is detected.
With this patch the aliases are predictable and even certain
configurations which would not be migratable previously are fixed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1359135
As with other devices assign the slot number right away when adding the
device. This will make the slot numbers static as we do with other
addressing elements and it will ultimately simplify allocation of the
alias in a static way which does not break with qemu.
Detect on reconnect to a running qemu VM whether the alias of a
hotpluggable memory device (dimm) does not match the dimm slot number
where it's connected to. This is necessary as qemu is actually
considering the alias as machine ABI used to connect the backend object
to the dimm device.
This will require us to keep them consistent so that we can reliably
restore them on migration. In some situations it was currently possible
to create a mismatched configuration and qemu would refuse to restore
the migration stream.
To avoid breaking existing VMs we'll need to keep the old algorithm
though.
Simplify handling of the 'dimm' address element by allowing to specify
the slot number only. This will allow libvirt to allocate slot numbers
before starting qemu.
We dropped support for RHEL-5 vintage Xen a while ago,
but forgot to remove some of the hacks for it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1386976
We have everything ready. Actually the only limitation was our
check that denied hotplug of vhost-user.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If there is an error hotpluging a net device (for whatever
reason) a rollback operation is performed. However, whilst doing
so various helper functions that are called report errors on
their own. This results in the original error to be overwritten
and thus misleading the user.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though using /dev/shm/asdf as the backend, we still need to make
the mapping shared. The original patch forgot to add that parameter.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1392031
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
../../src/qemu/qemu_capabilities.c:3757: error: declaration of
'basename' shadows a global declaration [-Wshadow]
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Function qemuDomainAttachShmemDevice() steals the device data if the
hotplug was successful, but the condition checked for unsuccessful
execution otherwise.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Propagate the selected or default level to qemu if it's supported.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1376009
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This helps in selecting log level of the gluster gfapi, output to stderr.
The option is 'gluster_debug_level', can be tuned by editing
'/etc/libvirt/qemu.conf'
Debug levels ranges 0-9, with 9 being the most verbose, and 0
representing no debugging output. The default is the same as it was
before, which is a level of 4. The current logging levels defined in
the gluster gfapi are:
0 - None
1 - Emergency
2 - Alert
3 - Critical
4 - Error
5 - Warning
6 - Notice
7 - Info
8 - Debug
9 - Trace
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Allow detecting capabilities according to the qemu QMP schema. This is
necessary as sometimes the availability of certain options depends on
the presence of a field in the schema.
This patch adds support for loading the QMP schema when detecting qemu
capabilities and adds a very simple query language to allow traversing
the schema and selecting a certain element from it.
The infrastructure in this patch uses a query path to set a specific
capability flag according to the availability of the given element in
the schema.
Some operations like reboot, save, coreDump, blockStats,
ifaceStats make sense iff domain is running. While it is
technically possible for our test driver to return success
regardless of domain state, we should copy constraints from
other drivers and thus deny these operations over inactive
domains.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1379196
Add check in qemuCheckDiskConfig for an invalid combination
of using the 'scsi' bus for a block 'lun' device and any disk
source format other than 'raw'.
Fixes the behavior when destroying a domain more than once.
VIR_ERR_OPERATION_INVALID should be raised when destroying an
already destroyed domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit c29e6d4805 cause build failure on RHEL-6:
../../src/qemu/qemu_capabilities.c: In function 'virQEMUCapsIsValid':
../../src/qemu/qemu_capabilities.c:4085: error: declaration of 'ctime'
shadows a global declaration [-Wshadow]
/usr/include/time.h:258: error: shadowed declaration is here [-Wshadow]
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's keep all run time validation of cached QEMU capabilities in
virQEMUCapsIsValid and call it whenever we access the cache.
virQEMUCapsInitCached should keep only the checks which do not make
sense once the cache is loaded in memory.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virQEMUCapsLoadCache loads QEMU capabilities from a file, but strangely
enough it returns the loaded QEMU binary ctime in qemuctime parameter
instead of storing it in qemuCaps.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This is needed in order to migrate a domain with shmem devices as that
is not allowed to migrate.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
QEMU added support for ivshmem-plain and ivshmem-doorbell. Those are
reworked varians of legacy ivshmem that are compatible from the guest
POV, but not from host's POV and have sane specification and handling.
Details about the newer device type can be found in qemu's commit
5400c02b90bb:
http://git.qemu.org/?p=qemu.git;a=commit;h=5400c02b90bb
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We're keeping some things at default and that's not something we want to
do intentionally. Let's save some sensible defaults upfront in order to
avoid having problems later. The details for the defaults (of the newer
implementation) can be found in qemu's commit 5400c02b90bb:
http://git.qemu.org/?p=qemu.git;a=commit;h=5400c02b90bb
Since we are merely saving the defaults it will not change the guest ABI
and thanks to the fact that we're doing it in the PostParse callback it
will not break the ABI stability checks.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The old ivshmem is deprecated in QEMU, so let's use the better
ivshmem-{plain,doorbell} variants instead.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Unlike other migration capabilities, post-copy is also set on the
destination host which means it doesn't disappear once domain is
migrated. As a result of that other functionality which internally uses
migration to a file (virDomainManagedSave, virDomainSave,
virDomainCoreDump) may fail after migration because the post-copy
capability is still set.
https://bugzilla.redhat.com/show_bug.cgi?id=1374718
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
commit 9065cfaa added the ability to disable DNS services for a
libvirt virtual network. If neither DNS nor DHCP is needed for a
network, then we don't need to start dnsmasq, so code was added to
check for this.
Unfortunately, it was written with a great lack of attention to detail
(I can say that, because I was the author), and the loop that checked
if DHCP is needed for the network would never end if the network had
multiple IP addresses and the first <ip> had no <dhcp> subelement
(which would have contained a <range> or <host> subelement, thus
requiring DHCP services).
This patch rewrites the check to be more compact and (more
importantly) finite.
This bug was present in release 2.2.0 and 2.3.0, so will need to be
backported to any relevant maintainence branches.
Reported here:
https://www.redhat.com/archives/libvirt-users/2016-October/msg00032.htmlhttps://www.redhat.com/archives/libvirt-users/2016-October/msg00045.html
If we failed to unlink old dom cfg file, we goto rollback.
But inside rollback, we fogot to unlink the new dom cfg file.
This patch fixes this issue.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Whilst working on another issue, I've noticed that in some
functions we have a local @driver variable among with access to
global @qemu_driver variable. This makes no sense.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Rather than waiting until we've free'd up all the resources, cause the
'workerPool' thread pool to flush as soon as possible during stateCleanup.
Otherwise, it's possible something waiting to run will SEGV such as is the
case during race conditions of simultaneous exiting libvirtd and qemu process.
Resolves the following crash:
[1] crash backtrace: (bt is shortened a bit):
0 0x00007ffff7282f2b in virClassIsDerivedFrom
(klass=0xdeadbeef, parent=0x55555581d650) at util/virobject.c:169
1 0x00007ffff72835fd in virObjectIsClass
(anyobj=0x7fffd024f580, klass=0x55555581d650) at util/virobject.c:365
2 0x00007ffff7283498 in virObjectLock
(anyobj=0x7fffd024f580) at util/virobject.c:317
3 0x00007ffff722f0a3 in virCloseCallbacksUnset
(closeCallbacks=0x7fffd024f580, vm=0x7fffd0194db0,
cb=0x7fffdf1af765 <qemuProcessAutoDestroy>)
at util/virclosecallbacks.c:164
4 0x00007fffdf1afa7b in qemuProcessAutoDestroyRemove
(driver=0x7fffd00f3a60, vm=0x7fffd0194db0) at qemu/qemu_process.c:6365
5 0x00007fffdf1adff1 in qemuProcessStop
(driver=0x7fffd00f3a60, vm=0x7fffd0194db0, reason=VIR_DOMAIN_SHUTOFF_CRASHED,
asyncJob=QEMU_ASYNC_JOB_NONE, flags=0)
at qemu/qemu_process.c:5877
6 0x00007fffdf1f711c in processMonitorEOFEvent
(driver=0x7fffd00f3a60, vm=0x7fffd0194db0) at qemu/qemu_driver.c:4545
7 0x00007fffdf1f7313 in qemuProcessEventHandler
(data=0x555555832710, opaque=0x7fffd00f3a60) at qemu/qemu_driver.c:4589
8 0x00007ffff72a84c4 in virThreadPoolWorker
(opaque=0x555555805da0) at util/virthreadpool.c:167
Thread 1 (Thread 0x7ffff7fb1880 (LWP 494472)):
1 0x00007ffff72a7898 in virCondWait
(c=0x7fffd01c21f8, m=0x7fffd01c21a0) at util/virthread.c:154
2 0x00007ffff72a8a22 in virThreadPoolFree
(pool=0x7fffd01c2160) at util/virthreadpool.c:290
3 0x00007fffdf1edd44 in qemuStateCleanup ()
at qemu/qemu_driver.c:1102
4 0x00007ffff736570a in virStateCleanup ()
at libvirt.c:807
5 0x000055555556f991 in main (argc=1, argv=0x7fffffffe458) at libvirtd.c:1660
Recently, libprlsdk got a separate flag PNA_BRIDGE corresponding to
type=bridge libvirt network interfaces. Let's use it and get rid of
all workarounds previously added to support it.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We don't support cpu pinning for TCG domains because QEMU runs them in
one thread only. But vcpupin command was able to set them, which
resulted in a failed startup, so make sure that doesn't happen.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
When starting a new domain, we allocate the USB addresses and keep
an address cache in the domain object's private data.
However this data is lost on libvirtd restart.
Also generate the address cache if all the addresses have been
specified, so that devices hotplugged after libvirtd restart
also get theirs assigned.
https://bugzilla.redhat.com/show_bug.cgi?id=1387666
Return 0 instead of 1, so that qemuDomainAttachChrDevice does not
assume the address neeeds to be released on error.
No functional change, since qemuDomainReleaseDeviceAddress has been a noop
for virtio serial addresses since the address cache was removed
in commit 19a148b.
This time do not require an address cache as a parameter.
Simplify qemuDomainAttachChrDeviceAssignAddr to not generate
the virtio serial address cache for devices of other types.
Partially reverts commit 925fa4b.
Commit 19a148b dropped the cache from QEMU's private domain object.
Assume the callers do not have the cache by default and use
a longer name for the internal ones that do.
This makes the shorter 'virDomainVirtioSerialAddrAutoAssign'
name availabe for a function that will not require the cache.
When user tries to resume already running domain (Qemu or LXC)
VIR_ERR_OPERATION_INVALID error should be raised with message that
domain is already running.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1009008
Support for virtio disks was added in commit id 'fceeeda', but not for
SCSI drives. Add the secret for the server when hotplugging a SCSI drive.
No need to make any adjustments for unplug since that's handled during
the qemuDomainDetachDiskDevice call to qemuDomainRemoveDiskDevice in
the qemuDomainDetachDeviceDiskLive switch.
Added a test to/for the command line processing to show the command line
options when adding a SCSI drive for the guest.
https://bugzilla.redhat.com/show_bug.cgi?id=1300776
Complete the implementation of support for TLS encryption on
chardev TCP transports by adding the hotplug ability of a secret
to generate the passwordid for the TLS object for chrdev, RNG,
and redirdev.
Fix up the order of object removal on failure to be the inverse
of the attempted attach (for redirdev, chr, rng) - for each the
tls object was being removed before the chardev backend.
Likewise, add the ability to hot unplug that secret object as well
and be sure the order of unplug matches that inverse order of plug.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add the secret object so the 'passwordid=' can be added if the command line
if there's a secret defined in/on the host for TCP chardev TLS objects.
Preparation for the secret involves adding the secinfo to the char source
device prior to command line processing. There are multiple possibilities
for TCP chardev source backend usage.
Add test for at least a serial chardev as an example.
Need to remove the drive first, then the secobj and/or encobj if they exist.
This is because the drive has a dependency on secobj (or the secret for
the networked storage server) and/or the encobj (or the secret for the
LUKS encrypted volume). Deleting either object first leaves an drive
without it's respective objects.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add in the block I/O throttling length/duration parameter to the command
line if supported. If not supported, fail command creation.
Add the xml2argvtest for testing.
Add support for a duration/length for the bps/iops and friends.
Modify the API in order to add the "blkdeviotune." specific definitions
for the iotune throttling duration/length options
total_bytes_sec_max_length
write_bytes_sec_max_length
read_bytes_sec_max_length
total_iops_sec_max_length
write_iops_sec_max_length
read_iops_sec_max_length
Create a helper to set the bytes/iops iotune default values based on
the current qemu setting for both the live and persistent definitions.
NB: This also fixes an unreported bug where the persistent values for
*_max and size_iops_sec would be set back to 0 if unrelated persistent
values were set.
This patch will also adjust the qemuMonitorJSONSetBlockIoThrottle error
procession so that rather than returning/displaying:
"error: internal error: Unexpected error"
Fetch the actual error message from qemu and display that
Create a macros to hide all the comparisons for each of the fields.
Add a 'continue;' for a compiler hint that we only need to find one
this should be similar enough to the if - elseif - elseif logic.
Commit id '2c32237' added the TLS object removal to the DetachChrDevice
all when it should have been added to the RemoveChrDevice since that's
the norm for similar processing (e.g. disk)
Signed-off-by: John Ferlan <jferlan@redhat.com>
After succesfully reading an outdated caps cache from disk,
calling virQEMUCapsReset did not properly clear out the calculated
host CPU model. This lead to a memory leak when the host CPU model
pointer was overwritten later in virQEMUCapsNewForBinaryInternal.
Introduced by commit 68c70118.
Although the migration port is immediately released in the
finish phase of migration, it was never set in the domain
private object when allocated in the prepare phase. So
libxlDomainMigrationFinish() always released a 0-initialized
migrationPort, leaking any allocated port. After enough
migrations to exhaust the migration port pool, migration would
fail with
error: internal error: Unable to find an unused port in range
'migration' (49152-49216)
Fix it by setting libxlDomainObjPrivate->migrationPort to the
port allocated in the prepare phase. While at it, also fix
leaking an allocated port if the prepare phase fails.
Extended qemuDomainGetStatsVcpu to include the per vcpu halted
indicator if reported by QEMU. The key for new boolean value
has the format "vcpu.<n>.halted".
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Adding a field to the domain's private vcpu object to hold the halted
state information.
Adding two functions in support of the halted state:
- qemuDomainGetVcpuHalted: retrieve the halted state from a
private vcpu object
- qemuDomainRefreshVcpuHalted: obtain the per-vcpu halted states
via qemu monitor and store the results in the private vcpu objects
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Hao QingFeng <haoqf@linux.vnet.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Extended the qemuMonitorCPUInfo with a halted flag. Extract the halted
flag for both text and JSON monitor.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
An upcoming commit will remove the "flag" argument from all the calls
to reserve the next available address|slot, but I don't want to change
the arguments in the hypervisor-agnostic
virDomainPCIAddressReserveNext*() functions, so this patch places a
simple qemu-specific wrapper around those functions - the new
functions don't take a flags arg, but grab it from the device's
info->pciConnectFlags.
There is an existing virDomainPCIAddressReserveNextSlot() which will
reserve all functions of the next available PCI slot. One place in the
qemu PCI address assignment code requires reserving a *single*
function of the next available PCI slot. This patch modifies and
renames virDomainPCIAddressReserveNextSlot() so that it can fulfill
both the original purpose and the need to reserve a single function.
(This is being done so that the abovementioned code in qemu can have
its "kind of open coded" solution replaced with a call to this new
function).
Since TLS was introduced hostwide for libvirt 2.3.0 and a domain
configurable haveTLS was implemented for libvirt 2.4.0, we have to
modify the migratable XML for specific case where the 'tls' attribute
is based on setting from qemu.conf.
The "tlsFromConfig" is libvirt internal attribute and is stored only in
status XML to ensure that when libvirtd is restarted this internal flag
is not lost by the restart.
That flag is used to decide whether we should put *tls* attribute to
migratable XML or not.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Add an optional "tls='yes|no'" attribute for a TCP chardev.
For QEMU, this will allow for disabling the host config setting of the
'chardev_tls' for a domain chardev channel by setting the value to "no" or
to attempt to use a host TLS environment when setting the value to "yes"
when the host config 'chardev_tls' setting is disabled, but a TLS environment
is configured via either the host config 'chardev_tls_x509_cert_dir' or
'default_tls_x509_cert_dir'
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Currently the union has only one member so remove that union. If there
is a need to add a new type of source for new bus in the future this
will force the author to add a union and properly check bus type before
any access to union member.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit id '2c322378' missed the nuance that the rng backend could be
using a TCP chardev and if TLS is enabled on the host, thus will need
to have the TLS object added.
Commit id '2c322378' missed the nuance that the redirdev backend could
be using a TCP chardev and if TLS is enabled on the host, thus will need
to have the TLS object added.
Rather than VIR_ALLOC() the data, use virDomainChrSourceDefNew in order
to get the private data if necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Use a pointer and the virDomainChrSourceDefNew() function in order to
allocate the structure for _virDomainRedirdevDef.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Use a pointer and the virDomainChrSourceDefNew() function in order to
allocate the structure for _virDomainSmartcardDef.
Signed-off-by: John Ferlan <jferlan@redhat.com>
instead of:
virBufferAdd(buf, "arg1,");
virBufferAdd(buf, "arg2");
lets have:
virBufferAdd(buf, "arg1");
virBufferAdd(buf, ",arg2");
Because it's better. Consider we want to add conditionally arg3.
With this change, it's simple:
if (cond)
virBufferAdd(buf, ",arg3");
with current code there might be a comma hanging at EOL.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
from virDomainDefPtr to virDomainObjPtr so that the function has
access to the other parts of the virDomainObjPtr. Take advantage of
this by removing the "priv" arg and retrieving it from the
virDomainObjPtr instead.
No functional change.
For some reason the values of memballoon model are set using an
anonymous enum, making it impossible to perform nice tricks like
demanding there are cases for all possible values in a switch. This
patch turns the anonymous enum into virDomainMemballoonModel.
More occurences of repeatedly dereferencing the same pointer stored in
an array are replaced with the definition of a temporary pointer that
is then used directly. No functional change.
Commit id '5f2a132786' should have placed the data in the host source
def structure since that's also used by smartcard, redirdev, and rng in
order to provide a backend tcp channel. The data in the private structure
will be necessary in order to provide the secret properly.
This also renames the previous names from "Chardev" to "ChrSource" for
the private data structures and API's
Change the virDomainChrDef to use a pointer to 'source' and allocate
that pointer during virDomainChrDefNew.
This has tremendous "fallout" in the rest of the code which mainly
has to change source.$field to source->$field.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When hotplugging networks with ancient QEMUs not supporting
QEMU_CAPS_NETDEV, we use space instead of a comma as the separator
between the network type and other options.
Except for "user", all the network types pass other options
and use up the first separator by the time we get to the section
that adds the alias (or vlan for QEMUs without CAPS_NETDEV).
Since the alias/vlan is mandatory, convert all preceding code to add
the separator at the end, removing the need to rewrite type_sep for
all types but NET_TYPE_USER.
Absent driver name attribute is invalid xml. Which in turn makes
unusable 'virsh edit' for example. The value does not make
much sense and ignored on input so nobody will hurt.
vz sdk supports setting serial number only for disk devices.
Getting serial upon cdrom(for example) is error however
setting is just ignored. Let's check for disk device
explicitly for clarity in both cases.
Setting serial number for other devices is ignored
with an info note just as before.
We need usual conversion from "" to NULL in direction
vz sdk -> libvirt, because "" is not valid for libvirt
and "" means unspecifiend in vz sdk which is NULL for libvirt.
New line character in name of network is now forbidden because it
mess virsh output and can be confusing for users. Validation of
name is done in network driver, after parsing XML to avoid
problems with disappeared network which was already created with
new-line char in name.
Closes-Bug: https://bugzilla.redhat.com/show_bug.cgi?id=818064
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
New util function virXMLCheckIllegalChars is now used to test if
parsed network contains illegal char '/' in it's name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This new function can be used to check if e.g. name of XML
node don't contains forbidden chars like "/" or "\n".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Modeled after the qemuDomainHostdevPrivatePtr (commit id '27726d8c'),
create a privateData pointer in the _virDomainChardevDef to allow storage
of private data for a hypervisor in order to at least temporarily store
secret data for usage during qemuBuildCommandLine.
NB: Since the qemu_parse_command (qemuParseCommandLine) code is not
expecting to restore the secret data, there's no need to add code
code to handle this new structure there.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new qemu.conf variables to store the UUID for the secret that could
be used to present credentials to access the TLS chardev. Since this will
be a server level and it's possible to use some sort of default, introduce
both the default and chardev logic at the same time making the setting of
the chardev check for it's own value, then if not present checking whether
the default value had been set.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When converting a domain xml containing a CDROM device without
any attached source, don't add a target=(null) to the libxl config
disk definition: xen doesn't like it at all and would fail to start
the domain.
There was inconsistency between alias used to create tls-creds-x509
object and alias used to link that object to chardev while hotpluging.
Hotplug ends with this error:
error: Failed to detach device from channel-tcp.xml
error: internal error: unable to execute QEMU command 'chardev-add':
No TLS credentials with id 'objcharchannel3_tls0'
In XML we have for example alias "serial0", but on qemu command line we
generate "charserial0".
The issue was that code, that creates QMP command to hotplug chardev
devices uses only the second alias "charserial0" and that alias is also
used to link the tls-creds-x509 object.
This patch unifies the aliases for tls-creds-x509 to be always generated
from "charserial0".
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Instead of typing the prefix every time we want to append parameters
to qemu command line use a variable that contains prefixed alias.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
We need to make sure that the chardev is TCP. Without this check we
may access different part of union and corrupt pointers.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The code is entirely correct, but it still managed to trip me
up when I first ran into it because I did not realize right away
that VIR_PCI_CONNECT_TYPES_ENDPOINT was not a single flag, but
rather a mask including both VIR_PCI_CONNECT_TYPE_PCI_DEVICE and
VIR_PCI_CONNECT_TYPE_PCIE_DEVICE.
In order to save the next distracted traveler in PCI Address Land
some time, document this fact with a comment. Add a test case for
the behavior as well.
A pci-bridge has *almost* the same rules as a legacy PCI endpoint
device for where it can be automatically connected, and until now both
had been considered identical. There is one pairing that is okay when
specifically requested by the user (i.e. manual assignment), but we
want to avoid it when auto-assigning addresses - plugging a pci-bridge
directly into pcie-root (it is cleaner to plug in a dmi-to-pci-bridge,
then plug the pci-bridge into that).
In order to allow that difference, this patch makes a separate
CONNECT_TYPE for pci-bridge, and uses it to restrict auto-assigned
addresses for pci-bridges to be only on pci-root, pci-expander-bus,
dmi-to-pci-bridge, or on another pci-bridge.
NB: As with other discouraged-but-seem-to-work configurations
(e.g. plugging a legacy PCI device into a pcie-root-port) if someone
*really* wants to, they can still force a pci-bridge to be plugged
into pcie-root (by manually specifying its PCI address.)
https://bugzilla.redhat.com/show_bug.cgi?id=1357416
Rather than return a 0 or -1 and the *result string, return just the result
string to the caller. Alter all the callers to handle the different return.
As a side effect or result of this, it's much clearer that we cannot just
assign the returned string into the scsi_host wwnn, wwpn, and fabric_wwn
fields - rather we should fetch a temporary string, then as long as our
fetch was good, VIR_FREE what may have been there, and STEAL what we just got.
This fixes a memory leak in the virNodeDeviceCreateXML code path through
find_new_device and nodeDeviceLookupSCSIHostByWWN which will continually
call nodeDeviceSysfsGetSCSIHostCaps until the expected wwnn/wwpn is found
in the device object capabilities.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1366108
There are couple of things that needs to be done in order to
allow vhost-user hotplug. Firstly, vhost-user requires a chardev
which is connected to vhost-user bridge and through which qemu
communicates with the bridge (no acutal guest traffic is sent
through there, just some metadata). In order to generate proper
chardev alias, we must assign device alias way sooner.
Then, because we are plugging the chardev first, we need to do
the proper undo if something fails - that is remove netdev too.
We don't want anything to be left over in case attach fails at
some point.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1366505
So far, this function lacked support for
VIR_DOMAIN_NET_TYPE_VHOSTUSER leaving callers to hack around the
problem by constructing the command line on their own. This is
not ideal as it blocks hot plug support.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, what we do for vhost-user network is generate the
following part of command line:
-netdev type=vhost-user,id=hostnet0,chardev=charnet0
There's no need for 'type=' it is the default. Drop it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's no need to reinvent the wheel here. We already have a
function to format virDomainChrSourceDefPtr. It's called
qemuBuildChrChardevStr(). Use that instead of some dummy
virBufferAsprintf().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This alone makes not much sense. But the aim is to reuse this
function in qemuBuildVhostuserCommandLine() where 'nowait' is not
supported for vhost-user devices.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We tend to prevent using 'default' in switches. And it is for a
good reason - control may end up in paths we wouldn't want for
new values. In this specific case, if qemuBuildHostNetStr is
called over VIR_DOMAIN_NET_TYPE_VHOSTUSER it would produce
meaningless output. Fortunately, there no such call yet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Instead of blindly claim support for hot-plugging of every
interface type out there we should copy approach we have for
device types: white listing supported types and explicitly error
out on unsupported ones.
For instance, trying to hotplug vhostuser interface results in
nothing usable from guest currently. vhostuser typed interfaces
require additional work on our side.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The idea is to have function that does some checking at its
beginning and then have one big switch for all the interface
types it supports.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The idea is to have function that does some checking of the
arguments at its beginning and then have one big switch for all
the interface types it supports. Each one of them generating the
corresponding part of the command line.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The idea is to have function that does some checking of the
arguments at its beginning and then have one big switch for all
the interface types it supports. Each one of them generating the
corresponding part of the command line.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function for some weird reason returns integer instead of
virDomainNetType type. It is important to return the correct type
so that we know what values we can expect.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not every system out there has syslog, that's why we check for it
in our configure script. However, in 640b58abdf while fixing
another issue, some variables and functions are called that are
defined only when syslog.h is present. But these function
calls/variables were not guarded by #ifdef-s.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuBuildSmbiosBiosStr and qemuBuildSmbiosSystemStr return NULL if
there's nothing to format on the commandline. Reporting errors from
buffer creation doesn't make sense since it would be ignored.
This initially started as a fix of some debug printing in
virCgroupDetect. However it turned out that other places suffer
from the similar problem. While dealing with pids, esp. in cases
where we cannot use pid_t for ABI stability reasons, we often
chose an unsigned integer type. This makes no sense as pid_t is
signed.
Also, new syntax-check rule is introduced so we won't repeat this
mistake.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are two video devices with models without VGA compatibility mode.
They are primary used as secondary video devices, but in some cases it
is required to use them also as primary video devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This improves commit 706b5b6277 in a way that we check qemu capabilities
instead of what architecture we are running on to detect whether we can
use *virtio-vga* model or not. This is not a case only for arm/aarch64.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit 21373feb added support for primary virtio-vga device but it was
checking for virtio-gpu. Let's check for existence of virtio-vga if we
want to use it.
Virtio video device is currently represented by three different models
*virtio-gpu-device*, *virtio-gpu-pci* and *virtio-vga*. The first two
models are tied together and if virtio video devices is compiled in they
both exist. However, the *virtio-vga* model doesn't have to exist on
some architectures even if the first two models exist. So we cannot
group all three together.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Before this patch we've checked qemu capabilities for video devices
only while constructing qemu command line using "-device" option.
Since we support qemu only if "-device" option is present we can use
the same capabilities to check also video devices while using "-vga"
option to construct qemu command line.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
All definition validation that doesn't depend on qemu capabilities
and was allowed previously as valid definition should be placed into
qemuDomainDefValidate.
The check whether video type is supported or not was based on an enum
that translates type into model. Use switch to ensure that if new
video type is added, it will be properly handled.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
We generally uses QEMU_CAPS_DEVICE_$NAME to probe for existence of some
device and QEMU_CAPS_$NAME_$PROP to probe for existence of some property
of that device.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
If QEMU in question supports QMP, this capability is set if
QEMU_CAPS_DEVICE_QXL was set based on existence of "-device qxl". If
libvirt needs to parse *help*, because there is no QMP support, it
checks for existence of "-vga qxl", but it also parses output of
"-device ?" and sets QEMU_CAPS_DEVICE_QXL too.
Now that libvirt supports only QEMU that has "-device" implemented it's
safe to drop this capability and stop using it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This patch simplifies QEMU capabilities for QXL video device. QEMU
exposes this device as *qxl-vga* and *qxl* and they are both the same
device with the same set of parameters, the only difference is that
*qxl-vga* includes VGA compatibility.
Based on QEMU code they are tied together so it's safe to check only for
presence of only one of them.
This patch also removes an invalid test case "video-qxl-sec-nodevice"
where there is only *qxl-vga* device and *qxl* device is not present.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Qemu supports *xen* video device only with XEN and this code was part
of xenner code. We dropped support for xenner in commit de9be0a.
Before this patch if you used 'xen' video type you ended up with
domain without any video device at all. Now we don't allow to start
such domain.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
It was never safe anyway and as such shouldn't have been enabled in the
first place. Future patches will allow hot-(un)pluging of some ivshmem
devices as a workaround.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Due to the switch of parameters in a call to virDomainShmemDefEquals()
no device was found when looking for device with all the information
except address. Also fix the indentation.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the last event callback is unregistered while the event loop is
dispatching, it is only marked as deleted, but not removed. The number
of callbacks is more than zero in that case, so the timer is not
removed. Because it can be removed in this function now (but also
accessed afterwards so that we set 'isDispatching = false' and have it
locked), we need to temporarily increase the reference counter of the
state for the duration of this function.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is a repeating pattern of code that removes the timer if it's not
needed. So let's move it to a new function. We'll also use it later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There should be one more reference because it is being kept in the list
of callbacks as an opaque. We also unref it properly using
virObjectFreeCallback.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently Libvirt allows attempts to migrate read only disks. Qemu
cannot handle this as read only disks cannot be written to on the
destination system. The end result is a cryptic error message and a
failed migration.
This patch causes migration to fail earlier and provides a meaningful
error message stating that migrating read only disks is not supported.
Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Make sure that the topology results into a sane number of cpus (up to
UINT_MAX) so that it can be sanely compared to the vcpu count of the VM.
Additionally the helper added in this patch allows to fetch the total
number the topology results to so that it does not have to be
reimplemented later.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1378290
In both virLogParseOutput and virLogParseFilter, rather than returning
NULL, goto cleanup since it's possible that for each the first condition
passes, but the || condition doesn't and thus we leak memory.
The dnsmasq man page recommends that dhcp-authoritative "should be
set when dnsmasq is definitely the only DHCP server on a network".
This is the case for libvirt-managed virtual networks.
The effect of this is that VMs that fail to renew their DHCP lease
in time (e.g. if the VM or host is suspended) will be able to
re-acquire the lease even if it's expired, unless the IP address has
been taken by some other host. This avoids various annoyances caused
by changing VM IP addresses.
Sometimes virObjectEventStateFlush can be called without timer (if the
last event was unregistered right when the timer fired). There is a
check for timer == -1, but that triggers warning and other log messages,
which is unnecessary.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Handling of outputs and filters has been changed in a way that splits
parsing and defining. Do the same thing for logging priority as well, this
however, doesn't need much of a preparation.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is mainly virLogAddOutputTo* which were replaced by virLogNewOutputTo* and
the previously poorly named ones virLogParseAndDefine* functions. All of these
are unnecessary now, since all the original callers were transparently switched
to the new model of separate parsing and defining logic.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Similar to outputs, parser should do parsing only, thus the 'define' logic
is going to be stripped from virLogParseAndDefineFilters by replacing calls to
this method to virLogSetFilters instead.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virLogParseAndDefineOutputs is going to be stripped from 'output defining'
logic, replace all relevant occurrences with virLogSetOutputs call to make the
change transparent to all original callers (daemons mostly).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This method will eventually replace virLogParseAndDefineFilters which
currently does both parsing and defining.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This API is the entry point to output modification of the logger. Currently,
everything is done by virLogParseAndDefineOutputs. Parsing and defining will be
split into two operations both handled by this method transparently.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Abstraction added over parsing a single filter. The method parses potentially a
set of logging filters, while adding each filter logging object to a
caller-provided array.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Another abstraction added on the top of parsing a single logging output. This
method takes and parses the whole set of outputs, adding each single output
that has already been parsed into a caller-provided array. If the user-supplied
string contained duplicate outputs, only the last occurrence is taken into
account (all the others are removed from the list), so we silently avoid
duplicate logs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as for outputs, introduce a new method, that is basically the same as
virLogParseAndDefineFilter with the difference that it does not define the
filter. It rather returns a newly created object that needs to be inserted into
a list and then defined separately.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Introduce a method to parse an individual logging output. The difference
compared to the virLogParseAndDefineOutput is that this method does not define
the output, instead it makes use of the virLogNewOutputTo* methods introduced
in the previous patch and just returns the virLogOutput object that has to be
added to a list of object which then can be defined as a whole via
virLogDefineOutputs. The idea remains still the same - split parsing and
defining of the logging primitives (outputs, filters).
Additionally, since virLogNewOutputTo* methods are now finally used,
ATTRIBUTE_UNUSED can be successfully removed from the methods' definitions,
since that was just to avoid compiler complaints about unused static functions.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Now that we're in the critical section, syslog connection can be re-opened
by issuing openlog, which is something that cannot be done beforehand, since
syslog keeps its file descriptor private and changing the tag earlier might
introduce a log inconsistency if something went wrong with preparing a new set
of logging outputs in order to replace the existing one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Continuing with the effort to split output parsing and defining, these new
functions return a logging object reference instead of defining the output.
Eventually, these functions will replace the existing ones (virLogAddOutputTo*)
which will then be dropped.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of filters. It takes a list of
filters, preferably created by virLogParseFilters. The original set of filters
is reset and replaced by the new user-provided set of filters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of outputs. It takes a list of
outputs, preferably created by virLogParseOutputs. The original set of outputs
is reset and replaced by the new user-provided set of outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Outputs are a bit trickier than filters, since the user(config)-specified
set of outputs can contain duplicates. That would lead to logging the same
message twice. For compatibility reasons, we cannot just error out and forbid
the daemon to start if we find duplicate outputs which do not make sense.
Instead, we could silently take into account only the last occurrence of the
duplicate output and remove all the previous ones, so that the logger will not
try to use them when it is looping over all of its registered outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order to later split output parsing and output defining, introduce a new
function which will create a new virLogOutput object which the parser will
insert into a list with the list being eventually defined.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There is really no reason why we could not keep journald's fd within the
journald output object the same way as we do for regular file-based outputs.
By doing this we later won't have to special case the journald-based output
(due to the fd being globally shared) when replacing the existing set of outputs
with a new one. Additionally, by making this change, we don't need the
virLogCloseJournald routine anymore, plain virLogCloseFd will suffice.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now virLogParse* functions are doing both parsing and defining of filters
and outputs which should be two separate operations. Since the naming is
apparently a bit poor this patch renames these functions to
virLogParseAndDefine* which eventually will be replaced by virLogSet*.
Additionally, virLogParse{Filter,Output} will be later (after the split) reused,
so that these functions do exactly what the their name suggests.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
During first stage of virlog.c refactor, commit 0b231195 forgot to remove the
macro definition along with its usage.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
If the event is already disabled, then don't bother with setting it
disabled again. Causes unnecessary error on systems that don't support
the feature anyway.
The intel-iommu device has existed since QEMU 2.2.0, but
it was only possible to create it with -device since
QEMU 2.7.0, thanks to:
commit 621d983a1f9051f4cfc3f402569b46b77d8449fc
Author: Marcel Apfelbaum <marcel@redhat.com>
Date: Mon Jun 27 18:38:34 2016 +0300
hw/iommu: enable iommu with -device
Use the standard '-device intel-iommu' to create the IOMMU device.
The legacy '-machine,iommu=on' can still be used.
The libvirt capability check & command line formatting code
is thus broken for all QEMU versions 2.2.0 -> 2.6.0 inclusive.
This fixes it to use iommu=on instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since introduction of chardev hotplug the code was wrong for the UDP
case and basically created a TCP socket instead. Use proper objects and
type for UDP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1377602
Rather than copy-paste - use a macro
Unfortunately due to how the RNG schema was written keeping the 'value'
and 'value'_max next to each other in the XML causes a schema failure,
so the FORMAT has to write out singly rather than optimizing to write
out both values at once
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to add more options, let's avoid having multiple if-then-else
which each try to set up the qemuMonitorJSONMakeCommand call with all the
parameters it knows about.
Instead, use the fact that when a NULL is found in the argument list that
processing of the remaining arguments stops and just have call.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to add 6 new options and it appears (from testing) one cannot
utilize both the shorthand (alias) and (much) longer names for the arguments.
So modify the command builder to use the longer name and of course alter the
test output .args to have the similarly innocuous long name.
Also utilize a macro to build that name makes it so much more visually
appealing and saves a few characters or potential cut-n-paste issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When I added support for the pcie-expander-bus controller in commit
bc07251f, I incorrectly thought that it only had a single slot
available. Actually it has 32 slots, just like the root complex aka
pcie-root (the part that I *did* get correct is that unlike pcie-root
a pcie-expander-bus doesn't allow any integrated endpoint devices -
only pcie-root-ports and dmi-to-pci-controllers are allowed).
Reduce some cut-n-paste code by creating common helper. Make use of the
recently added virJSONValueObjectStealArray to grab the devices list as
part of the common code (we we can Free the reply) and return devices for
each of the callers to continue to parse.
NB: This also adds error checking to qemuMonitorJSONDiskNameLookup
Signed-off-by: John Ferlan <jferlan@redhat.com>
Provide the Steal API for any code paths that will desire to grab the
object array and then free it afterwards rather than relying to freeing
the whole chain from the reply.
Rather than use stack allocated state context pointers, let's allocate and
free the state context pointer. In doing so, we'll shrink the code a bit
since many routines perform the same initialization sequence.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since none of the callers check the status, let's just alter it to
a static void.
While we're at it - scrap the local runtime variable and just do the
math in the VIR_DEBUG directly.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If attaching to a qemu process fails after opening the monitor socket
libvirt does not clean up the monitor. As the monitor also holds a
reference to the domain object the qemu attach API basically leaks it.
QEMU also does not interact on a second monitor connection and thus a
further attempt to attach to it would lock up.
Prevent libvirt from leaking the monitor by explicitly closing it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1378401
Attaching to a existing qemu process allows to get us into a situation
when qemu is new enough to have JSON monitor and new vCPU hotplug but
the json monitor is not used. The vCPU detection code would require it
though. This broke attaching to qemu processes.
Make the condition less strict and just skip the vCPU hotplug detection
if JSON monitor is not available.
Resolves one of the symptoms in:
https://bugzilla.redhat.com/show_bug.cgi?id=1378401
Libvirt, on its own, shouldn't decide whether an expired lease should
stay in the custom leases database or not. It should rather rely on
the 'DEL' event from dnsmasq.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We are about to add 6 new values to fetch. This will put us over the
current limit of 16 (we're at 13 now).
Once there are more than 16 parameters, this will affect existing clients
that attempt to fetch blockiotune config values for the domain from the
remote host since the server side has no mechanism to determine whether
the capability for the emulator exists and thus would attempt to return
all known values from the persistentDef. If attempting to fetch the
blockiotune values from a running domain, the code will check the emulator
capabilities and set maxparams (in qemuDomainGetBlockIoTune) appropriately.
On the client side of the remote connection, it uses this constant in
xdr_remote_domain_get_block_io_tune_ret and virTypedParamsDeserialize
calls, so if a remote server returns more than 16 parameters, then the
client will fail with "Unable to decode message payload".
Signed-off-by: John Ferlan <jferlan@redhat.com>
The REMOTE_DOMAIN_MEMORY_PARAMETERS_MAX was erroneously used in the
remoteDomainBlockStatsFlags and remoteDomainGetBlockIoTune calls. Change
the constant to be the right one.
Fortunately, all 3 are defined as 16.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This breaks vCPU hotplug, because when starting a domain, we
create a copy of domain definition (which becomes live XML) and
during the post parse callbacks we might adjust some tunings so
that vCPU hotplug is possible.
This reverts commit 581b7756af.
This breaks vCPU hotplug, because when starting a domain, we
create a copy of domain definition (which becomes live XML) and
during the post parse callbacks we might adjust some tunings so
that vCPU hotplug is possible.
This reverts commit c0f90799bc.
Certain operations may make the vcpu order information invalid. Since
the order is primarily used to ensure migration compatibility and has
basically no other user benefits, clear the order prior to certain
operations and document that it may be cleared.
All the operations that would clear the order can still be properly
executed by defining a new domain configuration rather than using the
helper APIs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370357
virDomainDefSetVcpus was not designed to handle coldplug of vcpus now
that we can set state of vcpus individually.
Introduce qemuDomainSetVcpusConfig that properly handles state changes
of vcpus when coldplugging so that invalid configurations are not
created.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375939
The current code that validates duplicate vcpu order would not work
properly if the order would exceed def->maxvcpus. Limit the order to the
interval described.
The bitmap indexes for the order duplicate check are shifted to 0 since
vcpu order 0 is not allowed. The error message doesn't need such
treating though.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370360
https://bugzilla.redhat.com/show_bug.cgi?id=1292984
Hold on to your hats, because this is gonna be wild.
In bd3e16a3 I've tried to expose sanlock io_timeout. What I had
not realized (because there is like no documentation for sanlock
at all) was very unusual way their APIs work. Basically, what we
do currently is:
sanlock_add_lockspace_timeout(&ls, io_timeout);
which adds a lockspace to sanlock daemon. One would expect that
io_timeout sets the io_timeout for it. Nah! That's where you are
completely off the tracks. It sets timeout for next lockspace you
will probably add later. Therefore:
sanlock_add_lockspace_timeout(&ls, io_timeout = 10);
/* adds new lockspace with default io_timeout */
sanlock_add_lockspace_timeout(&ls, io_timeout = 20);
/* adds new lockspace with io_timeout = 10 */
sanlock_add_lockspace_timeout(&ls, io_timeout = 40);
/* adds new lockspace with io_timeout = 20 */
And so on. You get the picture.
Fortunately, we don't allow setting io_timeout per domain or per
domain disk. So we just need to set the default used in the very
first step and hope for the best (as all the io_timeout-s used
later will have the same value).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, we are checking for sanlock_add_lockspace_timeout
which is good for now. But in a subsequent patch we are going to
use sanlock_write_lockspace (which sets an initial value for io
timeout for sanlock). Now, there is no reason to check for both
functions in sanlock library as the sanlock_write_lockspace was
introduced in 2.7 release and the one we are currently checking
for in the 2.5 release. Therefore it is safe to assume presence
of sanlock_add_lockspace_timeout when sanlock_write_lockspace
is detected.
Moreover, the macro for conditional compilation is renamed to
HAVE_SANLOCK_IO_TIMEOUT (as it now encapsulates two functions).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If this reminds you of a commit message from around a year ago, it's
41c2aa729f and yes, we're dealing with
"the same thing" again. Or f309db1f4d and
it's similar.
There is a logic in place that if there is no real need for
memory-backend-file, qemuBuildMemoryBackendStr() returns 0. However
that wasn't the case with hugepage backing. The reason for that was
that we abused the 'pagesize' variable for storing that information, but
we should rather have a separate one that specifies whether we really
need the new object for hugepage backing. And that variable should be
set only if this particular NUMA cell needs special treatment WRT
hugepages.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1372153
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Due to a copy and paste error, the scheduler 'cap' parameter
was over-writing the 'weight' parameter when preparing the
return parameters in libxlDomainGetSchedulerParametersFlags.
As a result, the scheduler weight was never shown when getting
schedinfo and setting the weight failed as well
virsh schedinfo testvm
Scheduler : credit
cap : 0
virsh schedinfo testvm --cap 50 --weight 500
Scheduler : credit
error: invalid scheduler option: weight
The obvious fix is to assign the 'caps' parameter to the correct
item in the parameter list.
Reported-by: Volo M. <vm@vovs.net>
Add support for formating/parsing libxl channels.
Syntax on xen libxl goes as following:
channel=["connection=pty|socket,path=/path/to/socket,name=XXX",...]
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
And allow libxl to handle channel element which creates a Xen
console visible to the guest as a low-bandwitdh communication
channel. If type is PTY we also fetch the tty after boot using
libxl_channel_getinfo to fetch the tty path. On socket case,
we autogenerate a path if not specified in the XML. Path autogenerated
is slightly different from qemu driver: qemu stores also on
"channels/target" but it creates then a directory per domain with
each channel target name. libxl doesn't appear to have a clear
definition of private files associated with each domain, so for
simplicity we do it slightly different. On qemu each autogenerated
channel goes like:
channels/target/<domain-name>/<target name>
Whereas for libxl:
channels/target/<domain-name>-<target name>
Should note that if path is not specified it won't persist,
existing only on live XML, unless user had initially specified it.
Since support for libxl channels only came on Xen >= 4.5 we therefore
need to conditionally compile it with LIBXL_HAVE_DEVICE_CHANNEL.
After this patch and having a qemu guest agent:
$ cat domain.xml | grep -a1 channel | head -n 5 | tail -n 4
<channel type='unix'>
<source mode='bind' path='/tmp/channel'/>
<target type='xen' name='org.qemu.guest_agent.0'/>
</channel>
$ virsh create domain.xml
$ echo '{"execute":"guest-network-get-interfaces"}' | socat
stdio,ignoreeof unix-connect:/tmp/channel
{"execute":"guest-network-get-interfaces"}
{"return": [{"name": "lo", "ip-addresses": [{"ip-address-type": "ipv4",
"ip-address": "127.0.0.1", "prefix": 8}, {"ip-address-type": "ipv6",
"ip-address": "::1", "prefix": 128}], "hardware-address":
"00:00:00:00:00:00"}, {"name": "eth0", "ip-addresses":
[{"ip-address-type": "ipv4", "ip-address": "10.100.0.6", "prefix": 24},
{"ip-address-type": "ipv6", "ip-address": "fe80::216:3eff:fe40:88eb",
"prefix": 64}], "hardware-address": "00:16:3e:40:88:eb"}, {"name":
"sit0"}]}
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
So far only guestfwd and virtio were supported. Add an additional
for Xen as libxl channels create a Xen console visible to the guest.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The qemucapsprobe helper calls virQEMUCapsNewForBinaryInternal with
caps == NULL, causing the following crash:
Program received signal SIGSEGV, Segmentation fault.
#0 0x00007ffff788775f in virQEMUCapsInitHostCPUModel
(qemuCaps=qemuCaps@entry=0x649680, host=host@entry=0x10) at
src/qemu/qemu_capabilities.c:2969
#1 0x00007ffff7889dbf in virQEMUCapsNewForBinaryInternal
(caps=caps@entry=0x0, binary=<optimized out>,
libDir=libDir@entry=0x4033f6 "/tmp", cacheDir=cacheDir@entry=0x0,
runUid=runUid@entry=4294967295, runGid=runGid@entry=4294967295,
qmpOnly=true) at src/qemu/qemu_capabilities.c:4039
#2 0x0000000000401702 in main (argc=2, argv=0x7fffffffd968) at
tests/qemucapsprobe.c:73
Caused by v2.2.0-182-g68c7011.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1368417
So far, when it comes to 'virsh update-device --config' of disks
we are limiting ourselves for just the disk source update and
just for CDROMs and floppies. This makes no sense. Especially if
you look around and see that we already allow full update to
graphics and net devices. So let's just take whatever XML user
wants to have there and replace our internal definition with it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libxl events are delivered to libvirt via the libxlDomainEventHandler
callback registered with libxl. Documenation in
$xensrc/tools/libxl/libxl_event.h states that the callback "may occur
on any thread in which the application calls libxl". This can result
in deadlock since many of the libvirt callees of libxl hold a lock on
the virDomainObj they are working on. When the callback is invoked, it
attempts to find a virDomainObj corresponding to the domain ID provided
by libxl. Searching the domain obj list results in locking each obj
before checking if it is active, and its ID equals the requested ID.
Deadlock is possible when attempting to lock an obj that is already
locked further up the call stack. Indeed, Max Ustermann reported an
instance of this deadlock
https://www.redhat.com/archives/libvir-list/2015-November/msg00130.html
Guido Rossmueller also recently stumbled across it
https://www.redhat.com/archives/libvir-list/2016-September/msg00287.html
Fix the deadlock by moving the lookup of virDomainObj to the
libxlDomainShutdownThread. After this patch, libxl events are
enqueued on the libvirt side and processed by dedicated thread,
avoiding the described deadlock.
Reported-by: Max Ustermann <ustermann78@web.de>
Reported-by: Guido Rossmueller <Guido.Rossmueller@gdata.de>
enum types are unsigned and the qemuGetCompressionProgram
function can return -1 on error. It is therefore inappropriate
to return an enum type. This fixes a build error where the
internal 'ret' variable was used in a comparison with -1
../../src/qemu/qemu_driver.c: In function 'qemuGetCompressionProgram':
../../src/qemu/qemu_driver.c:3280:5: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]
../../src/qemu/qemu_driver.c:3289:5: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]
cc1: all warnings being treated as errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating a copy of virDomainDef we save ourselves the
trouble of writing deep-copy functions and just format and parse
back domain/device XML. However, the XML we are parsing was
already fully formatted - there is no reason to run post parse
callbacks (which fill in blanks - there are none!).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is an internal flag that prevents our two entry points to
XML parsing (virDomainDefParse and virDomainDeviceDefParse) from
running post parse callbacks. This is expected to be used in
cases when we already have full domain/device XML and we are just
parsing it back (i.e. virDomainDefCopy or virDomainDeviceDefCopy)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We can't rely on def->emulator path. It may be provided by user
as we give them opportunity to provide their own XML for
migration. Therefore the path may point to just whatever binary
(or even to a non-existent file). Moreover, this path is meant
for destination, but the capabilities lookup is done on source.
What we can do is to assume same capabilities for post parse
callbacks as the running domain has. They will be used just to
add some default models/controllers/devices/... anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like virDomainDefPostParseCallback has gained new
parseOpaque argument, we need to follow the logic with
virDomainDeviceDefPostParse.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParse and subsequently virDomainDefParseNode too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParseXML and subsequently virDomainDefPostParse too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some callers might want to pass yet another pointer to opaque
data to post parse callbacks. The driver generic one is not
enough because two threads executing post parse callback might
want to see different data (e.g. domain object pointer that
domain def belongs to).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Based upon a patch from Chen Hanxiao <chenhanxiao@gmail.com>, rather than
need to call virFindFileInPath twice, let's just save the path and pass it
along with the compressed type. (NB: the second call would be in virExec as
called from virCommandRunAsync which is called from qemuMigrationToFile
using the argument 'compressor' which up to this point would be the string
from the cfg file that isn't the fully qualified path).
Since we now have the path, we can remove qemuCompressProgramName which
would return NULL or the string representation of the compress type.
There's only one caller and the code is duplicitous just converting the
recently converted cfg image name back into it's string value in order to
get/find the path to the image. A subsequent patch can return this path.
Let's do some more code reuse - there are 3 other callers that care to
check/get the compress program. Each of those though cares whether the
requested cfg image is valid and exists. So, add a parameter to handle
those cases.
NB: We won't need to initialize the returned value in the case where
the cfg image doesn't exist since the called program will handle that.
Add a new parameter 'styleFormat' to be used when printing the
warning message so that it's "clearer" what style of compression
call caused the error. Add that style to both messages as a paremter.
Also a VIR_WARN error message doesn't need to be translated
(e.g. inside _()), so remove the need for the translation.
There's only one caller now anyway... Besides it's just a shell for
getting the compress type. Subsequent patches will return the path
to the compression program.
Split out the guts of getCompressionType to perform the same functionality
in the new helper program with a subsequent patch goal to be reusable for
other callers making similar checks/calls to ensure the compression type
is valid and that the compression program cannot be found.
If passing an empty usbdevice_list to libxl, qemu will always get an
-usb parameter for HVM guests with only non-USB input devices. This
causes qemu to crash when passing pvusb device on HVM guests.
The solution is to allocate the list only when an item to put in it
is found.
Both cpuCompare* APIs are renamed to virCPUCompare*. And they should now
work for any guest CPU definition, i.e., even for host-passthrough
(trivial) and host-model CPUs. The implementation in x86 driver is
enhanced to provide a hint about -noTSX Broadwell and Haswell models
when appropriate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function is similar to virCPUDataCheckFeature, but it works directly
on CPU definition rather than requiring it to be transformed into CPU
data first.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is supposed to make sure the provided CPU definition does not
use a CPU model which is not supported by the hypervisor (if at all
possible, of course).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Keeping nfeatures_max set to 0 while nfeatures > 0 and some features are
already stored in features array is just asking for problems once we
want to add a new feature into the array.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The reworked API is now called virCPUUpdate and it should change the
provided CPU definition into a one which can be consumed by the QEMU
command line builder:
- host-passthrough remains unchanged
- host-model is turned into custom CPU with a model and features
copied from host
- custom CPU with minimum match is converted similarly to host-model
- optional features are updated according to host's CPU
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
x86ModelFromCPU is used to provide CPUID data for features matching
@policy. This patch allows callers to set @policy to -1 to get combined
CPUID for all CPU features (including those implicitly provided a CPU
model) specified in CPU def.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The domain capabilities XML is capable of showing whether each guest CPU
mode is supported or not with a possibility to provide additional
details. This patch enhances host-model capability to advertise the
exact CPU model which will be used as a host-model:
<cpu>
...
<mode name='host-model' supported='yes'>
<model fallback='allow'>Broadwell</model>
<vendor>Intel</vendor>
<feature policy='disable' name='aes'/>
<feature policy='require' name='vmx'/>
</mode>
...
</cpu>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The ARM CPU driver wrongly reported host CPU model as "host", which made
host-model to be just an alias for host-passthrough. Let's drop this
insanity.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Host capabilities provide libvirt's view of the host CPU, but for a
useful support for host-model CPUs we really need a hypervisor's view of
the CPU. And since the view can be differ with emulator, qemu
capabilities is the best place to store the host CPU model.
This patch just copies the CPU model from host capabilities, but this
will change in the future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function filters all CPU features through a given callback while
copying CPU model related parts of a CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function moves CPU model related parts from one CPU definition to
another. It can be used to avoid unnecessary copies from a temporary CPU
definitions which will be freed anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Useful for copying a CPU definition without model related parts (i.e.,
without model name, feature list, vendor).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In case a hypervisor is able to tell us a list of supported CPU models
and whether each CPU models can be used on the current host, we can
propagate this to domain capabilities. This is a better alternative
to calling virConnectCompareCPU for each supported CPU model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Listing all CPU models supported by QEMU in domain capabilities makes
little sense when libvirt will refuse any model it doesn't know about.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Some CPU drivers (such as arm) do not provide list of CPUs libvirt
supports and just pass any CPU model from domain XML directly to QEMU.
Such driver need to return models == NULL and success from cpuGetModels.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemu_command.c should deal with translating our domain definition into a
QEMU command line and nothing else.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The list of supported CPU models in domain capabilities is stored in
virDomainCapsCPUModels. Let's use the same object for storing CPU models
in QEMU capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The patch adds <cpu> element to domain capabilities XML:
<cpu>
<mode name='host-passthrough' supported='yes'/>
<mode name='host-model' supported='yes'/>
<mode name='custom' supported='yes'>
<model>Broadwell</model>
<model>Broadwell-noTSX</model>
...
</mode>
</cpu>
Applications can use it to inspect what CPU configuration modes are
supported for a specific combination of domain type, emulator binary,
guest architecture and machine type.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Our internal APIs mostly use virArch rather than strings. Switching
cpuGetModels to virArch will save us from unnecessary conversions in the
future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
By default, virt-manager (and likely other libvirt-based apps) sets
the VIR_MIGRATE_PERSIST_DEST flag when invoking the migrate API, which
fails in a Xen setup since the libxl driver does not support the flag.
Persisting a domain is a trivial task in the grand scheme of migration,
so be nice to libvirt apps and add support for VIR_MIGRATE_PERSIST_DEST
in the libxl driver.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
We have a few of senarios that libvirtd would invoke qemuProcessStop
and leave a "shutting down" in /var/log/libvirt/qemu/$DOMAIN.log.
The shutoff reason showing in debug log is also very important
for us to know why VM shutting down in domain log,
as we seldom enable debug log of libvirtd.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1322717
During offline migration, no storage is copied. Nor disks, nor
NVRAM file, nor anything. We use qemu for that and because domain
is not running there's nobody to copy that for us.
We should document this to avoid confusing users.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Calling virDomainGetEmulatorPinInfo on a live VM with automatic NUMA
pinning and VIR_DOMAIN_AFFECT_CONFIG would return the automatic pinning
data in some cases which is bogus. Use the autoCpuset property only when
called on a live definition.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1365779
Calling virDomainGetVcpuPinInfo on a live VM with automatic NUMA pinning
and VIR_DOMAIN_AFFECT_CONFIG would return the automatic pinning data
in some cases which is bogus. Use the autoCpuset property only when
called on a live definition.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1365779
Sometimes adding a separate variable to access vm->privateData is not
necessary. Add a macro that will do the typecasting rather than having
to add a temp variable to force the compiler to typecast it.
Old libvirt represents
<graphics type='spice'>
<listen type='none'/>
</graphics>
as
<graphics type='spice' autoport='no'/>
In this mode, QEMU doesn't listen for SPICE connection anywhere and
clients have to use virDomainOpenGraphics* APIs to attach to the domain.
That is, the client has to run on the same host where the domains runs
and it's impossible to tell the client to reconnect to the destination
QEMU during migration (unless there is some kind of proxy on the host).
While current libvirt correctly ignores such graphics devices when
creating graphics migration cookie, old libvirt just sends
<graphics type='spice' port='0' listen='0.0.0.0' tlsPort='-1'/>
in the cookie. After seeing this cookie, we happily would call
client_migrate_info QMP command and wait for SPICE_MIGRATE_COMPLETED
event, which is quite pointless since the doesn't know where to connecti
anyway. We should just ignore such cookies.
https://bugzilla.redhat.com/show_bug.cgi?id=1376083
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Checking if a domain's definition or if it is active before we got a job
is pointless since the domain might have changed in the meantime.
Luckily libvirtd didn't crash when the API tried to talk to an inactive
domain:
debug : qemuDomainObjBeginJobInternal:2914 : Started job: modify
(async=none vm=0x7f8f340140c0 name=ble)
debug : qemuDomainObjEnterMonitorInternal:3137 : Entering monitor
(mon=(nil) vm=0x7f8f340140c0 name=ble)
warning : virObjectLock:319 : Object (nil) ((unknown)) is not a
virObjectLockable instance
debug : qemuMonitorOpenGraphics:3505 : protocol=spice fd=27
fdname=graphicsfd skipauth=1
error : qemuMonitorOpenGraphics:3508 : invalid argument: monitor must
not be NULL
debug : qemuDomainObjExitMonitorInternal:3160 : Exited monitor
(mon=(nil) vm=0x7f8f340140c0 name=ble)
debug : qemuDomainObjEndJob:3068 : Stopping job: modify (async=none
vm=0x7f8f340140c0 name=ble)
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We can receive NULL as sync reply in two situations. First
is garbage sync reply and this situation is handled by
resending sync message. Second is different cases
of rebooting guest, destroing domain etc and we can
give more meaningful error message. Actually we have
this error message in qemuAgentCommand already which checks
for the same sitatuion. AFAIK case with mon->running
is just to be safe on adding some future(?) cases of
returning NULL reply.
We can easily handle receiving garbage on sync. We don't
have to make client deal with this situation. We just
need to resend sync command but this time garbage is
not be possible.
When we wait for sync reply we can receive delayed
reply to syncs or commands that were sent erlier. We can
safely skip them until we receive sync reply with correct id.
There is no much sense report this situation to client.
Actually with a bit of "luck" if we involve client into
this the play can go on forever: send sync 0, receive
sync reply -1, send sync 1, receive reply 0 ...
After sync is sent we can receive garbare and this is not error.
Consider next regular case:
1. libvirtd sends sync
2. qga sends partial sync reply and die
3. libvirtd sends sync
4. qga sends sync reply
5. libvirtd receives garbage
(half of first reply and second reply together)
We should handle this situation as it is recoverable.
Next sync can succeed. Let's report reply is NULL,
it will be converted to the VIR_ERR_AGENT_UNSYNCED
which signals client to retry.
Errors in qemuAgentIOProcessLine stop agent IO processing just
like any regular IO error, however some of current errors
that this functions spawns are false positives. Consider
next case for example:
1. send sync (unsynced state)
2. receive sync reply (sync established)
3. command send, but timeout occured (unsynced state)
4. receive command reply
Last IO triggers error because current code ignores
only delayed syncs when unsynced
We should not treat any delayed reply as error in unsynced
state. Until client and qga are not in sync delayed reply to any
command is possible. msg == NULL is the exact criterion
that we are not in sync.
Put it into qemuDomainPrepareShmemChardev() so it can be used later.
Also don't fill in the path unless the server option is enabled.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some checks will need to be performed for newer device types as well, so
let's not duplicate them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 839a060 tied the lifecycle of virtlogd more
closely to that of libvirtd. Unfortunately, while starting
virtlogd when libvirtd is started is definitely a good idea,
restarting virtlogd or shutting it down at any time outside
of system poweroff is not.
Revert part of that commit by removing the PartOf= lines,
meaning that only startup requests will be propagated from
libvirtd to virtlogd.
Resolves: https://bugzilla.redhat.com/1372576
Now that we have two same implementations for getting path for
huge pages backed guest memory, lets merge them into one function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When trying to migrate a huge page enabled guest, I've noticed
the following crash. Apparently, if no specific hugepages are
requested:
<memoryBacking>
<hugepages/>
</memoryBacking>
and there are no hugepages configured on the destination, we try
to dereference a NULL pointer.
Program received signal SIGSEGV, Segmentation fault.
0x00007fcc907fb20e in qemuGetHugepagePath (hugepage=0x0) at qemu/qemu_conf.c:1447
1447 if (virAsprintf(&ret, "%s/libvirt/qemu", hugepage->mnt_dir) < 0)
(gdb) bt
#0 0x00007fcc907fb20e in qemuGetHugepagePath (hugepage=0x0) at qemu/qemu_conf.c:1447
#1 0x00007fcc907fb2f5 in qemuGetDefaultHugepath (hugetlbfs=0x0, nhugetlbfs=0) at qemu/qemu_conf.c:1466
#2 0x00007fcc907b4afa in qemuBuildMemoryBackendStr (size=4194304, pagesize=0, guestNode=0, userNodeset=0x0, autoNodeset=0x0, def=0x7fcc70019070, qemuCaps=0x7fcc70004000, cfg=0x7fcc5c011800, backendType=0x7fcc95087228, backendProps=0x7fcc95087218,
force=false) at qemu/qemu_command.c:3297
#3 0x00007fcc907b4f91 in qemuBuildMemoryCellBackendStr (def=0x7fcc70019070, qemuCaps=0x7fcc70004000, cfg=0x7fcc5c011800, cell=0, auto_nodeset=0x0, backendStr=0x7fcc70020360) at qemu/qemu_command.c:3413
#4 0x00007fcc907c0406 in qemuBuildNumaArgStr (cfg=0x7fcc5c011800, def=0x7fcc70019070, cmd=0x7fcc700040c0, qemuCaps=0x7fcc70004000, auto_nodeset=0x0) at qemu/qemu_command.c:7470
#5 0x00007fcc907c5fdf in qemuBuildCommandLine (driver=0x7fcc5c07b8a0, logManager=0x7fcc70003c00, def=0x7fcc70019070, monitor_chr=0x7fcc70004bb0, monitor_json=true, qemuCaps=0x7fcc70004000, migrateURI=0x7fcc700199c0 "defer", snapshot=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START, standalone=false, enableFips=false, nodeset=0x0, nnicindexes=0x7fcc95087498, nicindexes=0x7fcc950874a0, domainLibDir=0x7fcc700047c0 "/var/lib/libvirt/qemu/domain-1-fedora") at qemu/qemu_command.c:9547
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Both qemu monitor and agent print the same
log on HUANGUP event, which would be confusing
when reading libvirtd log.
This patch will give a different log message to them.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Most of QEMU's PCI display device models, such as:
libvirt video/model/@type QEMU -device
------------------------- ------------
cirrus cirrus-vga
vga VGA
qxl qxl-vga
virtio virtio-vga
come with a linear framebuffer (sometimes called "VGA compatibility
framebuffer"). This linear framebuffer lives in one of the PCI device's
MMIO BARs, and allows guest code (primarily: firmware drivers, and
non-accelerated OS drivers) to display graphics with direct memory access.
Due to architectural reasons on aarch64/KVM hosts, this kind of
framebuffer doesn't / can't work in
qemu-system-(arm|aarch64) -M virt
machines. Cache coherency issues guarantee a corrupted / unusable display.
The problem has been researched by several people, including kvm-arm
maintainers, and it's been decided that the best way (practically the only
way) to have boot time graphics for such guests is to consolidate on
QEMU's "virtio-gpu-pci" device.
>From <https://bugzilla.redhat.com/show_bug.cgi?id=1195176>, libvirt
supports
<devices>
<video>
<model type='virtio'/>
</video>
</devices>
but libvirt unconditionally maps @type='virtio' to QEMU's "virtio-vga"
device model. (See the qemuBuildDeviceVideoStr() function and the
"qemuDeviceVideo" enum impl.)
According to the above, this is not right for the "virt" machine type; the
qemu-system-(arm|aarch64) binaries don't even recognize the "virtio-vga"
device model (justifiedly). Whereas "virtio-gpu-pci", which is a pure
virtio device without a compatibility framebuffer, is available, and works
fine.
(The ArmVirtQemu ("AAVMF") platform of edk2 -- that is, the UEFI firmware
for "virt" -- supports "virtio-gpu-pci", as of upstream commit
3ef3209d3028. See
<https://tianocore.acgmultimedia.com/show_bug.cgi?id=66>.)
Override the default mapping of "virtio", from "virtio-vga" to
"virtio-gpu-pci", if qemuDomainMachineIsVirt() evaluates to true.
Cc: Andrea Bolognani <abologna@redhat.com>
Cc: Drew Jones <drjones@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Martin Kletzander <mkletzan@redhat.com>
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1372901
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Martin Kletzander <mkletzan@redhat.com>
There is nothing Linux-specific in that function. Also since commit
8c3b5bf481 mingw build is broken due to
the fact that this function is not compiled in the library.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In testOpenDefault we create a virtual computer that is later
presented to user. We also pretend to have NUMA cells and
initialize them somehow. But whilst doing so a magical constant
is used. Drop it.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need this function shortly when implementing
nodeGetCPUStats in the test driver.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use the state information (online, hotpluggable) provided by the monitor
code rather than trying to infer it. This fixes an issue where on
architectures that require hotplug of multiple threads at once the
sub-cores would get updated as offline on daemon restart thus creating
an invalid configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375783
Return whether a vcpu entry is hotpluggable or online so that upper
layers don't have to infer the information from other data.
Advantage is that this code can be tested by unit tests.
The algorithm that matches data from query-cpus and
query-hotpluggable-cpus is quite complex. Start using descriptive
iterator names to avoid confusion.
https://bugzilla.redhat.com/show_bug.cgi?id=1372613
Apparently, some management applications use the following code
pattern when waiting for a block job to finish:
while (1) {
virDomainGetBlockJobInfo(dom, disk, info, flags);
if (info.cur == info.end)
break;
sleep(1);
}
Problem with this approach is in its corner cases. In case of
QEMU, libvirt merely pass what has been reported on the monitor.
However, if the block job hasn't started yet, qemu reports cur ==
end == 0 which tricks mgmt apps into thinking job is complete.
The solution is to mangle cur/end values as described here [1].
1: https://www.redhat.com/archives/libvir-list/2016-September/msg00017.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though we merely just pass to users whatever qemu provided
on the monitor, we still do some translation. For instance we
turn bytes into mebibytes, or fix job type if needed. However, in
the future there is more fixing to be done so this code deserves
its own function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268
In a full domain config, libvirt allows overriding the normal PCI
vs. PCI Express rules when a device address is explicitly provided
(so, e.g., you can force a legacy PCI device to plug into a PCIe port,
although libvirt would never do that on its own). However, due to a
bug libvirt doesn't give this same leeway when hotplugging devices. On
top of that, current libvirt assumes that *all* devices are legacy
PCI. The result of all this is that it's impossible to hotplug a
device into a PCIe port, even if you manually add the PCI address.
This can all be traced to the function
virDomainPCIAddressEnsureAddr(), and the fact that it calls
virDomainPCIaddressReserveSlot() for manually set addresses, and that
function hardcodes the argument "fromConfig" to false (meaning "this
address was auto-assigned, so it should be subject to stricter
validation").
Since virDomainPCIAddressReserveSlot() is just a one line simple
wrapper around virDomainPCIAddressReserveAddr() (adding in a hardcoded
reserveEntireSlot = true and fromConfig = false), all that's needed to
solve the problem with no unwanted side effects is to replace that
call for virDomainPCIAddressReserveSlot() with a direct call to
virDomainPCIAddressReserveAddr(), but with reserveEntireSlot = true,
fromConfig = true. That's what this patch does.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1337490
virQEMUDriverConfigNew() always initializes the bitmap in its
cgroupControllers member to -1 (i.e. all 1's).
Prior to commit a9331394, if qemu.conf had a line with
"cgroup_controllers", cgroupControllers would get reset to 0 before
going through a loop setting a bit for each named cgroup controller.
commit a9331394 left out the "reset to 0" part, so cgroupControllers
would always be -1; if you didn't want a controller included, there
was no longer a way to make that happen.
This was discovered by users who were using qemu commandline
passthrough to use the "input-linux" method of directing
keyboard/mouse input to a virtual machine:
https://www.redhat.com/archives/vfio-users/2016-April/msg00105.html
Here's the first report I found of the problem encountered after
upgrading libvirt beyond v2.0.0:
https://www.redhat.com/archives/vfio-users/2016-August/msg00053.html
Thanks to sL1pKn07 SpinFlo <sl1pkn07@gmail.com> for bringing the
problem up in IRC, and then taking the time to do a git bisect and
find the patch that started the problem.
previous commit:
commit 2c3223785c
Author: John Ferlan <jferlan@redhat.com>
Date: Mon Jun 13 12:30:34 2016 -0400
qemu: Add the ability to hotplug the TLS X.509 environment
added a parameter "bool listen" in some methods. This
unfortunately clashes with the listen() method, causing
compile failures on certain platforms (RHEL-6 for example)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id 'a48c7141' altered how to determine if a volume was encrypted
by adding a peek at an offset into the file at a specific buffer location.
Unfortunately, all that was compared was the first "char" of the buffer
against the expect "int" value.
Restore the virReadBufInt32BE to get the complete field in order to
compare against the expected value from the qcow2EncryptionInfo or
qcow1EncryptionInfo "modeValue" field.
This restores the capability to create a volume with encryption, then
refresh the pool, and still find the encryption for the volume.
A LUKS volume uses the volume secret type just like the QCOW2 secret, so
adjust the loading of the default secrets to handle any volume that the
virStorageFileGetMetadataFromBuf code has deemed to be an encrypted volume
to search for the volume's secret. This lookup is done by volume usage
where the usage is expected to be the path to volume.
When a new filter is being defined, the return code is not handled properly,
thus triggering OOM error reporting routine (bug introduced by 51b2606f).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:
"info migration reply was missing return status"
This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.
Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.
https://bugzilla.redhat.com/show_bug.cgi?id=1374613
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id 'b00d7f29' shifted the opening of the /sys/devices/intel_cqm/type
file from event enable to perf event initialization. If the file did not
exist, then an error would be written to the domain log:
2016-09-06 20:51:21.677+0000: 7310: error : virFileReadAll:1360 : Failed to open file '/sys/devices/intel_cqm/type': No such file or directory
Since the error is now handled in virPerfEventEnable by checking if the
event_attr->attrType == 0 for CMT, MBML, and MBMT events - we can just
use the Quiet API in order to not log the error we're going to throw away.
Additionally, rather than using virReportSystemError, use virReportError
and VIR_ERR_ARGUMENT_UNSUPPORTED in order to signify that support isn't there
for that type of perf event - adjust the error message as well.
Akin to previous commit but for "virsh cpu-baseline" which
computes a baseline CPU for a set of host cpu elements.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Implement support for "virsh cpu-compare" so that we can calculate
common cpu element between a pool of hosts, which had a requirement
of providing host cpu description.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Parse libxl_hwcap accounting for versions since Xen 4.4 - Xen 4.7.
libxl_hwcaps is a set of cpuid leaves output that is described in [0] or
[1] in Xen 4.7. This is a collection of CPUID leaves that we version
in libvirt whenever feature words are reordered or added. Thus we keep the
common ones in one struct and others for each version. Since
libxl_hwcaps doesn't appear to have a stable format across all supported
versions thus we need to keep track of changes as a compromise until it's
exported in xen libxl API. We don't fail in initializing the driver in case
parsing of hwcaps failed for that reason. In addition, change the notation
on PAE feature such that is easier to read which bit it corresponds.
[0] xen/include/asm-x86/cpufeature.h
[1] xen/include/public/arch-x86/cpufeatureset.h
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Add support for describing cpu topology in host cpu element. In doing
so, refactor hwcaps part to its own helper namely libxlCapsInitCPU to
handle all host cpu related operations, including topology.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Qemu always opens the tray if forced to. Skip the waiting step in such
case.
This also helps if qemu does not report the tray change event when
opening the cdrom forcibly (the documentation says that the event will
not be sent although qemu in fact does trigger it even if @force is
selceted).
This is a workaround for a qemu issue where qemu does not send the tray
change event in some cases (after migration with empty closed locked
drive) and thus renders the cdrom useless from libvirt's point of view.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1368368
When a source image is dropped when missing due to startup policy the
policy needs to be cleared since it was relevant only for the given
storage source. New sources need to update it if needed.
Just like in the previous commit, teach qemu driver to detect
whether qemu supports this configuration knob or not.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a new secret usage type known as "tls" - it will handle adding the
secret objects for various TLS objects that need to provide some sort
of passphrase in order to access the credentials.
The format is:
<secret ephemeral='no' private='no'>
<description>Sample TLS secret</description>
<usage type='tls'>
<name>mumblyfratz</name>
</usage>
</secret>
Once defined and a passphrase set, future patches will allow the UUID
to be set in the qemu.conf file and thus used as a secret for various
TLS options such as a chardev serial TCP connection, a NBD client/server
connection, and migration.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the incoming XML defined a path to a TLS X.509 certificate environment,
add the necessary 'tls-creds-x509' object to the VIR_DOMAIN_CHR_TYPE_TCP
character device.
Likewise, if the environment exists the hot unplug needs adjustment as
well. Note that all the return ret were changed to goto cleanup since
the cfg needs to be unref'd
Signed-off-by: John Ferlan <jferlan@redhat.com>
When building a chardev device string for tcp, add the necessary pieces to
access provide the TLS X.509 path to qemu. This includes generating the
'tls-creds-x509' object and then adding the 'tls-creds' parameter to the
VIR_DOMAIN_CHR_TYPE_TCP command line.
Finally add the tests for the qemu command line. This test will make use
of the "new(ish)" /etc/pki/qemu setting for a TLS certificate environment
by *not* "resetting" the chardevTLSx509certdir prior to running the test.
Also use the default "verify" option (which is "no").
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new TLS X.509 certificate type - "chardev". This will handle the
creation of a TLS certificate capability (and possibly repository) for
properly configured character device TCP backends.
Unlike the vnc and spice there is no "listen" or "passwd" associated. The
credentials eventually will be handled via a libvirt secret provided to
a specific backend.
Make use of the default verify option as well.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than specify perhaps multiple TLS X.509 certificate directories,
let's create a "default" directory which can then be used if the service
(e.g. for now vnc and spice) does not supply a default directory.
Since the default for vnc and spice may have existed before without being
supplied, the default check will first check if the service specific path
exists and if so, set the cfg entry to that; otherwise, the default will
be set to the (now) new defaultTLSx509certdir.
Additionally add a "default_tls_x509_verify" entry which can also be used
to force the peer verification option (for vnc it's a x509verify option).
Add/alter the macro for the option being found in the config file to accept
the default value.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If a migration of a domain which is already defined on the destination
host failed early (before we tried to start QEMU), we would forget to
remove the incoming transient definition. Later on when someone starts
the domain on the destination host, we will use the stale incoming
definition and the persistent definition will just be ignored.
https://bugzilla.redhat.com/show_bug.cgi?id=1368774
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The code for replacing domain's transient definition with the persistent
one is repeated in several places and we'll need to add one more. Let's
make a nice helper for it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
There is an issue with a wrong label inside vah_add_path().
The compilation fails with the error:
make[3]: Entering directory '/tmp/libvirt/src'
CC security/virt_aa_helper-virt-aa-helper.o
security/virt-aa-helper.c: In function 'vah_add_path':
security/virt-aa-helper.c:769:9: error: label 'clean' used but not defined
goto clean;
This patch moves 'clean' label to 'cleanup' label.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
This patch fixes a segfault in virt-aa-helper caused by attempting to
modify a static string literal. It is triggered when a domain has a
<filesystem> with type='mount' configured read-only and libvirt is
using the AppArmor security driver for sVirt confinement. An "R" is
passed into the function and converted to 'r'.
Similarly to vcpu hotplug the emulator thread cgroup numa mapping needs
to be relaxed while hot-adding vcpus so that the threads can allocate
data in the DMA zone.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370084
When hot-adding vcpus qemu needs to allocate some structures in the DMA
zone which may be outside of the numa pinning. Extract the code doing
this in a set of helpers so that it can be reused.
There is a possibility that qemu driver frees by unreferencing its
closeCallbacks pointer as it has the only reference to the object,
while in fact not all users of CloseCallbacks called thier
virCloseCallbacksUnset.
Backtrace is the following:
Thread #1:
0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 in virCondWait (c=<optimized out>, m=<optimized out>)
at util/virthread.c:154
2 in virThreadPoolFree (pool=0x7f0810110b50)
at util/virthreadpool.c:266
3 in qemuStateCleanup () at qemu/qemu_driver.c:1116
4 in virStateCleanup () at libvirt.c:808
5 in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1660
Thread #2:
0 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7f0837c694d0) at util/virobject.c:169
1 in virObjectIsClass (anyobj=anyobj@entry=0x7f08101d4760, klass=<optimized out>) at util/virobject.c:365
2 in virObjectLock (anyobj=0x7f08101d4760) at util/virobject.c:317
3 in virCloseCallbacksUnset (closeCallbacks=0x7f08101d4760, vm=vm@entry=0x7f08101d47b0, cb=cb@entry=0x7f081d078fc0 <qemuProcessAutoDestroy>) at util/virclosecallbacks.c:163
4 in qemuProcessAutoDestroyRemove (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0) at qemu/qemu_process.c:6368
5 in qemuProcessStop (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0) at qemu/qemu_process.c:5854
6 in processMonitorEOFEvent (vm=0x7f08101d47b0, driver=0x7f081018be50) at qemu/qemu_driver.c:4585
7 qemuProcessEventHandler (data=<optimized out>, opaque=0x7f081018be50) at qemu/qemu_driver.c:4629
8 in virThreadPoolWorker (opaque=opaque@entry=0x7f0837c4f820) at util/virthreadpool.c:145
9 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
10 in start_thread () from /lib64/libpthread.so.0
Let's reference CloseCallbacks object in virCloseCallbacksSet and
unreference in virCloseCallbacksUnset.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
In the latest glibc, major() and minor() functions are marked as
deprecated (glibc commit dbab6577):
CC util/libvirt_util_la-vircgroup.lo
util/vircgroup.c: In function 'virCgroupGetBlockDevString':
util/vircgroup.c:768:5: error: '__major_from_sys_types' is deprecated:
In the GNU C Library, `major' is defined by <sys/sysmacros.h>.
For historical compatibility, it is currently defined by
<sys/types.h> as well, but we plan to remove this soon.
To use `major', include <sys/sysmacros.h> directly.
If you did not intend to use a system-defined macro `major',
you should #undef it after including <sys/types.h>.
[-Werror=deprecated-declarations]
if (virAsprintf(&ret, "%d:%d ", major(sb.st_rdev), minor(sb.st_rdev)) < 0)
^~
In file included from /usr/include/features.h:397:0,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdio.h:28,
from ../gnulib/lib/stdio.h:43,
from util/vircgroup.c:26:
/usr/include/sys/sysmacros.h:87:1: note: declared here
__SYSMACROS_DEFINE_MAJOR (__SYSMACROS_FST_IMPL_TEMPL)
^
Moreover, in the glibc commit, there's suggestion to keep
ordering of including of header files as implemented here.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Current implementation uses the dev.cpu.0.freq sysctl that is
provided by the cpufreq(4) framework and returns the actual
CPU frequency. However, there are environments where it's not available,
e.g. when running nested in KVM. In this case fall back to hw.clockrate
that reports CPU frequency at the boot time.
Resolves (hopefully):
https://bugzilla.redhat.com/show_bug.cgi?id=1369964
We already guarantee that virtlogd.socket is enabled/disabled
along with libvirtd.service, but if libvirtd.service has just
been installed and is started before rebooting, then
virtlogd.socket will not be running and guest startup will
fail.
Add Requires=virtlogd.socket to libvirtd.service to make sure
virtlogd.socket is always started along with libvirtd.service,
and add Before=libvirtd.service to both virtlogd.socket and
virtlogd.service so that virtlogd never disappears before
libvirtd has exited.
Also add PartOf=libvirtd.service to both virtlogd.socket and
virtlogd.service, so that virtlogd can be shut down when not
needed.
Resolves: https://bugzilla.redhat.com/1372576
We already have the ability to turn off dumping of guest
RAM via the domain XML. This is not particularly useful
though, as it is under control of the management application.
What is needed is a way for the sysadmin to turn off guest
RAM defaults globally, regardless of whether the mgmt app
provides its own way to set this in the domain XML.
So this adds a 'dump_guest_core' option in /etc/libvirt/qemu.conf
which defaults to false. ie guest RAM will never be included in
the QEMU core dumps by default. This default is different from
historical practice, but is considered to be more suitable as
a default because
a) guest RAM can be huge and so inflicts a DOS on the host
I/O subsystem when dumping core for QEMU crashes
b) guest RAM can contain alot of sensitive data belonging
to the VM owner. This should not generally be copied
around inside QEMU core dumps submitted to vendors for
debugging
c) guest RAM contents are rarely useful in diagnosing
QEMU crashes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the QEMU processes inherit their core dump rlimit
from libvirtd, which is really suboptimal. This change allows
their limit to be directly controlled from qemu.conf instead.
With current perf framework, this patch adds support and documentation
for more perf events, including cache misses, cache references, cpu cycles,
and instructions.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Introduce a static attr table and refactor virPerfEventEnable() for
general purpose usage.
This patch creates a static table/matrix that converts the VIR_PERF_EVENT_*
events into their respective "attr.type" and "attr.config" so that
virPerfEventEnable doesn't have the switch the calling function passes
by value the 'type'.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Since the domain lock is not held during preparation of an external XML
config, it is possible that the value can change resulting in unexpected
failures during ABI consistency checking for some save and migrate
operations.
This patch adds a new flag to skip the checking of the cur_balloon value
and then sets the destination value to the source value to ensure
subsequent checks without the skip flag will succeed.
This way it is protected from forges and is keeped up to date too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Add support for multi serial devices, after this patch virsh can be used to
connect different serial devices of running domains. E.g.
vish # console <xxx> --devname serial<xxx>
Note:
This depends on a xen/libxl bug fix to have libxl_console_get_tty(...) correctly
returning the tty path (as opposed to always returning the first one).
[0] https://lists.xen.org/archives/html/xen-devel/2016-08/msg00438.html
Signed-off-by: Bob Liu <bob.liu@oracle.com>
libvirt uses the new_id PCI sysfs interface to bind a PCI stub driver
to a PCI device. The new_id interface is known to be buggy and racey,
hence a more deterministic interface was introduced in the 3.12 kernel:
driver_override. For more details see
https://www.redhat.com/archives/libvir-list/2016-June/msg02124.html
For more details about the driver_override interface and examples of
its usage, see
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci-driver.c?h=v3.12&id=782a985d7af26db39e86070d28f987cad21313c0
This patch adds support for the driver_override interface by
- adding new virPCIDevice{BindTo,UnbindFrom}StubWithOverride functions
that use the driver_override interface
- renames the existing virPCIDevice{BindTo,UnbindFrom}Stub functions
to virPCIDevice{BindTo,UnbindFrom}StubWithNewid to perserve existing
behavior on new_id interface
- changes virPCIDevice{BindTo,UnbindFrom}Stub function to call one of
the above depending on availability of driver_override
The patch includes a bit of duplicate code, but allows for easily
dropping the new_id code once support for older kernels is no
longer desired.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
libxl only has API to address the host USB devices by bus/device.
Find the bus/device if the user only provided the vendor/product
of the USB device.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Finding an USB device from the vendor/device values will be needed
by libxl driver to convert from vendor/device to bus/dev addresses.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The 'multi' element in PCI address struct used as 'virTristateSwitch',
and its default value is 'VIR_TRISTATE_SWITCH_ABSENT'. Current PCI
process use 'false' to initialization 'multi', which is ambiguously
for assignment or comparison. This patch use '{0}' to initialize
the whole PCI address struct, which fix the 'multi' initialization
and makes code more simplify and explicitly.
Signed-off-by: Xian Han Yu <xhyubj@linux.vnet.ibm.com>
The libxl driver has long supported migration V3 but has never
indicated so in the connectSupportsFeature API. As a result, apps
such as virt-manager that use the more generic virDomainMigrate API
fail with
libvirtError: this function is not supported by the connection driver:
virDomainMigrate
Add VIR_DRV_FEATURE_MIGRATION_V3 to the list of features marked as
supported in the connectSupportsFeature API.
Test 12 from objecteventtest (createXML add event) segaults on FreeBSD
with bus error.
At some point it calls testNodeDeviceDestroy() from the test driver. And
it fails when it tries to unlock the device in the "out:" label of this
function.
Unlocking fails because the previous step was a call to
virNodeDeviceObjRemove from conf/node_device_conf.c. This function
removes the given device from the device list and cleans up the object,
including destroying of its mutex. However, it does not nullify the pointer
that was given to it.
As a result, we end up in testNodeDeviceDestroy() here:
out:
if (obj)
virNodeDeviceObjUnlock(obj);
And instead of skipping this, we try to do Unlock and fail because of
malformed mutex.
Change virNodeDeviceObjRemove to use double pointer and set pointer to
NULL.
As bhyve currently doesn't use controller addressing and simply
uses 1 implicit controller for 1 disk device, the scheme looks the
following:
pci addrees -> (implicit controller) -> disk device
So in fact we identify disk devices by pci address of implicit
controller and just pass it this way to bhyve in a form:
-s pci_addr,ahci-(cd|hd),/path/to/disk
Therefore, we cannot use virDeviceInfoPCIAddressWanted() because it
does not expect that disk devices might need PCI address assignment.
As a result, if a disk was specified without address, it will not be
generated and domain will to start.
Until proper controller addressing is implemented in the bhyve
driver, force each disk to have PCI address generated if it was not
specified by user.
Setting vcpu count when cpu topology is specified may result into an
invalid configuration. Since the topology can't be modified, reject the
setting if it doesn't match the requested topology. This will allow
fixing the topology in case it was broken.
Partially fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1370066
Validating the vcpu count is more intricate and doing it in the XML
parser will make previously valid configs (with older qemus) vanish.
Now that we have a very similar check in the qemu domain validation
callback we can do it in a more appropriate place.
This basically reverts commit b54de0830a.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370066
ce43cca0e refactored the helper to prepare it for sparse topologies but
forgot to fix the iterator used to fill the structures. This would
result into a weirdly sparse populated array and possible out of bounds
access and crash once sparse vcpu topologies were allowed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1369988
While dettaching/attaching device in OpenStack, nova
calls vzDomainDettachDevice twice, because the update of the internal
configuration of the ct comes a bit latter than the update event.
As the result, we suffer from the second call to dettach the same device.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
libvirt-python passes parameter bandwidth = 0
by default. This means that bandwidth is unlimited.
VZ driver doesn't support bandwidth rate limiting,
but we still need to handle it and fail if bandwidth > 0.
Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
* Added VIR_MIGRATE_LIVE, VIR_MIGRATE_UNDEFINE_SOURCE and
VIR_MIGRATE_PERSIST_DEST to supported migration flags
Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
When support for auto-creating tap devices was added to <interface
type='ethernet'> in commit 9c17d6, the code assumed that
virNetDevTapCreate() would honor the VIR_NETDEV_TAP__CREATE_IFUP flag
that is supported by virNetDevTapCreateInBridgePort(). That isn't the
case - the latter function performs several operations, and one of
them is setting the tap device online. But virNetDevTapCreate() *only*
creates the tap device, and relies on the caller to do everything
else, so qemuInterfaceEthernetConnect() needs to call
virNetDevSetOnline() after the device is successfully created.
The linkstate setting of an <interface> is only meant to change the
online status reported to the guest system by the emulated network
device driver in qemu, but when support for auto-creating tap devices
for <interface type='ethernet'> was added in commit 9717d6, a chunk of
code was also added to qemuDomainChangeNetLinkState() that sets the
online status of the tap device (i.e. the *host* side of the
interface) for type='ethernet'. This was never done for tap devices
used in type='bridge' or type='network' interfaces, nor was it done in
the past for tap devices created by external scripts for
type='ethernet', so we shouldn't be doing it now.
This patch removes the bit of code in qemuDomainChangeNetLinkState()
that modifies online status of the tap device.
The call to virNetDevIPInfoAddToDev() that sets up tap device IP
addresses and routes was somehow incorrectly placed in
qemuInterfaceStopDevice() instead of qemuInterfaceStartDevice() in
commit fe8567f6. This fixes that error by moving the call to
virNetDevIPInfoAddToDev() to qemuInterfaceStartDevice().
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
This patch removes the old vcpu unplug code completely and replaces it
with the new code using device_del. The old hotplug code basically never
worked with any recent qemu and thus is useless.
As the new code is using device_del all the implications of using it
are present. Contrary to the device deletion code, the vcpu deletion
code fails if the unplug request is not executed in time.
To allow unplugging the vcpus, hotplugging of vcpus on platforms which
require to plug multiple logical vcpus at once or plugging them in an
arbitrary order it's necessary to use the new device_add interface for
vcpu hotplug.
This patch adds support for the device_add interface using the old
setvcpus API by implementing an algorithm to select the appropriate
entities to plug in.
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.
There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
entity
Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.
The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started
This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.
We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.
The helper will store the order information in places where we are
certain that it's necessary.
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
Individual vCPU hotplug requires us to track the state of any vCPU. To
allow this add the following XML:
<domain>
...
<vcpu current='2'>3</vcpu>
<vcpus>
<vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
<vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
<vcpu id='1' enabled='no' hotpluggable='yes'/>
</vcpus>
...
The 'enabled' attribute allows to control the state of the vcpu.
'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
allows to specify the order to add the vcpus.
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.
As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
For hotplug purposes it's necessary to retrieve data using
query-hotpluggable-cpus while the old query-cpus API report thread IDs
and order of hotplug.
This patch adds code that merges the data using a rather non-trivial
algorithm and fills the data to the qemuMonitorCPUInfo structure for
adding to appropriate place in the domain definition.
Add support for retrieving information regarding hotpluggable cpu units
supported by qemu. Data returned by the command carries information
needed to figure out the granularity of hotplug, the necessary cpu type
name and the topology information.
Note that qemu doesn't specify any particular order of the entries thus
it's necessary sort them by socket_id, core_id and thread_id to the
order libvirt expects.
To allow matching up the data returned by query-cpus to entries in the
query-hotpluggable-cpus reply for CPU hotplug it's necessary to extract
the QOM path as it's the only link between the two.
QEMU reports whether 'query-hotpluggable-cpus' is supported for a given
machine type. Extract and cache the information using the capability
cache.
When copying the capabilities for a new start of qemu, mask out the
presence of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine type
doesn't support hotpluggable cpus.
As of qemu commit:
commit a32ef3bfc12c8d0588f43f74dcc5280885bbdb30
Author: Thomas Huth <thuth@redhat.com>
Date: Wed Jul 22 15:59:50 2015 +0200
vl: Add another sanity check to smp_parse() function
v2.4.0-952-ga32ef3b
configuration where the maximum CPU count doesn't match the topology is
rejected. Prior to that only configurations where the topology would
contain more cpus than the maximum count would be rejected.
Use QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS as a relevant recent enough
witness to avoid breaking old configs.
Prepare to extract more data by returning an array of structs rather than
just an array of thread ids. Additionally report fatal errors separately
from qemu not being able to produce data.
vzDomainMigrateConfirm3Params is whitelisted. Otherwise we need to
move removing domain from domain list from perform to confirm
step. This would further imply adding a flag and check that migration
is in progress to prohibit mistakenly (maliciously) removing domains
on confirm step. vz version of p2p also need to be fixed to include confirm step.
One would also need to add means to cleanup pending migration
on client disconnect as now is has state across several API
calls.
On the other hand current version of confirm step is totaly
harmless thus it is easier to whitelist it at the moment.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
This way we make naming consistent to API calls and make subsequent
ACL checks possible (otherwise ACL check would discover name
discrepancies).
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
ACL check on perform step should be in API call itself to make ACL
checking script pass. Thus we need to reorganize code to obtain
domain object in perform API itself. Most of this is straight
forward, the only nuance is dropping locks on lengthy remote
operations.
The other motivation is to have only perform step ACL checks for
p2p migration instead of both begin in perform if we can leave
ACL check in vzDomainMigratePerformStep.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
We need it to prepare the calls for ACL checks otherwise ACL checking
script will fail.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
This action deserves its own function and makes main API call
structure much cleaner.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The original motivation is to expand API calls like start/stop etc so that
the ACL checks could be added. But this patch has its own befenits.
1. functions like prlsdkStart/Stop use common routine to wait for
job without domain lock. They become more self contained and do
not return intermediate PRL_RESULT.
2. vzDomainManagedSave do not update cache twice.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1367259
Crash occurs because 'secrets' is being dereferenced in call:
if (qemuDomainSecretSetup(conn, priv, secinfo, disk->info.alias,
VIR_SECRET_USAGE_TYPE_VOLUME, NULL,
&src->encryption->secrets[0]->seclookupdef,
true) < 0)
(gdb) p *src->encryption
$1 = {format = 2, nsecrets = 0, secrets = 0x0, encinfo = {cipher_size = 0,
cipher_name = 0x0, cipher_mode = 0x0, cipher_hash = 0x0, ivgen_name = 0x0,
ivgen_hash = 0x0}}
(gdb) bt
priv=priv@entry=0x7fffc03be160, disk=disk@entry=0x7fffb4002ae0)
at qemu/qemu_domain.c:1087
disk=0x7fffb4002ae0, vm=0x7fffc03a2580, driver=0x7fffc02ca390,
conn=0x7fffb00009a0) at qemu/qemu_hotplug.c:355
Upon entry to qemuDomainAttachVirtioDiskDevice, src->encryption points
at a valid 'secret' buffer w/ nsecrets == 1; however, the call to
qemuDomainDetermineDiskChain will call virStorageFileGetMetadata
and eventually virStorageFileGetMetadataInternal where the src->encryption
was overwritten when probing the volume.
Commit id 'a48c7141' added code to virStorageFileGetMetadataInternal
to determine if the disk/volume would use/need encryption and allocated
a meta->encryption. This overwrote an existing encryption buffer
already provided by the XML
This patch adds a check for meta->encryption already present before
just allocating and overwriting an existing buffer. It then checks the
existing encryption data to ensure the XML provided format for the
disk matches the expected format read from the disk and errors if there
is a mismatch.
For some unknown reason the original implementation of the <forwarder>
element only took advantage of part of the functionality in the
dnsmasq feature it exposes - it allowed specifying the ip address of a
DNS server which *all* DNS requests would be forwarded to, like this:
<forwarder addr='192.168.123.25'/>
This is a frontend for dnsmasq's "server" option, which also allows
you to specify a domain that must be matched in order for a request to
be forwarded to a particular server. This patch adds support for
specifying the domain. For example:
<forwarder domain='example.com' addr='192.168.1.1'/>
<forwarder domain='www.example.com'/>
<forwarder domain='travesty.org' addr='10.0.0.1'/>
would forward requests for bob.example.com, ftp.example.com and
joe.corp.example.com all to the DNS server at 192.168.1.1, but would
forward requests for travesty.org and www.travesty.org to
10.0.0.1. And due to the second line, requests for www.example.com,
and odd.www.example.com would be resolved by the libvirt network's own
DNS server (i.e. thery wouldn't be immediately forwarded) even though
they also match 'example.com' - the match is given to the entry with
the longest matching domain. DNS requests not matching any of the
entries would be resolved by the libvirt network's own DNS server.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1331796
If you define a libvirt virtual network with one or more IP addresses,
it starts up an instance of dnsmasq. It's always been possible to
avoid dnsmasq's dhcp server (simply don't include a <dhcp> element),
but until now it wasn't possible to avoid having the DNS server
listening; even if the network has no <dns> element, it is started
using default settings.
This patch adds a new attribute to <dns>: enable='yes|no'. For
backward compatibility, it defaults to 'yes', but if you don't want a
DNS server created for the network, you can simply add:
<dns enable='no'/>
to the network configuration, and next time the network is started
there will be no dns server created (if there is dhcp configuration,
dnsmasq will be started with "port=0" which disables the DNS server;
if there is no dhcp configuration, dnsmasq won't be started at all).
The new forward mode 'open' is just like mode='route', except that no
firewall rules are added to assure that any traffic does or doesn't
pass. It is assumed that either they aren't necessary, or they will be
setup outside the scope of libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=846810
This patch fixes a bug which occurs when we check a bus and unit number
for a new attached disk. We should do this check in ValidadionCallback,
not in PostParse callback. Because in PostParse we have not initialized
disk->info.addr.drive struct yet.
Move part of code from domainPostParseCallback to domainValidateCallback
and part from devicesPostParseCallback to deviceValidateCallback.
PostParse callbacks are for modification data.
ValidateCallbacks are only for checks.
While dettaching/attaching device in OpenStack, nova
calls vzDomainDettachDevice twice, because the update of the internal
configuration of the ct comes a bit latter than the update event.
As the result, we suffer from the second call to dettach the same device.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
If we are going to ignore return value of a functions
that can raise an error, it's not enough to use ignore_value
construction. We should explicitly call virResetLastError
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
First, make function logPrlEventErrorHelper be void and only
print information (if any) from an event.
Second, don't rewrite original error with any errors we get
during parsing event info.
Third, ignore PRL_ERR_NO_DATA at all.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Check whether the disable-legacy property is present on the following
devices:
virtio-balloon-pci
virtio-blk-pci
virtio-scsi-pci
virtio-serial-pci
virtio-9p-pci
virtio-net-pci
virtio-rng-pci
virtio-gpu-pci
virtio-input-host-pci
virtio-keyboard-pci
virtio-mouse-pci
virtio-tablet-pci
Assuming that if QEMU knows other virtio devices where this property
is applicable, it will have at least one of these devices.
Added in QEMU by:
commit e266d421490e0ae83044bbebb209b2d3650c0ba6
virtio-pci: add flags to enable/disable legacy/modern
https://bugzilla.redhat.com/show_bug.cgi?id=1182074
Since libvirt still uses a legacy qemu arg format to add a disk, the
manner in which the 'password-secret' argument is passed to qemu needs
to change to prepend a 'file.' If in the future, usage of the more
modern disk format, then the prepended 'file.' can be removed.
Fix based on Jim Fehlig <jfehlig@suse.com> posting and subsequent
upstream list followups, see:
http://www.redhat.com/archives/libvir-list/2016-August/msg00777.html
for details. Introduced by commit id 'a1344f70'.
Modify virDomainDefGetVcpuSched to emit an error message if
virDomainDefGetVcpu returns NULL meaning the vcpu could not
be found. Prior to commit id '9cc931f0b' the error message
would have been issued in virDomainDefGetVcpu.
The code that setups listen types may change a listen type from address to
socket based on configuration from qemu.conf. This needs to be done before we
reserve/allocate ports that won't be used.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1364843
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Ports are valid only for listen types 'address' and 'network', other listen
types doesn't use them so we should not try to reserve any ports.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The first argument should be const char ** instead of
char **, because this is a search function and as such it
doesn't, and shouldn't, alter the haystack in any way.
This change means we no longer have to cast arrays of
immutable strings to arrays of mutable strings; we still
have to do the opposite, though, but that's reasonable.
When commit id '6dfb4507' refactored where the iothreadsched data was
stored, the error message for when the virDomainIOThreadIDFind failed
to find an iothreadid ("iothreadsched attribute 'iothreads' uses
undefined iothread ids") was lost. This led to the possibility that
someone would try to use it, but receive the generic message "An error
occurred, but the cause is unknown".
This patch adds the error message back so that someone will know that
they have an invalid configuration.
Signed-off-by: John Ferlan <jferlan@redhat.com>
All other modes of qemuDomainSetVcpusFlags have helpers so finish the
work by splitting the regular code into a new function.
This patch also touches up the coding (spacing) style.
If any of the devices referenced a USB hub that does not exist,
defining the domain would either fail with:
error: An error occurred, but the cause is unknown
(if only the last hub in the path is missing)
or crash.
Return a proper error instead of crashing.
https://bugzilla.redhat.com/show_bug.cgi?id=1367130
Mention whether it was the live or persistent definition which caused an
error reported and explicitly error out in case when attempting to set
maximum vcpu count for a live domain.
<filesystem type='ram' accessmode='passthrough'>
<source usage='524288' units='KiB'/>
<target dir='/dev/shm'/>
</filesystem>
would lead to lxcContainerMountAllFS calling STRPREFIX
on a NLL pointer because it failed to check if fs->src->path
was non-NULL. This is a regression caused by
commit da665fbd48
Author: Olga Krishtal <okrishtal@virtuozzo.com>
Date: Thu Jul 14 16:52:38 2016 +0300
filesystem: adds possibility to use storage pool as fs source
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
<filesystem type='ram' accessmode='passthrough'>
<source usage='524288' units='KiB'/>
<target dir='/dev/shm'/>
</filesystem>
would lead to lxcContainerResolveSymlinks calling
access(NULL) because it failed to check if fs->src->path
was non-NULL. This is a regression caused by
commit da665fbd48
Author: Olga Krishtal <okrishtal@virtuozzo.com>
Date: Thu Jul 14 16:52:38 2016 +0300
filesystem: adds possibility to use storage pool as fs source
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Because of change in caaa1bd357 this macro is no under
#ifdef block. That means it needs to be re-intended correctly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit eee7bd4e introduced two functions: libxlDiskPathToID and
libxlDiskSectorSize.
However, as they're used only by code under #ifdef __linux__,
on non-Linux platforms it results in errors similar to this:
CC libxl/libvirt_driver_libxl_impl_la-libxl_driver.lo
libxl/libxl_driver.c:5263:1: error: unused function 'libxlDiskPathToID' [-Werror,-Wunused-function]
libxlDiskPathToID(const char *virtpath)
^
libxl/libxl_driver.c:5312:1: error: unused function 'libxlDiskSectorSize' [-Werror,-Wunused-function]
libxlDiskSectorSize(int domid, int devno)
^
2 errors generated.
Fix that by moving these functions under the #ifdef __linux__ block.
This event is emitted when a nodedev XML definition is updated,
like when cdrom media is changed in a cdrom block device.
Also includes node device update event implementation for udev
backend, virsh nodedev-event support, and event-test support
Setting heads to 0 in case that *max_outputs* is not supported while building
command line doesn't have any real effect. It only removes *heads* attribute
from live XML, but after restarting libvirt the default value is restored.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When starting a guest and copying host vendor cpuid to the guest
cpu, libvirtd would crash if the host cpu contained a NULL vendor
field. Avoid the crash by checking for a valid vendor in the host
cpu before copying the cpuid to the guest cpu.
For completeness, here is a backtrace from the crash
(gdb) bt
f0 0x00007ffff739bf33 in x86DataCpuid (cpuid=0x8, cpuid=0x8,
data=data@entry=0x7fffb800ee78) at cpu/cpu_x86.c:287
f1 virCPUx86DataAddCPUID (data=data@entry=0x7fffb800ee78, cpuid=0x8)
at cpu/cpu_x86.c:355
f2 0x00007ffff739ef47 in x86Compute (host=<optimized out>, cpu=0x7fffb8000cc0,
guest=0x7fffecca7348, message=<optimized out>) at cpu/cpu_x86.c:1580
f3 0x00007fffd2b38e53 in qemuBuildCpuModelArgStr (migrating=false,
hasHwVirt=<synthetic pointer>, qemuCaps=0x7fffb8001040, buf=0x7fffecca7360,
def=0x7fffc400ce20, driver=0x1c) at qemu/qemu_command.c:6283
f4 qemuBuildCpuCommandLine (cmd=cmd@entry=0x7fffb8002f60,
driver=driver@entry=0x7fffc80882c0, def=def@entry=0x7fffc400ce20,
qemuCaps=qemuCaps@entry=0x7fffb8001040, migrating=<optimized out>)
at qemu/qemu_command.c:6445
(gdb) f2
(gdb) p *host_model
$23 = {name = 0x7fffb800ec50 "qemu64", vendor = 0x0, signature = 0, data = {
len = 2, data = 0x7fffb800e720}}
Since we now pick the default USB controller model when parsing
the guest XML, we can get rid of some duplicated code so that
the default model selection happens in one place only.
Add some comments as well.
Now that the default USB controller model is explicit rather
than implicit for i440fx machines, we have to tweak the
conditions for dropping it in order to keep migration towards
libvirt <= 0.9.4 working.
When the user doesn't specify any model for a USB controller,
we use an architecture-dependent default, but we don't reflect
it in the guest XML.
Pick the default USB controller model when parsing the guest
XML instead of when creating the QEMU command line, so that
our choice is saved back to disk.
Usually, this variable is used to hold the return value for a
function of ours. Well, this is not the case. Its use does not
match our pattern and therefore it is very misleading. Drop it
and define an alternative @rc variable, but only in that single
block where it is needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This variable is very misleading. We use VIR_FORCE_CLOSE to set
it to -1 and returning it even though it does not refer to a FD
at all. It merely holds 0 or -1. Drop it completely. Also, at the
same time some corner cases are fixed too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1240439
In this function we create a macvtap device and open its tap
device. Possibly multiple times. Now the thing is, if opening the
tap device fails, that is virNetDevMacVLanTapOpen() returns a
negative value, we unroll all the changes BUT return 0 fooling
caller into thinking everything went okay.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since a9331394 (first release v2.1.0), specifying a manual
security_driver setting in qemu.conf causes the daemon to fail to
start, erroring with 'Duplicate security driver X'.
The duplicate checking was incorrectly comparing every entry
against itself, guaranteeing a false positive.
https://bugzilla.redhat.com/show_bug.cgi?id=1365607
More misunderstanding/mistaken assumptions on my part - I had thought
that a pci-expander-bus could be plugged into any legacy PCI slot, and
that pcie-expander-bus could be plugged into any PCIe slot. This isn't
correct - they can both be plugged ontly into their respective root
buses. This patch adds that restriction.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1358712
libvirt had allowed a dmi-to-pci-bridge to be plugged in anywhere a
normal PCIe endpoint can be connected, but this is wrong - it will
only work if it's plugged into pcie-root (the PCIe root complex) or a
pcie-expander-bus (the qemu device pxb-pcie). This patch adjusts the
connection flags accordingly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1363648
I apparently misunderstood Marcel's description of what could and
couldn't be plugged into qemu's pxb-pcie controller (known as
pcie-expander-bus in libvirt) - I specifically allowed directly
connecting a pcie-switch-upstream-port, and it turns out that causes
the guest kernel to crash.
This patch forbids such a connection, and updates the xml docs
appropriately.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1361172
The virDomainPCIAddressFlagsCompatible() error logs report that a
device required a controller that accepted standard PCI endpoint
devices, or PCI Express endpoint devices, and if hotplug was required
by the configuration but not provided by the selected controller. But
the wording of the error messages was apparently confusing (according
to the bugzilla report referenced below). On top of that, if the
device was something other than an endpoint device (e.g. a
pcie-switch-downstream-port) the error message was a complete punt -
it would just say that the flags were incorrect.
This patch makes the messages for PCI/PCIe endpoint and hotplug
requirements more clear, and also specifically indicates what was the
device type when it is other than an endpoint device.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1363627
Since the introduction of CMT features (commit v1.3.5-461-gf294b83)
starting a domain with host-model CPU on a host which supports CMT fails
because QEMU complains about unknown 'cmt' feature:
qemu-system-x86_64: CPU feature cmt not found
https://bugzilla.redhat.com/show_bug.cgi?id=1355857
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
By removing a non-migratable feature in a for loop we would fail to drop
every second non-migratable feature if the features array contained
several of them in a row.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit 30ce2f0e tried to fix the issue with an incorrect session URI to admin
server but it messed up the checks:
if (geteuid == 0 && VIR_STRDUP(*uristr, "libvirtd:///system") < 0)
return -1;
else if (VIR_STRDUP(*uristr, "libvirtd:///session") < 0)
return -1;
So if a client executed with root privileges tries to connect, its euid is
checked (true) and the correct URI is successfully copied to @uristr (false),
therefore the 'else' branch is taken and @uristr is replaced by the session URI
which for root results in:
Failed to connect socket to '/root/.cache/libvirt/libvirt-admin-sock':
No such file or directory
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Just like we decide on which URI we go with based on EUID for qemu in remote
driver, do a similar thing for admin except we do not spawn a daemon in this
case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1356858
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Commit b3e4401dc6 introduced a check to ignore an error if the guest
is already terminated. However the check accidentally compared
error.code with VIR_ERR_ERROR, which is an error level, not an error
code. Because of this, almost every error got silently ignored.
Fixes: b3e4401dc6 ("systemd: don't report an error if the guest is
already terminated")
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
When build for architecture that don't use gcc atomic ops but pthread,
it fails to build for armel:
| ../tools/nss/.libs/libnss_libvirt_impl.a(libvirt_nss_la-virobject.o): In function `virClassNew':
| /buildarea2/kkang/builds/qemuarm-Aug03/bitbake_build/tmp/work/armv5e-wrs-linux-gnueabi/libvirt/1.3.5-r0/build/src/../../libvirt-1.3.5/src/util/virobject.c:153: undefined reference to `virAtomicLock'
| ../tools/nss/.libs/libnss_libvirt_impl.a(libvirt_nss_la-virobject.o): In function `virObjectNew':
| /buildarea2/kkang/builds/qemuarm-Aug03/bitbake_build/tmp/work/armv5e-wrs-linux-gnueabi/libvirt/1.3.5-r0/build/src/../../libvirt-1.3.5/src/util/virobject.c:205: undefined reference to `virAtomicLock'
| ../tools/nss/.libs/libnss_libvirt_impl.a(libvirt_nss_la-virobject.o): In function `virObjectUnref':
| /buildarea2/kkang/builds/qemuarm-Aug03/bitbake_build/tmp/work/armv5e-wrs-linux-gnueabi/libvirt/1.3.5-r0/build/src/../../libvirt-1.3.5/src/util/virobject.c:277: undefined reference to `virAtomicLock'
| ../tools/nss/.libs/libnss_libvirt_impl.a(libvirt_nss_la-virobject.o): In function `virObjectRef':
| /buildarea2/kkang/builds/qemuarm-Aug03/bitbake_build/tmp/work/armv5e-wrs-linux-gnueabi/libvirt/1.3.5-r0/build/src/../../libvirt-1.3.5/src/util/virobject.c:298: undefined reference to `virAtomicLock'
| collect2: error: ld returned 1 exit status
It is similar with:
http://libvirt.org/git/?p=libvirt.git;a=commit;h=12dc729
Signed-off-by: Kai Kang <kai.kang@windriver.com>
Unfortunately vz sdk do not provide detail information on migration
progress, only progress percentage. Thus vz driver provides percents
instead of bytes in data fields of virDomainJobInfoPtr.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The build was failing with:
CCLD lockd.la
libtool: error: can't build i686-pc-cygwin shared library unless -no-undefined is specified
Rather than add yet another $(CYGWIN_EXTRA_LDFLAGS) to all the
impacted *_la_LDFLAGS, it was easier to just pull the extra
flags into ALL libraries via AM_LDFLAGS.
Then, fix lockd_la_LDFLAGS to include AM_LDFLAGS, like all other
libraries.
Signed-off-by: Eric Blake <eblake@redhat.com>
Without XDR_CFLAGS, compilation on Cygwin fails with:
CC libvirt_driver_la-libvirt-stream.lo
In file included from libvirt-stream.c:26:0:
rpc/virnetprotocol.h:9:21: fatal error: rpc/rpc.h: No such file or directory
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1363773
Imagine that you're creating a transient domain, but for some reason,
starting it fails. That is virLXCProcessStart() returns an error. With
current code, in the error handling code the domain object is removed
from the domain object list, @vm is set to NULL and controls jump to
enjob label where virLXCDomainObjEndJob() is called which dereference vm
leading to instant crash.
The fix is to end the job in the error handling code and only after that
remove the domain from the list and jump onto cleanup label instead of
endjob.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1362349
When adding the ability to build the pool during the start pool processing
using the similar flags as buildPool processing would use, the code was
essentially cut-n-pasted from storagePoolCreateXML. However, that included
a call to virStoragePoolObjRemove which shouldn't happen within the
storagePoolCreate path since that'll remove the pool from the list of
pools only to be rediscovered if libvirtd restarts.
So on failure, just fail and return as we should expect
Doing a load, copy, format cycle on all QEMU capabilities XML files
should make sure we don't forget to update virQEMUCapsNewCopy when
adding new elements to QEMU capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
There was a missing check for vol->target.encryption being NULL
at one particular place (modified by commit a48c71411) which caused a crash
when user attempted to create a raw volume using a non-raw file volume as
source.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1363636
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In qemu, enabling this feature boils down to adding the following
onto the command line:
-global driver=cfi.pflash01,property=secure,value=on
However, there are some constraints resulting from the
implementation. For instance, System Management Mode (SMM) is
required to be enabled, the machine type must be q35-2.4 or
later, and the guest should be x86_64. While technically it is
possible to have 32 bit guests with secure boot, some non-trivial
CPU flags tuning is required (for instance lm and nx flags must
be prohibited). Given complexity of our CPU driver, this is not
trivial. Therefore I've chosen to forbid 32 bit guests for now.
If there's ever need, we can refine the check later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This element will control secure boot implemented by some
firmwares. If the firmware used in <loader/> does support the
feature we must tell it to the underlying hypervisor. However, we
can't know whether loader does support it or not just by looking
at the file. Therefore we have to have an attribute to the
element where users can tell us whether the firmware is secure
boot enabled or not.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since its release of 2.4.0 qemu is able to enable System
Management Module in the firmware, or disable it. We should
expose this capability in the XML. Unfortunately, there's no good
way to determine whether the binary we are talking to supports
it. I mean, if qemu's run with real machine type, the smm
attribute can be seen in 'qom-list /machine' output. But it's not
there when qemu's run with -M none. Therefore we're stuck with
version based check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We use 'goto cleanup' for a reason. If a function can exit at
many places but doesn't follow the pattern, it has to copy the
free code in multiple places.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While no leak was observed yet, there might be one if
virObjectEventClass is ever derived from another class. Because
in that case plain VIR_FREE() will not call dispose() from parent
classes possibly leaking some memory.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In the cleanup path, @vm cannot be possibly NULL. If it were so,
we would receive SIGSEGV much earlier. At the beginning of the
function we do libxlDomainObjBeginJob(.., vm, ..); and so on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virJSONValueArraySize() function return ssize_t (with
possibly returning -1 if the passed json is not an array).
Storing the return value into size_t is possibly dangerous then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Call the vcpu thread info validation separately to decrease complexity
of returned values by qemuDomainRefreshVcpuInfo.
This function now returns 0 on success and -1 on error. Certain
failures of qemu to report data are still considered as success. Any
error reported now is fatal.
Validate the presence of the thread id according to state of the vCPU
rather than just checking the vCPU count. Additionally put the new
validation code into a separate function so that the information
retrieval can be split from the validation.
Long, long ago before libxl_get_required_shadow_memory() was
made publicly available, its code was copied to the libxl driver
for calculating shadow memory requirements of HVM domains.
Long ago, libxl_get_required_shadow_memory() was exported in
libxl_utils.h and included in xen-devel packages everywhere.
Remove the copied code, which has become stale, and let libxl
provode a proper shadow memory value.
https://bugzilla.redhat.com/show_bug.cgi?id=1356937
Add support for IOThread quota/bandwidth and period parameters for non
session mode. If in session mode, then error out. Uses all the same
places where {vcpu|emulator|global}_{period|quota} are adjusted and
adds the iothread values.
https://bugzilla.redhat.com/show_bug.cgi?id=1356937
Add the definitions to allow for viewing/setting cgroup period and quota
limits for IOThreads.
This is similar to the work done for emulator quota and period by
commit ids 'b65dafa' and 'e051c482'.
Being able to view/set the IOThread specific values is related to more
recent changes adding global period (commmit id '4d92d58f') and global
quota (commit id '55ecdae') definitions and qemu support (commit id
'4e17ff79' and 'fbcbd1b2'). With a global setting though, if somehow
the IOThread value in the cgroup hierarchy was set "outside of libvirt"
to a value that is incompatible with the global value.
Allowing control over IOThread specific values provides the capability
to alter the IOThread values as necessary.
If you invoke virDomainLxcEnterSecurityLabel() on security
model of "none" it will report an error. Logically a "none"
security model should be treated as a no-op, so we should
just return success immediately, instead of an error.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1289391
Rather than pass the whole drive string (which contained the alias),
pass only the alias for the qemuMonitorDriveDel call in the error
path when adding a host device in the monitor fails.
Partial fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1336225
Similar to the other disk types, add the qemuMonitorDriveDel in the failure
to add/hotplug a USB.
Added a couple of other formatting changes just to have a less cluttered look
Move QEMU_DRIVE_HOST_PREFIX into the qemu_alias.c to dissuade future
callers from using it. Create qemuAliasDiskDriveSkipPrefix in order
to handle the current consumers that desire to check if an alias has
the drive- prefix and "get beyond it" in order to get the disk alias.
Since we already have a function that will generate the drivestr from
the alias, let's use it and remove the qemuDeviceDriveHostAlias.
Move the QEMU_DRIVE_HOST_PREFIX definition into qemu_alias.h
Also alter qemuAliasFromDisk to use the QEMU_DRIVE_HOST_PREFIX instead
of "drive-%s".
Rather than pass the disks[i]->info.alias to qemuMonitorSetDrivePassphrase
and then generate the "drive-%s" alias from that, let's use qemuAliasFromDisk
prior to the call to generate the drive alias and then pass that along
thus removing the need to generate the alias from the monitor code.
Node device lifecycle event API entry points for registering and
deregistering node deivce events, as well as types of events
associated with node device.
These entry points will be used for implementing asynchronous
lifecycle events.
Node device API:
virConnectNodeDeviceEventRegisterAny
virConnectNodeDeviceEventDeregisterAny
virNodeDeviceEventLifecycleType which has events CREATED and DELETED
As commit id 'e2b86f580' notes, when mode=agent possibly setting the
fake reboot flag to true wouldn't be necessary; however, it doesn't
"force" the issue by just ensuring the fake reboot is false, so this
patch adds the explicit setting for the reboot path.
More investigation and details can be found in commit id '8be502fd'
as well as in the archives at:
https://www.redhat.com/archives/libvir-list/2015-April/msg00715.html
Conditional setting of the fake reboot flag should only happen for
the acpi mode shutdown path; however, for the agent mode shutdown,
the fake reboot should be cleared. This patch will essentially revert
commit id '8be502fd', but adds an explicit setting of the flag to false
when using mode=agent while also only conditionally setting the reboot
flag if the guest went away. This also avoids an issue where a shutdown
with reboot semantics is done from agent mode which sets the reboot
flag followed by a shutdown from within the guest which would result
in a reboot due to the fake reboot flag being set. The change will
also properly handle the cases described in the following archive post:
https://www.redhat.com/archives/libvir-list/2015-April/msg00715.html
Commit id '44304c6eb' added the API libxlDomainAttachControllerDevice
inside a conditional LIBXL_HAVE_PVUSB, but called that function outside
the conditional in libxlDomainAttachDeviceLive.
Similarly, the API libxlDomainDetachControllerDevice was added inside a
conditional LIBXL_HAVE_PVUSB, but called outside the conditional in
libxlDomainDetachDeviceLive.
This patch adds the conditional LIBXL_HAVE_PVUSB around those two calls
from within the switch.
Prior to commit 2737aaaf, we allowed every client to connect successfully,
however, if accepting a client would eventually lead to an overcommit of the
limits, we would disconnect it immediately with "Too many active clients,
dropping connection from...". Recent changes refactored the code in a way, that
it is not possible for the client-related callback to be dispatched and the
client to be accepted if the limits wouldn't permit to do so, therefore a check
if a connection should be dropped due to limits violation has become a dead
code that could be removed.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Commit 2737aaaf changed our policy for accepting new clients in a way, that
instead of accepting new clients only to disconnect them immediately, since
that would overcommit the limit, we temporarily disable polling for the
dedicated file descriptor, so any new connection will queue on the socket.
Commit 8b1f0469 then added the possibility to change the limits during runtime
but it didn't re-enable polling for the previously disabled file descriptor,
thus any new connection would still continue to queue on the socket. This patch
forces an update of the services each time the limits were changed in some way.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1357776
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So far, virNetServerCheckLimits was only used to possibly re-enable accepting
new clients that might have previously been disabled due to client limits
violation (max_clients, max_anonymous_clients). This patch refactors
virNetServerAddClient, which is currently the only place where the services get
disabled, in order to use the virNetServerCheckLimits helper instead of
checking the limits by itself.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virNetServerAddClient checks for the limits in order to temporarily
suspend the services, thus not accepting any more clients, there is no reason
why virNetServerCheckLimits, which is only responsible for re-enabling
previously disabled services according to the limits, could not do both. To be
able to do that however, it needs to be moved up in the file since it's static
(and because it's just a helper and there's only one caller it should remain
static).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In case of error, libxlReconnectDomain may call
virDomainObjListRemoveLocked. However it has no local reference on
the domain object, leading to segfault. Get a reference to the domain
object at the start of the function and release it at the end to avoid
problems.
This commit also factorizes code between the error and normal ends.
To sync with virDomainControllerModelUSB, we add two models
in qemuControllerModelUSB 'qusb1' and 'qusb2', but those
models are not supported in qemu driver. So add check in
device post parse to report errors if 'qusb1' and 'qusb2'
are specified.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
libxl configuration files conversion can now handle USB controllers.
When parting libxl config file, USB controllers with type PV are
ignored as those aren't handled.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
When hotplugging a USB device, check if there is an available controller
and port, if not, automatically create a USB controller of version
2.0 and 8 ports.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Support USB controller hot-plug and hot-unplug.
#virsh attach-device dom usbctrl.xml
#virsh detach-device dom usbctrl.xml
usbctrl.xml example:
<controller type='usb' index='0' model='qusb2'>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
To support USB Controller in xen guest domains, just add
USB controller in domain config xml as following:
<controller type='usb' model='qusb2' ports='4'/>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
According to libxl implementation, it supports pvusb
controller of version 1.1 and version 2.0, and it
supports two types of backend, 'pvusb' (dom0 backend)
and 'qusb' (qemu backend). But currently pvusb backend
is not checked in yet.
To match libxl support, extend usb controller schema
to support two more models: qusb1 (qusb, version 1.1)
and 'qusb2' (qusb version 2.0).
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Let's cleanly differentiate what wiping a volume does for ploop and
other volumes so it's more readable what is done for each one instead of
branching out multiple times in different parts of the same function.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some functions use volume specification merely to use the target path
from it. Let's change it to pass the path only so that it can be used
for other files than just volumes.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This is done in order to call them in next patches from each other and
definitions would be missing otherwise.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When reset was called from a domain that crashed we didn't change the
crashed state into a paused one which could confuse users.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1269575
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Until now we simply errored out when the translation from pool+volume
failed. However, we should instead check whether that disk is needed or
not since there is an option for that.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1168453
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is an error reset following the function and check for
startupPolicy before that. Let's reflect those things inside that
function so that future code doesn't have to be that complex.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When wiping a volume we just rewrite all the data of the volume, not
only the content. Since format gets overridden, we need to recreate the
volume. However we can't do that for every possible format out there.
Since it was only coded for the ploop volume type, let's document what
might be the consequences instead of forbidding it for every other
format out there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=868771
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The panic devices with models s390 and pseries are autogenerated.
For backwards compatibility reasons the devices are to be removed
when migrating.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Ever since virDomainCreateWithFlags() was introduced by de3aadaa
[drivers: add virDomainCreateWithFlags if virDomainCreate exists], the
domain ID retrieved with virDomainGetID() was incorrect for several
drivers after virDomainCreateWithFlags() was called. The API consumer
had to look up the domain anew to retrieve the correct ID.
For the ESX driver, this was fixed in 6139b274 [esx: Update ID after
starting a domain]. For the openvz driver, it was fixed in fd81a097
[openvzDomainCreateWithFlags: set domain id to the correct value]. The
test driver, the OpenNebula driver (removed in the meantime) and the
vbox driver were already updating the domain ID correctly in
domainCreate().
Copy over the ID in qemuDomainCreateWithFlags() to fix this for the qemu
driver, too.
Fixes: de3aadaa ("drivers: add virDomainCreateWithFlags if virDomainCreate exists")
Reported-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Consider the following XML snippet:
<memory model=''>
<target>
<size unit='KiB'>523264</size>
<node>0</node>
</target>
</memory>
Whats wrong you ask? The @model attribute. This should result in
an error thrown into users faces during virDomainDefine phase.
Except it doesn't. The XML validation catches this error, but if
users chose to ignore that, they will end up with invalid XML.
Well, they won't be able to start the machine - that's when error
is produced currently. But it would be nice if we could catch the
error like this earlier.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The original name 'admin_uri_default' was introduced to our code by commit
dbecb87f. However, at that time we already had a separate config file for
admin library but the commit mentioned above didn't properly adjust the
config's option name. The result is that when we're loading the config, we
check a non-existent config option (there's not much to do with the URIs
anyway, since we only allow local connection). Additionally, virt-admin's man
page documents, that the default URI can be altered by setting
admin_uri_default option. So the fix proposed by this patch leaves the
libvirt-admin.conf as is and adjusts the naming in the code as well as in the
virt-admin's man page.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1356436
Commit id '56057900' altered the discovery of iSCSI node targets by
using the "--op nonpersistent". This caused issues for clean environments
or if by chance a "-m node -o delete" was executed.
Since each iSCSI Storage Pool has the required iSCSI target path, use
that and the virISCSINodeNew API in order to generate the iSCSI node record.
https://bugzilla.redhat.com/show_bug.cgi?id=1356436
According to RFC 3721 (https://www.ietf.org/rfc/rfc3721.txt), there are
two ways to "discover" targets in/for the iSCSI environment. Discovery
is the process which allows the initiator to find the targets to which
it has access and at least one address at which each target may be
accessed.
The method currently implemented in libvirt using the virISCSIScanTargets
API is known as "SendTargets" discovery. This method is more useful when
the target IP Address and TCP port information are available, e.g. in
libvirt terms the "portal". It returns a list of targets for the portal.
From that list, the target can be found. This operation can also fill an
iSCSI node table into which iSCSI logins may occur. Commit id '56057900'
altered that filling by adding the "--op nonpersistent" since it was
not necessarily desired to perform that for non libvirt related targets.
The second method is "Static Configuration". This method not only needs
the IP Address and TCP port (e.g. portal), but also the iSCSI target name.
In libvirt terms this would be the device path field from the iSCSI pool
<source> XML. This patch implements the second methodology using that
required device path as the targetname.
The current LUKS support has a "luks" volume type which has
a "luks" encryption format.
This partially makes sense if you consider the QEMU shorthand
syntax only requires you to specify a format=luks, and it'll
automagically uses "raw" as the next level driver. QEMU will
however let you override the "raw" with any other driver it
supports (vmdk, qcow, rbd, iscsi, etc, etc)
IOW the intention though is that the "luks" encryption format
is applied to all disk formats (whether raw, qcow2, rbd, gluster
or whatever). As such it doesn't make much sense for libvirt
to say the volume type is "luks" - we should be saying that it
is a "raw" file, but with "luks" encryption applied.
IOW, when creating a storage volume we should use this XML
<volume>
<name>demo.raw</name>
<capacity>5368709120</capacity>
<target>
<format type='raw'/>
<encryption format='luks'>
<secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
</encryption>
</target>
</volume>
and when configuring a guest disk we should use
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/home/berrange/VirtualMachines/demo.raw'/>
<target dev='sda' bus='scsi'/>
<encryption format='luks'>
<secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
</encryption>
</disk>
This commit thus removes the "luks" storage volume type added
in
commit 318ebb36f1
Author: John Ferlan <jferlan@redhat.com>
Date: Tue Jun 21 12:59:54 2016 -0400
util: Add 'luks' to the FileTypeInfo
The storage file probing code is modified so that it can probe
the actual encryption formats explicitly, rather than merely
probing existance of encryption and letting the storage driver
guess the format.
The rest of the code is then adapted to deal with
VIR_STORAGE_FILE_RAW w/ VIR_STORAGE_ENCRYPTION_FORMAT_LUKS
instead of just VIR_STORAGE_FILE_LUKS.
The commit mentioned above was included in libvirt v2.0.0.
So when querying volume XML this will be a change in behaviour
vs the 2.0.0 release - it'll report 'raw' instead of 'luks'
for the volume format, but still report 'luks' for encryption
format. I think this change is OK because the storage driver
did not include any support for creating volumes, nor starting
guets with luks volumes in v2.0.0 - that only since then.
Clearly if we change this we must do it before v2.1.0 though.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactor the virStorageFileMatchesNNN methods so that
they don't take a struct FileFormatInfo parameter, but
instead get the actual raw dat items they needs. This
will facilitate reuse in other contexts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To collect all balloon statistics for all guests it was necessary to make
several libvirt requests. Now it's possible to get all balloon statiscs via
single connectGetAllDomainStats call.
Signed-off-by: Derbyshev Dmitry <dderbyshev@virtuozzo.com>
To allow using failover with gluster it's necessary to specify multiple
volume hosts. Add support for starting qemu with such configurations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
To allow richer definitions of disk sources add infrastructure that will
allow to register functionst generating a JSON object based definition.
This infrastructure will then convert the definition to the proper
command line syntax and use it in cases where it's necessary. This will
allow to keep legacy definitions for back-compat when possible and use
the new definitions for the configurations requiring them.
Add support for converting objects nested in arrays with a numbering
discriminator on the command line. This syntax is used for the
object-based specification of disk source properties.
As gluster natively supports multiple hosts for failover reasons we can
easily add the support to the storage driver code in libvirt.
Extract the code setting an individual host into a separate function and
call them in a loop. The new code also tries to keep the debug log
entries sane.
Extract the code so that it can be called from multiple places. This
also removes a tricky fallthrough in the large switch in
qemuBuildNetworkDriveStr.
Add a modular parser that will allow to parse 'json' backing definitions
that are supported by qemu. The initial implementation adds support for
the 'file' driver.
Due to the approach qemu took to implement the JSON backing strings it's
possible to specify them in two approaches.
The object approach:
json:{ "file" : { "driver":"file",
"filename":"/path/to/file"
}
}
And a partially flattened approach:
json:{"file.driver":"file"
"file.filename":"/path/to/file"
}
Both of the above are supported by qemu and by the code added in this
commit. The current implementation de-flattens the first level ('file.')
if possible and required. Other handling may be added later but
currently only one level was possible anyways.
The cur_balloon also increases/decreases with dimm hotplug/unplug.
To be consistent, adjust the value for coldplug too. This was inconsistently
taken care when cur_ballon != memory to begin with. The patch fixes it
irrespective of that.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Since commit c4bdff19, the path to the configuration file has been constructed
in the following manner:
- if no config filename was passed to virConfLoadConfigPath, libvirt.conf was
used as default
- otherwise the filename was concatenated with
"<config_dir>/libvirt/libvirt%s%s.conf" which in admin case resulted in
"libvirt-libvirt-admin.conf.conf". Obviously, this non-existent config led to
ignoring all user settings in libvirt-admin.conf. This patch requires the
config filename to be always provided as an argument with the concatenation
being simplified.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1357364
Signed-off-by: Erik Skultety <eskultet@redhat.com>
For use with memory hotplug virQEMUBuildCommandLineJSONRecurse attempted
to format JSON arrays as bitmap on the command line. Make the formatter
function configurable so that it can be reused with different syntaxes
of arrays such as numbered arrays for use with disk sources.
This patch extracts the code and adds a parameter for the function that
will allow to plug in different formatters.
Until now the JSON->commandline convertor was used only for objects
created by qemu. To allow reusing it with disk formatter we'll need to
escape ',' as usual in qemu commandlines.
Refactor the command line generator by adding a wrapper (with
documentation) that will handle the outermost object iteration.
This patch also renames the functions and tweaks the error message for
nested arrays to be more universal.
The new function is then reused to simplify qemucommandutiltest.
As we already test that the extraction of the backing store string works
well additional tests for the backing store string parser can be made
simpler.
Export virStorageSourceNewFromBackingAbsolute and use it to parse the
backing store strings, format them using virDomainDiskSourceFormat and
match them against expected XMLs.
Nothing in the code path after the removed call has needs/uses the alias
anyway (as would be the case for command line building or talking to monitor).
The alias is VIR_FREE'd in virDomainDeviceInfoClear which is called for any
device that needs/uses an alias via virDomainDeviceDefFree or virDomainDefFree
as well as during virDomainDeviceInfoFree for host devices.
For persistent domains, the domain definition (including aliases) gets
freed a few screens later when it's replaced with newDef.
For transient domains, the definition is freed/unref'd along with the
virDomainObj a few moments later.
Introduce initial support for domainBlockStats API call that
allow us to query block device statistics. OpenStack nova
uses this API call to query block statistics, alongside
virDomainMemoryStats and virDomainInterfaceStats. Note that
this patch only introduces it for VBD for starters. QDisk
would come in a separate patch series.
A new statistics data structure is introduced to fit common
statistics among others specific to the underlying block
backends. For the VBD statistics on linux these are exported
via sysfs on the path:
"/sys/bus/xen-backend/devices/vbd-<domid>-<devid>/statistics"
To calculate the block devno libxlDiskPathToID is introduced.
Each backend implements its own function to extract statistics,
allowing support for multiple backends and different platforms.
VBD stats are exposed in reqs and number of sectors from
blkback, and it's up to us to convert it to sector sizes.
The sector size is gathered through xenstore in the device
backend entry "physical-sector-size".
BlockStatsFlags variant is also implemented which has the
added benefit of getting the number of flush requests.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
QEMU reports timestamp along with other memory statistics, but this information is not saved into domain statistics.
It could be useful to determine if the data reported is fresh or not.
Balloon statistics are not reported in hrf, so no modifications are made in qemu_monitor_text.c.
Signed-off-by: Derbyshev Dmitry <dderbyshev@virtuozzo.com>
'memtotal' in virtio drivers and qemu corresponds to 'available' in libvirt.
Because of that, 'stat-available-memory' is renamed into 'usable'.
Balloon statistics are not reported in hrf, so no modifications are made in qemu_monitor_text.c.
Signed-off-by: Derbyshev Dmitry <dderbyshev@virtuozzo.com>
Dropping the caching of ccw address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
The address sets (pci, ccw, virtio serial) are currently cached
in qemu private data, but all the information required to recreate
these sets is in the domain definition. Therefore I am removing
the redundant data and adding a way to recalculate these sets.
Add a function that calculates the ccw address set
from the domain definition.
Dropping the caching of virtio serial address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
Credit goes to Cole Robinson.
Dropping the caching of virtio serial address set.
Instead of using the cached address set, a function in qemu_hotplug.c
now recalculates it on demand.
Credit goes to Cole Robinson.
The address sets (pci, ccw, virtio serial) are currently cached
in qemu private data, but all the information required to recreate
these sets is in the domain definition. Therefore I am removing
the redundant data and adding a way to recalculate these sets.
Add a function that calculates the virtio serial address set
from the domain definition.
Credit goes to Cole Robinson.
The symbol being missing has been reported as causing build
failures on OS X. If it's not already defined, define it to
zero so that it won't have any effect.
Commit 4a585a88 introduced searching QOM device path by alias, let's use it for
memballoon too. This may speedup the search because in most cases we will find
the correct QOM device path directly by using alias without the need for the
recursion code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit ce745914 introduced detection of actual video ram sizes to fix migration
if QEMU decide to modify the values provided by libvirt. This works perfectly
for domains with number of video devices up to two.
If there are more than two video devices in the guest all the secondary devices
in the XML will have the same memory values. This is because our current code
search for QOM device path only by the device type name and all the secondary
video devices has the same name "qxl".
This patch introduces a new search function that will try to search a QOM device
path using also device's alias if the alias is available. After that it will
fallback to the old recursive code if the alias search found no results.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1358728
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Previously, qemuDomainAttachDeviceFlags was doing two things:
handling the job and attaching devices. Now the second part is
in a new function.
This change is required to make it possible to test more complex
device attachment situations, like attaching a device to both
config and live at once.
We want to be able to pass a NULL instead of the connection
and use this function in tests. To achieve this, the virConnectPtr
is passed instead of virDomainPtr, and the driver is a new separate
parameter.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So commit 306b3a8504 tried mimicking behaviour of commit 540c339a25, but
added a virObjectRef(vm) only after virDomainObjListAdd() in
lxcDomainDefineXMLFlags() and not in lxcDomainCreateXMLWithFiles().
That way undefining a domain that was started with different XML than
defined will leave the domain object in a state with not enough
references to then remove it. Hence any lxcDomainDestroyFlags() called
afterwards crashes the daemon.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1351057
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since return code is checked globally at the end of the function, let's
make sure that we set it correctly at any point.
This fixes a regression introduced in commit 0aa19f35 where the first
command to eject changeable media would fail unconditionally.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This patch forces container's init process, to become a session leader,
that is its session ID is made the same as its process ID.
That might seem unnecessary in general, but if we want to checkpoint a
container with CRIU, which is needed for container migration,
we must ensure that the SID of each process inside the container points
to a process that lives in the same PID namespace as the container.
Therefore, we force that the session leader is the init.
Signed-off-by: Katerina Koukiou <k.koukiou@gmail.com>
When parsing a command line with USB devices that have
no address specified, QEMU automatically adds a USB hub
if the device would fill up all the available USB ports.
To help most of the users, add one hub if there are more
USB devices than available ports. For wilder configurations,
expect the user to provide us with more hubs and/or controllers.
Walk through all the usb hubs in the domain definition
that have a USB address specified, create the
corresponding structures in the virDomainUSBAddressSet
and mark the port it occupies as used.
A new type to track USB addresses.
Every <controller type='usb' index='i'/> is represented by an
object of type virDomainUSBAddressHub located at buses[i].
Each of these hubs has up to 'nports' ports.
If a port is occupied, it has the corresponding bit set in
the 'ports' bitmap, e.g. port 1 would have the 0th bit set.
If there is a hub on this port, then hubs[i] will point
to this hub.
Undefine procedure drops domain lock while waiting for detaching
disks vz sdk call. Meanwhile vz sdk event domain-config-changed
arrives, its handler finds domain and is blocked waiting for job
condition. After undefine API call finishes event processing procedes
and tries to refreshes domain config thru existing vz sdk domain handle.
Domain does not exists anymore and event processing fails. Everything
is fine we just don't want to see error message in log for this
particular case.
Fortunately domain has flag that domain is removed from list. This
also imply that vz sdk domain is also undefined. Thus if we check
for this flag right after domain is locked again on accuiring
job condition we gracefully handle this situation.
Actually the race can happen in other situations too. Any
time we wait for job condition in mutualy exclusive job in
time when we acquire it vz sdk domain can cease to exist.
So instead of general internal error we can return domain
not found which is easier to handle. We don't need to patch
other places in mutually exclusive jobs where domain lock
is dropped as if job is started domain can't be undefine
by mutually exclusive undefine job.
The code of this patch is quite similar to qemu driver checks
for is domain is active after acquiring a job. The difference
only while qemu domain is operational while process is active
vz domain is operational while domain exists.
Current vz driver implementation is not usable when it comes to
long runnig operations. Migration or saving a domain blocks all
other operations even query ones which are expecteted to be available.
This patch addresses this problem.
All vz driver API calls fall into next 3 groups:
1. only query domain cache (virDomainObj, vz cache statistic)
examples are vzDomainGetState, vzDomainGetXMLDesc etc.
2. use thread shared sdkdom object
examples are vzDomainSetMemoryFlags, vzDomainAttachDevice etc.
3. use no thread shared sdkdom object nor domain cache
examples are vzDomainSnapshotListNames, vzDomainSnapshotGetXMLDesc etc
API calls from group 1 don't need to be changed as they hold domain lock only
for short period of time. These calls [1] are easily distinguished. They query
domain object thru libvirt common code or query vz sdk statistics handle thru
vz sdk sync operations.
vzDomainInterfaceStats is the only exception. It uses sdkdom object to
convert interface name to its vz sdk stack index which could not be saved in
domain cache. Interface statistics is available thru this stack index as a key
rather than name. As a result we can have accidental 'not known interface'
errors on quering intrerface stats. The reason is that in the process of
updating domain configuration we drop all devices and then recreate them again
in sdkdom object and domain lock can be dropped meanwhile (to remove networks
for existing bridged interfaces and(or) (re)create new ones). We can fix this
by changing the way we support bridged interfaces or by reordering operations
and changing bridged networks beforehand. Anyway this is better than moving
this API call into 2 group and making it an exclusive job.
As to API calls from group 2, first thread shared sdkdom object needs to be
explained. vz sdk has only one handle for a given domain, thus threads need
exclusive access to operate on it. These calls are fixed to drop and reacquire
domain lock on any lengthy operations - namely waiting the result of async vz
sdk operation. As lock is dropped we need to take extra reference to domain
object if it is not taken already as domain object can be deleted from list
while lock is dropped. As this operations use thread shared sdkdom object, the
simplest way to make calls from group 2 be consistent to each other is to make
them mutually exclusive. This is done by taking/releasing job condition thru
calling correspondent job routine. This approach makes group 1 and group
2 calls consistent to each other too. Not all calls of group 2 change the
domain cache but those that do update it thru prlsdkUpdateDomain which holds
the lock thoughout the update.
API calls from group [2] are easily distinguished too. They use
beginEdit/commit to change domain configuration (vzDomainSetMemoryFlags) or/and
update domain cache from sdkdom at the end of operation (vzDomainSuspend).
There is a known issue however. Frankly speaking it was introduced by ealier
patch '[PATCH 6/9] vz: cleanup loading domain code' from a different series.
The patch significantly reduced amount of time when the driver lock is held when
creating domain from API call or as a result of domain added event from vz sdk.
The problem is these two paths race on using thread shared sdkdom as we don't
have libvirt domain object and can not lock on it. However this don't
invalidates the patch as we can't use the former approach of preadding domain
into the list as we need name at least and name is not given by event. Anyway
i'm against adding half baked object into the list. Eventually this race can be
fixed by extra measures. As to current situation races with different
configurations are unlikely and race when adding domain thru vz driver and
simultaneous event from vz sdk is not dangerous as configuration is the same.
The last group [3] is API calls that need only sdkdom object to make vz sdk
call and don't change thread shared sdkdom object or domain cache in any way.
For now these are mostly domain snapshot API calls. The changes are similar to
those of group 2 - they add extra reference and drop/reacquire the lock on waiting
vz async call result. One can simply take the immutable sdkdom object from the
cache and drop the lock for the rest of operations but the chosen approach
makes implementation of these API calls somewhat similar to those of from group
2 and thus a bit futureproof. As calls of group 3 don't need vz driver
domain/vz sdk cache in any way, they are consistent with respect to API calls from
groups 1 and 3.
There is another exception. Calls to make-snapshot/revert-to-snapshot/migrate
are moved to group 2. That is they are made mutually exclusive. The reason
is that libvirt API supports control/query only for one job per domain and
these are jobs that are likely to be queried/aborted.
Appendix.
[1] API calls that only query domain cache.
(marked [*] are included for a different reason)
.domainLookupByID = vzDomainLookupByID, /* 0.10.0 */
.domainLookupByUUID = vzDomainLookupByUUID, /* 0.10.0 */
.domainLookupByName = vzDomainLookupByName, /* 0.10.0 */
.domainGetOSType = vzDomainGetOSType, /* 0.10.0 */
.domainGetInfo = vzDomainGetInfo, /* 0.10.0 */
.domainGetState = vzDomainGetState, /* 0.10.0 */
.domainGetXMLDesc = vzDomainGetXMLDesc, /* 0.10.0 */
.domainIsPersistent = vzDomainIsPersistent, /* 0.10.0 */
.domainGetAutostart = vzDomainGetAutostart, /* 0.10.0 */
.domainGetVcpus = vzDomainGetVcpus, /* 1.2.6 */
.domainIsActive = vzDomainIsActive, /* 1.2.10 */
.domainIsUpdated = vzDomainIsUpdated, /* 1.2.21 */
.domainGetVcpusFlags = vzDomainGetVcpusFlags, /* 1.2.21 */
.domainGetMaxVcpus = vzDomainGetMaxVcpus, /* 1.2.21 */
.domainHasManagedSaveImage = vzDomainHasManagedSaveImage, /* 1.2.13 */
.domainGetMaxMemory = vzDomainGetMaxMemory, /* 1.2.15 */
.domainBlockStats = vzDomainBlockStats, /* 1.2.17 */
.domainBlockStatsFlags = vzDomainBlockStatsFlags, /* 1.2.17 */
.domainInterfaceStats = vzDomainInterfaceStats, /* 1.2.17 */ [*]
.domainMemoryStats = vzDomainMemoryStats, /* 1.2.17 */
.domainMigrateBegin3Params = vzDomainMigrateBegin3Params, /* 1.3.5 */
.domainMigrateConfirm3Params = vzDomainMigrateConfirm3Params, /* 1.3.5 */
[2] API calls that use thread shared sdkdom object
(marked [*] are included for a different reason)
.domainSuspend = vzDomainSuspend, /* 0.10.0 */
.domainResume = vzDomainResume, /* 0.10.0 */
.domainDestroy = vzDomainDestroy, /* 0.10.0 */
.domainShutdown = vzDomainShutdown, /* 0.10.0 */
.domainCreate = vzDomainCreate, /* 0.10.0 */
.domainCreateWithFlags = vzDomainCreateWithFlags, /* 1.2.10 */
.domainReboot = vzDomainReboot, /* 1.3.0 */
.domainDefineXML = vzDomainDefineXML, /* 0.10.0 */
.domainDefineXMLFlags = vzDomainDefineXMLFlags, /* 1.2.12 */ (update part)
.domainUndefine = vzDomainUndefine, /* 1.2.10 */
.domainAttachDevice = vzDomainAttachDevice, /* 1.2.15 */
.domainAttachDeviceFlags = vzDomainAttachDeviceFlags, /* 1.2.15 */
.domainDetachDevice = vzDomainDetachDevice, /* 1.2.15 */
.domainDetachDeviceFlags = vzDomainDetachDeviceFlags, /* 1.2.15 */
.domainSetUserPassword = vzDomainSetUserPassword, /* 1.3.6 */
.domainManagedSave = vzDomainManagedSave, /* 1.2.14 */
.domainSetMemoryFlags = vzDomainSetMemoryFlags, /* 1.3.4 */
.domainSetMemory = vzDomainSetMemory, /* 1.3.4 */
.domainRevertToSnapshot = vzDomainRevertToSnapshot, /* 1.3.5 */ [*]
.domainSnapshotCreateXML = vzDomainSnapshotCreateXML, /* 1.3.5 */ [*]
.domainMigratePerform3Params = vzDomainMigratePerform3Params, /* 1.3.5 */ [*]
.domainUpdateDeviceFlags = vzDomainUpdateDeviceFlags, /* 2.0.0 */
prlsdkHandleVmConfigEvent
[3] API calls that do not use thread shared sdkdom object
.domainManagedSaveRemove = vzDomainManagedSaveRemove, /* 1.2.14 */
.domainSnapshotNum = vzDomainSnapshotNum, /* 1.3.5 */
.domainSnapshotListNames = vzDomainSnapshotListNames, /* 1.3.5 */
.domainListAllSnapshots = vzDomainListAllSnapshots, /* 1.3.5 */
.domainSnapshotGetXMLDesc = vzDomainSnapshotGetXMLDesc, /* 1.3.5 */
.domainSnapshotNumChildren = vzDomainSnapshotNumChildren, /* 1.3.5 */
.domainSnapshotListChildrenNames = vzDomainSnapshotListChildrenNames, /* 1.3.5 */
.domainSnapshotListAllChildren = vzDomainSnapshotListAllChildren, /* 1.3.5 */
.domainSnapshotLookupByName = vzDomainSnapshotLookupByName, /* 1.3.5 */
.domainHasCurrentSnapshot = vzDomainHasCurrentSnapshot, /* 1.3.5 */
.domainSnapshotGetParent = vzDomainSnapshotGetParent, /* 1.3.5 */
.domainSnapshotCurrent = vzDomainSnapshotCurrent, /* 1.3.5 */
.domainSnapshotIsCurrent = vzDomainSnapshotIsCurrent, /* 1.3.5 */
.domainSnapshotHasMetadata = vzDomainSnapshotHasMetadata, /* 1.3.5 */
.domainSnapshotDelete = vzDomainSnapshotDelete, /* 1.3.5 */
[4] Known issues.
1. accidental errors on getting network statistics
2. race with simultaneous use of thread shared domain object on paths
of adding domain thru API and adding domain on vz sdk domain added event.
sdk domain handle is unique per connection so there is
no sense to query it again if we have it in vzDomObjPtr.
Side effect of prlsdkSdkDomainLookupByUUID is refreshing
domain config is of no use too as PrlVm_BeginEdit do it too.
Commit id '5e46d7d6' did not take into account that usage of a luks
volume will require usage of the master key encrypted passphrase for
a QEMU environment. So rather than allow creation of something that
won't be usable, just fail the creation.
Resolves a CI test integration failure with a RHEL6/Centos6 environment.
In order to use a LUKS encrypted device, the design decision was to
generate an encrypted secret based on the master key. However, commit
id 'da86c6c' missed checking for that specifically.
When qemuDomainSecretSetup was implemented, a design decision was made
to "fall back" to a plain text secret setup if the specific cipher was
not available (e.g. virCryptoHaveCipher(VIR_CRYPTO_CIPHER_AES256CBC))
as well as the QEMU_CAPS_OBJECT_SECRET. For the luks encryption setup
there is no fall back to the plaintext secret, thus if that gets set
up by qemuDomainSecretSetup, then we need to fail.
Also, while the qemuxml2argvtest has set the QEMU_CAPS_OBJECT_SECRET
bit, it didn't take into account the second requirement that the
ability to generate the encrypted secret is possible. So modify the
test to not attempt to run the luks-disk if we know we don't have
the encryption algorithm.
Any error happening after the hand shake in the lxc controller
will not result in a failure as errors are checked during the handshake.
Move the handshake after the last possible error.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1301021
Generate the luks command line using the AES secret key to encrypt the
luks secret. A luks secret object will be in addition to a an AES secret.
For hotplug, check if the encinfo exists and if so, add the AES secret
for the passphrase for the secret object used to decrypt the device.
Modify/augment the fakeSecret* in qemuxml2argvtest in order to handle
find a uuid or a volume usage with a specific path prefix in the XML
(corresponds to the already generated XML tests). Add error message
when the 'usageID' is not 'mycluster_myname'. Commit id '1d632c39'
altered the error message generation to rely on the errors from the
secret_driver (or it's faked replacement).
Add the .args output for adding the LUKS disk to the domain
Signed-off-by: John Ferlan <jferlan@redhat.com>
Soon we will be adding luks encryption support. Since a volume could require
both a luks secret and a secret to give to the server to use of the device,
alter the alias generation to create a slightly different alias so that
we don't have two objects with the same alias.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id 'a1344f70a' added AES secret processing for RBD when starting
up a guest. As such, when the hotplug code calls qemuDomainSecretDiskPrepare
an AES secret could be added to the disk about to be hotplugged. If an AES
secret was added, then the hotplug code would need to generate the secret
object because qemuBuildDriveStr would add the "password-secret=" to the
returned 'driveStr' rather than the base64 encoded password.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Partially resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1301021
If the volume xml was looking to create a luks volume take the necessary
steps in order to make that happen.
The processing will be:
1. create a temporary file (virStorageBackendCreateQemuImgSecretPath)
1a. use the storage driver state dir path that uses the pool and
volume name as a base.
2. create a secret object (virStorageBackendCreateQemuImgSecretObject)
2a. use an alias combinding the volume name and "_luks0"
2b. add the file to the object
3. create/add luks options to the commandline (virQEMUBuildLuksOpts)
3a. at the very least a "key-secret=%s" using the secret object alias
3b. if found in the XML the various "cipher" and "ivgen" options
Signed-off-by: John Ferlan <jferlan@redhat.com>
When formatting the graphics data for TYPE_SPICE, check if the glisten
is NULL before blindly referencing
Found by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com>
A recent adjustment to qemuDomainAttachRNGDevice to properly cleanup
the props object after a qemuMonitorAddObject also would affect this
code. Alter the cleanup to be similar to RNG changes.
Based on recent review comment - rather than have a spate of goto failxxxx,
change to a boolean based model. Ensures that the original error can be
preserved and cleanup is a bit more orderly if more objects are added.
Based on recent review comment - rather than have a spate of goto failxxxx,
change to a boolean based model. Ensures that the original error can be
preserved and cleanup is a bit more orderly if more objects are added.
Based on recent review comment - rather than have a spate of goto failxxxx,
change to a boolean based model. Ensures that the original error can be
preserved and cleanup is a bit more orderly if more objects are added.
Based on recent review comment - rather than have a spate of goto failxxxx,
change to a boolean based model. Ensures that the original error can be
preserved and cleanup is a bit more orderly if more objects are added.
Based on recent review comment - rather than have a spate of goto failxxxx,
change to a boolean based model. Ensures that the original error can be
preserved and cleanup is a bit more orderly if more objects are added.
Commit da665fbd introduced the following condition to virLXCProcessEnsureRootFS
and openvzReadFSConf:
if (!(<some_var> = virDomainFSDefNew()) < 0)
which broke the build on fedora with GCC 5.3.1: "logical not is only applied to
the left hand side of comparison".
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The commit da665fbd introduced virStorageSourcePtr inside the structure
_virDomainFSDef. This is causing an error when libvirt is being compiled.
make[3]: Entering directory `/media/julio/8d65c59c-6ade-4740-9cdc-38016a4cb8ae
/home/julio/Desktop/virt/libvirt/src'
CC security/virt_aa_helper-virt-aa-helper.o
security/virt-aa-helper.c: In function 'get_files':
security/virt-aa-helper.c:1087:13: error: passing argument 2 of 'vah_add_path'
from incompatible pointer type [-Werror]
if (vah_add_path(&buf, fs->src, "rw", true) != 0)
^
security/virt-aa-helper.c:732:1: note: expected 'const char *' but argument is
of type 'virStorageSourcePtr'
vah_add_path(virBufferPtr buf, const char *path, const char *perms, bool
recursive)
^
cc1: all warnings being treated as errors
Adding the attribute "path" from virStorageSourcePtr fixes this issue.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
vz supports only a subset of tcp and udp parameters.
1. tcp type supports only 'raw' protocol.
2. udp type supports only same parameters of 'host' and 'service'
for 'bind' and 'connect'.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
After domain is in the domains list let's keep it there. This
is approach taken by qemu driver and vz vzDomainMigrateFinish3Params too.
It quite reasonable, driver domain object is fully constructed and
can be discovered by client later.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
9c14a9ab introduced vzNewDomain function to enlist libvirt domain
object before actually creating vz sdk domain. Fix should fix
race on same vz sdk domain added event where libvirt domain object is
enlisted too. But later eb5e9c1e added locked checks for
adding livirtd domain object to list on vz sdk domain added event.
Thus now approach of 9c14a9ab is unnecessary complicated.
See we have otherwise unuseful prlsdkGetDomainIds function only
to create minimal domain definition to create libvirt domain object.
Also vzNewDomain is difficult to use as it creates partially
constructed domain object.
Let's move back to original approach where prlsdkLoadDomain do
all the necessary job. Another benefit is that we can now
take driver lock for bare minimum and in single place. Reducing
locking time have small disadvatage of double parsing on race
conditions which is typical if domain is added thru vz driver.
Well we have this double parse inevitably with current vz sdk api
on any domain updates so i would not take it here seriously.
Performance events subscribtion is done before locked check and
therefore could be done twice on races but this is not the problem.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Vz containers are able to use ploop volumes from storage pools
to work upon.
To use filesystem type volume, pool name and volume name should be
specifaed in <source> :
<filesystem type='volume' accessmode='passthrough'>
<driver type='ploop' format='ploop'/>
<source pool='guest_images' volume='TEST_POOL_CT'/>
<target dir='/'/>
</filesystem>
The information about pool and volume is stored in ct dom configuration:
<StorageURL>libvirt://localhost/pool_name/vol_name</StorageURL>
and can be easily obtained via PrlVmDevHd_GetStorageURL sdk call.
The only shorcoming: if storage pool is moved somewhere the ct
should be redefined in order to refresh the information aboot path
to root.hdd
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
We do not need to check domainf fs type there,
because it is done in prlsdkCheckUnsupportedParams.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
New type of <devices> <filesystem type= 'volume'> is introduced.
This patch allows to use volumes for storing the filesystem, that is
accessed from the guest e.g. root directory for container.
To take advantage of volumes as a backend of filesystem volume
and pool names should be specified:
<filesystem type= 'volume'>
<source pool='pool name' volume='volume name'/>
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Adding domain to domain list on preparation step is not correct.
First domain is not fully constructed - domain definition is
missing. Second we can't use VIR_MIGRATE_PARAM_DEST_XML parameter
to parse definition as vz sdk can patch it by itself. Let's add/remove
domain on finish step. This is for synchronization purpose only so domain
is present/absent on destination after migration completion. Actually
domain object will probably be created right after actual vz sdk
migration start by vz sdk domain defined event.
We can not and should not sync domain cache on error path in finish step
of migration. We can not as we really don't know what is the reason of
cancelling and we should not as user should not make assumptions on
state on error path. What we should do is cleaning up temporary migration
state that is induced on prepare step but we don't have one. Thus
cancellation should be noop.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
libvirt domain defined event is issued only on correspondent vz sdk
event. But in case event delivered before domain is added to
domain list we can mistakenly skip this event if prlsdkNewDomainByHandle
return NULL in case of domain is discovered in the list under
the driver lock. Let's return domain object in this case.
Now prlsdkNewDomainByHandle returns NULL only in case of
error which is more convinient.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The first version of migration cookie was rather dumb resulting
in passing empty or unused fields here and there. Add flags to
specify what to bake to and eat from cookie so we deal only
with meaningful data. However for backwards compatibility
we still need to pass at least some faked fields sometimes.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Older libvirt versions send persistent XML in a migration cookie even
when VIR_MIGRATE_PERSIST_DEST flag is not used, but current libvirt
properly fails if the cookie contains unexpected flags. Thus migration
from old libvirt fails with
internal error: Unsupported migration cookie feature persistent
unless VIR_MIGRATE_PERSIST_DEST flag is set.
https://bugzilla.redhat.com/show_bug.cgi?id=1320500
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
A nodedev device definition like this is required for testing
NodeDeviceCreateXML and NodeDeviceDestroy. So unless it's part
of the stock test:///default set there's no way to actually
invoke those functions for the default URI
Convert the individual XML documents into one big XML document
in the format expected by the non-default test://$PATH URI, and
use the same internal helpers for assembling the driver contents.
virConfGetValueLLong() errors out if the value is too big to
fit into a long long integer, but claims the supported range
to be (0,LLONG_MAX) instead of (LLONG_MIN,LLONG_MAX).
When parsing numeric values, we always store them as unsigned
unless they're negative. We can use this fact to simplify the
logic by removing a bunch of unnecessary checks.
Commit 6381c89f8c changed virConfValue to store long long
integers instead of long integers; however, the temporary variable
used in virConfParseLong() was not updated accordingly, causing
trouble for 32-bit machines.
In preparation to tracking which USB addresses are occupied.
Introduce two helper functions for printing the port path
as a string and appending it to a virBuffer.
We were requiring a USB port path in the schema, but not enforcing it.
Omitting the USB port would lead to libvirt formatting it as (null).
Such domain cannot be started and will disappear after libvirtd restart
(since it cannot parse back the XML).
Only format the port if it has been specified and mark it as optional
in the XML schema.
Migration to an older libvirt (pre v1.3.0-175-g7140807) is broken
because older versions of libvirt generated different channel paths and
they didn't drop the default paths when parsing domain XMLs. We'd get
such a nice error message:
internal error: process exited while connecting to monitor:
2016-07-08T15:28:02.665706Z qemu-kvm: -chardev socket,
id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/
domain-3-nest/org.qemu.guest_agent.0,server,nowait: Failed to bind
socket to /var/lib/libvirt/qemu/channel/target/domain-3-nest/
org.qemu.guest_agent.0: No such file or directory
That said, we should not even format the default paths when generating a
migratable XML.
https://bugzilla.redhat.com/show_bug.cgi?id=1320470
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Playing directly with our live definition, updating it, and reverting it
back once we are done is very nice and it's quite dangerous too. Let's
just make a copy of the domain definition if needed and do all tricks on
the copy.
https://bugzilla.redhat.com/show_bug.cgi?id=1320470
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If size_t is the same size as long long, then we can skip
some of the range checks. This avoids triggering some
bogus compiler warning messages.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virConf 'l' field is a 'signed long long', so whenever
the 'type' field is VIR_CONF_ULONG, we should explicitly cast
'l' to a 'unsigned long long' before doing range checks.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This function tries to get a ssize_t value from a config file.
But before returning it, it checks whether the value would fit in
ssize_t and if not an error is printed out among with the range
for the ssize_t type. However, on some platforms SSIZE_MAX may
actually be a signed long type:
util/virconf.c: In function 'virConfGetValueSSizeT':
util/virconf.c:1268:9: error: format '%zd' expects argument of type 'signed size_t', but argument 9 has type 'long int' [-Werror=format=]
virReportError(VIR_ERR_INTERNAL_ERROR,
^
$ grep -r SSIZE_MAX /usr/include/
/usr/include/bits/posix1_lim.h:#ifndef SSIZE_MAX
/usr/include/bits/posix1_lim.h:# define SSIZE_MAX LONG_MAX
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
MinGW complained that we might be dereferencing a NULL pointer. While
that can't be true, the logic certainly allows for that.
../../src/conf/domain_conf.c: In function 'virDomainDefPostParse':
../../src/conf/domain_conf.c:4224:18: error: potential null pointer dereference [-Werror=null-dereference]
if (!vcpu->online && vcpu->cpumask) {
~~~~^~~~~~~~
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
MinGW complained that we might be dereferencing a NULL pointer. While
that can't be true, the logic certainly allows for that.
src/conf/domain_conf.c: In function 'virDomainDefGetVcpuPinInfoHelper':
src/conf/domain_conf.c:1545:17: error: potential null pointer dereference [-Werror=null-dereference]
if (vcpu->cpumask)
~~~~^~~~~~~~~
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
IPv6 RA always contains an implicit default route via
the link-local address of the source of RA. This forces
the guest to install a route via isolated network, which
may disturb the guest's networking in case of multiple interfaces.
More info in 013427e6e7.
The validity of this route is controlled by "default [route] lifetime"
field of RA. If the lifetime is set to 0 seconds, then no route
is installed by receiver.
dnsmasq 2.67+ supports "ra-param=<interface>,<RA interval>,<default
lifetime>" option. We pass "ra-param=*,0,0"
(here, RA_interval=0 means default) to disable default gateway in RA
for isolated networks.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1354238
So we spend some time and effort constructing perfect file name
for an automatic coredump of a domain, but then just leak it and
use the domain name anyway. This is probably due to a silly
mistake that slipped even through review.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
At least with systemd v210, NOTIFY_SOCKET is abstact, e.g.
@/org/freedesktop/systemd1/notify. sendmsg() fails on such a socket
with "Connection refused". The unix(7) man page contains the following
details wrt abstract socket addresses
abstract: an abstract socket address is distinguished (from a
pathname socket) by the fact that sun_path[0] is a null byte
('\0'). The socket's address in this namespace is given by the
additional bytes in sun_path that are covered by the specified
length of the address structure. (Null bytes in the name have
no special significance.)
So we need to be more precise about the address length, setting it to
the sizeof sa_family_t + length of address copied to sun_path instead
of setting it to the sizeof the entire sockaddr_un struct.
Resolves: https://bugzilla.opensuse.org/show_bug.cgi?id=987668
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When fetching domains with virConnectListAllDomains() and when filtering
by snapshot existence is requested the ESX driver first lists all the
domains and then check one-by-one for snapshot existence. This process
takes unnecessarily long time.
To significantly improve the time necessary to finish the query we can
request the snapshot related info directly when querying the list of
domains from VMware.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
In an unlikely event of execve() failing, the virCommandExec()
function does not report any error, even though checks that are
at the beginning of the function are verbose when failing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The modification of .volWipe callback wipes ploop volume using one of
given wiping algorithm: dod, nnsa, etc.
However, in case of ploop volume we need to reinitialize root.hds and DiskDescriptor.xml.
v2:
- added check on ploop tools presens
- virCommandAddArgFormat changed to virCommandAddArg
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Check whether QEMU supports -device intel-iommu
Note that the presence of this option does not mean that it's
usable because of a bug in earlier QEMU versions, but it's
better than nothing.
https://bugzilla.redhat.com/show_bug.cgi?id=1235580
Currently many users of virConf APIs are defining the same
macros for calling virConfValue() and then doing type
checking. To remove this repeated code, add a set of
typesafe accessor methods.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the config file does not end with a \n, the parser will append
one. When re-allocating the array though, it is mistakenly assuming
that 'len' is the length including the trailing NUL, but it does
not. So we must add 2 to len, when reallocating, not 1.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This one's a bit more complicated. In qemuProcessPrepareDomain()
a master key for encrypting secret for ciphered disks is created.
This object lives within qemuDomainObjPrivate object. It is freed
in qemuProcessStop(), but if nobody calls it (for instance like
our qemuxml2argvtest does), the key object leaks.
==17078== 32 bytes in 1 blocks are definitely lost in loss record 633 of 707
==17078== at 0x4C2C070: calloc (vg_replace_malloc.c:623)
==17078== by 0xAD924DF: virAllocN (viralloc.c:191)
==17078== by 0x5050BA6: virCryptoGenerateRandom (qemuxml2argvmock.c:166)
==17078== by 0x453DC8: qemuDomainMasterKeyCreate (qemu_domain.c:678)
==17078== by 0x47A36B: qemuProcessPrepareDomain (qemu_process.c:4913)
==17078== by 0x47C728: qemuProcessCreatePretendCmd (qemu_process.c:5542)
==17078== by 0x433698: testCompareXMLToArgvFiles (qemuxml2argvtest.c:332)
==17078== by 0x4339AC: testCompareXMLToArgvHelper (qemuxml2argvtest.c:413)
==17078== by 0x446E7A: virTestRun (testutils.c:179)
==17078== by 0x445BD9: mymain (qemuxml2argvtest.c:2022)
==17078== by 0x44886F: virTestMain (testutils.c:969)
==17078== by 0x445D9B: main (qemuxml2argvtest.c:2036)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like every other qemuBuild*CommandLine() function, this uses
a buffer to hold partial cmd line strings too. However, if
there's an error, the control jumps to 'cleanup' label leaving
the buffer behind and thus leaking it.
==2013== 1,006 bytes in 1 blocks are definitely lost in loss record 701 of 711
==2013== at 0x4C29F80: malloc (vg_replace_malloc.c:296)
==2013== by 0x4C2C32F: realloc (vg_replace_malloc.c:692)
==2013== by 0xAD925A8: virReallocN (viralloc.c:245)
==2013== by 0xAD95EA8: virBufferGrow (virbuffer.c:130)
==2013== by 0xAD95F78: virBufferAdd (virbuffer.c:165)
==2013== by 0x5097F5: qemuBuildCpuModelArgStr (qemu_command.c:6339)
==2013== by 0x509CC3: qemuBuildCpuCommandLine (qemu_command.c:6437)
==2013== by 0x51142C: qemuBuildCommandLine (qemu_command.c:9174)
==2013== by 0x47CA3A: qemuProcessCreatePretendCmd (qemu_process.c:5546)
==2013== by 0x433698: testCompareXMLToArgvFiles (qemuxml2argvtest.c:332)
==2013== by 0x4339AC: testCompareXMLToArgvHelper (qemuxml2argvtest.c:413)
==2013== by 0x446E7A: virTestRun (testutils.c:179)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When storage secret is parsed in virStorageEncryptionSecretParse(),
virSecretLookupParseSecret() which allocates some memory. This is
however never freed.
==21711== 134 bytes in 6 blocks are definitely lost in loss record 70 of 85
==21711== at 0x4C29F80: malloc (vg_replace_malloc.c:296)
==21711== by 0xBCA0356: xmlStrndup (in /usr/lib64/libxml2.so.2.9.4)
==21711== by 0xA9F432E: virXMLPropString (virxml.c:479)
==21711== by 0xA9D25B0: virSecretLookupParseSecret (virsecret.c:70)
==21711== by 0xA9D616E: virStorageEncryptionSecretParse (virstorageencryption.c:172)
==21711== by 0xA9D66B2: virStorageEncryptionParseXML (virstorageencryption.c:281)
==21711== by 0xA9D68DF: virStorageEncryptionParseNode (virstorageencryption.c:338)
==21711== by 0xAA12575: virDomainDiskDefParseXML (domain_conf.c:7606)
==21711== by 0xAA2CAC6: virDomainDefParseXML (domain_conf.c:16658)
==21711== by 0xAA2FC75: virDomainDefParseNode (domain_conf.c:17472)
==21711== by 0xAA2FAE4: virDomainDefParse (domain_conf.c:17419)
==21711== by 0xAA2FB72: virDomainDefParseFile (domain_conf.c:17443)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Setting up cgroups and other things for all kinds of threads (the
emulator thread, vCPU threads, I/O threads) was copy-pasted every time
new thing was added. Over time each one of those functions changed a
bit differently. So create one function that does all that setup and
start using it, starting with I/O thread setup. That will shave some
duplicated code and maybe fix some bugs as well.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The code in qemuDomainObjPrivateXMLParseVcpu for parsing
the 'idstr' string was comparing the overall boolean
result against 0 which was always true
qemu/qemu_domain.c: In function 'qemuDomainObjPrivateXMLParseVcpu':
qemu/qemu_domain.c:1482:59: error: comparison of constant '0' with boolean expression is always false [-Werror=bool-compare]
if ((idstr && virStrToLong_uip(idstr, NULL, 10, &idx)) < 0 ||
^
It was further performing two distinct error checks in
the same conditional and reporting a single error message,
which was misleading in one of the two cases.
This splits the conditional check into two parts with
distinct error messages and fixes the logic error.
Fixes the bug in
commit 5184f398b4
Author: Peter Krempa <pkrempa@redhat.com>
Date: Fri Jul 1 14:56:14 2016 +0200
qemu: Store vCPU thread ids in vcpu private data objects
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
An error in virHostCPUGetKVMMaxVCPUs() means we've been unable
to access /dev/kvm, or we're running on a platform that doesn't
support KVM in the first place.
If that's the case, we shouldn't ignore the error and report
domcapabilities even though we know the user won't be able to
start any KVM guest.
If we don't HAVE_LINUX_KVM_H, we can't query /dev/kvm to discover
the limits on the number of vCPUs, so we report an error and
return a negative value instead.
Allow to store driver specific data on a per-vcpu basis.
Move of the virDomainDef*Vcpus* functions was necessary as
virDomainXMLOptionPtr was declared below this block and I didn't want to
split the function headers.
Most callers make sure that it's never called with an out of range vCPU.
Every other caller reports a different error explicitly. Drop the error
reporting and clean up some dead code paths.
A simple getopt-based argument parser is added for the /usr/sbin/bhyveload
command, loosely based on its argument parser.
The boot disk is guessed by iterating over all
disks and matching their sources. If any non-default arguments are found,
def->os.bootloaderArgs is set accordingly, and the bootloader is treated as a
custom bootloader.
Custom bootloader are supported by setting the def->os.bootloader and
def->os.bootloaderArgs accordingly
grub-bhyve is also treated as a custom bootloader. Since we don't get the
device map in the native format anyways, we can't reconstruct the complete
boot order. While it is possible to check what type the grub boot disk is by
checking if the --root argument is "cd" or "hd0,msdos1", and then just use the
first disk found, implementing the grub-bhyve argument parser as-is in the
grub-bhyve source would mean adding a dependency to argp or duplicating lots
of the code of argp. Therefore it's not really worth implementing that now.
Signed-off-by: Fabian Freyer <fabian.freyer@physik.tu-berlin.de>
A simpe getopt-based argument parser is added for the /usr/sbin/bhyve command,
loosely based on its argument parser, which reads the following from the bhyve
command line string:
* vm name
* number of vcpus
* memory size
* the time offset (UTC or localtime)
* features:
* acpi
* ioapic: While this flag is deprecated in FreeBSD r257423, keep checking for
it for backwards compatibiility.
* the domain UUID; if not explicitely given, one will be generated.
* lpc devices: for now only the com1 and com2 are supported. It is required for
these to be /dev/nmdm[\d+][AB], and the slave devices are automatically
inferred from these to be the corresponding end of the virtual null-modem
cable: /dev/nmdm<N>A <-> /dev/nmdm<N>B
* PCI devices:
* Disks: these are numbered in the order they are found, for virtio and ahci
disks separately. The destination is set to sdX or vdX with X='a'+index;
therefore only 'z'-'a' disks are supported.
Disks are considered to be block devices if the path
starts with /dev, otherwise they are considered to be files.
* Networks: only tap devices are supported. Since it isn't possible to tell
the type of the network, VIR_DOMAIN_NET_TYPE_ETHERNET is assumed, since it
is the most generic. If no mac is specified, one will be generated.
Signed-off-by: Fabian Freyer <fabian.freyer@physik.tu-berlin.de>
First, remove escaped newlines and split up the string into an argv-list for
the bhyve and loader commands, respectively. This is done by iterating over the
string splitting it by newlines, and then re-iterating over each line,
splitting it by spaces.
Since this code reuses part of the code of qemu_parse_command.c
(in bhyveCommandLine2argv), add the appropriate copyright notices.
Signed-off-by: Fabian Freyer <fabian.freyer@physik.tu-berlin.de>
As there is an explicit constructor for the special case of empty
bitmaps, we should mention that the generic constructors rejects the
creation of empty bitmaps.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Before the variable 'bits' was initialized with 0 (commit
3470cd860d), the following bug was
possible.
A function call with an empty bitmap leads to undefined
behavior. Because if 'bitmap->map_len == 0' 'unusedBits' will be <= 0
and 'sz == 1'. So the non global and non static variable 'bits' would
have never been set. Consequently the check 'bits == 0' results in
undefined behavior.
This patch clarifies the current version of the function by handling the
empty bitmap explicitly. Also, for an empty bitmap there is obviously no
bit set so we can just return -1 (indicating no bit set) right away. The
explicit check for 'bits == 0' after the loop is unnecessary because we
only get to this point if no set bit was found.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Otherwise migration during which we didn't send client_migrate_info QMP
command will get stuck waiting for SPICE migration to finish if libvirtd
sent the QMP command in a previous migration attempt.
Broken by bd7c8a69.
https://bugzilla.redhat.com/show_bug.cgi?id=1151723
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
People debugging guest OS boot processes and reported that
the default 128 KB size is too small to capture an entire
boot up sequence. Increase the default size to 2 MB which
should allow capturing a full boot up even with verbose
debugging.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently virtlogd has a hardcoded max file size of 128kb
and max of 3 backups. This adds two new config parameters
to /etc/libvirt/virtlogd.conf to let these be customized.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After 27726d8c21 a privateData is allocated in
virDomainHostdevDefAlloc(). However, the counter part - freeing
them in Free() is missing which leads to the following memory
leak:
==6489== 24 bytes in 1 blocks are definitely lost in loss record 684 of 1,003
==6489== at 0x4C2C070: calloc (vg_replace_malloc.c:623)
==6489== by 0x54B7C94: virAllocVar (viralloc.c:560)
==6489== by 0x5517BE6: virObjectNew (virobject.c:193)
==6489== by 0x1B400121: qemuDomainHostdevPrivateNew (qemu_domain.c:798)
==6489== by 0x5557B24: virDomainHostdevDefAlloc (domain_conf.c:2152)
==6489== by 0x5575578: virDomainHostdevDefParseXML (domain_conf.c:12709)
==6489== by 0x5582292: virDomainDefParseXML (domain_conf.c:16995)
==6489== by 0x5583C98: virDomainDefParseNode (domain_conf.c:17470)
==6489== by 0x5583B07: virDomainDefParse (domain_conf.c:17417)
==6489== by 0x5583B95: virDomainDefParseFile (domain_conf.c:17441)
==6489== by 0x55A3F24: virDomainObjListLoadConfig (virdomainobjlist.c:465)
==6489== by 0x55A43E6: virDomainObjListLoadAllConfigs (virdomainobjlist.c:596)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is preferrable to -nographic which (in addition to disabling
graphics output) redirects the serial port to stdio and on OpenBIOS
enables the firmware's serial console.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Libxl is the last user and I don't have the toolchain prepared to
compile the libxl driver. Move it to the libxl driver to avoid having to
refactor the code.
This is just a convenience method for discarding a list of filters instead of
using a 'for' loop everywhere. It is safe to pass -1 as the number of elements
in the list as well as passing NULL as list reference.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Provide a separate method to free a logging filter object. This will come handy
once a method to create an individual logging filter object is introduced.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is just a convenience method for discarding a list of outputs instead of
using a 'for' loop everywhere. It is safe to pass -1 as the number of elements
in the list as well as passing NULL as list reference.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Provide a separate method to free a logging output object. This will come handy
once a method to create an individual logging output object is introduced.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as with outputs; since the operations will be further divided into smaller
tasks, creating a filter will become a separate operation that will return
a reference to a newly created filter.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now, we define outputs one after another. However, the correct flow
should be to define a set of outputs as a whole unit. Therefore each output
should be first created, placed into an array/list and the list will be
defined. Output creation should be a separate operation, so an output will be
returned by a reference. From that perspective, it makes perfect sense to
only store pointers to actual outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In this particular case, reset is meant as clearing the whole list of
outputs/filters, not resetting it to a predefined default setting. Looking at
it from that perspective, returning the number of records removed doesn't help
the caller in any way (not that any of the callers would actually check for
it). Well, callers could detect an error from the number of successfully
removed records, but the only thing that can fail in virLogReset is force
closing a file descriptor in which case the error isn't propagated back to
virLogReset anyway. Conclusion: there is no practical use for having a return
type of 'int' rather than 'void' in this case.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Due to the way the hardware works, KVM on ppc64 always requires
memory locking; however, that is not the case for non-KVM ppc64
guests, eg. ppc64 guests that are running on x86_64 with TCG.
Only require memory locking for ppc64 guests if they are using
KVM or, as it's the case for all architectures, they have host
devices assigned using VFIO.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1350772
For type='ethernet' interfaces only.
(This patch had been pushed earlier in
commit 0b4645a7e0, but was reverted in
commit 84d47a3cce because it had been
accidentally pushed during the freeze for release 2.0.0)
(This patch had been pushed earlier in
commit cd5c9f21de, but was reverted in
commit 1549f16832 because it had been
accidentally pushed during the freeze for release 2.0.0)
This will apply to any IP address setting that uses
virNetDevIPInfoAddToDev() (which so far is only the guest-side of LXC
type='ethernet' interfaces).
(This patch had been pushed earlier in
commit cb20f989df, but was reverted in
commit cba06aea8d because it had been
accidentally pushed during the freeze for release 2.0.0)
This is place as a sub-element of <source>, where other aspects of the
host-side connection to the network device are located (network or
bridge name, udp listen port, etc). It's a bit odd that the interface
we're configuring with this info is itself named in <target dev='x'/>,
but that ship sailed long ago:
<interface type='ethernet'>
<mac address='00:16:3e:0f:ef:8a'/>
<source>
<ip address='192.168.122.12' family='ipv4'
prefix='24' peer='192.168.122.1'/>
<ip address='192.168.122.13' family='ipv4' prefix='24'/>
<route family='ipv4' address='0.0.0.0'
gateway='192.168.122.1'/>
<route family='ipv4' address='192.168.124.0' prefix='24'
gateway='192.168.124.1'/>
</source>
</interface>
In practice, this will likely only be useful for type='ethernet', so
its presence in any other type of interface is currently forbidden in
the generic device Validate function (but it's been put into the
general population of virDomainNetDef rather than the
ethernet-specific union member so that 1) we can more easily add the
capability to other types if needed, and 2) we can retain the info
when set to an invalid interface type all the way through to
validation and report a proper error, rather than just ignoring it
(which is currently what happens for many other type-specific
settings).
(NB: The already-existing configuration of IP info for the guest-side
of interfaces is in subelements directly under <interface>, and the
name of the guest-side interface (when configurable) is in <guest
dev='x'/>).
(This patch had been pushed earlier in
commit fe6a77898a, but was reverted in
commit d658456530 because it had been
accidentally pushed during the freeze for release 2.0.0)
The peer attribute is used to set the property of the same name in the
interface IP info:
<interface type='ethernet'>
...
<ip family='ipv4' address='192.168.122.5'
prefix='32' peer='192.168.122.6'/>
...
</interface>
Note that this element is used to set the IP information on the
*guest* side interface, not the host side interface - that will be
supported in an upcoming patch.
(This patch now has quite a history: it was originally pushed in
commit 690969af, which was subsequently reverted in commit 1d14b13f,
then reworked and pushed (along with a lot of other related/supporting
patches) in commit 93135abf1; however *that* commit had been
accidentally pushed during dev. freeze for release 2.0.0, so it was
again reverted in commit f6acf039f0).
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
This patch takes the code out of
lxcContainerRenameAndEnableInterfaces() that adds all IP addresses and
IP routes to the interface, and puts it into a utility function
virNetDevIPInfoAddToDev() in virnetdevip.c so that it can be used by
anyone.
One small change in functionality -
lxcContainerRenameAndEnableInterfaces() previously would add all IP
addresses to the interface while it was still offline, then set the
interface online, and then add the routes. Because I don't want the
utility function to set the interface online, I've moved this up so
the interface is first set online, then IP addresses and routes are
added. This is the same order that the network service from
initscripts (in ifup-ether) does it, so it shouldn't pose any problem
(and hasn't, in the tests that I've run).
(This patch had been pushed earlier in commit
f1e0d0da11, but was reverted in commit
05eab47559 because it had been
accidentally pushed during the freeze for release 2.0.0)
Introduce a helper to help determine if a disk src could be possibly used
for a disk secret... Going to need this for hot unplug.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In order to use more common code and set up for a future type, modify the
encryption secret to allow the "usage" attribute or the "uuid" attribute
to define the secret. The "usage" in the case of a volume secret would be
the path to the volume as dictated by the backwards compatibility brought
on by virStorageGenerateQcowEncryption where it set up the usage field as
the vol->target.path and didn't allow someone to provide it. This carries
into virSecretObjListFindByUsageLocked which takes the secret usage attribute
value from from the domain disk definition and compares it against the
usage type from the secret definition. Since none of the code dealing
with qcow/qcow2 encryption secrets uses usage for lookup, it's a mostly
cosmetic change. The real usage comes in a future path where the encryption
is expanded to be a luks volume and the secret will allow definition of
the usage field.
This code will make use of the virSecretLookup{Parse|Format}Secret common code.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new secret type known as "passphrase" - it will handle adding the
secret objects that need a passphrase without a specific username.
The format is:
<secret ...>
<uuid>...</uuid>
...
<usage type='passphrase'>
<name>mumblyfratz</name>
</usage>
</secret>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the virSecretDefParseUsage ensures each of the fields is present,
no need to check during virSecretDefFormatUsage (also virBufferEscapeString
is a no-op with a NULL argument).
Signed-off-by: John Ferlan <jferlan@redhat.com>
A helper that will execute a callback on every USB device
in the domain definition.
With an ability to skip USB hubs, since we will want to treat
them differently in some cases.
I'm not sure why our code claimed "-boot menu=on" cannot be used in
combination with per-device bootindex, but it was proved wrong about
four years ago by commit 8c952908. Let's always use bootindex when QEMU
supports it.
https://bugzilla.redhat.com/show_bug.cgi?id=1323085
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Pretending (partial) support for something we don't understand is risky.
Reporting a failure is much better.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virTypedParameterAssign steals the string rather than copying it into
the typed parameter and thus freeing it leads to a crash when attempting
to serialize the results.
This was introduced in commit 9f50f6e2 and later made an universal
helper in 32e6339c.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1351473
Some code paths already assume that it is allocated since it was always
allocated by virDomainPerfDefParseXML. Make it member of virDomainDef
directly so that we don't have to allocate it all the time.
This fixes crash when attempting to connect to an existing process via
virDomainQemuAttach since we would not allocate it in that code path.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1350688
Ensure that the given controller and all controllers with a smaller
index exist; there must not be any missing index in between.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
The commit "qemu: hot-plug: Assume support for -device in
qemuDomainAttachSCSIDisk" dropped the code for the automatic SCSI
controller creation used in SCSI disk hot-plugging. If we are
hot-plugging a SCSI disk to a domain and there is no proper SCSI
controller defined, it results in an "error: internal error: Could not
find scsi controller with index X required for device" error.
For that reason reverting a hunk of the commit
d4d32005d6.
This patch also adds an extra comment to the code to clarify the
loop.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
CVE-2016-5008
Setting an empty graphics password is documented as a way to disable
VNC/SPICE access, but QEMU does not always behaves like that. VNC would
happily accept the empty password. Let's enforce the behavior by setting
password expiration to "now".
https://bugzilla.redhat.com/show_bug.cgi?id=1180092
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Similarly to what virsh virt-login-shell do, call virAdmInitialize prior to
initializing an event loop and initializing the error handler. Commit 97973ebb7
described and fixed an identical issue for libvirt_lxc.
Since virAdmInitialize becomes a public API after applying this patch,
the symbol is also added to public syms and the doc string of the method is
slightly enhanced analogically to virInitialize.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
virNetServerClientGetInfo returns the client's remote address
as a string, which is a part of the client object.
Use VIR_STRDUP to make a copy which can be freely accessed
even after the virNetServerClient object is unlocked.
To reproduce, put a sleep between virObjectUnlock in
virNetServerClientGetInfo and virTypedParamsAddString in
adminClientGetInfo, then close the queried connection during
that sleep.
https://bugzilla.redhat.com/show_bug.cgi?id=1316370
Consider the following disk for a domain:
<disk type='volume' device='cdrom'>
<driver name='qemu' type='raw'/>
<auth username='libvirt'>
<secret type='iscsi' usage='libvirtiscsi'/>
</auth>
<source pool='iscsi-secret-pool' volume='unit:0:0:0' mode='direct' startupPolicy='optional'/>
<target dev='sda' bus='scsi'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
Now, startupPolicy is currently not allowed for iscsi disks, so
one would expect an error message to be thrown. But what a
surprise is waiting for users if they try to start up such
domain:
==15724== Invalid free() / delete / delete[] / realloc()
==15724== at 0x4C2B1F0: free (vg_replace_malloc.c:473)
==15724== by 0x54B7A69: virFree (viralloc.c:582)
==15724== by 0x552DC90: virStorageAuthDefFree (virstoragefile.c:1549)
==15724== by 0x552F023: virStorageSourceClear (virstoragefile.c:2055)
==15724== by 0x552F054: virStorageSourceFree (virstoragefile.c:2067)
==15724== by 0x55556AA: virDomainDiskDefFree (domain_conf.c:1562)
==15724== by 0x5557ABE: virDomainDefFree (domain_conf.c:2547)
==15724== by 0x1B43CC42: qemuProcessStop (qemu_process.c:5918)
==15724== by 0x1B43BA2E: qemuProcessStart (qemu_process.c:5511)
==15724== by 0x1B48993E: qemuDomainObjStart (qemu_driver.c:7050)
==15724== by 0x1B489B9A: qemuDomainCreateWithFlags (qemu_driver.c:7104)
==15724== by 0x1B489C01: qemuDomainCreate (qemu_driver.c:7122)
==15724== Address 0x21cfbb90 is 0 bytes inside a block of size 48 free'd
==15724== at 0x4C2B1F0: free (vg_replace_malloc.c:473)
==15724== by 0x54B7A69: virFree (viralloc.c:582)
==15724== by 0x552DC90: virStorageAuthDefFree (virstoragefile.c:1549)
==15724== by 0x12D1C8D4: virStorageTranslateDiskSourcePool (storage_driver.c:3475)
==15724== by 0x1B4396E4: qemuProcessPrepareDomain (qemu_process.c:4896)
==15724== by 0x1B43B880: qemuProcessStart (qemu_process.c:5466)
==15724== by 0x1B48993E: qemuDomainObjStart (qemu_driver.c:7050)
==15724== by 0x1B489B9A: qemuDomainCreateWithFlags (qemu_driver.c:7104)
==15724== by 0x1B489C01: qemuDomainCreate (qemu_driver.c:7122)
==15724== by 0x561CA97: virDomainCreate (libvirt-domain.c:6787)
==15724== by 0x12B6FD: remoteDispatchDomainCreate (remote_dispatch.h:4116)
==15724== by 0x12B61A: remoteDispatchDomainCreateHelper (remote_dispatch.h:4092)
The problem is, in virStorageTranslateDiskSourcePool disk
def->src->auth is freed, but the pointer is not set to NULL. So
later, when qemuProcessStop starts to free the domain definition,
virStorageAuthDefFree() tries to free the memory again, instead
of jumping out immediately.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit cf0568b0af moved a bunch of functions from virNetDev
to the more specific virNetDevIP; however, not all of the
existing uses were moved properly, causing build failures on
FreeBSD.
Complete the transition to the new names and drop the
obsolete declarations from the header file while at it.
Not including the header causes
util/virnetdevip.c:520:5: error:
unknown type name 'virCommandPtr'; did you mean 'virCondPtr'?
virCommandPtr cmd = NULL;
^~~~~~~~~~~~~
and plenty more similar failures when compiling on FreeBSD.
This is place as a sub-element of <source>, where other aspects of the
host-side connection to the network device are located (network or
bridge name, udp listen port, etc). It's a bit odd that the interface
we're configuring with this info is itself named in <target dev='x'/>,
but that ship sailed long ago:
<interface type='ethernet'>
<mac address='00:16:3e:0f:ef:8a'/>
<source>
<ip address='192.168.122.12' family='ipv4'
prefix='24' peer='192.168.122.1'/>
<ip address='192.168.122.13' family='ipv4' prefix='24'/>
<route family='ipv4' address='0.0.0.0'
gateway='192.168.122.1'/>
<route family='ipv4' address='192.168.124.0' prefix='24'
gateway='192.168.124.1'/>
</source>
</interface>
In practice, this will likely only be useful for type='ethernet', so
its presence in any other type of interface is currently forbidden in
the generic device Validate function (but it's been put into the
general population of virDomainNetDef rather than the
ethernet-specific union member so that 1) we can more easily add the
capability to other types, and 2) we can retain the info when set to
an invalid interface type all the way through to validation and report
a proper error, rather than just ignoring it (which is currently what
happens for many other type-specific settings).
(NB: The already-existing configuration of IP info for the guest-side
of interfaces is in subelements directly under <interface>, and the
name of the guest-side interface (when configurable) is in <guest
dev='x'/>).
The peer attribute is used to set the property of the same name in the
interface IP info:
<interface type='ethernet'>
...
<ip family='ipv4' address='192.168.122.5'
prefix='32' peer='192.168.122.6'/>
...
</interface>
Note that this element is used to set the IP information on the
*guest* side interface, not the host side interface - that will be
supported in an upcoming patch.
(This is an updated *re*-commit of commit 690969af, which was
subsequently reverted in commit 1d14b13f).
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
This patch takes the code out of
lxcContainerRenameAndEnableInterfaces() that adds all IP addresses and
IP routes to the interface, and puts it into a utility function
virNetDevIPInfoAddToDev() in virnetdevip.c so that it can be used by
anyone.
One small change in functionality -
lxcContainerRenameAndEnableInterfaces() previously would add all IP
addresses to the interface while it was still offline, then set the
interface online, and then add the routes. Because I don't want the
utility function to set the interface online, I've moved this up so
the interface is first set online, then IP addresses and routes are
added. This is the same order that the network service from
initscripts (in ifup-ether) does it, so it shouldn't pose any problem
(and hasn't, in the tests that I've run).
It makes more sense to have the logging at the lower level so other
callers can share the goodness.
While removing so much stuff from / touching so many lines in
lxcContainerRenameAndEnableInterfaces() (which used to have this
debug/error logging), label names were changed and it was updated to
use the now-more-common method of initializing ret to -1 (failure),
then setting to 0 right before the cleanup label.
virDomainNetIPInfoParseXML() and virDomainNetIPInfoFormat() are no
longer "unused", so we can now remove the "ATTRIBUTE_UNUSED" from
their definitions, since virDomainNetIPInfoFormat() is now the only
caller of virDomainNetIPsFormat() and virDomainNetRoutesFormat(),
those two functions can simply be subsumed into
virDomainNetIPInfoFormat().
libvirt's qemu driver doesn't have direct access to the config on the
guest side of a network interface, and currently doesn't have any
method in place to even inform the guest of the desired config. In the
future, an unenforceable attempt to set the guest-side IP info could
be made by adding a static host entry to the appropriate dnsmasq
configuration (or changing the default dhcp client address on the qemu
commandline for type='user' interfaces), or enhancing the guest agent
to allow setting an IP address, but for now it can't have any effect,
and we don't want to give the illusion that it does.
To prevent the "disappearance" of any existing configs with ip
address/route info (due to parser failure), this check is added in the
newly implemented qemuDomainDeviceDefValidate(), which is only called
when a domain is defined or started, *not* when it is reread from disk
at libvirtd startup.
a.k.a. <hostdev mode='capabilities' type='net'>.
This replaces the existing nips, ips, nroutes, and routes with a
single virNetDevIPInfo, and simplifies the code by calling that
object's parse/format/clear functions instead of open coding.
There are currently two places in the domain where this combination is
used, and there is about to be another. This patch puts them together
for brevity and uniformity.
As with the newly-renamed virNetDevIPAddr and virNetDevIPRoute
objects, the new virNetDevIPInfo object will need to be accessed by a
utility function that calls low level Netlink functions (so we don't
want it to be in the conf directory) and will be called from multiple
hypervisor drivers (so it can't be in any hypervisor directory); the
most appropriate place is thus once again the util directory.
The parse and format functions are in conf/domain_conf.c because only
the domain XML (i.e. *not* the network XML) has this exact combination
of IP addresses plus routes. Note that virDomainNetIPInfoFormat() will
end up being the only caller to virDomainNetRoutesFormat() and
virDomainNetIPsFormat(), so it will just subsume those functions in a
later patch, but we can't do that until they are no longer called.
(It would have been nice to include the interface name within the
virNetDevIPInfo object (with a slight name change), but that can't
be done cleanly, because in each case the interface name is provided
in a different place in the XML relative to the routes and IP
addresses, so putting it in this object would actually make the code
more confused rather than simpler).
These functions all need to be called from a utility function that
must be located in the util directory, so we move them all into
util/virnetdevip.[ch] now that it exists.
Function and struct names were appropriately changed for the new
location, but all code is unchanged aside from motion and renaming.
This patch splits virnetdev.[ch] into multiple files, with the new
virnetdevip.[ch] containing all the functions related to setting and
retrieving IP-related info for a device (both addresses and routes).
When support for <interface type='ethernet'> was added in commit
9a4b705f back in 2010, it erroneously looked at <source dev='blah'/>
for a user-specified guest-side interface name. This was never
documented though. (that attribute already existed at the time in the
data.ethernet union member of virDomainNetDef, but apparently had no
practical use - it was only used as a storage place for a NetDef's
bridge name during qemuDomainXMLToNative(), but even then that was
never used for anything).
When support for similar guest-side device naming was added to the lxc
driver several years later, it was put in a new subelement <guest
dev='blah'/>.
In the intervening years, since there was no validation that
ethernet.dev was NULL in the other drivers that didn't actually use
it, innocent souls who were adding other features assuming they needed
to account for non-NULL ethernet.dev when really they didn't, so
little bits of the usual pointless cargo-cult code showed up.
This patch not only switches the openvz driver to use the documented
<guest dev='blah'/> notation for naming the guest-side device (just in
case anyone is still using the openvz driver), and logs an error if
anyone tries to set <source dev='blah'/> for a type='ethernet'
interface, it also removes the cargo-cult uses of ethernet.dev and
<source dev='blah'/>, and eliminates if from the RNG and from
virDomainNetDef.
NB: I decided on this course of action after mentioning the
inconsistency here:
https://www.redhat.com/archives/libvir-list/2016-May/msg02038.html
and getting encouragement do eliminate it in a later IRC discussion
with danpb.
in qemuConnectDomainXMLToNative. This function was only accounting for
about 1/10 of all the allocated items in the NetDef prior to memseting
it to all 0's. On top of that, it was going to great pains to learn
the name of the bridge device, but then never doing anything useful
with it (just putting it into data.ethernet.dev, which is *never* used
when building a qemu commandline). (I think this again all started off
as code with good intentions, but it was never completed, and instead
was just Frankensteinically cargo-culted into the odd mish mash we
have today).
The resulting code is much simpler, produces exactly the same output,
and doesn't leak memory.
This patch removes the expanded and duplicated code that all sprung
out of two well-intentioned-but-useless settings of
net->data.(bridge|ethernet).ipaddr.
qemu has never supported even a single IP address in the interface
config, much less a list of them. All of the instances of "clearing
out the IP addresses" that are now in this function originated with
commit d8dbd6 "Basic domain XML conversions for Xen/QEMU drivers" in
May 2009, but even then the single "ipaddr" in the struct for
type='ethernet' and type='bridge' wasn't used in the qemu driver (only
in xen and openvz). Since then anyone who added a new interface type
also tacked on another unnecessary clearing of ipaddr, and when it was
made into a list of IPs (so far supported only by the LXC driver) this
simple setting was turned into a loop (well, multiple loops) to clear
them all.
Commit c9a641 (first appearred in 1.2.12) added support for setting
the guest-side IP address of veth devices in lxc domains.
Unfortunately, it hardcoded the assumption that the proper prefix for
any IP address with no explicit prefix in the config should be "24";
that is only correct for class C IPv4 addresses, but not for any other
IPv4 address, nor for any IPv6 address.
The good news is that there is already a function in libvirt that will
determine the proper default prefix for any IP address. This patch
replaces the use of the ill-fated VIR_SOCKET_ADDR_DEFAULT_PREFIX with
calls to virSocketAddrGetIPPrefix().
lxcContainerRenameAndEnableInterfaces() isn't making a copy of the
interface's ifname_guest (into newname), it's just copying the pointer
to it. This means that when it later calls VIR_FREE(newname), it's
actually freeing up (and fortunately NULLing out, so at least we don't
try to access free'd memory) netDef->ifname_guest.
There are times when we don't have a netmask pointer to give to
virSocketAddrGetIPPrefix() (e.g. the IP addresses in domain interfaces
only have a prefix, no netmask), but it would have caused a segv if we
called it with NULL instead of a pointer to a netmask. This patch
qualifies the code that would use the netmask or address pointers to
check for NULL first.
Rearrange this function to be better organized and more correct:
* the error codes were changed from the incorrect INVALID_ARG to
XML_ERROR
* prefix still isn't required, but if present it must be valid or an
error will be logged.
* don't emit a debug log just because prefix is missing - this
is valid.
* group everything related to setting prefix in one place rather than
scattered through the function.
I'm tired of mistyping this all the time, so let's do it the same all
the time (similar to how we changed all "Pci" to "PCI" awhile back).
(NB: I've left alone some things in the esx and vbox drivers because
I'm unable to compile them and they weren't obviously *not* a part of
some API. I also didn't change a couple of variables named,
e.g. "somethingIptables", because they were derived from the name of
the "iptables" command)
These had been declared in conf/device_conf.h, but then used in
util/virnetdev.c, meaning that we had to #include conf/device_conf.h
in virnetdev.c (which we have for a long time said shouldn't be done.
This caused a bigger problem when I tried to #include util/virnetdev.h
in a file in src/conf (which is allowed) - for some reason the
"device_conf.h: File not found" error.
The solution is to move the data types and functions used in util
sources from conf to util. Some names were adjusted during the move
("virInterface" --> "virNetDevIf", and "VIR_INTERFACE" -->
"VIR_NETDEV_IF")
virNetDevLinkDump should have been in virnetlink.c, but that file
didn't exist yet when the function was created. It didn't really
matter until now - I found that having virnetlink.h included by
virnetdev.h caused build problems when trying to #include virnetdev.h
in a .c file in src/conf (due to missing directory in -I). Rather than
fix that to further institutionalize the incorrect placement of this
one function, this patch moves the function.
This patch enables admin socket creation in daemon's code, bumps the library
version in libvirt_admin_public.syms, and performs all necessary modifications
to our makefiles so that admin API can finally be included in the tarball,
and eventually become part of an rpm package (a patch later in this series).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We need this because apply graphics functions is used on
update too. Also in case of NULL address resolve it to default
instead of error.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Move graphic device config to post parse. This way we
detect error on early stage and leverage checking on detach too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
First we need to always set value to vz sdk parameter so
we can leverage setting code for device updates. This patch
resolves tristate default to off implicitly. This is easier
then extract default value from vz sdk itself. First current
default is off too, second this approach is already taken
for 'net->linkstate'.
Second dump this option in domain xml.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Current code that pass gateways to vz sdk is not suitable for
updates. If update has no gateways while we had them before
we need to pass "" for vz sdk gateways to reset old value.
The code definitely deserves its own function.
Drop checks that skip setting gateways if network address
is not set. Such a configuration is possible in vz sdk.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>