If a migration of a domain which is already defined on the destination
host failed early (before we tried to start QEMU), we would forget to
remove the incoming transient definition. Later on when someone starts
the domain on the destination host, we will use the stale incoming
definition and the persistent definition will just be ignored.
https://bugzilla.redhat.com/show_bug.cgi?id=1368774
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The code for replacing domain's transient definition with the persistent
one is repeated in several places and we'll need to add one more. Let's
make a nice helper for it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Similarly to vcpu hotplug the emulator thread cgroup numa mapping needs
to be relaxed while hot-adding vcpus so that the threads can allocate
data in the DMA zone.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370084
When hot-adding vcpus qemu needs to allocate some structures in the DMA
zone which may be outside of the numa pinning. Extract the code doing
this in a set of helpers so that it can be reused.
We already have the ability to turn off dumping of guest
RAM via the domain XML. This is not particularly useful
though, as it is under control of the management application.
What is needed is a way for the sysadmin to turn off guest
RAM defaults globally, regardless of whether the mgmt app
provides its own way to set this in the domain XML.
So this adds a 'dump_guest_core' option in /etc/libvirt/qemu.conf
which defaults to false. ie guest RAM will never be included in
the QEMU core dumps by default. This default is different from
historical practice, but is considered to be more suitable as
a default because
a) guest RAM can be huge and so inflicts a DOS on the host
I/O subsystem when dumping core for QEMU crashes
b) guest RAM can contain alot of sensitive data belonging
to the VM owner. This should not generally be copied
around inside QEMU core dumps submitted to vendors for
debugging
c) guest RAM contents are rarely useful in diagnosing
QEMU crashes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the QEMU processes inherit their core dump rlimit
from libvirtd, which is really suboptimal. This change allows
their limit to be directly controlled from qemu.conf instead.
With current perf framework, this patch adds support and documentation
for more perf events, including cache misses, cache references, cpu cycles,
and instructions.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Since the domain lock is not held during preparation of an external XML
config, it is possible that the value can change resulting in unexpected
failures during ABI consistency checking for some save and migrate
operations.
This patch adds a new flag to skip the checking of the cur_balloon value
and then sets the destination value to the source value to ensure
subsequent checks without the skip flag will succeed.
This way it is protected from forges and is keeped up to date too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The 'multi' element in PCI address struct used as 'virTristateSwitch',
and its default value is 'VIR_TRISTATE_SWITCH_ABSENT'. Current PCI
process use 'false' to initialization 'multi', which is ambiguously
for assignment or comparison. This patch use '{0}' to initialize
the whole PCI address struct, which fix the 'multi' initialization
and makes code more simplify and explicitly.
Signed-off-by: Xian Han Yu <xhyubj@linux.vnet.ibm.com>
Setting vcpu count when cpu topology is specified may result into an
invalid configuration. Since the topology can't be modified, reject the
setting if it doesn't match the requested topology. This will allow
fixing the topology in case it was broken.
Partially fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1370066
ce43cca0e refactored the helper to prepare it for sparse topologies but
forgot to fix the iterator used to fill the structures. This would
result into a weirdly sparse populated array and possible out of bounds
access and crash once sparse vcpu topologies were allowed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1369988
When support for auto-creating tap devices was added to <interface
type='ethernet'> in commit 9c17d6, the code assumed that
virNetDevTapCreate() would honor the VIR_NETDEV_TAP__CREATE_IFUP flag
that is supported by virNetDevTapCreateInBridgePort(). That isn't the
case - the latter function performs several operations, and one of
them is setting the tap device online. But virNetDevTapCreate() *only*
creates the tap device, and relies on the caller to do everything
else, so qemuInterfaceEthernetConnect() needs to call
virNetDevSetOnline() after the device is successfully created.
The linkstate setting of an <interface> is only meant to change the
online status reported to the guest system by the emulated network
device driver in qemu, but when support for auto-creating tap devices
for <interface type='ethernet'> was added in commit 9717d6, a chunk of
code was also added to qemuDomainChangeNetLinkState() that sets the
online status of the tap device (i.e. the *host* side of the
interface) for type='ethernet'. This was never done for tap devices
used in type='bridge' or type='network' interfaces, nor was it done in
the past for tap devices created by external scripts for
type='ethernet', so we shouldn't be doing it now.
This patch removes the bit of code in qemuDomainChangeNetLinkState()
that modifies online status of the tap device.
The call to virNetDevIPInfoAddToDev() that sets up tap device IP
addresses and routes was somehow incorrectly placed in
qemuInterfaceStopDevice() instead of qemuInterfaceStartDevice() in
commit fe8567f6. This fixes that error by moving the call to
virNetDevIPInfoAddToDev() to qemuInterfaceStartDevice().
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
This patch removes the old vcpu unplug code completely and replaces it
with the new code using device_del. The old hotplug code basically never
worked with any recent qemu and thus is useless.
As the new code is using device_del all the implications of using it
are present. Contrary to the device deletion code, the vcpu deletion
code fails if the unplug request is not executed in time.
To allow unplugging the vcpus, hotplugging of vcpus on platforms which
require to plug multiple logical vcpus at once or plugging them in an
arbitrary order it's necessary to use the new device_add interface for
vcpu hotplug.
This patch adds support for the device_add interface using the old
setvcpus API by implementing an algorithm to select the appropriate
entities to plug in.
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.
There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
entity
Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.
The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started
This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.
We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.
The helper will store the order information in places where we are
certain that it's necessary.
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
Individual vCPU hotplug requires us to track the state of any vCPU. To
allow this add the following XML:
<domain>
...
<vcpu current='2'>3</vcpu>
<vcpus>
<vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
<vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
<vcpu id='1' enabled='no' hotpluggable='yes'/>
</vcpus>
...
The 'enabled' attribute allows to control the state of the vcpu.
'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
allows to specify the order to add the vcpus.
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.
As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
For hotplug purposes it's necessary to retrieve data using
query-hotpluggable-cpus while the old query-cpus API report thread IDs
and order of hotplug.
This patch adds code that merges the data using a rather non-trivial
algorithm and fills the data to the qemuMonitorCPUInfo structure for
adding to appropriate place in the domain definition.
Add support for retrieving information regarding hotpluggable cpu units
supported by qemu. Data returned by the command carries information
needed to figure out the granularity of hotplug, the necessary cpu type
name and the topology information.
Note that qemu doesn't specify any particular order of the entries thus
it's necessary sort them by socket_id, core_id and thread_id to the
order libvirt expects.
To allow matching up the data returned by query-cpus to entries in the
query-hotpluggable-cpus reply for CPU hotplug it's necessary to extract
the QOM path as it's the only link between the two.
QEMU reports whether 'query-hotpluggable-cpus' is supported for a given
machine type. Extract and cache the information using the capability
cache.
When copying the capabilities for a new start of qemu, mask out the
presence of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine type
doesn't support hotpluggable cpus.
As of qemu commit:
commit a32ef3bfc12c8d0588f43f74dcc5280885bbdb30
Author: Thomas Huth <thuth@redhat.com>
Date: Wed Jul 22 15:59:50 2015 +0200
vl: Add another sanity check to smp_parse() function
v2.4.0-952-ga32ef3b
configuration where the maximum CPU count doesn't match the topology is
rejected. Prior to that only configurations where the topology would
contain more cpus than the maximum count would be rejected.
Use QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS as a relevant recent enough
witness to avoid breaking old configs.
Prepare to extract more data by returning an array of structs rather than
just an array of thread ids. Additionally report fatal errors separately
from qemu not being able to produce data.
Check whether the disable-legacy property is present on the following
devices:
virtio-balloon-pci
virtio-blk-pci
virtio-scsi-pci
virtio-serial-pci
virtio-9p-pci
virtio-net-pci
virtio-rng-pci
virtio-gpu-pci
virtio-input-host-pci
virtio-keyboard-pci
virtio-mouse-pci
virtio-tablet-pci
Assuming that if QEMU knows other virtio devices where this property
is applicable, it will have at least one of these devices.
Added in QEMU by:
commit e266d421490e0ae83044bbebb209b2d3650c0ba6
virtio-pci: add flags to enable/disable legacy/modern
https://bugzilla.redhat.com/show_bug.cgi?id=1182074
Since libvirt still uses a legacy qemu arg format to add a disk, the
manner in which the 'password-secret' argument is passed to qemu needs
to change to prepend a 'file.' If in the future, usage of the more
modern disk format, then the prepended 'file.' can be removed.
Fix based on Jim Fehlig <jfehlig@suse.com> posting and subsequent
upstream list followups, see:
http://www.redhat.com/archives/libvir-list/2016-August/msg00777.html
for details. Introduced by commit id 'a1344f70'.
The code that setups listen types may change a listen type from address to
socket based on configuration from qemu.conf. This needs to be done before we
reserve/allocate ports that won't be used.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1364843
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Ports are valid only for listen types 'address' and 'network', other listen
types doesn't use them so we should not try to reserve any ports.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The first argument should be const char ** instead of
char **, because this is a search function and as such it
doesn't, and shouldn't, alter the haystack in any way.
This change means we no longer have to cast arrays of
immutable strings to arrays of mutable strings; we still
have to do the opposite, though, but that's reasonable.
All other modes of qemuDomainSetVcpusFlags have helpers so finish the
work by splitting the regular code into a new function.
This patch also touches up the coding (spacing) style.
Mention whether it was the live or persistent definition which caused an
error reported and explicitly error out in case when attempting to set
maximum vcpu count for a live domain.
Setting heads to 0 in case that *max_outputs* is not supported while building
command line doesn't have any real effect. It only removes *heads* attribute
from live XML, but after restarting libvirt the default value is restored.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Since we now pick the default USB controller model when parsing
the guest XML, we can get rid of some duplicated code so that
the default model selection happens in one place only.
Add some comments as well.
Now that the default USB controller model is explicit rather
than implicit for i440fx machines, we have to tweak the
conditions for dropping it in order to keep migration towards
libvirt <= 0.9.4 working.
When the user doesn't specify any model for a USB controller,
we use an architecture-dependent default, but we don't reflect
it in the guest XML.
Pick the default USB controller model when parsing the guest
XML instead of when creating the QEMU command line, so that
our choice is saved back to disk.
Since a9331394 (first release v2.1.0), specifying a manual
security_driver setting in qemu.conf causes the daemon to fail to
start, erroring with 'Duplicate security driver X'.
The duplicate checking was incorrectly comparing every entry
against itself, guaranteeing a false positive.
https://bugzilla.redhat.com/show_bug.cgi?id=1365607
Doing a load, copy, format cycle on all QEMU capabilities XML files
should make sure we don't forget to update virQEMUCapsNewCopy when
adding new elements to QEMU capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In qemu, enabling this feature boils down to adding the following
onto the command line:
-global driver=cfi.pflash01,property=secure,value=on
However, there are some constraints resulting from the
implementation. For instance, System Management Mode (SMM) is
required to be enabled, the machine type must be q35-2.4 or
later, and the guest should be x86_64. While technically it is
possible to have 32 bit guests with secure boot, some non-trivial
CPU flags tuning is required (for instance lm and nx flags must
be prohibited). Given complexity of our CPU driver, this is not
trivial. Therefore I've chosen to forbid 32 bit guests for now.
If there's ever need, we can refine the check later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since its release of 2.4.0 qemu is able to enable System
Management Module in the firmware, or disable it. We should
expose this capability in the XML. Unfortunately, there's no good
way to determine whether the binary we are talking to supports
it. I mean, if qemu's run with real machine type, the smm
attribute can be seen in 'qom-list /machine' output. But it's not
there when qemu's run with -M none. Therefore we're stuck with
version based check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We use 'goto cleanup' for a reason. If a function can exit at
many places but doesn't follow the pattern, it has to copy the
free code in multiple places.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Call the vcpu thread info validation separately to decrease complexity
of returned values by qemuDomainRefreshVcpuInfo.
This function now returns 0 on success and -1 on error. Certain
failures of qemu to report data are still considered as success. Any
error reported now is fatal.
Validate the presence of the thread id according to state of the vCPU
rather than just checking the vCPU count. Additionally put the new
validation code into a separate function so that the information
retrieval can be split from the validation.
https://bugzilla.redhat.com/show_bug.cgi?id=1356937
Add support for IOThread quota/bandwidth and period parameters for non
session mode. If in session mode, then error out. Uses all the same
places where {vcpu|emulator|global}_{period|quota} are adjusted and
adds the iothread values.
https://bugzilla.redhat.com/show_bug.cgi?id=1289391
Rather than pass the whole drive string (which contained the alias),
pass only the alias for the qemuMonitorDriveDel call in the error
path when adding a host device in the monitor fails.
Partial fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1336225
Similar to the other disk types, add the qemuMonitorDriveDel in the failure
to add/hotplug a USB.
Added a couple of other formatting changes just to have a less cluttered look
Move QEMU_DRIVE_HOST_PREFIX into the qemu_alias.c to dissuade future
callers from using it. Create qemuAliasDiskDriveSkipPrefix in order
to handle the current consumers that desire to check if an alias has
the drive- prefix and "get beyond it" in order to get the disk alias.
Since we already have a function that will generate the drivestr from
the alias, let's use it and remove the qemuDeviceDriveHostAlias.
Move the QEMU_DRIVE_HOST_PREFIX definition into qemu_alias.h
Also alter qemuAliasFromDisk to use the QEMU_DRIVE_HOST_PREFIX instead
of "drive-%s".
Rather than pass the disks[i]->info.alias to qemuMonitorSetDrivePassphrase
and then generate the "drive-%s" alias from that, let's use qemuAliasFromDisk
prior to the call to generate the drive alias and then pass that along
thus removing the need to generate the alias from the monitor code.
As commit id 'e2b86f580' notes, when mode=agent possibly setting the
fake reboot flag to true wouldn't be necessary; however, it doesn't
"force" the issue by just ensuring the fake reboot is false, so this
patch adds the explicit setting for the reboot path.
More investigation and details can be found in commit id '8be502fd'
as well as in the archives at:
https://www.redhat.com/archives/libvir-list/2015-April/msg00715.html
Conditional setting of the fake reboot flag should only happen for
the acpi mode shutdown path; however, for the agent mode shutdown,
the fake reboot should be cleared. This patch will essentially revert
commit id '8be502fd', but adds an explicit setting of the flag to false
when using mode=agent while also only conditionally setting the reboot
flag if the guest went away. This also avoids an issue where a shutdown
with reboot semantics is done from agent mode which sets the reboot
flag followed by a shutdown from within the guest which would result
in a reboot due to the fake reboot flag being set. The change will
also properly handle the cases described in the following archive post:
https://www.redhat.com/archives/libvir-list/2015-April/msg00715.html
To sync with virDomainControllerModelUSB, we add two models
in qemuControllerModelUSB 'qusb1' and 'qusb2', but those
models are not supported in qemu driver. So add check in
device post parse to report errors if 'qusb1' and 'qusb2'
are specified.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
According to libxl implementation, it supports pvusb
controller of version 1.1 and version 2.0, and it
supports two types of backend, 'pvusb' (dom0 backend)
and 'qusb' (qemu backend). But currently pvusb backend
is not checked in yet.
To match libxl support, extend usb controller schema
to support two more models: qusb1 (qusb, version 1.1)
and 'qusb2' (qusb version 2.0).
Signed-off-by: Chunyan Liu <cyliu@suse.com>
When reset was called from a domain that crashed we didn't change the
crashed state into a paused one which could confuse users.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1269575
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Until now we simply errored out when the translation from pool+volume
failed. However, we should instead check whether that disk is needed or
not since there is an option for that.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1168453
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is an error reset following the function and check for
startupPolicy before that. Let's reflect those things inside that
function so that future code doesn't have to be that complex.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The panic devices with models s390 and pseries are autogenerated.
For backwards compatibility reasons the devices are to be removed
when migrating.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Ever since virDomainCreateWithFlags() was introduced by de3aadaa
[drivers: add virDomainCreateWithFlags if virDomainCreate exists], the
domain ID retrieved with virDomainGetID() was incorrect for several
drivers after virDomainCreateWithFlags() was called. The API consumer
had to look up the domain anew to retrieve the correct ID.
For the ESX driver, this was fixed in 6139b274 [esx: Update ID after
starting a domain]. For the openvz driver, it was fixed in fd81a097
[openvzDomainCreateWithFlags: set domain id to the correct value]. The
test driver, the OpenNebula driver (removed in the meantime) and the
vbox driver were already updating the domain ID correctly in
domainCreate().
Copy over the ID in qemuDomainCreateWithFlags() to fix this for the qemu
driver, too.
Fixes: de3aadaa ("drivers: add virDomainCreateWithFlags if virDomainCreate exists")
Reported-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Consider the following XML snippet:
<memory model=''>
<target>
<size unit='KiB'>523264</size>
<node>0</node>
</target>
</memory>
Whats wrong you ask? The @model attribute. This should result in
an error thrown into users faces during virDomainDefine phase.
Except it doesn't. The XML validation catches this error, but if
users chose to ignore that, they will end up with invalid XML.
Well, they won't be able to start the machine - that's when error
is produced currently. But it would be nice if we could catch the
error like this earlier.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>