The code from qemuSetupCgroupCpusetCpus() and virLXCCgroupSetupCpusetTune()
can be centralized in a new helper called virCgroupSetupCpusetCpus().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virLXCCgroupSetupMemTune() and qemuSetupMemoryCgroup() shares
duplicated code that can be put in a new helper to avoid
code repetition.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There is duplicated code between virt drivers that needs to
be moved to avoid code repetition. In the case of duplicated
code between lxc_cgroup.c and qemu_cgroup.c a common place
would be utils/vircgroup.c. The problem is that this would
introduce /conf related definitions that shouldn't be imported
to vircgroup.c, which is supposed to be a place for utilitary
cgroups functions only. And syntax-check would forbid it anyway
due to cross-directory includes being used.
An alternative would be to overload domain_conf.c, which already
contains all the definitions required. But that file is already
crowded with XML handling code and we wouldn't do any favors to
it by putting more utilitary, non-XML parsing/formatting code
there.
In [1], Cole suggested a 'domain_cgroup' file to host common code
between lxc_cgroup and qemu_cgroup, and Daniel suggested a
'src/hypervisor' dir to host these type of files. This patch
introduces src/hypervisor/domain_cgroup.c and, to get started,
introduces a new virDomainCgroupSetupBlkio() function to host shared
code between virLXCCgroupSetupBlkioTune() and qemuSetupBlkioCgroup().
[1] https://www.redhat.com/archives/libvir-list/2019-December/msg00817.html
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are code repetition of set() and get() blkio device
parameters across lxc and qemu files. Use the new vircgroup
helpers to trim the repetition a bit.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This setting can be updating very easily on an already active
interface by just changing it in sysfs. If the bridge used for
connection is also changed, there is no need to separately update it,
because the new setting isf done as a part of connecting to the bridge
anyway.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This patch pushes the isolatedPort setting from the <interface> down
all the way to the callers of virNetDevBridgeAddPort(), and sets
BR_ISOLATED on the port (using virNetDevBridgePortSetIsolated()) after
the port has been successfully added to the bridge.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Not only was the original error code destroyed in the case of
encountering an error during recovery from a failed attach to the
bridge (and then *that* error was destroyed by logging a *second*
error about the failure to recover - virNetDevBridgeAddPort() already
logs an error, so the one about failing to recover was redundant), but
if the recovery was successful, the function would then return success
to the caller even though it had failed.
Fixes: 2711ac8716
(overwritten errors were introduced along with this functionality)
Fixes: 6bde0a1a37
(the wrong return value was introduced by a refactor)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Firstly, the check for disk I/O error can be moved into 'if
(!offline)' section a few lines below.
Secondly, checks for vmstate and slirp should be moved under the
same section because they reflect live state of a domain. For
offline migration no QEMU is involved and thus these restrictions
are not valid.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In two places where virPidFileForceCleanupPath() is called, we
try to unlink() the pidfile again. This is needless because
virPidFileForceCleanupPath() has done just that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuMonitorGetIOThreads returns a NULL-terminated list even when 0
iothreads are present. The caller didn't perform cleanup if there were 0
iothreads leaking the array.
https://bugzilla.redhat.com/show_bug.cgi?id=1804548
Fixes: d1eac92784
Reported-by: Jing Yan <jiyan@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
QoS 'floor' setting is documented to be only supported for interfaces of
type 'network'. Fail with an error message on attempt to set 'floor' on
an interface of any other type.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Implement support for the slice of type 'storage' which allows to set
the offset and size which modifies where qemu should look for the start
of the format container inside the image.
Since slicing is done using the 'raw' driver we need to add another
layer into the blockdev tree if there's any non-raw image format driver
used to access the data.
This patch adds the blockdev integration and setup of the image data so
that we can use the slices for any backing image.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When creating overlay images e.g. for snapshots or when merging
snapshots we often specify the backing store string to use. Make the
formatter aware of backing chain entries which have a <slice>
configured so that we record it properly. Otherwise such images
would not work without the XML (when detecting the backing chain).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The storage slice will require a specific node name in cases when the
image format is not raw. Store and format them in the status XML.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Specifically creating such images via libvirt during blockjobs would
be much more hassle than it's worth. Just forbid them for now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We support explicit storage slices only when using blockdev. Storage
slices expressed via the backing store string are left to qemu to
open correctly.
Reject storage slices configured via the XML for non-blockdev usage.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If we have a 'format' type slice for a raw driver we can directly format
the values.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If a domain has a NVMe disk it already has the access configured.
Trying to configure it again on a commit or some other operation
is wrong and condemned to failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Its behavior is controlled by a KVM-specific CPU feature.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Its use is limited to certain guest types, and it only supports
a subset of all possible tick policies.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This new timer model will be used to control the behavior of the
virtual timer for KVM ARM/virt guests.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We will use this capability to detect whether the QEMU binary
supports the kvm-no-adjvtime CPU feature.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Make sure we are taking all possible virDomainTimerNameType values
into account. This will make upcoming changes easier.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Libvirt switched to using a UNIX socket for monitors in
2009 for version 0.7.0. It seems unlikely that there is
a running QEMU process that hasn't been restarted for
11 years while also taking a libvirt upgrade. Therefore
we can drop support for opening a PTY for the QEMU
monitor.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
My original implementation was completely broken because it attempted to
use object-add/del instead of blockdev-add/del.
https://bugzilla.redhat.com/show_bug.cgi?id=1798366
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Modify qemuMonitorBlockdevAdd so that it takes a double pointer for the
@props argument so that it's cleared inside the call. This allows
writing cleaner callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Use automatic variable freeing and get rid of the cleanup section.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Use automatic variable freeing and get rid of the cleanup section.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Few switch cases returned failure but didn't report an error. For a
situation when the backingStore type='volume' was not translated the
following error would occur:
$ virsh start VM
error: Failed to start domain VM
error: An error occurred, but the cause is unknown
After this patch:
$ virsh start VM
error: Failed to start domain VM
error: internal error: storage source pool 'tmp' volume 'pull3.qcow2' is not translated
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We call APIs that reset the error in the rollback code.
Preserve the error from the original call that failed.
This turns the boringly cryptic:
error: Unable to set interface parameters
error: An error occurred, but the cause is unknown
to the unexpectedly anarchist:
error: internal error: Child process (/usr/sbin/tc filter add
dev vnet1 parent ffff: protocol all u32 match u32 0 0 police
rate 4294968kbps burst 4294968kb mtu 64kb drop flowid :1)
unexpected exit status 1: Illegal "rate"
Illegal "police"
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: f02e21cb33https://bugzilla.redhat.com/show_bug.cgi?id=1800505
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Otherwise an attempt to set an invalid value:
virsh domiftune rhel8.2 vnet0 --outbound 4294968
on an interface with no bandwidth set crashes.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: f02e21cb33https://bugzilla.redhat.com/show_bug.cgi?id=1800505
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This deletes all trace of gnulib from libvirt. We still
have the keycodemapdb submodule to deal with. The simple
solution taken was to update it when running autogen.sh.
Previously gnulib could auto-trigger refresh when running
'make' too. We could figure out a solution for this, but
with the pending meson rewrite it isn't worth worrying
about, given how infrequently keycodemapdb changes.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Pvpanic device supports bit 1 as crashloaded event, it means that
guest actually panicked and run kexec to handle error by guest side.
Handle crashloaded as a lifecyle event in libvirt.
Test case:
Guest side:
before testing, we need make sure kdump is enabled,
1, build new pvpanic driver (with commit from upstream
e0b9a42735f2672ca2764cfbea6e55a81098d5ba
191941692a3d1b6a9614502b279be062926b70f5)
2, insmod new kmod
3, enable crash_kexec_post_notifiers,
# echo 1 > /sys/module/kernel/parameters/crash_kexec_post_notifiers
4, trigger kernel panic
# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger
Host side:
1, build new qemu with pvpanic patches (with commit from upstream
600d7b47e8f5085919fd1d1157f25950ea8dbc11
7dc58deea79a343ac3adc5cadb97215086054c86)
2, build libvirt with this patch
3, handle lifecycle event and trigger guest side panic
# virsh event stretch --event lifecycle
event 'lifecycle' for domain stretch: Crashed Crashloaded
events received: 1
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Starting a KVM domain on s390 with old machine type (such as
s390-ccw-virtio-2.5) and without any guest CPU model configured fails
with
CPU models are not available: KVM doesn't support CPU models
QEMU error. This is cause by libvirt using host-model CPU as the default
CPU based on QEMU reporting "host" CPU model as being the default one
(see commit v5.9.0-402-g24d8202294: qemu: Use host-model CPU on s390 by
default). However, even though both QEMU and KVM support CPU models on
s390 and QEMU can give us the host-model CPU, we can't use it with old
machine types which only support -cpu host.
https://bugzilla.redhat.com/show_bug.cgi?id=1795651
Reported-by: Christian Ehrhardt <paelzer@gmail.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The usability of a specific CPU mode may depend on machine type, let's
prepare for this by passing it to virQEMUCapsIsCPUModeSupported.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Extend QEMU with tpm-spapr support. Assign a device address to the
vTPM device model.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Extend the QEMU capabilties with tpm-spapr support.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This patch adds support for the tpm-spapr device model for ppc64. The XML for
this type of TPM looks as follows:
<tpm model='tpm-spapr'>
<backend type='emulator'/>
</tpm>
Extend the documentation.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Introduce VIR_DOMAIN_TPM_MODEL_DEFAULT as a default model which we use
in case the user does not provide a model in the device XML. It has
the TIS's previous value of '0'. In the post parsing function
we change this default value to 'TIS' to have the same model as before.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
All our supported Linux distros now have this header.
It has never existed on FreeBSD / macOS / Mingw.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This addreses portability to Windows and standardizes
error reporting. This fixes a number of places which
failed to set O_CLOEXEC or failed to report errors.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Most code now uses the virProcess / virCommand APIs, so
the need for sys/wait.h is quite limited. Removing this
include removes the dependency on GNULIB providing a
dummy sys/wait.h for Windows.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Use qemuBlockBitmapsHandleBlockcopy to calculate bitmaps to copy over
for a block-copy job.
We copy them when pivoting to the new image as at that point we are
certain that we don't dirty any bitmap unnecessarily.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a function calculating which bitmaps to copy to the mirror during
a block-copy operation.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a validator which checks that a bitmap spanning multiple backing
chain members doesn't look broken. The current rules are that no
intermediate birmaps are missing (unfortunately it's hard to know
whether the topmost or bottommost bitmap is missing) and none of the
components is inconsistent.
We can obviously improve it over time.
The validator is also tested against the existing bitmap data we have
for the backup merging test as well as some of the existing broken
bitmap synthetic test cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The flags may control important aspects of the block job which may
influence also the termination of the job. Store the 'flags' for all
the block job types.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a variable which will store the contents of the 'flags' variable as
passed in by the individual block jobs. Since the flags may influence
behaviour of the jobs it's important to preserve them to the
finalization steps.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use the glib allocation function that never returns NULL and remove the
now dead-code checks from all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Create a wrapper for qemuBlockGetNamedNodeData named
qemuBlockGetNamedNodeData. The purpose of the wrapper is to integrate
the monitor handling functionality and in the future possible
qemuCaps-based flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allow qemu access to modify backing files in case when we want to delete
a checkpoint.
This patch adds tracking of which images need to be relabelled when
calculating the transaction, the code to relabel them and rollback.
To verify that stuff works we also output the list of images to relabel
into the test case output files in qemublocktest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allow deleting of checkpoints when snapshots were created along. The
code tracks and modifies the checkpoint list so that backups can still
be taken with such a backing chain. This unfortunately requires to
rename few bitmaps (by copying and deleting them) in some cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This requires stealing one cmd pointer before returning it.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Mark eligible declarations as g_autofree and remove
the corresponding VIR_FREE calls.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Always trim the full specified suffix.
All of the callers outside of tests were passing either
strlen or the actual length of the string.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now, that every use of virAtomic was replaced with its g_atomic
equivalent, let's remove the module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The rewrite to use GLib's atomic ops functions changed the behavior
of virAtomicIntInc - before it returned the pre-increment value.
Most of the callers using its value were adjusted, but the one
in qemuDriverAllocateID was not. If libvirtd would reconnect to
a running domain during startup, the next started domain would get
the same ID:
$ virsh list
Id Name State
--------------------------
1 f28live running
1 f28live1 running
Use the g_atomic_add function directly (as recommended in viratomic.h)
and add 1 to the result.
This also restores the usual numbering from 1 instead of 0.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 7b9645a7d1
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Starting on commit 1f43393283, qemuDomainFillDeviceIsolationGroup()
returns 0 in all circunstances. Let's turn it to 'void' make it
clearer that the function will not fail. This also spares a
check for < 0 return in qemu_hotplug.c. The
qemuDomainFillDeviceIsolationGroupIter() callback now returns
0 at all times - which is already happening anyway.
Refer to 1f43393283 commit message for more details on why
the function was changed to never return an error.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainChrDefDropDefaultPath() returns an int, but it's
always returning 0. Callers are checking for result < 0 to
run their cleanup code needlessly.
Turn the function to 'void' and adjust the callers.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Avoid some of the virObjectUnref() calls by using g_autoptr.
Aside from the 'cleanup' label in qemuDomainSetFakeReboot(),
all other now deprecated cleanup labels will be removed in
the next patch.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_autofree to remove VIR_FREE() calls used for cleanups.
Labels that became deprecated will be removed in a later
patch.
In qemuDomainSetupDisk(), the 'dst' variable is not used at
all and could be removed.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'caps' variable in qemuDomainObjPrivateXMLParseAutomaticPlacement()
is set to auto clean via g_autoptr(), but a 'virObjectUnref(caps)' is
being executed in the 'cleanup' label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
With -blockdev we must look up via the nodename rather than the 'drive'
alias which is not present any more.
This fixes the pre-creation of storage volumes on migration with
non-shared storage.
https://bugzilla.redhat.com/show_bug.cgi?id=1793263
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Swithc to the helper which doesn't require checking of the return value.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The data is gathered only once so we can move the whole block which
fetches the data out of the loop and get rid of the logic which
prevents multiple calls.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor the logic to skip the body of the function if there's nothing
to do.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There are two calls to virHashNew which check the return value. It's not
necessary any more as virHashNew always returns a valid pointer.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Aside from itinerant error (actually warning) messages due to an
unrecognized response from qemu, this isn't even necessary - the
migration proceeds successfully to completion anyway.
(I'm not sure where to see this status reported in the API though - do
we need to add an extra state, or recognition of a new event somewhere?)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Normally a PCI hostdev can't be migrated, so
qemuMigrationSrcIsAllowedHostdev() won't permit it. In the case of a a
hostdev network interface that has <teaming type='transient'/> set,
QEMU will automatically unplug the device prior to migration, and
re-plug a corresponding device on the destination. This patch modifies
qemuMigrationSrcIsAllowedHostdev() to allow domains with those devices
to be migrated.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The QEMU driver uses the <teaming type='persistent|transient'
persistent='blah'/> element to setup a "failover" pair of devices -
the persistent device must be a virtio emulated NIC, with the only
extra configuration being the addition of ",failover=on" to the device
commandline, and the transient device must be a hostdev NIC
(<interface type='hostdev'> or <interface type='network'> with a
network that is a pool of SRIOV VFs) where the extra configuration is
the addition of ",failover_pair_id=$aliasOfVirtio" to the device
commandline. These new options are supported in QEMU 4.2.0 and later.
Extra qemu-specific validation is added to ensure that the device
type/model is appropriate and that the qemu binary supports these
commandline options.
The result of this will be:
1) The virtio device presented to the guest will have an extra bit set
in its PCI capabilities indicating that it can be used as a failover
backup device. The virtio guest driver will need to be equipped to do
something with this information - this is included in the Linux
virtio-net driver in kernel 4.18 and above (and also backported to
some older distro kernels). Unfortunately there is no way for libvirt
to learn whether or not the guest driver supports failover - if it
doesn't then the extra PCI capability will be ignored and the guest OS
will just see two independent devices. (NB: the current virtio guest
driver also requires that the MAC addresses of the two NICs match in
order to pair them into a bond).
2) When a migration is requested, QEMu will automatically unplug the
transient/hostdev NIC from the guest on the source host before
starting migration, and automatically re-plug a similar device after
restarting the guest CPUs on the destination host. While the transient
NIC is unplugged, all network traffic will go through the
persistent/virtio device, but when the hostdev NIC is plugged in, it
will get all the traffic. This means that in normal circumstances the
guest gets the performance advantage of vfio-assigned "real hardware"
networking, but it can still be migrated with the only downside being
a performance penalty (due to using an emulated NIC) during the
migration.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Presence of the virtio-net-pci option called "failover" indicates
support in a qemu binary of a simplistic bonding of a virtio-net
device with another PCI device. This feature allows migration of
guests that have a network device assigned to a guest with VFIO, by
creating a network bond device in the guest consisting of the
VFIO-assigned device and a virtio-net-pci device, then temporarily
(and automatically) unplugging the VFIO net device prior to migration
(and hotplugging an equivalent device on the migration
destination). (The feature is called "failover" because the bond
device uses the vfio-pci netdev for normal guest networking, but
"fails over" to the virtio-net-pci netdev once the vfio-pci device is
unplugged for migration.)
Full functioning of the feature also requires support in the
virtio-net driver in the guest OS (since that is where the bond device
resides), but if the "failover" commandline option is present for the
virtio-net-pci device in qemu, at least the qemu part of the feature
is available, and libvirt can add the proper options to both the
virtio-net-pci and vfio-pci device commandlines to indicate qemu
should attempt doing the failover during migration.
This patch just adds the qemu capabilities flag "virtio-net.failover".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
There are a large number of different header files that
are related to the sockets APIs. The virsocket.h header
includes all of the relevant headers for Windows and UNIX
in one convenient place. If virsocketaddr.h is already
included, then there's no need for virsocket.h
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is a simplified variant of gnulib's passfd module
without the portability code that we do not require.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently when disk is removed from iotune group (by setting
all tunables to zero) group name is leaved in config. Let's fix
it.
Given iotune defaults are taken from the destination group setting
tunables to zero may require different set of zero settings in API
call. Let's prohibit removing from group while specifying different
group name then current for the sanity sake.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
For example if disk is not in the group and we want to move it
there then it makes sense to specify only the group name in API call.
Currently the destination group iotune settings will be overwritten
with the disk settings which I would say is not what one would expect.
Thus let's get defaults from the group we are moving to.
And if we are moving the brand new group then is makes sense to
copy the current disk iotune settings to the group.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomainSetBlockIoTune not simply sets the iotune params given in API
but use current settings for all the omitted params. Unfortunately
it uses current settings for active config when setting inactive
params. Let's fix it.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently it is possible to start a domain which have disks
in same iotune group and at the same time having different iotune
params. Both params set are passed to qemu in command line and the one
that is passed later down command line is get actually set.
Let's prohibit such configurations.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently upon successfull call to qemu's implementation of
virDomainSetBlockIoTune iotune settings are changed only for the
disk given in API if the disk is in iotune group while we need
to change the settings for all disks in the group.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
And introduce virDomainBlockIoTuneInfoHasAny.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
All the callees return either 0 or -1 so there is no need
for propagating the value. And we bail on the first error.
Remove the variable to make the function simpler.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We have a helper variable to make the code more concise,
use it consistently.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that the cleanup section is empty, eliminate the cleanup
label as well as the 'ret' variable.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Use the g_auto macros wherever possible to eliminate the cleanup
section.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The template still references libvirt-qemu-shim, which was at one
point the name used to refer to what we now know as virt-qemu-run.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
A recent commit added an error check for too-nested backing chains
followed by a return, even though errors above jump to cleanup.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: b168fa88b8
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Remove bogus G_GNUC_UNUSED attribute and add a missing space.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: d600667278
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The algorithm is used in two places to find the parent checkpoint object
which contains given disk and then uses data from the disk. Additionally
the code is written in a very non-obvious way. Factor out the lookup of
the disk into a function which also simplifies the callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>