qemuAliasForSecret is meant as a replacement qemuDomainGetSecretAESAlias
with saner API. The sub-type we are creating the alias for is passed in
as a string rather than the unflexible 'isLuks' boolean.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Replace it by a direct call to qemuDomainSecretAESSetupFromSecret.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Split out the lookup of the secret from the secret driver into
qemuDomainSecretAESSetupFromSecret so that we can also instantiate
secret objects in qemu with data from other sources.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than passing in an empty qemuDomainSecretInfoPtr allocate it
in this function and return it. This is done by absorbing the check from
qemuDomainSecretInfoNew and removing the internals of
qemuDomainSecretInfoNew.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use g_autofree for the ciphertext and init vector as they are not
secret and thus don't have to be cleared and use g_new0 to allocate the
iv for parity.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The comment mentioned that the function resets migration params, but
that is not true as of commit eb54cb473a
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use g_autofree instead of VIR_FREE and delete the comment mentioning
possible failure to allocate memory.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Using a double pointer prevents the function from being used as the
automatic cleanup function for the given type.
Remove the double pointer use by replacing the calls with
g_clear_pointer which ensures that the pointer is cleared.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
oVirt used a quirk in the pre-blockdev semantics of drive-mirror which
opened the backing chain of the mirror destination only once
'block-job-complete' was called.
Our introduction of blockdev made qemu open the backing chain images
right at the start of the job. This broke oVirt's usage of this API
because they copy the data into the backing chain during the time the
block copy job is running.
Re-introduce late open of the backing chain if qemu allows us to use
blockdev-snapshot on write-only nodes as it can be used to install the
backing chain even for an existing image now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The capability is based on qemu's support of using blockdev-snapshot to
install backing chain also for images which are in use by a block-copy
job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
For a long time we've masked out VIR_DOMAIN_BLOCK_COPY_SHALLOW if
there's no backing chain for the copied disk to simplify the code.
One of the refactors of the block copy code caused that we no longer
update the 'flags' variable just the local copies. This was okay until
in ccd4228aff we started storing the job flags in the block job data.
Given that we modify how we call qemu we also should modify @flags so
that the correct value is recorded in the block job data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Move the check whether the job is already synchronised to the beginning
of the function so that we don't try to do some of the steps necessary
for pivoting prior to actually wanting to pivot.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
'nfs' variable was set to -1 or -2 on agent failure. Cleanup then tried
to free 'nfs' elements of the array which resulted into a crash.
Make 'nfs' size_t and assign it only on successful agent call.
https://bugzilla.redhat.com/show_bug.cgi?id=1812965
Broken by commit 599ae372d8
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The only caller doesn't check the value and also there are no real
errors to report anyways.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virQEMUCaps structure has usedQMP member which in the past
used to tell if qemu we are dealing with is capable of QMP. Well,
we don't support HMP anymore (minus a few HMP passthrough
commands, which are wrapped into QMP anyways) and the member is
not used really.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
needReply added in [1] looks redundant. Indeed it is set to false only
when mon->await_event is set too (the only exception qemuAgentFSTrim
which is mistaken).
However it fixes the issue when qemuAgentCommand exits on error path and
mon->await_event is not reset. Let's instead reset mon->await_event properly.
Also remove "Woken up by event" debug message as it can be misleading.
We can get it also if monitor is closed due to serial changed event
currently. Anyway both qemuAgentClose and qemuAgentNotifyEvent log
itself.
[1] qemu: make sure agent returns error when required data are missing
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Sync was introduced in [1] to check for ga presence. This
check is racy but in the era before serial events are available
there was not better solution I guess.
In case we have the events the sync function is different. It allows us
to flush stateless ga channel from remnants of previous communications.
But we need to do it only once. Until we get timeout on issued command
channel state is ok.
[1] qemu_agent: Issue guest-sync prior to every command
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If a disk has persistent reservations enabled, qemu-pr-helper
might open not only /dev/mapper/control but also individual
targets of the multipath device. We are already querying for them
in CGroups, but now we have to create them in the namespace too.
This was brought up in [1].
1: https://bugzilla.redhat.com/show_bug.cgi?id=1711045#c61
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Lin Ma <LMa@suse.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
This converts the QEMU agent APIs to use the per-VM
event loop, which involves switching from virEvent APIs
to GMainContext / GSource APIs.
A GSocket is used as a convenient way to create a GSource
for a socket, but is not yet used for actual I/O.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We are dealing with the QEMU agent, not the monitor.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This converts the QEMU monitor APIs to use the per-VM
event loop, which involves switching from virEvent APIs
to GMainContext / GSource APIs.
A GSocket is used as a convenient way to create a GSource
for a socket, but is not yet used for actual I/O.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In common with regular QEMU guests, the QMP probing
will need an event loop for handling monitor I/O
operations.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The event loop thread will be responsible for handling
any per-domain I/O operations, most notably the QEMU
monitor and agent sockets.
We start this event loop when launching QEMU, but stopping
the event loop is a little more complicated. The obvious
idea is to stop it in qemuProcessStop(), but if we do that
we risk loosing the final events from the QEMU monitor, as
they might not have been read by the event thread at the
time we tell the thread to stop.
The solution is to delay shutdown of the event thread until
we have seen EOF from the QEMU monitor, and thus we know
there are no further events to process.
Note that this assumes that we don't have events to process
from the QEMU agent.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When preparing images for block jobs we modify their seclabels so
that QEMU can open them. However, as mentioned in the previous
commit, secdrivers base some it their decisions whether the image
they are working on is top of of the backing chain. Fortunately,
in places where we call secdrivers we know this and the
information can be passed to secdrivers.
The problem is the following: after the first blockcommit from
the base to one of the parents the XATTRs on the base image are
not cleared and therefore the second attempt to do another
blockcommit fails. This is caused by blockcommit code calling
qemuSecuritySetImageLabel() over the base image, possibly
multiple times (to ensure RW/RO access). A naive fix would be to
call the restore function. But this is not possible, because that
would deny QEMU the access to the base image. Fortunately, we
can use the fact that seclabels are remembered only for the top
of the backing chain and not for the rest of the backing chain.
And thanks to the previous commit we can tell secdrivers which
images are top of the backing chain.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1803551
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Historically threads are given a name based on the C function,
and this name is just used inside libvirt. With OS level thread
naming this name is now visible to debuggers, but also has to
fit in 15 characters on Linux, so function names are too long
in some cases.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The qemuMonitorOpenFD method has not been used since it
was first introduced.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Libvirt has never configured the QEMU agent to support
running on a PTY implicitly. In theory an end user may
have written such an XML config, but this is reasonably
unlikely since when a bare <channel> is provided, libvirt
will auto-expand it to a UNIX socket backend.
With this change a user who has use the PTY backend will
have to switch to the UNIX backend if they wish to use
libvirt APIs for interacting with the agent. This will
not have guest ABI impact.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
qemuMonitorJSONMakeCommandInternal does the full command construction if
you pass in what would become the value of the 'arguments' key. Refactor
the open-coded implementation to use the helper and use modern cleanup
helpers at the same time.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Make it obvious that the function always returns a valid pointer and fix
all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
I've found that if my virtlogd is socket activated but the daemon
doesn't run yet, then the virt-qemu-run is killed right after it
tries to start the domain. The problem is that because the default
setting is to use virtlogd, the domain create code tries to
connect to virtlogd socket, which in turn tries to detect who is
connecting (virNetSocketGetUNIXIdentity()) and as a part of it,
it will try to open /proc/${PID_OF_SHIM}/stat which is denied by
SELinux:
type=AVC msg=audit(1582903501.927:323): avc: denied { search } for \
pid=1210 comm="virtlogd" name="1843" dev="proc" ino=37224 \
scontext=system_u:system_r:virtlogd_t:s0-s0:c0.c1023 \
tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir \
permissive=0
Virtlogd reacts by closing the connection which the shim sees as
SIGPIPE. Since the default response to the signal is Term, we
don't even get to reporting any error nor to removing the
temporary directory.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
When virt-qemu-run is ran without any root directory specified on
the command line, a temporary directory is made and used instead.
But since we are using g_dir_make_tmp() to create the directory
it is going to have 0700 mode. So even though we create the whole
directory structure under it and label everything, QEMU is very
likely to not have the access. This is because in this case there
is no qemu.conf and thus distro default UID:GID is used to run
QEMU (e.g. qemu:kvm on Fedora). Change the mode of the temporary
directory so that everybody has eXecute permission.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Libvirt tries to forbid migration onto the same host and it does
that by checking if local and remote hostnames are the same and
whether local and remote UUIDs are the same. Well, the latter
makes sense but the former doesn't really because libvirtd can be
running inside an UTS namespace and hostnames can appear the same
on both sides of migration. On the other hand, host UUIDs are
unique, so rely on them when trying to prevent migration onto the
same host.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1639596
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Use the 'flat' flag for 'query-named-block-nodes' if qemu supports
QEMU_CAPS_QMP_QUERY_NAMED_BLOCK_NODES_FLAT in qemuBlockGetNamedNodeData.
We don't need the data so plumb in whether qemu supports the
'flat' output.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Modern qemu allows to skip the nested redundant data in the output of
query-named-block-nodes. Plumb in the support for the argument that
enables it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Replace qemuMonitorBlockGetNamedNodeData by qemuBlockGetNamedNodeData.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Detect the presence of the flag and make it available internally as
QEMU_CAPS_QMP_QUERY_NAMED_BLOCK_NODES_FLAT.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The monitor password callback was removed long time ago but the callback
type and variable were left around. Finish the cleanup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Format the 'vhost-user-fs' device on the QEMU command line.
This device provides shared file system access using the FUSE protocol
carried over virtio.
The actual file server is implemented in an external vhost-user-fs device
backend process.
https://bugzilla.redhat.com/show_bug.cgi?id=1694166
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Look into /usr/share/qemu/vhost-user to see whether we can find
a suitable virtiofsd binary, in case the user did not provide one
in the domain XML.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Wire up the code to put virtiofsd in the emulator cgroup on domain
startup.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it
to virtiofsd.
Note that virtiofsd needs to run as root.
https://bugzilla.redhat.com/show_bug.cgi?id=1694166
Introduced by QEMU commit a43efa34c7d7b628cbf1ec0fe60043e5c91043ea
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
This is not yet supported.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Add a 'virtiofsd_debug' option for tuning whether to run virtiofsd
in debug mode.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Introduce a new 'virtiofs' driver type for filesystem.
<filesystem type='mount' accessmode='passthrough'>
<driver type='virtiofs'/>
<source dir='/path'/>
<target dir='mount_tag'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</filesystem>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Introduced by QEMU commit 98fc1ada4cf70af0f1df1a2d7183cf786fc7da05
virtio: add vhost-user-fs base device
Released in QEMU v4.2.0.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Pass logManager to qemuExtDevicesStart for future usage.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
The default memlock limit is 64k which is not enough to start a single
VM. The requirements for one VM are 12k, 8k for eBPF map and 4k for eBPF
program, however, it fails to create eBPF map and program with 64k limit.
By testing I figured out that the minimal limit is 80k to start a single
VM with functional eBPF and if I add 12k I can start another one.
This leads into following calculation:
80k as memlock limit worked to start a VM with eBPF which means there
is 68k of lock memory that I was not able to figure out what was using
it. So to get a number for 4096 VMs:
68 + 12 * 4096 = 49220
If we round it up we will get 64M of memory lock limit to support 4096
VMs with default map size which can hold 64 entries for devices.
This should be good enough as a sane default and users can change it if
the need to.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1807090
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Whenever there is a guest CPU configured in domain XML, we will call
some CPU driver APIs to validate the CPU definition and check its
compatibility with the hypervisor. Thus domains with guest CPU
specification can only be started if the guest architecture is supported
by the CPU driver. But we would add a default CPU to any domain as long
as QEMU reports it causing failures to start any domain on affected
architectures.
https://bugzilla.redhat.com/show_bug.cgi?id=1805755
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
While our code can detect ISO as a separate format, qemu does not use it
as such and just passes it through as raw. Add conversion for detected
parts of the backing chain so that the validation code does not reject
it right away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Include virutil.h in all files that use it,
instead of relying on it being pulled in somehow.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Historically, this file was a dump for most of our helper
functions and needed almost everywhere.
With the introduction of virfile.h and virstring.h,
and more importantly, virenum.h and the introduction
of GLib, that is no longer true.
Remove its include from C files that don't even use it.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Both callers pass false. Since we frown upon format probing, remove the
unused possibility to do the probing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The backend name is memory-backend-memfd but we've been checking
for memory-backend-memory.
Reported by GCC on rawhide:
../../../src/internal.h:75:22: error: 'strcmp' of a string of length 21 and
an array of size 21 evaluates to nonzero [-Werror=string-compare]
../../../src/qemu/qemu_command.c:3525:20: note: in expansion of macro 'STREQ'
3525 | } else if (STREQ(backendType, "memory-backend-memory") &&
| ^~~~~
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 24b74d187c
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Another vircgroup helper to avoid code repetition between
the LXC and QEMU driver.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
lxcDomainSetMemoryParameters() and qemuDomainSetMemoryParameters()
has duplicated chunks of code that can be put in a new
helper.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This new helper avoids more code repetition inside
lxcDomainSetBlkioParameters() and qemuDomainSetBlkioParameters().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After the introduction of virDomainDriverMergeBlkioDevice() in a
previous patch, it is now clear that lxcDomainSetBlkioParameters() and
qemuDomainSetBlkioParameters() uses the same loop to set cgroup
blkio parameter of a domain.
Avoid the repetition by adding a new helper called
virDomainCgroupSetupDomainBlkioParameters().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
lxcDomainParseBlkioDeviceStr() and qemuDomainParseBlkioDeviceStr()
are the same function. Avoid code repetition by putting the code
in a new helper.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
lxcDomainMergeBlkioDevice() and qemuDomainMergeBlkioDevice()
are the same functions. This duplicated code can't be put in
the existing domain_cgroup.c since it's not cgroup related.
This patch introduces a new src/hypervisor/domain_driver.c to
host this more generic code that can be shared between virt
drivers. This new file is then used to create a new helper
called virDomainDeivceMergeBlkioDevice() to eliminate the code
repetition mentioned above. Callers in LXC and QEMU files
were updated.
This change is a preliminary step for more code reduction of
cgroup related code inside lxcDomainSetBlkioParameters() and
qemuDomainSetBlkioParameters().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuSetupCgroupVcpuBW() and lxcSetVcpuBWLive() shares the
same code to set CPU CFS period and quota. This code can be
moved to a new virCgroupSetupCpuPeriodQuota() helper to
avoid code repetition.
A similar code is also executed in virLXCCgroupSetupCpuTune(),
but without the rollback on error. Use the new helper in this
function as well since the 'period' rollback, if not a
straight improvement for virLXCCgroupSetupCpuTune(), is
benign. And we end up cutting more code repetition.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code that calls virCgroupSetCpuShares() and virCgroupGetCpuShares()
is repeated in 4 different places. Let's put it in a new
virCgroupSetupCpuShares() to avoid code repetition.
There's a reason of why we execute a Get in the same value we
just executed Set, explained in detail by commit 97814d8ab3.
Let's add a gist of the reasoning behind it as a comment in
this new function as well.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code from qemuSetupCgroupCpusetCpus() and virLXCCgroupSetupCpusetTune()
can be centralized in a new helper called virCgroupSetupCpusetCpus().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virLXCCgroupSetupMemTune() and qemuSetupMemoryCgroup() shares
duplicated code that can be put in a new helper to avoid
code repetition.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There is duplicated code between virt drivers that needs to
be moved to avoid code repetition. In the case of duplicated
code between lxc_cgroup.c and qemu_cgroup.c a common place
would be utils/vircgroup.c. The problem is that this would
introduce /conf related definitions that shouldn't be imported
to vircgroup.c, which is supposed to be a place for utilitary
cgroups functions only. And syntax-check would forbid it anyway
due to cross-directory includes being used.
An alternative would be to overload domain_conf.c, which already
contains all the definitions required. But that file is already
crowded with XML handling code and we wouldn't do any favors to
it by putting more utilitary, non-XML parsing/formatting code
there.
In [1], Cole suggested a 'domain_cgroup' file to host common code
between lxc_cgroup and qemu_cgroup, and Daniel suggested a
'src/hypervisor' dir to host these type of files. This patch
introduces src/hypervisor/domain_cgroup.c and, to get started,
introduces a new virDomainCgroupSetupBlkio() function to host shared
code between virLXCCgroupSetupBlkioTune() and qemuSetupBlkioCgroup().
[1] https://www.redhat.com/archives/libvir-list/2019-December/msg00817.html
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are code repetition of set() and get() blkio device
parameters across lxc and qemu files. Use the new vircgroup
helpers to trim the repetition a bit.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This setting can be updating very easily on an already active
interface by just changing it in sysfs. If the bridge used for
connection is also changed, there is no need to separately update it,
because the new setting isf done as a part of connecting to the bridge
anyway.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This patch pushes the isolatedPort setting from the <interface> down
all the way to the callers of virNetDevBridgeAddPort(), and sets
BR_ISOLATED on the port (using virNetDevBridgePortSetIsolated()) after
the port has been successfully added to the bridge.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Not only was the original error code destroyed in the case of
encountering an error during recovery from a failed attach to the
bridge (and then *that* error was destroyed by logging a *second*
error about the failure to recover - virNetDevBridgeAddPort() already
logs an error, so the one about failing to recover was redundant), but
if the recovery was successful, the function would then return success
to the caller even though it had failed.
Fixes: 2711ac8716
(overwritten errors were introduced along with this functionality)
Fixes: 6bde0a1a37
(the wrong return value was introduced by a refactor)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Firstly, the check for disk I/O error can be moved into 'if
(!offline)' section a few lines below.
Secondly, checks for vmstate and slirp should be moved under the
same section because they reflect live state of a domain. For
offline migration no QEMU is involved and thus these restrictions
are not valid.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In two places where virPidFileForceCleanupPath() is called, we
try to unlink() the pidfile again. This is needless because
virPidFileForceCleanupPath() has done just that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuMonitorGetIOThreads returns a NULL-terminated list even when 0
iothreads are present. The caller didn't perform cleanup if there were 0
iothreads leaking the array.
https://bugzilla.redhat.com/show_bug.cgi?id=1804548
Fixes: d1eac92784
Reported-by: Jing Yan <jiyan@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
QoS 'floor' setting is documented to be only supported for interfaces of
type 'network'. Fail with an error message on attempt to set 'floor' on
an interface of any other type.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Implement support for the slice of type 'storage' which allows to set
the offset and size which modifies where qemu should look for the start
of the format container inside the image.
Since slicing is done using the 'raw' driver we need to add another
layer into the blockdev tree if there's any non-raw image format driver
used to access the data.
This patch adds the blockdev integration and setup of the image data so
that we can use the slices for any backing image.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When creating overlay images e.g. for snapshots or when merging
snapshots we often specify the backing store string to use. Make the
formatter aware of backing chain entries which have a <slice>
configured so that we record it properly. Otherwise such images
would not work without the XML (when detecting the backing chain).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The storage slice will require a specific node name in cases when the
image format is not raw. Store and format them in the status XML.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Specifically creating such images via libvirt during blockjobs would
be much more hassle than it's worth. Just forbid them for now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We support explicit storage slices only when using blockdev. Storage
slices expressed via the backing store string are left to qemu to
open correctly.
Reject storage slices configured via the XML for non-blockdev usage.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If we have a 'format' type slice for a raw driver we can directly format
the values.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If a domain has a NVMe disk it already has the access configured.
Trying to configure it again on a commit or some other operation
is wrong and condemned to failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Its behavior is controlled by a KVM-specific CPU feature.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Its use is limited to certain guest types, and it only supports
a subset of all possible tick policies.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This new timer model will be used to control the behavior of the
virtual timer for KVM ARM/virt guests.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We will use this capability to detect whether the QEMU binary
supports the kvm-no-adjvtime CPU feature.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Make sure we are taking all possible virDomainTimerNameType values
into account. This will make upcoming changes easier.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Libvirt switched to using a UNIX socket for monitors in
2009 for version 0.7.0. It seems unlikely that there is
a running QEMU process that hasn't been restarted for
11 years while also taking a libvirt upgrade. Therefore
we can drop support for opening a PTY for the QEMU
monitor.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
My original implementation was completely broken because it attempted to
use object-add/del instead of blockdev-add/del.
https://bugzilla.redhat.com/show_bug.cgi?id=1798366
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Modify qemuMonitorBlockdevAdd so that it takes a double pointer for the
@props argument so that it's cleared inside the call. This allows
writing cleaner callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Use automatic variable freeing and get rid of the cleanup section.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Use automatic variable freeing and get rid of the cleanup section.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Few switch cases returned failure but didn't report an error. For a
situation when the backingStore type='volume' was not translated the
following error would occur:
$ virsh start VM
error: Failed to start domain VM
error: An error occurred, but the cause is unknown
After this patch:
$ virsh start VM
error: Failed to start domain VM
error: internal error: storage source pool 'tmp' volume 'pull3.qcow2' is not translated
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We call APIs that reset the error in the rollback code.
Preserve the error from the original call that failed.
This turns the boringly cryptic:
error: Unable to set interface parameters
error: An error occurred, but the cause is unknown
to the unexpectedly anarchist:
error: internal error: Child process (/usr/sbin/tc filter add
dev vnet1 parent ffff: protocol all u32 match u32 0 0 police
rate 4294968kbps burst 4294968kb mtu 64kb drop flowid :1)
unexpected exit status 1: Illegal "rate"
Illegal "police"
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: f02e21cb33https://bugzilla.redhat.com/show_bug.cgi?id=1800505
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Otherwise an attempt to set an invalid value:
virsh domiftune rhel8.2 vnet0 --outbound 4294968
on an interface with no bandwidth set crashes.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: f02e21cb33https://bugzilla.redhat.com/show_bug.cgi?id=1800505
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This deletes all trace of gnulib from libvirt. We still
have the keycodemapdb submodule to deal with. The simple
solution taken was to update it when running autogen.sh.
Previously gnulib could auto-trigger refresh when running
'make' too. We could figure out a solution for this, but
with the pending meson rewrite it isn't worth worrying
about, given how infrequently keycodemapdb changes.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Pvpanic device supports bit 1 as crashloaded event, it means that
guest actually panicked and run kexec to handle error by guest side.
Handle crashloaded as a lifecyle event in libvirt.
Test case:
Guest side:
before testing, we need make sure kdump is enabled,
1, build new pvpanic driver (with commit from upstream
e0b9a42735f2672ca2764cfbea6e55a81098d5ba
191941692a3d1b6a9614502b279be062926b70f5)
2, insmod new kmod
3, enable crash_kexec_post_notifiers,
# echo 1 > /sys/module/kernel/parameters/crash_kexec_post_notifiers
4, trigger kernel panic
# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger
Host side:
1, build new qemu with pvpanic patches (with commit from upstream
600d7b47e8f5085919fd1d1157f25950ea8dbc11
7dc58deea79a343ac3adc5cadb97215086054c86)
2, build libvirt with this patch
3, handle lifecycle event and trigger guest side panic
# virsh event stretch --event lifecycle
event 'lifecycle' for domain stretch: Crashed Crashloaded
events received: 1
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Starting a KVM domain on s390 with old machine type (such as
s390-ccw-virtio-2.5) and without any guest CPU model configured fails
with
CPU models are not available: KVM doesn't support CPU models
QEMU error. This is cause by libvirt using host-model CPU as the default
CPU based on QEMU reporting "host" CPU model as being the default one
(see commit v5.9.0-402-g24d8202294: qemu: Use host-model CPU on s390 by
default). However, even though both QEMU and KVM support CPU models on
s390 and QEMU can give us the host-model CPU, we can't use it with old
machine types which only support -cpu host.
https://bugzilla.redhat.com/show_bug.cgi?id=1795651
Reported-by: Christian Ehrhardt <paelzer@gmail.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The usability of a specific CPU mode may depend on machine type, let's
prepare for this by passing it to virQEMUCapsIsCPUModeSupported.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Extend QEMU with tpm-spapr support. Assign a device address to the
vTPM device model.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Extend the QEMU capabilties with tpm-spapr support.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This patch adds support for the tpm-spapr device model for ppc64. The XML for
this type of TPM looks as follows:
<tpm model='tpm-spapr'>
<backend type='emulator'/>
</tpm>
Extend the documentation.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Introduce VIR_DOMAIN_TPM_MODEL_DEFAULT as a default model which we use
in case the user does not provide a model in the device XML. It has
the TIS's previous value of '0'. In the post parsing function
we change this default value to 'TIS' to have the same model as before.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
All our supported Linux distros now have this header.
It has never existed on FreeBSD / macOS / Mingw.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This addreses portability to Windows and standardizes
error reporting. This fixes a number of places which
failed to set O_CLOEXEC or failed to report errors.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Most code now uses the virProcess / virCommand APIs, so
the need for sys/wait.h is quite limited. Removing this
include removes the dependency on GNULIB providing a
dummy sys/wait.h for Windows.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Use qemuBlockBitmapsHandleBlockcopy to calculate bitmaps to copy over
for a block-copy job.
We copy them when pivoting to the new image as at that point we are
certain that we don't dirty any bitmap unnecessarily.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a function calculating which bitmaps to copy to the mirror during
a block-copy operation.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a validator which checks that a bitmap spanning multiple backing
chain members doesn't look broken. The current rules are that no
intermediate birmaps are missing (unfortunately it's hard to know
whether the topmost or bottommost bitmap is missing) and none of the
components is inconsistent.
We can obviously improve it over time.
The validator is also tested against the existing bitmap data we have
for the backup merging test as well as some of the existing broken
bitmap synthetic test cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The flags may control important aspects of the block job which may
influence also the termination of the job. Store the 'flags' for all
the block job types.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a variable which will store the contents of the 'flags' variable as
passed in by the individual block jobs. Since the flags may influence
behaviour of the jobs it's important to preserve them to the
finalization steps.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use the glib allocation function that never returns NULL and remove the
now dead-code checks from all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Create a wrapper for qemuBlockGetNamedNodeData named
qemuBlockGetNamedNodeData. The purpose of the wrapper is to integrate
the monitor handling functionality and in the future possible
qemuCaps-based flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allow qemu access to modify backing files in case when we want to delete
a checkpoint.
This patch adds tracking of which images need to be relabelled when
calculating the transaction, the code to relabel them and rollback.
To verify that stuff works we also output the list of images to relabel
into the test case output files in qemublocktest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allow deleting of checkpoints when snapshots were created along. The
code tracks and modifies the checkpoint list so that backups can still
be taken with such a backing chain. This unfortunately requires to
rename few bitmaps (by copying and deleting them) in some cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This requires stealing one cmd pointer before returning it.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Mark eligible declarations as g_autofree and remove
the corresponding VIR_FREE calls.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Always trim the full specified suffix.
All of the callers outside of tests were passing either
strlen or the actual length of the string.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now, that every use of virAtomic was replaced with its g_atomic
equivalent, let's remove the module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The rewrite to use GLib's atomic ops functions changed the behavior
of virAtomicIntInc - before it returned the pre-increment value.
Most of the callers using its value were adjusted, but the one
in qemuDriverAllocateID was not. If libvirtd would reconnect to
a running domain during startup, the next started domain would get
the same ID:
$ virsh list
Id Name State
--------------------------
1 f28live running
1 f28live1 running
Use the g_atomic_add function directly (as recommended in viratomic.h)
and add 1 to the result.
This also restores the usual numbering from 1 instead of 0.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 7b9645a7d1
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Starting on commit 1f43393283, qemuDomainFillDeviceIsolationGroup()
returns 0 in all circunstances. Let's turn it to 'void' make it
clearer that the function will not fail. This also spares a
check for < 0 return in qemu_hotplug.c. The
qemuDomainFillDeviceIsolationGroupIter() callback now returns
0 at all times - which is already happening anyway.
Refer to 1f43393283 commit message for more details on why
the function was changed to never return an error.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainChrDefDropDefaultPath() returns an int, but it's
always returning 0. Callers are checking for result < 0 to
run their cleanup code needlessly.
Turn the function to 'void' and adjust the callers.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Avoid some of the virObjectUnref() calls by using g_autoptr.
Aside from the 'cleanup' label in qemuDomainSetFakeReboot(),
all other now deprecated cleanup labels will be removed in
the next patch.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_autofree to remove VIR_FREE() calls used for cleanups.
Labels that became deprecated will be removed in a later
patch.
In qemuDomainSetupDisk(), the 'dst' variable is not used at
all and could be removed.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'caps' variable in qemuDomainObjPrivateXMLParseAutomaticPlacement()
is set to auto clean via g_autoptr(), but a 'virObjectUnref(caps)' is
being executed in the 'cleanup' label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
With -blockdev we must look up via the nodename rather than the 'drive'
alias which is not present any more.
This fixes the pre-creation of storage volumes on migration with
non-shared storage.
https://bugzilla.redhat.com/show_bug.cgi?id=1793263
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Swithc to the helper which doesn't require checking of the return value.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The data is gathered only once so we can move the whole block which
fetches the data out of the loop and get rid of the logic which
prevents multiple calls.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor the logic to skip the body of the function if there's nothing
to do.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There are two calls to virHashNew which check the return value. It's not
necessary any more as virHashNew always returns a valid pointer.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Aside from itinerant error (actually warning) messages due to an
unrecognized response from qemu, this isn't even necessary - the
migration proceeds successfully to completion anyway.
(I'm not sure where to see this status reported in the API though - do
we need to add an extra state, or recognition of a new event somewhere?)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Normally a PCI hostdev can't be migrated, so
qemuMigrationSrcIsAllowedHostdev() won't permit it. In the case of a a
hostdev network interface that has <teaming type='transient'/> set,
QEMU will automatically unplug the device prior to migration, and
re-plug a corresponding device on the destination. This patch modifies
qemuMigrationSrcIsAllowedHostdev() to allow domains with those devices
to be migrated.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The QEMU driver uses the <teaming type='persistent|transient'
persistent='blah'/> element to setup a "failover" pair of devices -
the persistent device must be a virtio emulated NIC, with the only
extra configuration being the addition of ",failover=on" to the device
commandline, and the transient device must be a hostdev NIC
(<interface type='hostdev'> or <interface type='network'> with a
network that is a pool of SRIOV VFs) where the extra configuration is
the addition of ",failover_pair_id=$aliasOfVirtio" to the device
commandline. These new options are supported in QEMU 4.2.0 and later.
Extra qemu-specific validation is added to ensure that the device
type/model is appropriate and that the qemu binary supports these
commandline options.
The result of this will be:
1) The virtio device presented to the guest will have an extra bit set
in its PCI capabilities indicating that it can be used as a failover
backup device. The virtio guest driver will need to be equipped to do
something with this information - this is included in the Linux
virtio-net driver in kernel 4.18 and above (and also backported to
some older distro kernels). Unfortunately there is no way for libvirt
to learn whether or not the guest driver supports failover - if it
doesn't then the extra PCI capability will be ignored and the guest OS
will just see two independent devices. (NB: the current virtio guest
driver also requires that the MAC addresses of the two NICs match in
order to pair them into a bond).
2) When a migration is requested, QEMu will automatically unplug the
transient/hostdev NIC from the guest on the source host before
starting migration, and automatically re-plug a similar device after
restarting the guest CPUs on the destination host. While the transient
NIC is unplugged, all network traffic will go through the
persistent/virtio device, but when the hostdev NIC is plugged in, it
will get all the traffic. This means that in normal circumstances the
guest gets the performance advantage of vfio-assigned "real hardware"
networking, but it can still be migrated with the only downside being
a performance penalty (due to using an emulated NIC) during the
migration.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Presence of the virtio-net-pci option called "failover" indicates
support in a qemu binary of a simplistic bonding of a virtio-net
device with another PCI device. This feature allows migration of
guests that have a network device assigned to a guest with VFIO, by
creating a network bond device in the guest consisting of the
VFIO-assigned device and a virtio-net-pci device, then temporarily
(and automatically) unplugging the VFIO net device prior to migration
(and hotplugging an equivalent device on the migration
destination). (The feature is called "failover" because the bond
device uses the vfio-pci netdev for normal guest networking, but
"fails over" to the virtio-net-pci netdev once the vfio-pci device is
unplugged for migration.)
Full functioning of the feature also requires support in the
virtio-net driver in the guest OS (since that is where the bond device
resides), but if the "failover" commandline option is present for the
virtio-net-pci device in qemu, at least the qemu part of the feature
is available, and libvirt can add the proper options to both the
virtio-net-pci and vfio-pci device commandlines to indicate qemu
should attempt doing the failover during migration.
This patch just adds the qemu capabilities flag "virtio-net.failover".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>