Every single caller of the
virNetDevMacVLanDeleteWithVPortProfile() function is calling it
wrapped inside of ignore_value() macro. This is because the
function is annotated as G_GNUC_WARN_UNUSED_RESULT. This makes no
sense. Drop the annotation and the macro envelope.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The hotplug code paths need to be able to pass the FDs to the monitor to
ensure that hotplug works.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
To ensure that we can hot-unplug the disk including the associated fdset
we need to store the fdset ID in the status XML.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Rollback of FD sets passed to qemu is also needed after possible restart
of libvirtd when we need to serialize the data into status XML. For this
purpose we need to access the fdset ID once it was passed to qemu and
potentially re-create a 'qemuFDPass' struct in passed state.
Introduce 'qemuFDPassNewPassed' and 'qemuFDPassIsPassed'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Until now the code didn't expect that we'd want to rollback/detach a FD
passed on the commandline, but whith disk backend FD passing this can
happen.
Properly mark the 'qemuFDPass' object as passed to qemu even when it was
done on the commandline.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Copy the pointer to qemuFDPass into struct qemuBlockStorageSourceAttachData
so that it can be used from qemuBuildBlockStorageSourceAttachDataCommandline
rather than looping again in qemuBuildDiskSourceCommandLineFDs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Be consistent with other children buffer variable naming scheme.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The iTCO watchdog is part of the q35 machine type since its inception,
we just did not add it implicitly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2137346
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In order for the iTCO watchdog to be operational we must disable the
noreboot pin strap in qemu. This is the default starting from 8.0
machine types, but desirable for older ones as well. And we can safely
do that since that is not guest-visible.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is already possible with qemu, and actually already happening with
q35 machines and a specified watchdog since q35 already includes a
watchdog we do not include in the XML. In order to express such
posibility multiple watchdogs need to be supported.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The g_hash_table_unref() function does not accept NULL. Passing
NULL results in a glib warning being triggered. Check whether the
hash table is not NULL and unref it only then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Support virtio-crypto device, also support cryptodev types:
- builtin
- lkcf
Finally, we can launch a VM(QEMU) with one or more crypto devices by
libvirt.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Changes in this commit:
- docs: formatdomaincaps.rst
- conf: crypto related domain caps
- qemu: crypto related
- tests: crypto related test
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Introduce crypto device like:
<crypto model='virtio' type='qemu'>
<backend model='builtin' queues='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</crypto>
<crypto model='virtio' type='qemu'>
<backend model='lkcf'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</crypto>
Currently, crypto model supports virtio only, type supports qemu only
(vhost-user in the plan). For the qemu type, backend supports modle
builtin/lkcf, and the queues is optional.
Changes in this commit:
- docs: formatdomain.rst
- schemas: domaincommon.rng
- conf: crypto related domain conf
- qemu: crypto related
- tests: crypto related test
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The field is no longer used so we can remove it and the code filling it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
All callers pass 'false' so we no longer need it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Commit b7798a07f9 (in fall of 2016) changed the way we generate aliases
for 'dimm' memory devices as the alias itself is part of the migration
stream section naming and thus must be treated as ABI.
The code added compatibility layer for VMs with memory hotplug started
with the old scheme to prevent from generating wrong aliases. The
compatibility layer broke though later when 'nvdimm' and 'pmem' devices
were introduced as it wrongly detected them as old configuration.
Now rather than attempting to fix the legacy compat layer to treat other
devices properly we'll be better off simply removing it as it's
extremely unlikely that somebody has a VM started in 2016 running with
today's libvirt and attempts to hotplug more memory.
This fixes a corner case when a user hot-adds a 'dimm' into a VM with a
'dimm' and a 'nvdimm' after restart of libvirtd and then attempts to
migrate the VM.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2158701
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In error case, unref event->vm instead of vm. This makes it
easier for the reader to understand as it is the event struct
that's holding the reference.
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We have virDomainGetCPUStats() API which offers querying
statistics on host CPU usage by given guest. And it works in two
modes: getting overall stats (@start_cpu == -1, @ncpus == 1) or
getting per host CPU usage.
For the QEMU driver it is implemented by looking into values
stored in corresponding cpuacct CGroup controller. Well, this
works for system instances, where libvirt has permissions to
create CGroups and place QEMU process into them. But it does not
fly for session connection, where no CGroups are set up.
Fortunately, we can do something similar to v8.8.0-rc1~95 and use
virProcessGetStatInfo() to fill the overall stats. Unfortunately,
I haven't found any source of per host CPU usage, so we just
continue throwing an error in that case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Firstly, the virProcessGetStatInfo() does not fail really. But
even if it did, it sets correct errno only sometimes (and even
that is done in a helper it's calling - virProcessGetStat() and
even there it's the case only in very few error paths).
Therefore, using virReportSystemError() to report errors is very
misleading. Use plain virReportError() instead. Luckily, there
are only two places where the former was used:
chDomainHelperGetVcpus() and qemuDomainHelperGetVcpus() (not a
big surprise since CH driver is heavily inspired by QEMU driver).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In a recent commit of v9.0.0-rc1~192 I've tried to forbid case
where a TAP device already exists, but at the same time it's
managed by Libvirt (<interface type='ethernet'> <target
dev='tap0' managed='yes'/> </interface>). NB, if @managed
attribute is missing then it's assumed to be managed by Libvirt.
Anyway, I've mistakenly put setting of
VIR_NETDEV_TAP_CREATE_ALLOW_EXISTING flag into managed='yes'
branch instead of managed='no' branch in
qemuInterfaceEthernetConnect().
Move the setting of the flag into the correct branch.
Fixes: a2ae3d299c
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The systemd service files of the qemu and libxl driver currently have a
'Requires' dependency on virtlockd, which is too strong since virtlockd
is not enabled by default in either driver. Change the dependency to a
'Wants' to avoid a package dependency between the driver subpackages and
the new libvirt-daemon-lock subpackage.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This commented-out option was pointed out by jtomko during review, but
I missed taking it out when addressing his comments.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This attribute was added to support setting the --interface option for
passt, but in a post-push/pre-9.0-release review, danpb pointed out
that it would be better to use the existing <source dev='xxx'/>
attribute to set --interface rather than creating a new attribute (in
the wrong place). So we remove backend/upstream, and change the passt
commandline creation to grab the name for --interface from source/dev.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Currently, the ThreadContext object is generated whenever we see
.host-nodes attribute for a memory-backend-* object. The idea was
that when the backend is pinned to a specific set of host NUMA
nodes, then the allocation could be happening on CPUs from those
nodes too. But this may not be always possible.
Users might configure their guests in such way that vCPUs and
corresponding guest NUMA nodes are on different host NUMA nodes
than emulator thread. In this case, ThreadContext won't work,
because ThreadContext objects live in context of the emulator
thread (vCPU threads are moved around by us later, when emulator
thread finished its setup and spawned vCPU threads - see
qemuProcessSetupVcpus()). Therefore, memory allocation is done by
emulator thread which is pinned to a subset of host NUMA nodes,
but tries to create a ThreadContext object with a disjoint subset
of host NUMA nodes, which fails.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2154750
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jumping to the error label and reading the pidfile does not make sense
until we reached qemuSecurityCommandRun which creates the pidfile.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The pidfile is guaranteed to be non-NULL (thanks to glib allocation
functions) and it's dereferenced two lines above anyway.
Reported by coverity:
/src/qemu/qemu_passt.c: 278 in qemuPasstStart()
272 return 0;
273
274 error:
275 ignore_value(virPidFileReadPathIfLocked(pidfile, &pid));
276 if (pid != -1)
277 virProcessKillPainfully(pid, true);
>>> CID 404360: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "pidfile" suggests that it may be null, but it
>>> has already been dereferenced on all paths leading to the check.
278 if (pidfile)
279 unlink(pidfile);
280
281 return -1;
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
To ensure same behaviour when remote driver is or is not used we must
not steal the FDs and array holding them passed to qemuDomainFDAssociate
but rather duplicate them. At the same time the remote driver must close
and free them to prevent leak.
Pointed out by Coverity as FD leak on error path:
*** CID 404348: Resource leaks (RESOURCE_LEAK)
/src/remote/remote_daemon_dispatch.c: 7484 in remoteDispatchDomainFdAssociate()
7478 rv = 0;
7479
7480 cleanup:
7481 if (rv < 0)
7482 virNetMessageSaveError(rerr);
7483 virObjectUnref(dom);
>>> CID 404348: Resource leaks (RESOURCE_LEAK)
>>> Variable "fds" going out of scope leaks the storage it points to.
7484 return rv;
Fixes: abd9025c2f
Fixes: f762f87534
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This consists of (1) adding the necessary args to the qemu commandline
netdev option, and (2) starting a passt process prior to starting
qemu, and making sure that it is terminated when it's no longer
needed. Under normal circumstances, passt will terminate itself as
soon as qemu closes its socket, but in case of some error where qemu
is never started, or fails to startup completely, we need to terminate
passt manually.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
passt support requires "-netdev stream", which was added to QEMU in
qemu-7.2.0.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This fits better with the element containing the value (<driver>), and
allows us to use virDomainNetBackend* for things in the <backend>
element.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Assert support for VIR_DOMAIN_DEF_FEATURE_DISK_FD in the qemu driver
now that all code paths are adapted.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Probing stats and block copy to a FD passed image is not yet supported.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We assume that FD passed images already exist so all existance checks
are skipped.
For the case that a FD-passed image is passed without a terminated
backing chain (thus forcing us to detect) we attempt to read the header
from the FD.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Prepare the internal data for passing FDs instead of having qemu open
the file internally.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When starting up a VM with FD-passed images we need to look up the
corresponding named FD set and associate it with the virStorageSource
based on the name.
The association is brought into virStorageSource as security labelling
code will need to access the FD to perform selinux labelling.
Similarly when startup is complete in certain cases we no longer need to
keep the copy of FDs and thus can close them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new helper qemuDomainStartupCleanup is used to perform cleanup after
a startup of a VM (successful or not). The initial implementation just
calls qemuDomainSecretDestroy, which can be un-exported.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Implement passing and storage of FDs for the qemu driver. The FD tuples
are g_object instances stored in a per-domain hash table and are
automatically removed once the connection is closed.
In the future we can consider supporting also to not tie the lifetime of
the passed FDs bound to the connection.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The patch moving the code didn't faithfully represent the typecasting
of the 'bandwidth' variable needed to properly convert from the legacy
'unsigned long' argument which resulted in a build failure on 32 bit
systems:
../src/qemu/qemu_block.c: In function ‘qemuBlockCommit’:
../src/qemu/qemu_block.c:3249:23: error: comparison is always false due to limited range of data type [-Werror=type-limits]
3249 | if (bandwidth > LLONG_MAX >> 20) {
| ^
Fix it by returning the check into qemuDomainBlockCommit as it's needed
only because of the legacy argument type in the old API and use
'unsigned long long' for qemuBlockCommit.
Fixes: f5a77198bf
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Some compilers aren't happy when an automatically freed variable is used
just to free something (thus it's only assigned in the code):
When compiling qemuSnapshotDelete after recent commits they complain:
../src/qemu/qemu_snapshot.c:3153:61: error: variable 'delData' set but not used [-Werror,-Wunused-but-set-variable]
g_autoslist(qemuSnapshotDeleteExternalData) delData = NULL;
^
To work around the issue we can restructure the code which also has the
following semantic implications:
- since qemuSnapshotDeleteExternalPrepare does validation we error out
sooner than attempting to start the VM
- we read the temporary variable at least in one code path
Fixes: 4a4d89a925
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
If the daemon crashes or is restarted while the snapshot delete is in
progress we have to handle it gracefully to not leave any block jobs
active.
For now we will simply abort the snapshot delete operation so user can
start it again. We need to refuse deleting external snapshots if there
is already another active job as we would have to figure out which jobs
we can abort.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When daemon is restarted and libvirt tries to recover domain jobs we
need to know if the snapshot job was a snapshot delete in order to
safely abort running QEMU block jobs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting external snapshots the operation may fail at any point
which could lead to situation that some disks finished the block commit
operation but for some disks it failed and the libvirt job ends.
In order to make sure that the qcow2 images are in consistent state
introduce new element "<snapshotDeleteInProgress/>" that will mark the
disk in snapshot metadata as invalid until the snapshot delete is
completed successfully.
This will prevent deleting snapshot with the invalid disk and in future
reverting to snapshot with the invalid disk.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
With external snapshots we need to modify the metadata bit more then
what is required for internal snapshots. Mainly the storage source
location changes with every external snapshot.
This means that if we delete non-leaf snapshot we need to update all
children snapshots and modify the disk sources for all affected disks.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting snapshot we are starting block-commit job over all disks
that are part of the snapshot.
This operation may fail as it writes data changes to the backing qcow2
image so we need to wait for all the disks to finish the operation and
wait for correct signal from QEMU. If deleting active snapshot we will
get `ready` signal and for inactive snapshots we need to disable
autofinalize in order to get `pending` signal.
At this point if commit for any disk fails for some reason and we abort
the VM is still in consistent state and user can fix the reason why the
deletion failed.
After that we do `pivot` or `finalize` if it's active snapshot or not to
finish the block job. It still may fail but there is nothing else we can
do about it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In order to save some CPU cycles we will collect all the necessary data
to delete external snapshot before we even start. They will be later
used by code that deletes the snapshots and updates metadata when
needed.
With external snapshots we need data that libvirt gets from running QEMU
process so if the VM is not running we need to start paused QEMU process
for the snapshot deletion and kill at afterwards.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting external snapshots will require to run it as async domain job,
the same way as we do for snapshot creation.
For internal snapshots modify the job mask in order to forbid any other
job to be started.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting internal snapshot when the currently active disk image is
different than where the internal snapshot was taken doesn't work
correctly.
This applies to a running VM only as we are using QMP command and
talking to the QEMU process that is using different disk.
This works correctly when the VM is shut of as in this case we spawn
qemu-img process to delete the snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Prepare the validation function for external snapshot delete support.
There is one exception when deleting `children-only` snapshots. If the
snapshot tree is like this example:
snap1 (external)
|
+- snap2 (internal)
|
+- snap3 (internal)
|
+- snap4 (internal)
and user calls `snapshot-delete snap1 --children-only` the current
snapshot is external but all the children snapshots are internal only
and we are able to delete it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract the code deleting external snapshot metadata to separate
function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Previously the reparent happened before the actual snapshot deletion.
This change moves the code closer to the rest of the code handling
snapshot metadata when deletion happens. This makes the metadate
deletion happen after the data files are deleted.
Following patch will extract it into separate function
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This simplifies the code a bit by reusing existing parts that deletes
a single snapshot.
The drawback of this change is that we will now call the re-parent bits
to keep the metadata in sync for every child even though it will get
deleted as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract code that deletes children of specific snapshot to separate
function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract code that deletes single snapshot to separate function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Move code around to make it clear what is called when deleting single
snapshot or children snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
QEMU emits this signal when the job finished its work and is about to be
finalized. If the job is started with autofinalize disabled the job
waits for user input to finalize the job.
This will be used by snapshot delete code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The created job will be needed by external snapshot delete code so
rework qemuBlockCommit to return that pointer.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
External snapshots will use this to synchronize qemu block jobs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting external snapshots will require configuring autofinalize to
synchronize the block jobs for disks withing single snapshot in order to
be able safely abort of one of the jobs fails.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Upcoming snapshot deletion code will require that multiple commit jobs
are finished in sync. To allow aborting then if one fails we will need
to use manual finalization of the jobs.
This commit implements the monitor code for `job-finalize`.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow to use it while having async domain job active which we
will use when deleting external snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow to use it while having async domain job active which we
will use when deleting external snapshots. At the same time we will need
to have the block job started as synchronous.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Move the code for finishing a job in the ready state to qemu_block.c.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Up until commit 629282d884, using mode=restrictive caused
virNumaSetupMemoryPolicy() to be called from qemuProcessHook(),
and that in turn resulted in virNumaNodesetIsAvailable() being
called and the nodeset being validated.
After that change, the only validation for the nodeset is the one
happening in qemuBuildMemoryBackendProps(), which is skipped when
using mode=restrictive.
Make sure virNumaNodesetIsAvailable() is called whenever a
nodeset has been provided by the user, regardless of the mode.
https://bugzilla.redhat.com/show_bug.cgi?id=2156289
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When post-copy migration fails, the domain stays running on the
destination with a VIR_DOMAIN_RUNNING_POSTCOPY_FAILED reason. Both the
state and the reason can later be rewritten in case the domain gets
paused for other reasons (such as an I/O error). Thus we need a separate
place to remember the post-copy migration failed to be able to resume
the migration.
https://bugzilla.redhat.com/show_bug.cgi?id=2111948
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The parameter was only used to select which states correspond to an
active or failed post-copy migration. But these states are either
applicable to both operations or the check would just paper over a code
bug in case of an impossible combination of state and operation. By
dropping the check we can make the code simpler and also reuse existing
virDomainObjIsFailedPostcopy function and only check for active
post-copy states.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The qemu driver uses connection close callbacks in more places requiring
more changes than other drivers, but luckily the changes are very
straightforward. The migration code was written in a way ensuring that
there's just one callback present so this can be preserved directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function can't fail so there's no point in returning anything.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
1.clear passwd in debug log
2.alignment
3.use the same variable name for function definition and declaration
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In a recent commit I've introduced an umount() call. But the
function where the call lives is compiled on all OSes, not just
Linux. But umount() is Linux specific. Other OSes have unmount
(FreeBSD), or maybe something else. But since namespaces are
Linux specific, we can wrap the call in #ifdef __linux__ and not
care about other OSes.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When calling virConnectGetDomainCapabilities() (exposed as virsh
domcapabilities) users have option to specify whatever sub-set of
{ emulatorbin, arch, machine, virttype } they want. Then we have
a logic (hidden in virQEMUCapsCacheLookupDefault()) that picks
qemuCaps that satisfy values passed by user. And whatever was not
specified is then set to the default value as specified by picked
qemuCaps. For instance: if no machine type was provided but
emulatorbin was, then the machine type is set to the default one
as defined by the emulatorbin.
Or, when just virttype was set then the remaining three values
are set to their respective defaults. Except, we have a crasher
in this case:
# virsh domcapabilities --virttype hvf
error: Disconnected from qemu:///system due to end of file
error: failed to get emulator capabilities
error: End of file while reading data: Input/output error
This is because for 'hvf' virttype (at least my) QEMU does not
have any machine type. Therefore, @machine is set to NULL and the
rest of the code does not expect that.
What we can do about this is to validate all arguments. Well,
except for the emulatorbin which is obtained from passed
qemuCaps. This also fixes the issue when domcapabilities for a
virttype of a different driver are requested, or a different
arch.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When deciding whether to bind mount a path in domain's namespace,
we look at the QEMU mount table (/proc/$pid/mounts) and try to
match prefix of given path with one of mount points. Well, we
do that in a bit clumsy way. For instance, if there's
"/dev/hugepages" already mounted inside the namespace and we are
deciding whether to bind mount "/dev/hugepages1G/..." we decide
to skip over the path and NOT bind mount it. This is because
plain STRPREFIX() is used and yes, the former is prefix of the
latter. What we need to check also is whether the next character
after the prefix is slash.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Our code relies on mount events propagating into the namespace we
create for a domain. However, there's one caveat. In v8.8.0-rc1~8
I've tried to make us detect differences in mount tables between
the namespace in which libvirtd runs and the domain namespace.
This is crucial for any mounts that happen after the domain was
started (for instance new hugetlbfs can be mounted on say
/dev/hugepages1G).
Therefore, we take a look into /proc/$(pgrep qemu)/mounts to see
what filesystems are mounted under /dev. Now, since we don't
umount the original /dev, just mount a tmpfs over it, we get all
the events (e.g. aforementioned hugetlbfs mount on
/dev/hugepages1G), but we are not really able to access it
because of the tmpfs that's placed on top. This then confuses our
algorithm for detecting which filesystems are mounted (the
algorithm is implemented in qemuDomainGetPreservedMounts()).
To break the link between host's and guest's /dev we just need to
umount() the original /dev in the namespace. Just before our
artificially created tmpfs is moved into its place.
Fixes: 46b03819ae
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151869#c6
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Inside of qemuCaps (for the corresponding accelerator) we have
full host CPU expansion stored, among with supported Hyper-V
Enlightenments. To report them in the domain capabilities, we
just have to pick those starting with "hv-" and see if we know
them.
You may notice that neither of our domaincapsdata test shows any
enlightenment. This is because the test works by parsing
corresponding qemucapabilitiesdata/caps_*.xml file and none of
these store the full host CPU expansion (hostCPU.fullQEMU)
because that is runtime piece of information and not formatted
into virQEMUCaps XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717611
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that we have qemuMonitorGetCPUModelExpansion() aware of
Hyper-V Enlightenments, we can start querying it. Two conditions
need to be met:
1) KVM is in use,
2) Arch is either x86 or arm.
It may look like modifying the first call to
qemuMonitorGetCPUModelExpansion() inside of
virQEMUCapsProbeQMPHostCPU() would be sufficient but it is not.
We really need to ask QEMU for full expansion and the first call
does not guarantee that.
For the test data, I've just copied whatever
'query-cpu-model-expansion' returned earlier, therefore there are
no hv-* props. But that's okay - the full expansion is not stored
in cache (and thus not formatted in
tests/qemucapabilitiesdata/caps_*.replies files either). This is
purely runtime thing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This continues and finishes propagation of the @hv_passthrough
argument started in the previous commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Apart from setting @migratable prop to the
query-cpu-model-expansion command, we will need @hv-passthrough
so that we can query for expansion of Hyper-V Enlightenments
supported on the current host. The idea is to run:
{
"execute": "query-cpu-model-expansion",
"arguments": {
"type": "full",
"model": {
"name": "host",
"props": {
"hv-passthrough": true
}
}
}
}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In a recent commit, when ditching virXPathULong() the parsing of
<selfvers/> was changed. But it was changed to virXMLPropUInt()
which is not correct because the value we're interested in is not
in an attribute but element itself.
Fixes: a3c7426839
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The @hash variable inside of virQEMUCapsProbeQMPHostCPU() is used
only within a block, but declared at the beginning of the
function. Bring the variable declaration into the said block.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
After previous cleanup this function is no longer used and thus
can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When starting swtpm binary, the qemuSecurityStartTPMEmulator() is
called which sets seclabel on the TPM state and then uses
qemuSecurityCommandRun() to execute the swtpm binary with proper
seclabel. Well, the aim is to ditch
qemuSecurityStartTPMEmulator() because it entangles two distinct
operations. Just call functions for them separately.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
If swtpm binary fails to start after successful exec() (e.g. it
fails to initialize itself), the seclabels set in
qemuSecurityStartTPMEmulator() are not restored. This is due to
lacking qemuSecurityRestoreTPMLabels() call in the error path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that we have qemuSecurityRestoreTPMLabels() we might as well
have qemuSecuritySetTPMLabels(). The aim here is to remove
qemuSecurityStartTPMEmulator() which couples two separate things
into a single function call.
Therefore, introduce qemuSecuritySetTPMLabels() which does only
set seclabels on the TPM state.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The qemuSecurityCleanupTPMEmulator() function calls
virSecurityManagerRestoreTPMLabels() and thus the proper name is
qemuSecurityRestoreTPMLabels(). Rename it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Currently, qemuSecurityCleanupTPMEmulator() returns nothing which
means a caller (well, there's only one - qemuExtTPMStop()) can't
produce a warning when restoring seclabels on TPM state failed.
True, qemuSecurityCleanupTPMEmulator() does report a warning
itself, but only in one specific error path.
Make the function return an integer, just like the rest of
qemuSecurity*Restore() functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
qemu is about to deprecate the '-no-hpet' option in favor of configuring
the timer via '-machine'.
Use the QEMU_CAPS_MACHINE_HPET capability to switch to the new syntax
and mask out the old QEMU_CAPS_NO_HPET capability at the same time to
prevent using the old syntax.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The capability represents that qemu accepts the configuration of the
HPET timer via -machine hpet=on/off rather than the
soon-to-be-deprecated '-no-hpet' option.
The capability is detected from 'query-command-line-options' which
recently added the 'hpet' option.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Introduce a new backend type 'external' for connecting to a swtpm daemon
not managed by libvirtd.
Mostly in one commit, thanks to -Wswitch and the way we generate
capabilities.
https://bugzilla.redhat.com/show_bug.cgi?id=2063723
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Although the qemuMigrationSrcPerformResume actually got called
indirectly via qemuMigrationSrcPerformNative and the recovery process
worked, wrong job phases were used for the "perform" phase, which could
cause issues when libvirt daemon crashed (or was otherwise restarted)
during post-copy recovery.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It will need to be called from a place above its current definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When qemuDomainObjReleaseAsyncJob is called when the current async job
is already released we emit quite useless warning which was implemented
to warn about releasing a job owned by another thread.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The function is called even if QEMU reports migration as
postcopy-paused, i.e., it's not migrating anymore. And while changing
the warning, we can drop the part about unattended migration to make the
warning shorter and consistent with qemuMigrationSrcPostcopyFailed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There are some cases when the internal state of disks can change
without qemu sending events about it (e.g. a disk can close
during reset). In case this happens, we should emit an event
about the modified disk.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1824722#c20
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When starting a guest with <interface/> which has the target
device name set (i.e. not generated by us), it may happen that
the TAP device already exists. This then may lead to all sorts of
problems. For instance: for <interface type='network'/> the TAP
device is plugged into the network's bridge, but since the TAP
device is persistent it remains plugged there even after the
guest is shut off. We don't have a code that unplugs TAP devices
from the bridge because TAP devices we create are transient, i.e.
are removed automatically when QEMU closes their FD.
The only exception is <interface type='ethernet'/> with <target
managed='no'/> where we specifically want to let users use
pre-created TAP device and basically not touch it at all.
There's another reason for denying to use a pre-created TAP
devices: if we ever have bug in TAP name generation, we may
re-use a TAP device from another domain.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2144738
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Historically, QEMU took screenshots in PPM. While this might use
to be popular format, as of v7.1.0-rc0~125^2~6 it is possible to
take screenshots in PNG. This is more popular and renders almost
everywhere, which is not the case for PPM (for instance, modern
browsers do not render it).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'screendump' command has new argument 'format'. Let's expose
this on our QMP level so that callers can specify the format, if
they wish so.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For some reason, only @file argument is printed into debug logs.
The rest of arguments was left out. Include all arguments.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In its v7.1.0-rc0~125^2~6 commit, QEMU gained support for taking
screenshots in PNG format. Track this capability.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Internal domain state needs to be refreshed after reset from the guest
side because it may be inconsistent with the internal qemu state.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Internal domain state may change during the reset and qemu does
not always send events about it. In case it happens, internal
state of the domain in libvirt would be inconsistent with the
internal state in qemu which could cause additional problems
(e.g. cdrom tray state can change from open to closed). The
solution is to refresh state after a successful reset to query
qemu about the current internal domain state.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1824722
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Paths for external devices (well, so far only vTPM) are not
stored in the status XML. Therefore, we need to regenerate them
after we've been restarted and reconnecting to a running domain.
Otherwise these will remain NULL which may later lead to a NULL
dereference.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2150760
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function is going to be called outside of qemu_extdevice.c.
Expose it to the rest of the driver.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The path generation phase belongs conceptually into domain
preparation phase and not host preparation. Move
qemuExtDevicesInitPaths() call from qemuExtDevicesPrepareHost()
into qemuExtDevicesPrepareDomain().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The domain startup process is split into multiple phases. One of
them is preparing the domain (at that point live) XML, private
data, various paths, etc - see qemuProcessPrepareDomain(); the
other prepares the host - see qemuProcessPrepareHost(). It's
obvious that the domain XML preparation function must be called
before the host preparation function (e.g. the host preparation
might try to create a file which path is generated in the domain
preparation phase). Nevertheless, let's document this
expectation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Remove the 'cleanup' label and 'ret' variable as we can now directly
return form all cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use of qemuDomainValidateVcpuInfo in the helpers for hotplug and unplug
of vCPUs can lead to spurious errors reported such as:
internal error: qemu didn't report thread id for vcpu 'XX'"
The reason for this is that qemuDomainValidateVcpuInfo validates the
state of all vCPUs against the expected state of vCPUs. If an unplug
operation completed before libvirt was unable to process it yet the
expected state could not reflect the current state.
To avoid spurious errors the qemuDomainHotplugAddVcpu and
qemuDomainRemoveVcpu functions are modified to do localized validation
only for the vCPUs they actually modify.
We also now ensure that the cgroups are modified before bailing out on
error for any vCPUs which passed validation.
Additionally in order for qemuDomainRemoveVcpuAlias to be able to find
the unplugged vCPU we must ensure that qemuDomainRefreshVcpuInfo does
not clear out the alias in case when the vCPU is no longer reported by
qemu.
Co-authored-by: Partha Satapathy <partha.satapathy@oracle.com>
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Recently, the QEMU driver gained support for migration with TPM
state on a shared volume (e.g. NFS). As a part of that, the
destination side avoids setting seclabels on it to avoid cutting
off the source while it is still using it. Makes sense, except
for a wee bit: the secdriver API does a bit more - it also sets
label on the swtpm log file. And this one definitely needs to be
labeled (it lives under /var/log/swtpm/libvirt/qemu/..., i.e. not
on a shared volume).
Previously, qemuSecurityStartTPMEmulator() took care of that. But
during rework to shared volume migration, the code was changed so
now plain qemuSecurityCommandRun() would be run (i.e. no
relabelling).
But after previous commits, we can now chose whether the TPM
state should be relabelled or just the log file.
Fixes: 2e669ec789
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2130192#c7
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is basically just a continuation of the previous commit.
Now that the security driver APIs have a boolean flag that
controls setting/restoring seclabel of either both TPM state and
log files, or just the log file, propagate this boolean into
those APIs that start/stop swtpm emulator. For now, just pass
true. The juicy bits are soon to come.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virSecurityDomainSetTPMLabels() and
virSecurityDomainRestoreTPMLabels() APIs set/restore label on two
files/directories:
1) the TPM state (tpm->data.emulator.storagepath), and
2) the TPM log file (tpm->data.emulator.logfile).
Soon there will be a need to set the label on the log file but
not on the state. Therefore, extend these APIs for a boolean flag
that when set does both, but when unset does only 2).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use virJSONValueObjectGetArray + virJSONValueArrayToStringList instead
so that the ofvirJSONValueObjectGetStringArray wrapper can be removed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In two instances (qemuMonitorJSONGetStringListProperty,
qemuMonitorJSONGetStringArray) the return value is checked by
qemuMonitorJSONCheckReply and extracted by
virJSONValueObjectGetStringArray.
We can use qemuMonitorJSONGetReply which returns it directly and then
virJSONValueArrayToStringList to convert it without the additional
lookup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Using 'virJSONValueObjectHasKey' when we want to access the value
afterwards is wasteful. Fetch the JSON value right away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than checking that the object has the correct key and then
fetching it again use fetch the array first and then use
virJSONValueArrayToStringList to directly convert it.
Additionally we can avoid the conversion if there are no members
simplifying the surrounding logic.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'dependencies' field in the return data may be missing in some
cases. Historically 'virJSONValueObjectGetStringArray' didn't report
error in such case, but later refactor (commit 043b50b948 ) added
an error in order to use it in other places too.
Unfortunately this results in the error log being spammed with an
irrelevant error in case when qemuAgentGetDisks is invoked on a VM
running windows.
Replace the use of virJSONValueObjectGetStringArray by fetching the
array first and calling virJSONValueArrayToStringList only when we have
an array.
Fixes: 043b50b948
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2149752
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use qemuMonitorJSONGetReply and unify the two blocks with the same
condition.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use qemuMonitorJSONGetReply in cases where qemuMonitorJSONCheckReply
is followed by virJSONValueObjectGet*(reply, "return").
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Replace usage of the following pattern with the new helper:
if (qemuMonitorJSONCheckReply(cmd, reply, VIR_JSON_TYPE_ARRAY) < 0)
return -1;
data = virJSONValueObjectGetArray(reply, "return");
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Replace usage of the following pattern with the new helper:
if (qemuMonitorJSONCheckReply(cmd, reply, VIR_JSON_TYPE_OBJECT) < 0)
return -1;
data = virJSONValueObjectGetObject(reply, "return");
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than simply checking that the 'return' field is of the expected
type we can directly return it as the caller is very likely going to use
it. Extract the code into the new function and add a wrapper to preserve
old functionality.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Don't continue with the historical mistake and fix all internal
functions to use a sane type for flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virCommandSetSendBuffer() function consumes passed @buffer,
but takes it only as plain pointer. Switch to a double pointer to
make this obvious. This allows us then to drop all
g_steal_pointer() in callers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
QEMU capabilities is the only thing we use from priv so we can just pass
that directly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When an internal API takes a vm pointer, it's usually just after the
driver argument.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The vm object is used inside qemuMigrationCookieParse based on the flags
passed to qemuMigrationCookieParse and the content of the cookie. The
callers should not just blindly guess and pass NULL if they
(incorrectly) think the vm object is not needed. We should always pass
the vm object unless it does not exist yet.
This fixes a bug when statistics of a completed migration reported
"Unknown" operation instead of "Incoming migration" on the destination
host.
https://bugzilla.redhat.com/show_bug.cgi?id=2137298
Fixes: v8.7.0-79-g0150f7a8c1
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Inside of qemuTPMEmulatorBuildCommand() there are two calls to
qemuTPMSetupEncryption() which simply ignore returned error. This
is suboptimal because then we rely on swtpm binary reporting a
generic error (something among invalid command line arguments)
while an error reported by qemuTPMSetupEncryption() is more
specific.
However, since virCommandSetSendBuffer() only sets an error
inside of virCommand structure (the error is then reported in
virCommandRun()), we need to exempt its retval from error
checking. Thus, the signature of qemuTPMSetupEncryption() is
changed a bit so that -1/0 can be returned to indicate error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
When there is no vIOMMU, vfio devices don't need to lock the entire guest
memory per-device, but they still need to lock the entire guest memory to
share between all vfio devices. This memory accounting is not shared
with vDPA devices, so it should be added to the memlock limit separately.
Commit 8d5704e2 added support for multiple vfio/vdpa devices but
calculated the limits incorrectly when there were both vdpa and vfio
devices and no vIOMMU. In this case, the memory lock limit was not
increased separately for the vfio devices.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143838
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
When post-copy migration is running in Finish phase we already did
everything needed and we're just waiting for all the memory to transfer
to the destination. The domain is already running on there at this
point. Once all data is transferred (QEMU sends a MIGRATION completed
event) we're done. So in this specific post-copy case the source does
not need to care about the result of the Finish call as long as QEMU
says migration completed. The Finish call to the destination daemon may
fail for reasons that do not affect QEMU, e.g., libvirt daemon was
restarted there or the libvirt connection broke.
Currently we just mark the post-copy migration as failed on the source
and keep the domain paused there. But when libvirt daemon is restarted
at this point, it will detect migration finished successfully and kill
the domain as migrated. It make sense to do this even without having to
restart the daemon.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/338
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We need the restored job even in case the migration already finished
even though we will stop it just a few lines below as the functions we
call in between require an existing migration job.
This fixes a crash on reconnect when post-copy migration finished while
the daemon was not running.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When generating memory for main guest memory memory-backend-*
might be used. This means, we may need to generate thread-context
objects too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When generating memory for memory devices memory-backend-* might
be used. This means, we may need to generate thread-context
objects too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When generating memory for guest NUMA memory-backend-* might be
used. This means, we may need to generate thread-context objects
too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
While technically thread-context objects can be reused, we only
use them (well, will use them) to pin memory allocation threads.
Therefore, once we connect to QEMU monitor, all memory (with
prealloc=yes) was allocated and thus these objects are no longer
needed and can be removed. For on demand allocation the TC object
is left behind.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The aim of thread-context object is to set affinity on threads
that allocate memory for a memory-backend-* object. For instance:
-object '{"qom-type":"thread-context","id":"tc-ram-node0","node-affinity":[3]}' \
-object '{"qom-type":"memory-backend-memfd","id":"ram-node0","hugetlb":true,\
"hugetlbsize":2097152,"share":true,"prealloc":true,"prealloc-threads":8,\
"size":15032385536,"host-nodes":[3],"policy":"preferred",\
"prealloc-context":"tc-ram-node0"}' \
allocates 14GiB worth of memory, backed by 2MiB hugepages from
host NUMA node 3, using 8 threads. If it weren't for
thread-context these threads wouldn't have any affinity and thus
theoretically could be scheduled to run on CPUs of different NUMA
node (which is what I saw occasionally).
Therefore, whenever we are pinning memory (IOW setting host-nodes
attribute), we can generate thread-context object with the same
affinity.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In its commit v7.1.0-1429-g7208429223 QEMU gained new object
thread-context, which allows running specialized tasks with
affinity set to a given subset of host CPUs/NUMA nodes. Even
though only memory allocation task accepts this new object, it's
exactly what we aim to implement in libvirt. Therefore, introduce
a new capability to track whether QEMU is capable of this object.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In one of recent commits an error message was introduced. In this
message a variable of type ssize_t is being printed out, but the
corresponding format directive is %ld instead of %zd which breaks
on 32bits systems. Switch to proper format.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
According to the result parsing from xml, add the argument of
SGX EPC memory backend into QEMU command line.
$ qemu-system-x86_64 \
...... \
-object '{"qom-type":"memory-backend-epc","id":"memepc0","prealloc":true,"size":67108864,"host-nodes":[0,1],"policy":"bind"}' \
-object '{"qom-type":"memory-backend-epc","id":"memepc1","prealloc":true,"size":16777216,"host-nodes":[2,3],"policy":"bind"}' \
-machine sgx-epc.0.memdev=memepc0,sgx-epc.0.node=0,sgx-epc.1.memdev=memepc1,sgx-epc.1.node=1
Signed-off-by: Lin Yang <lin.a.yang@intel.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is similar to the previous commit. SGX memory backend needs
to access /dev/sgx_vepc and /dev/sgx_provision. Create these
nodes in domain's private /dev when required by domain's config.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
SGX memory backend needs to access /dev/sgx_vepc (which allows
userspace to allocate "raw" EPC without an associated enclave)
and /dev/sgx_provision (which allows creating provisioning
enclaves). Allow these two devices in CGroups if a domain is
configured so.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Extend hypervisor capabilities to include sgx feature. When available,
the hypervisor supports launching an VM with SGX on Intel platfrom.
The SGX feature tag privides additional details like section size and
sgx1 or sgx2.
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Generate the QMP command for query-sgx-capabilities and the command
return SGX capabilities from QMP.
{"execute":"query-sgx-capabilities"}
the right reply:
{"return":
{
"sgx": true,
"section-size": 197132288,
"flc": true
}
}
the error reply:
{"error":
{"class": "GenericError", "desc": "SGX is not enabled in KVM"}
}
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
JSON args for -netdev were added as precursor for adding the 'dgram'
network backend type. Enable the detection and update test cases using
DO_TEST_CAPS_LATEST.
Enabling the capability also ensures that the -netdev argument is
validated against the QAPI schema of 'netdev_add' which was already
implemented but not enabled.
The parser supporting JSON was added by qemu commit f3eedcddba3 and
enabled when adding stream/dgram netdevs in commit 5166fe0ae46.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
All callers pass the equivalent of looking up whether qemu supports
QEMU_CAPS_QMP_QUERY_NAMED_BLOCK_NODES_FLAT. Use
'mon->queryNamedBlockNodesFlat' directly and refactor all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'query-named-block-nodes' in non-flat mode returns redundantly nested
data under the 'backing-image' field. Fortunately we don't need it when
updating the capacity stats.
This function was unfortunately not fixed originally when the support
for flat mode was added. Use the flat cached in the monitor object to
force flat mode if available.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than having callers always pass this flag store it in the
qemuMonitor object. Following patches will convert the code to use this
internal flag.
In the future this will also simplify removal when all supported qemu
versions will support the new mode.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We don't need automatic freeing for 'blockNamedNodeData' and we can
directly return it rather than checking it for NULL-ness first.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemu-6.2 introduced support for the hv-avic enlightenment which allows
to use Hyper-V SynIC with hardware APICv/AVIC enabled.
Implement the libvirt support for it.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/402
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In recent commits migration of TPM on shared storage was
introduced. However, I've only complied it with gcc and thus did
not notice that clang build fails due to missing break; at the
end of some (empty) cases in switch() statements.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Never remove the TPM state on outgoing migration if the storage setup
has shared storage for the TPM state files. Also, do not do the security
cleanup on outgoing migration if shared storage is detected.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When using shared storage there is no need to apply security labels on the
storage since the files have to have been labeled already on the source
side and we must assume that the source and destination side have been
setup to use the same uid and gid for running swtpm as well as share the
same security labels. Whether the security labels can be used at all
depends on the shared storage and whether and how it supports them.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Pass the --migration option to swtpm if swptm supports it (starting
with v0.8) and if the TPM's state is written on shared storage. If this
is the case apply the 'release-lock-outgoing' parameter with this
option and apply the 'incoming' parameter for incoming migration so that
swtpm releases the file lock on the source side when the state is migrated
and locks the file on the destination side when the state is received.
If a started swtpm instance is running with the necessary options of
migrating with share storage then remember this with a flag in the
virDomainTPMPrivateDef.
Report an error if swtpm does not support the --migration option and an
incoming migration across shared storage is requested.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add support for storing private TPM-related data. The first private data
will be related to the capability of the started swtpm indicating whether
it is capable of migration with a shared storage setup since that requires
support for certain command line flags that were only becoming available
in v0.8.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Do not create storage if the TPM state files are on shared storage and
there's an incoming migration since in this case the storage directory
must already exist. Also do not run swtpm_setup in this case.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
New qemuTPMHasSharedStorage() function is introduced which
returns whether the swtpm state directory is on a shared
filesystem (e.g. NFS).
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v2.7.0-rc0~32^2~5 the .write-cache
attribute of virtio-blk dvice is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.
The change in some .args is justified, because the qemuxml2argvdatatest
runs these test caseses with very minimalistic set of capabilities,
that's nowhere near real life scenario.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v2.9.0-rc0~48^2~25 the .share-rw
attribute of virtio-blk device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.
The change in controller-order.args is justified, because the
qemuxml2argvdatatest runs the test case with very minimalistic
set of capabilities, that's nowhere near real life scenario.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v2.7.0-rc0~83^2 the .num-queues
attribute of virtio-blk device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v0.13.0-rc0~1072 the
.logical_block_size attribute of virtio-blk device is always
available for all QEMU versions we support (4.2.0, currently).
Therefore, we can assume the capability is always set and thus
doesn't need to be checked for.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v4.2.0-rc0~23^2~4 the .failover
attribute of virtio-net device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v2.9.0-rc0~162^2~10 the .host_mtu
attribute of virtio-net device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that nothing uses this capability, it can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs have this capability. Stop detecting it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in QEMU's commit of v2.10.0-rc0~95^2~20 the
.tx_queue_size attribute of virtio-net device is always available
for all QEMU versions we support (4.2.0, currently). Therefore,
we can assume the capability is always set and thus doesn't need
to be checked for.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>