If the daemon crashes or is restarted while the snapshot delete is in
progress we have to handle it gracefully to not leave any block jobs
active.
For now we will simply abort the snapshot delete operation so user can
start it again. We need to refuse deleting external snapshots if there
is already another active job as we would have to figure out which jobs
we can abort.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When daemon is restarted and libvirt tries to recover domain jobs we
need to know if the snapshot job was a snapshot delete in order to
safely abort running QEMU block jobs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting external snapshots the operation may fail at any point
which could lead to situation that some disks finished the block commit
operation but for some disks it failed and the libvirt job ends.
In order to make sure that the qcow2 images are in consistent state
introduce new element "<snapshotDeleteInProgress/>" that will mark the
disk in snapshot metadata as invalid until the snapshot delete is
completed successfully.
This will prevent deleting snapshot with the invalid disk and in future
reverting to snapshot with the invalid disk.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
With external snapshots we need to modify the metadata bit more then
what is required for internal snapshots. Mainly the storage source
location changes with every external snapshot.
This means that if we delete non-leaf snapshot we need to update all
children snapshots and modify the disk sources for all affected disks.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting snapshot we are starting block-commit job over all disks
that are part of the snapshot.
This operation may fail as it writes data changes to the backing qcow2
image so we need to wait for all the disks to finish the operation and
wait for correct signal from QEMU. If deleting active snapshot we will
get `ready` signal and for inactive snapshots we need to disable
autofinalize in order to get `pending` signal.
At this point if commit for any disk fails for some reason and we abort
the VM is still in consistent state and user can fix the reason why the
deletion failed.
After that we do `pivot` or `finalize` if it's active snapshot or not to
finish the block job. It still may fail but there is nothing else we can
do about it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In order to save some CPU cycles we will collect all the necessary data
to delete external snapshot before we even start. They will be later
used by code that deletes the snapshots and updates metadata when
needed.
With external snapshots we need data that libvirt gets from running QEMU
process so if the VM is not running we need to start paused QEMU process
for the snapshot deletion and kill at afterwards.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting external snapshots will require to run it as async domain job,
the same way as we do for snapshot creation.
For internal snapshots modify the job mask in order to forbid any other
job to be started.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting internal snapshot when the currently active disk image is
different than where the internal snapshot was taken doesn't work
correctly.
This applies to a running VM only as we are using QMP command and
talking to the QEMU process that is using different disk.
This works correctly when the VM is shut of as in this case we spawn
qemu-img process to delete the snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Prepare the validation function for external snapshot delete support.
There is one exception when deleting `children-only` snapshots. If the
snapshot tree is like this example:
snap1 (external)
|
+- snap2 (internal)
|
+- snap3 (internal)
|
+- snap4 (internal)
and user calls `snapshot-delete snap1 --children-only` the current
snapshot is external but all the children snapshots are internal only
and we are able to delete it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract the code deleting external snapshot metadata to separate
function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Previously the reparent happened before the actual snapshot deletion.
This change moves the code closer to the rest of the code handling
snapshot metadata when deletion happens. This makes the metadate
deletion happen after the data files are deleted.
Following patch will extract it into separate function
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This simplifies the code a bit by reusing existing parts that deletes
a single snapshot.
The drawback of this change is that we will now call the re-parent bits
to keep the metadata in sync for every child even though it will get
deleted as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract code that deletes children of specific snapshot to separate
function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract code that deletes single snapshot to separate function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Move code around to make it clear what is called when deleting single
snapshot or children snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Looks up disk storage source within storage source chain using storage
source object instead of path to make it work with all disk types.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
QEMU emits this signal when the job finished its work and is about to be
finalized. If the job is started with autofinalize disabled the job
waits for user input to finalize the job.
This will be used by snapshot delete code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The created job will be needed by external snapshot delete code so
rework qemuBlockCommit to return that pointer.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
External snapshots will use this to synchronize qemu block jobs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting external snapshots will require configuring autofinalize to
synchronize the block jobs for disks withing single snapshot in order to
be able safely abort of one of the jobs fails.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Upcoming snapshot deletion code will require that multiple commit jobs
are finished in sync. To allow aborting then if one fails we will need
to use manual finalization of the jobs.
This commit implements the monitor code for `job-finalize`.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow to use it while having async domain job active which we
will use when deleting external snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow to use it while having async domain job active which we
will use when deleting external snapshots. At the same time we will need
to have the block job started as synchronous.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Move the code for finishing a job in the ready state to qemu_block.c.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Up until commit 629282d884, using mode=restrictive caused
virNumaSetupMemoryPolicy() to be called from qemuProcessHook(),
and that in turn resulted in virNumaNodesetIsAvailable() being
called and the nodeset being validated.
After that change, the only validation for the nodeset is the one
happening in qemuBuildMemoryBackendProps(), which is skipped when
using mode=restrictive.
Make sure virNumaNodesetIsAvailable() is called whenever a
nodeset has been provided by the user, regardless of the mode.
https://bugzilla.redhat.com/show_bug.cgi?id=2156289
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When post-copy migration fails, the domain stays running on the
destination with a VIR_DOMAIN_RUNNING_POSTCOPY_FAILED reason. Both the
state and the reason can later be rewritten in case the domain gets
paused for other reasons (such as an I/O error). Thus we need a separate
place to remember the post-copy migration failed to be able to resume
the migration.
https://bugzilla.redhat.com/show_bug.cgi?id=2111948
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The parameter was only used to select which states correspond to an
active or failed post-copy migration. But these states are either
applicable to both operations or the check would just paper over a code
bug in case of an impossible combination of state and operation. By
dropping the check we can make the code simpler and also reuse existing
virDomainObjIsFailedPostcopy function and only check for active
post-copy states.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Convert to a switch instead of a bunch of 'if (type == ...).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function currently didn't have a return value. Returning the
'virLockGuard' struct allows the callers to use automatic unlocking of
the mutex.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Require check of return value of the ACL checking functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Now that all code was refactored to use the new version we can remove
the old code.
For now the new close callbacks code has no error messages so
syntax-check forced me to remove the POTFILES entry for
virclosecallbacks.c
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The qemu driver uses connection close callbacks in more places requiring
more changes than other drivers, but luckily the changes are very
straightforward. The migration code was written in a way ensuring that
there's just one callback present so this can be preserved directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The rewrite is straightforward as bhyve registers only the
'bhyveProcessAutoDestroy' callback which by design doesn't need any
special handling (there's just one caller which can start the VM thus
implicitly there's only one possible registration for that function).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The rewrite is straightforward as LXC registers only the
'lxcProcessAutoDestroy' callback which by design doesn't need any
special handling (there's just one caller which can start the VM thus
implicitly there's only one possible registration for that function).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new APIs store the list of callbacks for a VM inside the
virDomainObj and also allow registering multiple callbacks for a single
domain and also for multiple connections.
For now this code is dormant until each driver using the old APIs is not
refactored to use the new APIs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new connect close callbacks for domains will be represented by a
virObject associated with the domain object itself.
To simplify handling the pointer to the close callback data will be done
by an immutable pointer allocated directly when allocating the
corresponding virDomainObj struct.
This patch adds the 'closecallbacks' field to virDomainObj and a
corresponding callback to allocate it into virDomainXMLOption.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function can't fail so there's no point in returning anything.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Introduce a helper which will return a list of all domain objects inside
of the list without filtering and thus without the need to lock
individual members.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Use the new style which doesn't require re-aligning the argument list
once you change the return type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Remove extraneous spaces and put comment on a single line.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
'virObjectNew' can't return NULL. If we pre-check the arguments we don't
need a cleanup label.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Coverity scan reports:
"A time_t value is stored in an integer with too few bits to accommodate
it. The expression timeout is cast to unsigned int"
We are already casting and storing time_t timeout variable into unsigned int.
We can use time_t for timeout and cast it to unsigned long (should be big enough)
instead of unsigned int in sscanf, g_strdup_printf as required.
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
1.clear passwd in debug log
2.alignment
3.use the same variable name for function definition and declaration
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
These while loops exit directly due to break after entering.
Use if instead of these while loops.
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fix a misspelling in the documation of 'daemonCreateClientStream'.
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In a recent commit I've introduced an umount() call. But the
function where the call lives is compiled on all OSes, not just
Linux. But umount() is Linux specific. Other OSes have unmount
(FreeBSD), or maybe something else. But since namespaces are
Linux specific, we can wrap the call in #ifdef __linux__ and not
care about other OSes.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When calling virConnectGetDomainCapabilities() (exposed as virsh
domcapabilities) users have option to specify whatever sub-set of
{ emulatorbin, arch, machine, virttype } they want. Then we have
a logic (hidden in virQEMUCapsCacheLookupDefault()) that picks
qemuCaps that satisfy values passed by user. And whatever was not
specified is then set to the default value as specified by picked
qemuCaps. For instance: if no machine type was provided but
emulatorbin was, then the machine type is set to the default one
as defined by the emulatorbin.
Or, when just virttype was set then the remaining three values
are set to their respective defaults. Except, we have a crasher
in this case:
# virsh domcapabilities --virttype hvf
error: Disconnected from qemu:///system due to end of file
error: failed to get emulator capabilities
error: End of file while reading data: Input/output error
This is because for 'hvf' virttype (at least my) QEMU does not
have any machine type. Therefore, @machine is set to NULL and the
rest of the code does not expect that.
What we can do about this is to validate all arguments. Well,
except for the emulatorbin which is obtained from passed
qemuCaps. This also fixes the issue when domcapabilities for a
virttype of a different driver are requested, or a different
arch.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When deciding whether to bind mount a path in domain's namespace,
we look at the QEMU mount table (/proc/$pid/mounts) and try to
match prefix of given path with one of mount points. Well, we
do that in a bit clumsy way. For instance, if there's
"/dev/hugepages" already mounted inside the namespace and we are
deciding whether to bind mount "/dev/hugepages1G/..." we decide
to skip over the path and NOT bind mount it. This is because
plain STRPREFIX() is used and yes, the former is prefix of the
latter. What we need to check also is whether the next character
after the prefix is slash.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Our code relies on mount events propagating into the namespace we
create for a domain. However, there's one caveat. In v8.8.0-rc1~8
I've tried to make us detect differences in mount tables between
the namespace in which libvirtd runs and the domain namespace.
This is crucial for any mounts that happen after the domain was
started (for instance new hugetlbfs can be mounted on say
/dev/hugepages1G).
Therefore, we take a look into /proc/$(pgrep qemu)/mounts to see
what filesystems are mounted under /dev. Now, since we don't
umount the original /dev, just mount a tmpfs over it, we get all
the events (e.g. aforementioned hugetlbfs mount on
/dev/hugepages1G), but we are not really able to access it
because of the tmpfs that's placed on top. This then confuses our
algorithm for detecting which filesystems are mounted (the
algorithm is implemented in qemuDomainGetPreservedMounts()).
To break the link between host's and guest's /dev we just need to
umount() the original /dev in the namespace. Just before our
artificially created tmpfs is moved into its place.
Fixes: 46b03819ae
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151869#c6
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Inside of qemuCaps (for the corresponding accelerator) we have
full host CPU expansion stored, among with supported Hyper-V
Enlightenments. To report them in the domain capabilities, we
just have to pick those starting with "hv-" and see if we know
them.
You may notice that neither of our domaincapsdata test shows any
enlightenment. This is because the test works by parsing
corresponding qemucapabilitiesdata/caps_*.xml file and none of
these store the full host CPU expansion (hostCPU.fullQEMU)
because that is runtime piece of information and not formatted
into virQEMUCaps XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717611
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that we have qemuMonitorGetCPUModelExpansion() aware of
Hyper-V Enlightenments, we can start querying it. Two conditions
need to be met:
1) KVM is in use,
2) Arch is either x86 or arm.
It may look like modifying the first call to
qemuMonitorGetCPUModelExpansion() inside of
virQEMUCapsProbeQMPHostCPU() would be sufficient but it is not.
We really need to ask QEMU for full expansion and the first call
does not guarantee that.
For the test data, I've just copied whatever
'query-cpu-model-expansion' returned earlier, therefore there are
no hv-* props. But that's okay - the full expansion is not stored
in cache (and thus not formatted in
tests/qemucapabilitiesdata/caps_*.replies files either). This is
purely runtime thing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This continues and finishes propagation of the @hv_passthrough
argument started in the previous commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Apart from setting @migratable prop to the
query-cpu-model-expansion command, we will need @hv-passthrough
so that we can query for expansion of Hyper-V Enlightenments
supported on the current host. The idea is to run:
{
"execute": "query-cpu-model-expansion",
"arguments": {
"type": "full",
"model": {
"name": "host",
"props": {
"hv-passthrough": true
}
}
}
}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The virDomainCapsEnumFormat() function does not return anything
but zero and none of its callers is interested in the failure
anyways. Switch its return type from integer to void.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We are formatting <enum/> element and its children using
virBufferAddLit(), virBufferAsprintf(), virBufferAdjustIndent(),
etc. Well, we can avoid that when switching to
virXMLFormatElement().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In a recent commit, when ditching virXPathULong() the parsing of
<selfvers/> was changed. But it was changed to virXMLPropUInt()
which is not correct because the value we're interested in is not
in an attribute but element itself.
Fixes: a3c7426839
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Since we really only need to handle key skipping in the top level object
the caller doesn't at this point even pass it to the array formatting
helper function. Remove the unused argument.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Skipping of a specific key is needed only for the top level object to
specially handle the object type. We must not pass it to any recursed
printing of nested objects as skipping keys there might be surprising
and also is unhandlable later when formatting the commandline.
Until now this did not pose a problem but was discovered when adding a
new netdev backend which has a nested config object which also has the
'type' key which was being skipped.
Modern usage will prefer JSON directly but fix the commandline generator
to prevent surprises.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The @hash variable inside of virQEMUCapsProbeQMPHostCPU() is used
only within a block, but declared at the beginning of the
function. Bring the variable declaration into the said block.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There's nothing qemu specific about
qemuDomainCapsFeatureFormatSimple() and in fact, the function
lives in hypervisor agnostic location and thus mustn't have qemu
prefix.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
After previous cleanup this function is no longer used and thus
can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When starting swtpm binary, the qemuSecurityStartTPMEmulator() is
called which sets seclabel on the TPM state and then uses
qemuSecurityCommandRun() to execute the swtpm binary with proper
seclabel. Well, the aim is to ditch
qemuSecurityStartTPMEmulator() because it entangles two distinct
operations. Just call functions for them separately.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
If swtpm binary fails to start after successful exec() (e.g. it
fails to initialize itself), the seclabels set in
qemuSecurityStartTPMEmulator() are not restored. This is due to
lacking qemuSecurityRestoreTPMLabels() call in the error path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that we have qemuSecurityRestoreTPMLabels() we might as well
have qemuSecuritySetTPMLabels(). The aim here is to remove
qemuSecurityStartTPMEmulator() which couples two separate things
into a single function call.
Therefore, introduce qemuSecuritySetTPMLabels() which does only
set seclabels on the TPM state.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The qemuSecurityCleanupTPMEmulator() function calls
virSecurityManagerRestoreTPMLabels() and thus the proper name is
qemuSecurityRestoreTPMLabels(). Rename it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Currently, qemuSecurityCleanupTPMEmulator() returns nothing which
means a caller (well, there's only one - qemuExtTPMStop()) can't
produce a warning when restoring seclabels on TPM state failed.
True, qemuSecurityCleanupTPMEmulator() does report a warning
itself, but only in one specific error path.
Make the function return an integer, just like the rest of
qemuSecurity*Restore() functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
qemu is about to deprecate the '-no-hpet' option in favor of configuring
the timer via '-machine'.
Use the QEMU_CAPS_MACHINE_HPET capability to switch to the new syntax
and mask out the old QEMU_CAPS_NO_HPET capability at the same time to
prevent using the old syntax.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The capability represents that qemu accepts the configuration of the
HPET timer via -machine hpet=on/off rather than the
soon-to-be-deprecated '-no-hpet' option.
The capability is detected from 'query-command-line-options' which
recently added the 'hpet' option.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
That way it actually fits with what the condition checks for.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Our secret driver divides secrets into two groups: ephemeral
(stored only in memory) and persistent (stored on disk). Now, the
aim of ephemeral secrets is to define them shortly before being
used and then undefine them. But 'shortly before being used' is a
very vague time frame. And since we default to socket activation
and thus pass '--timeout 120' to every daemon it may happen that
just defined ephemeral secret is gone among with the virtsecretd.
This is no problem for persistent secrets as their definition
(and value) is restored when the virtsecretd starts again, but
ephemeral secrets can't be restored.
Therefore, we could view ephemeral secrets as active objects that
the daemon manages and thus inhibit automatic shutdown (just like
hypervisor daemons do when a guest is running).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Xen 4.17 has strict parsing of 'soundhw' option that allows only
specific values (instead of passing through any value directly to
qemu's -soundhw option, it uses -device now). For 'intel-hda' audio
device, it requires "hda" string. "hda" works with older libxl too.
Other supported models are the same as in libvirt XML.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Xen supports only subset of libvirt's sound devices, and starting with
Xen 4.17 it is enforced by libxl. Verify it early.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Automatically free 'sec' and remove the 'cleanup' section and 'ret'
variables.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virStorageBackendRBDRADOSConfSet' logs its arguments but it's also
used to set the RBD secret/key.
All the security theatre with securely erasing the string we do to fetch
the secret would be quite pointless if we log it thus introduce
virStorageBackendRBDRADOSConfSetQuiet and use it to avoid logging the
password.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The initialization vector is not optional thus we also don't need to
check whether the caller passed it in. Additionally we can use c99
initializers for the gnutls_datum_t structs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'gnutls_datum_t' simply holds pointers to the encryption key and its
length. There's absolutely no point in securely erasing that.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce a new backend type 'external' for connecting to a swtpm daemon
not managed by libvirtd.
Mostly in one commit, thanks to -Wswitch and the way we generate
capabilities.
https://bugzilla.redhat.com/show_bug.cgi?id=2063723
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In the past, the preferred policy
(VIR_DOMAIN_NUMATUNE_MEM_PREFERRED) required exactly one (host)
NUMA node. This made sense because:
1) the libnuma API - numa_set_preferred() allowed exactly one
node, because
2) corresponding kernel syscall (__NR_set_mempolicy) accepted
exactly one node (for MPOL_PREFERRED mode).
But things have changed since then. Firstly, kernel introduced
new MPOL_PREFERRED_MANY mode (v5.15-rc1~107^2~21) which was then
exposed in libnuma as numa_set_preferred_many() (v2.0.15~24).
Fortunately, libnuma also exposes numa_has_preferred_many() which
returns whether the kernel has support for the new mode (1) or
not (0).
Putting this all together, we can lift our check for sufficiently
new kernel and libnuma.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151064
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Although the qemuMigrationSrcPerformResume actually got called
indirectly via qemuMigrationSrcPerformNative and the recovery process
worked, wrong job phases were used for the "perform" phase, which could
cause issues when libvirt daemon crashed (or was otherwise restarted)
during post-copy recovery.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It will need to be called from a place above its current definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When qemuDomainObjReleaseAsyncJob is called when the current async job
is already released we emit quite useless warning which was implemented
to warn about releasing a job owned by another thread.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The function is called even if QEMU reports migration as
postcopy-paused, i.e., it's not migrating anymore. And while changing
the warning, we can drop the part about unattended migration to make the
warning shorter and consistent with qemuMigrationSrcPostcopyFailed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There are some cases when the internal state of disks can change
without qemu sending events about it (e.g. a disk can close
during reset). In case this happens, we should emit an event
about the modified disk.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1824722#c20
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
While only a couple of the message types include sensitive data,
the overhead of calling secure erase is not noticable enough
to worry about making the erasure selective per type. Thus it is
simplest to unconditionally securely erase the buffer.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The buffer length refers to the allocated buffer memory size,
while the offset refers to have much of the buffer we have
read/written. After reading the message payload we must thus
update the latter.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is available on at least FreeBSD and GLibc >= 2.25.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The BPF_CGROUP_DEVICE constant was introduced to Linux in
commit ebc614f687369f9df99828572b1d85a7c2de3d92
Author: Roman Gushchin <roman.gushchin@linux.dev>
Date: Sun Nov 5 08:15:32 2017 -0500
bpf, cgroup: implement eBPF-based device controller for cgroup v2
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The BPF_PROG_QUERY constant was introduced to Linux in
commit defd9c476fa6b01b4eb5450452bfd202138decb7
Author: Alexei Starovoitov <ast@kernel.org>
Date: Mon Oct 2 22:50:26 2017 -0700
libbpf: sync bpf.h
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The VHOST_VSOCK_SET_GUEST_CID constant was introduced to Linux in
commit 433fc58e6bf2c8bd97e57153ed28e64fd78207b8
Author: Asias He <asias@redhat.com>
Date: Thu Jul 28 15:36:34 2016 +0100
VSOCK: Introduce vhost_vsock.ko
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The linux/magic.h header has existed since
commit e18fa700c9a31360bc8f193aa543b7ef7b39a06b
Author: Jeff Garzik <jeff@garzik.org>
Date: Sun Sep 24 11:13:19 2006 -0400
Move several *_SUPER_MAGIC symbols to include/linux/magic.h.
This is old enough that all our supported platforms can be assumed
to have this header.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The DEVLINK_CMD_ESWITCH_GET constant was introduced to Linux in
commit adf200f31c000d707e4afe238ed1d1199e0cce7c
Author: Jiri Pirko <jiri@mellanox.com>
Date: Thu Feb 9 15:54:33 2017 +0100
devlink: fix the name of eswitch commands
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The GET_VLAN_VID_CMD constant has existed since before Linux moved
to git.
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETHTOOL_GCOALESCE constant has existed since before Linux moved
to git.
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETHTOOL_GFEATURES constant was introduced to Linux in
commit 5455c6998d34dc983a8693500e4dffefc3682dc5
Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date: Tue Feb 15 16:59:17 2011 +0000
net: Introduce new feature setting ops
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETH_FLAG_RXHASH constant was introduced to Linux in
commit b00fabb4020d17bda4bea59507e09fadf573088d
Author: stephen hemminger <shemminger@vyatta.com>
Date: Mon Mar 29 14:47:27 2010 +0000
netdev: ethtool RXHASH flag
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETH_FLAG_NTUPLE constant was introduced to Linux in
commit 15682bc488d4af8c9bb998844a94281025e0a333
Author: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Date: Wed Feb 10 20:03:05 2010 -0800
ethtool: Introduce n-tuple filter programming support
This is old enough that all our supported platforms can be assumed
to have this feature.
A typo in the existing condition "NTUBLE" instead of "NTUPLE" meant the
code was never enabled in the first place, which is an illustration of
why it is worth eliminating redundant conditional checks.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETH_FLAG_LRO constant was introduced to Linux in
commit 3ae7c0b2e3747b50c3a6c63ebb67469e0a6b3203
Author: Jeff Garzik <jeff@garzik.org>
Date: Wed Aug 15 16:00:51 2007 -0700
[ETHTOOL]: Add ETHTOOL_[GS]FLAGS sub-ioctls
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETHTOOL_GFLAGS constant was introduced to Linux in
commit 3ae7c0b2e3747b50c3a6c63ebb67469e0a6b3203
Author: Jeff Garzik <jeff@garzik.org>
Date: Wed Aug 15 16:00:51 2007 -0700
[ETHTOOL]: Add ETHTOOL_[GS]FLAGS sub-ioctls
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETHTOOL_GGRO constant was introduced to Linux in
commit b240a0e5644eb817c4a397098a40e1ad42a615bc
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon Dec 15 23:44:31 2008 -0800
ethtool: Add GGRO and SGRO ops
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The ETHTOOL_GGSO constant was introduced to Linux in
commit 37c3185a02d4b85fbe134bf5204535405dd2c957
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu Jun 22 03:07:29 2006 -0700
[NET]: Added GSO toggle
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
ethtool is a Linux specific feature that has existed since before Linux
moved to git. Checking against SIOCETHTOOL + WITH_STRUCT_IFREQ is
overkill for our needs.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The LO_FLAGS_AUTOCLEAR constant was introduced to Linux in
commit 96c5865559cee0f9cbc5173f3c949f6ce3525581
Author: David Woodhouse <dwmw2@infradead.org>
Date: Wed Feb 6 01:36:27 2008 -0800
Allow auto-destruction of loop devices
This is old enough that all our supported platforms can be assumed
to have this feature. For added fun this whole meson check was
semantically insane because EPOLL_CLOEXEC is not a valid arg
to unshare().
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The LOOP_CTL_GET_FREE constant was introduced to Linux in
commit 770fe30a46a12b6fb6b63fbe1737654d28e84844
Author: Kay Sievers <kay.sievers@vrfy.org>
Date: Sun Jul 31 22:08:04 2011 +0200
loop: add management interface for on-demand device allocation
This is old enough that all our supported platforms can be assumed
to have this feature. As a plus point, this meson check is going
to start failing with future GCC. It fails to set _GNU_SOURCE, thus
'unshare' is not defined by the header, and its relying on an
implicit function decl. For added fun this whole meson check was
semantically insane because LOOP_CTL_GET_FREE is not a valid arg
to unshare().
Fixes https://fedoraproject.org/wiki/Toolchain/PortingToModernC
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When starting a guest with <interface/> which has the target
device name set (i.e. not generated by us), it may happen that
the TAP device already exists. This then may lead to all sorts of
problems. For instance: for <interface type='network'/> the TAP
device is plugged into the network's bridge, but since the TAP
device is persistent it remains plugged there even after the
guest is shut off. We don't have a code that unplugs TAP devices
from the bridge because TAP devices we create are transient, i.e.
are removed automatically when QEMU closes their FD.
The only exception is <interface type='ethernet'/> with <target
managed='no'/> where we specifically want to let users use
pre-created TAP device and basically not touch it at all.
There's another reason for denying to use a pre-created TAP
devices: if we ever have bug in TAP name generation, we may
re-use a TAP device from another domain.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2144738
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
A caller might be interested in the case when @ifname was already
set and it wasn't a template. In such case the
virNetDevGenerateName() does not touch the @ifname at all and
returns 0 to indicate success. Make it return 1 to distinguish
this case from the other case, in which a new name was generated.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Historically, QEMU took screenshots in PPM. While this might use
to be popular format, as of v7.1.0-rc0~125^2~6 it is possible to
take screenshots in PNG. This is more popular and renders almost
everywhere, which is not the case for PPM (for instance, modern
browsers do not render it).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'screendump' command has new argument 'format'. Let's expose
this on our QMP level so that callers can specify the format, if
they wish so.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For some reason, only @file argument is printed into debug logs.
The rest of arguments was left out. Include all arguments.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In its v7.1.0-rc0~125^2~6 commit, QEMU gained support for taking
screenshots in PNG format. Track this capability.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Internal domain state needs to be refreshed after reset from the guest
side because it may be inconsistent with the internal qemu state.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Internal domain state may change during the reset and qemu does
not always send events about it. In case it happens, internal
state of the domain in libvirt would be inconsistent with the
internal state in qemu which could cause additional problems
(e.g. cdrom tray state can change from open to closed). The
solution is to refresh state after a successful reset to query
qemu about the current internal domain state.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1824722
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Paths for external devices (well, so far only vTPM) are not
stored in the status XML. Therefore, we need to regenerate them
after we've been restarted and reconnecting to a running domain.
Otherwise these will remain NULL which may later lead to a NULL
dereference.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2150760
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function is going to be called outside of qemu_extdevice.c.
Expose it to the rest of the driver.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The path generation phase belongs conceptually into domain
preparation phase and not host preparation. Move
qemuExtDevicesInitPaths() call from qemuExtDevicesPrepareHost()
into qemuExtDevicesPrepareDomain().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The domain startup process is split into multiple phases. One of
them is preparing the domain (at that point live) XML, private
data, various paths, etc - see qemuProcessPrepareDomain(); the
other prepares the host - see qemuProcessPrepareHost(). It's
obvious that the domain XML preparation function must be called
before the host preparation function (e.g. the host preparation
might try to create a file which path is generated in the domain
preparation phase). Nevertheless, let's document this
expectation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is now unused. Remove it to dissuade anybody from trying to
use it in the future.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Remove the 'cleanup' label and 'ret' variable as we can now directly
return form all cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use of qemuDomainValidateVcpuInfo in the helpers for hotplug and unplug
of vCPUs can lead to spurious errors reported such as:
internal error: qemu didn't report thread id for vcpu 'XX'"
The reason for this is that qemuDomainValidateVcpuInfo validates the
state of all vCPUs against the expected state of vCPUs. If an unplug
operation completed before libvirt was unable to process it yet the
expected state could not reflect the current state.
To avoid spurious errors the qemuDomainHotplugAddVcpu and
qemuDomainRemoveVcpu functions are modified to do localized validation
only for the vCPUs they actually modify.
We also now ensure that the cgroups are modified before bailing out on
error for any vCPUs which passed validation.
Additionally in order for qemuDomainRemoveVcpuAlias to be able to find
the unplugged vCPU we must ensure that qemuDomainRefreshVcpuInfo does
not clear out the alias in case when the vCPU is no longer reported by
qemu.
Co-authored-by: Partha Satapathy <partha.satapathy@oracle.com>
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Recently, the QEMU driver gained support for migration with TPM
state on a shared volume (e.g. NFS). As a part of that, the
destination side avoids setting seclabels on it to avoid cutting
off the source while it is still using it. Makes sense, except
for a wee bit: the secdriver API does a bit more - it also sets
label on the swtpm log file. And this one definitely needs to be
labeled (it lives under /var/log/swtpm/libvirt/qemu/..., i.e. not
on a shared volume).
Previously, qemuSecurityStartTPMEmulator() took care of that. But
during rework to shared volume migration, the code was changed so
now plain qemuSecurityCommandRun() would be run (i.e. no
relabelling).
But after previous commits, we can now chose whether the TPM
state should be relabelled or just the log file.
Fixes: 2e669ec789
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2130192#c7
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is basically just a continuation of the previous commit.
Now that the security driver APIs have a boolean flag that
controls setting/restoring seclabel of either both TPM state and
log files, or just the log file, propagate this boolean into
those APIs that start/stop swtpm emulator. For now, just pass
true. The juicy bits are soon to come.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virSecurityDomainSetTPMLabels() and
virSecurityDomainRestoreTPMLabels() APIs set/restore label on two
files/directories:
1) the TPM state (tpm->data.emulator.storagepath), and
2) the TPM log file (tpm->data.emulator.logfile).
Soon there will be a need to set the label on the log file but
not on the state. Therefore, extend these APIs for a boolean flag
that when set does both, but when unset does only 2).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce a simple helper fetching a sub-element node by name. This is
meant as a simple replacement for either open-coded versions of this or
use of XPath for this trivial lookup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use virJSONValueObjectGetArray + virJSONValueArrayToStringList instead
so that the ofvirJSONValueObjectGetStringArray wrapper can be removed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In two instances (qemuMonitorJSONGetStringListProperty,
qemuMonitorJSONGetStringArray) the return value is checked by
qemuMonitorJSONCheckReply and extracted by
virJSONValueObjectGetStringArray.
We can use qemuMonitorJSONGetReply which returns it directly and then
virJSONValueArrayToStringList to convert it without the additional
lookup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Using 'virJSONValueObjectHasKey' when we want to access the value
afterwards is wasteful. Fetch the JSON value right away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than checking that the object has the correct key and then
fetching it again use fetch the array first and then use
virJSONValueArrayToStringList to directly convert it.
Additionally we can avoid the conversion if there are no members
simplifying the surrounding logic.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'dependencies' field in the return data may be missing in some
cases. Historically 'virJSONValueObjectGetStringArray' didn't report
error in such case, but later refactor (commit 043b50b948 ) added
an error in order to use it in other places too.
Unfortunately this results in the error log being spammed with an
irrelevant error in case when qemuAgentGetDisks is invoked on a VM
running windows.
Replace the use of virJSONValueObjectGetStringArray by fetching the
array first and calling virJSONValueArrayToStringList only when we have
an array.
Fixes: 043b50b948
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2149752
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Introduce virJSONValueArrayToStringList which does only the conversion
from an array to a stringlist.
This will allow refactoring the callers to be more careful in case when
they want to handle the existance of the member in the parent object
differently.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use qemuMonitorJSONGetReply and unify the two blocks with the same
condition.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use qemuMonitorJSONGetReply in cases where qemuMonitorJSONCheckReply
is followed by virJSONValueObjectGet*(reply, "return").
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Replace usage of the following pattern with the new helper:
if (qemuMonitorJSONCheckReply(cmd, reply, VIR_JSON_TYPE_ARRAY) < 0)
return -1;
data = virJSONValueObjectGetArray(reply, "return");
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Replace usage of the following pattern with the new helper:
if (qemuMonitorJSONCheckReply(cmd, reply, VIR_JSON_TYPE_OBJECT) < 0)
return -1;
data = virJSONValueObjectGetObject(reply, "return");
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than simply checking that the 'return' field is of the expected
type we can directly return it as the caller is very likely going to use
it. Extract the code into the new function and add a wrapper to preserve
old functionality.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fix the type for few internal functions. Externally the APIs were
already limiting 'flags' to 'unsigned int'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Don't continue with the historical mistake and fix all internal
functions to use a sane type for flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Historically our migration APIs declare 'unsigned long flags'. Since
it's baked into our API we can't change that but we can avoid
compatibility problems by preemptively refusing the extra range on
certain arches to prevent future surprise.
Modify the macro to verify that value passed inside 'flags' doesn't
exceed the range of 'unsigned int'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virCommandSetSendBuffer() function consumes passed @buffer,
but takes it only as plain pointer. Switch to a double pointer to
make this obvious. This allows us then to drop all
g_steal_pointer() in callers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Back in v1.0.3-rc1~235 when I was adding virCommandDoAsyncIO(),
the main event loop was used to poll() on the pipe to the child
process. But this was promptly changed to a separate thread
handling I/O in v1.0.3-rc1~127. However, the corresponding
comment to virCommandDoAsyncIO() still documents the original
state.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
When virCommandSetSendBuffer() is used over a virCommand that is
(or will be) daemonized, then the command must have
VIR_EXEC_ASYNC_IO flag set no later than at virCommandRunAsync()
phase so that the thread that's doing IO is spawned and thus
buffers can be sent to the process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Error message reports that the guest has '0' NUMA nodes
configured when trying to attach a memory device to a guest with
no NUMA nodes. This may be a little misleading because '0' can
also be node's id. A more friendly way is to directly report
that the guest has no NUMA nodes.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2142519
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
QEMU capabilities is the only thing we use from priv so we can just pass
that directly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When an internal API takes a vm pointer, it's usually just after the
driver argument.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The vm object is used inside qemuMigrationCookieParse based on the flags
passed to qemuMigrationCookieParse and the content of the cookie. The
callers should not just blindly guess and pass NULL if they
(incorrectly) think the vm object is not needed. We should always pass
the vm object unless it does not exist yet.
This fixes a bug when statistics of a completed migration reported
"Unknown" operation instead of "Incoming migration" on the destination
host.
https://bugzilla.redhat.com/show_bug.cgi?id=2137298
Fixes: v8.7.0-79-g0150f7a8c1
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The virNodeDeviceGetPCIVPDDynamicCap() function is called from
virNodeDeviceGetPCIDynamicCaps() and therefore has to be a wee
bit more clever about adding VPD capability. Namely, it has to
remove the old one before adding a new one. This is how other
functions called from virNodeDeviceGetPCIDynamicCaps() behave
as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143235
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
As of [1]. libselinux changed the type of context_str() - it now
returns a const string. Follow this change in our code base.
1: dd98fa3227
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Inside of qemuTPMEmulatorBuildCommand() there are two calls to
qemuTPMSetupEncryption() which simply ignore returned error. This
is suboptimal because then we rely on swtpm binary reporting a
generic error (something among invalid command line arguments)
while an error reported by qemuTPMSetupEncryption() is more
specific.
However, since virCommandSetSendBuffer() only sets an error
inside of virCommand structure (the error is then reported in
virCommandRun()), we need to exempt its retval from error
checking. Thus, the signature of qemuTPMSetupEncryption() is
changed a bit so that -1/0 can be returned to indicate error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
In commit c43718ef67 I've added a disclaimer that the new stats which
are fetched from qemu and passed directly to the user are not guaranteed
by libvirt. I didn't notice that per-vcpu hypervisor specific stats are
also snuck into the VIR_DOMAIN_STATS_VCPU group along with other
pre-existing stats we do guarantee.
Extend the disclaimer for VIR_DOMAIN_STATS_VCPU too.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Guests are allowed to change their MAC addresses. Subsequently,
we may respond to that with tweaking that part of host side
configuration that depends on it. In this particular case: QoS.
Some parts of QoS are in fact set on corresponding bridge, where
overall view on traffic can be seen. Here, TC filters are used to
place incoming packets into qdiscs. These filters match source
MAC address. Therefore, upon guest changing its MAC address, the
corresponding TC filter needs to be updated too. This is done by
simply removing the old one and instantiating a new one, with new
MAC address.
Now, u32 filters (which we use) use a hash table for matching,
internally. And when deleting the old filter, we used to remove
the hash table (ID = 800::) and let the new filter instantiate
new hash table. This used to work, until kernel release 4.20
(specifically commit v4.20-rc1~27^2~131^2~11 and its friends)
where this practice was turned into error.
But that's okay - we can delete the specific filter we are after
and not touch the hash table at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
When setting up QoS for a domain <interface/>, or when reporting
its statistics we may need to swap TX/RX values. This is all
explained in comment to virDomainNetTypeSharesHostView().
However, this function claims that VIR_DOMAIN_NET_TYPE_ETHERNET
also shares the 'host view', meaning the TX/RX values must be
swapped. But that's not true.
An easy reproducer is to start a domain with two <interface/>-s:
one type of network, the other of type ethernet and configure the
same <bandwidth/> for both. Reversed setting can then be observed
(e.g. via tc).
Reported-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Our RPC calls can be divided into two groups: regular and high
priority. The latter can be then processed by so called high
priority worker threads. This is our way of defeating a
'deadlock' and allowing some RPCs to be processed even when all
(regular) worker threads are stuck. For instance: if all regular
worker threads get stuck when talking to QEMU on monitor, the
virDomainDestroy() can be processed by a high priority worker
thread(s) and thus unstuck those threads.
Now, this is all fine, except if users want to use virsh
non interactively:
virsh destroy $dom
This does a bit more - it needs to open a connection. And that
consists of multiple RPC calls: AUTH_LIST,
CONNECT_SUPPORTS_FEATURE, CONNECT_OPEN, and finally
CONNECT_REGISTER_CLOSE_CALLBACK. All of them are marked as high
priority except the last one. Therefore, virsh just sits there
with a partially open connection.
There's one requirement for high priority calls though: they can
not get stuck. Hopefully, the reason is obvious by now. And
looking into the server side implementation the
CONNECT_REGISTER_CLOSE_CALLBACK processing can't ever get stuck.
The only driver that implements the callback for public API is
Parallels (vz). And that can't block really.
And for virConnectUnregisterCloseCallback() it's the same story.
Therefore, both can be marked as high priority.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143840
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Use same style in the 'struct option' as:
struct option opt[] = {
{ a, b },
{ a, b },
...
{ a, b },
};
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
For the handling of usb we already allow plenty of read access,
but so far /sys/bus/usb/devices only needed read access to the directory
to enumerate the symlinks in there that point to the actual entries via
relative links to ../../../devices/.
But in more recent systemd with updated libraries a program might do
getattr calls on those symlinks. And while symlinks in apparmor usually
do not matter, as it is the effective target of an access that has to be
allowed, here the getattr calls are on the links themselves.
On USB hostdev usage that causes a set of denials like:
apparmor="DENIED" operation="getattr" class="file"
name="/sys/bus/usb/devices/usb1" comm="qemu-system-x86"
requested_mask="r" denied_mask="r" ...
It is safe to read the links, therefore add a rule to allow it to
the block of rules that covers the usb related access.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Michal Privoznik <mprivozn at redhat.com>
When there is no vIOMMU, vfio devices don't need to lock the entire guest
memory per-device, but they still need to lock the entire guest memory to
share between all vfio devices. This memory accounting is not shared
with vDPA devices, so it should be added to the memlock limit separately.
Commit 8d5704e2 added support for multiple vfio/vdpa devices but
calculated the limits incorrectly when there were both vdpa and vfio
devices and no vIOMMU. In this case, the memory lock limit was not
increased separately for the vfio devices.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143838
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
When post-copy migration is running in Finish phase we already did
everything needed and we're just waiting for all the memory to transfer
to the destination. The domain is already running on there at this
point. Once all data is transferred (QEMU sends a MIGRATION completed
event) we're done. So in this specific post-copy case the source does
not need to care about the result of the Finish call as long as QEMU
says migration completed. The Finish call to the destination daemon may
fail for reasons that do not affect QEMU, e.g., libvirt daemon was
restarted there or the libvirt connection broke.
Currently we just mark the post-copy migration as failed on the source
and keep the domain paused there. But when libvirt daemon is restarted
at this point, it will detect migration finished successfully and kill
the domain as migrated. It make sense to do this even without having to
restart the daemon.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/338
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We need the restored job even in case the migration already finished
even though we will stop it just a few lines below as the functions we
call in between require an existing migration job.
This fixes a crash on reconnect when post-copy migration finished while
the daemon was not running.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
These format are left unchanged when convert 'unsigned long' to
'unsigned long long', which caused compile warning.
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
If the running firewalld doesn't support getPolicies() then we fallback
to the "libvirt" zone. Throwing an error log is excessive since we
gracefully fallback.
Avoids these logs:
error : virGDBusCallMethod:242 : error from service: \
GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod
Fixes: ab56f84976 ("util: add virFirewallDGetPolicies()")
Signed-off-by: Eric Garver <eric@garver.life>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Register virDomainMemoryDefFree() to do the cleanup.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
This is new allocator for virDomainMemoryDef struct which also
sets some default values: @model and @targetNode.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
The idea here is that virVMXConfigScanResultsCollector() sets the
networks_max_index to the highest ethernet index seen. Well, the
struct member is signed int, we parse just seen index into uint
and then typecast to compare the two. This is not necessary,
because the maximum number of NICs a vSphere domain can have is
(<drumrolll/>): ten [1]. This will fit into signed int easily
anywhere.
1: https://configmax.esp.vmware.com/guest?vmwareproduct=vSphere&release=vSphere%208.0&categories=1-0
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
Now that we have STRCASESKIP() there's no need to open code it.
Convert virVMXConfigScanResultsCollector() so that it uses this
new macro.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
There is so far one case where STRCASEPREFIX(a, b) && a +
strlen(b) combo is used (in virVMXConfigScanResultsCollector()),
but there will be more. Do what we do usually: introduce a macro.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
When generating memory for main guest memory memory-backend-*
might be used. This means, we may need to generate thread-context
objects too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When generating memory for memory devices memory-backend-* might
be used. This means, we may need to generate thread-context
objects too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When generating memory for guest NUMA memory-backend-* might be
used. This means, we may need to generate thread-context objects
too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>