Some secdrivers (typically SELinux driver) generate unique
dynamic seclabel for each domain (unless a static one is
requested in domain XML). This is achieved by calling
qemuSecurityGenLabel() from qemuProcessPrepareDomain() which
allocates unique seclabel and stores it in domain def->seclabels.
The counterpart is qemuSecurityReleaseLabel() which releases the
label and removes it from def->seclabels. Problem is, that with
current code the qemuProcessStop() may still want to use the
seclabel after it was released, e.g. when it wants to restore the
label of a disk mirror.
What is happening now, is that in qemuProcessStop() the
qemuSecurityReleaseLabel() is called, which removes the SELinux
seclabel from def->seclabels, yada yada yada and eventually
qemuSecurityRestoreImageLabel() is called. This bubbles down to
virSecuritySELinuxRestoreImageLabelSingle() which find no SELinux
seclabel (using virDomainDefGetSecurityLabelDef()) and this
returns early doing nothing.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1751664
Fixes: 8fa0374c5b
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
All these headers are indirectly included provided by virfile.h having
virstoragefile.h which will be removed in the following patch.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Currently, swtpm TPM state file is removed when a transient domain is
powered off or undefined. When we store TPM state on a shared storage
such as NFS and use transient domain, TPM states should be kept as it is.
Add per-TPM emulator option `persistent_sate` for keeping TPM state.
This option only works for the emulator type backend and looks as follows:
<tpm model='tpm-tis'>
<backend type='emulator' persistent_state='yes'/>
</tpm>
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The kernel refuses to set guest TSC frequency less than a minimum
frequency or greater than maximum frequency (both computed based on the
host TSC frequency). When writing the libvirt code with a reversed logic
(return success when the requested frequency falls within the tolerance
interval) I forgot to include the boundaries.
Fixes: d8e5b45600https://bugzilla.redhat.com/show_bug.cgi?id=1839095
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Refactor in 0316c28a45 used incorrect source variable to initialize
the variable which holds the name of the bitmap which needs to be
deleted after the backup job finishes. This resulted into deleting the
source bitmap of the backup rather than the temporary one.
Use 'dd->incrementalBitmap' which holds the temporary bitmap name
instead of 'dd->backupdisk->incremental' which holds the name of the
source bitmap which is used by the backup.
Fixes: 0316c28a45
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1908647
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When starting a VM with an empty cdrom which has <iotune> configured the
startup fails as qemu is not happy about setting tuning for an empty
drive:
error: internal error: unable to execute 'block_set_io_throttle', unexpected error: 'Device has no medium'
Resolve this by skipping the setting of throttling for empty drives and
updating the throttling when new medium is inserted into the drive.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/111
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Currently, we configure QEMU to prealloc memory almost by
default. Well, by default for NVDIMMs, hugepages and if user
asked us to (via memoryBacking <allocation mode="immediate"/>).
However, when guest's NVDIMM is backed by real life NVDIMM this
approach is not the best. In this case users should put <pmem/>
into the <memory/> device <source/>, like this:
<memory model='nvdimm' access='shared'>
<source>
<path>/dev/pmem0</path>
<pmem/>
</source>
</memory>
Instructing QEMU to do prealloc in this case means that each
page of the NVDIMM is "touched" (the first byte is read and
written back - see QEMU commit v2.9.0-rc1~26^2) which cripples
device wear.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1894053
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
While previously we returned 0 this is not correct. We have to
return a negative value to indicate error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Even though we are getting driver capabilities with
refresh=false (so that it is not expensive), we still should do
ACL check first because there is no point in bothering with the
capabilities if caller doesn't have permissions to call the API.
Also, this way the comment makes more sense.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code we have there to copy seclabel model or doi can be
replaced by virStrcpy() calls which do exactly the same checks.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In recent patches new mambers to _qemuAgentDiskAddress struct
were introduced to keep optional CCW address sent by the guest
agent. These two members are a struct to store CCW address into
and a boolean to keep track whether the CCW address is valid.
Well, we can hold the same information with a pointer - instead
of storing the CCW address structure let's keep just a pointer to
it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
On s390x, devices are accessed via the channel subsystem by default,
so we need to look up the devices via their CCW address there instead
of using PCI.
This fixes "virsh domfsinfo" on s390x for virtio-block devices (the first
attempt from commit f8333b3b0a did it in the wrong way, reporting the
device name on the guest side instead of the target name on the host side).
Fixes: f8333b3b0a ("qemu: Fix domfsinfo for non-PCI device information ...")
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858771
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Newer versions of the QEMU guest agent will provide the CCW address
of devices on s390x. Store this information in the qemuAgentDiskInfo
so that we can use this later.
We also map the CSSID 0 from the guest to the value 0xfe on the host,
see https://www.qemu.org/docs/master/system/s390x/css.html for details.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The lower level function virNetDevGenerateName() now understands that
a blank ifname should be replaced with a generated name based on a
template that it knows about itself - there is no need for the higher
level functions to stuff a template name ("vnet%d") into ifname.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It must be used when migration URI uses `unix:` transport because otherwise we
cannot just guess where to connect for disk migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1638889
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Simplify ReserveName/GenerateName for macvlan and macvtap by using
common functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Simplify GenerateName/ReserveName for netdevtap by using common
functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Extract ReserveName/GenerateName from netdevtap and netdevmacvlan as
common helper functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
In v6.8.0-27-g88957116c9 and friends I've switched the way the
default RAM is specified for QEMU (from plain -m to
memory-backend-*). This means, that even if a guest doesn't have
any NUMA nodes configured we can use memory-backend-* attributes
to translate user config requests. For instance, we can allow
memory to be shared (<access mode='shared'/> under
<memoryBacking/>). But what my original commits are missing is
allowing such configuration in our validator.
Fixes: 88957116c9
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1839034#c12
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The next objective is to move virDomainDeviceDefValidate() to
domain_validate.c. First let's move all the static helpers.
The net device validation functions are used across multiple
drivers, so let's move them separately first.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Move virDomainDeviceDefValidate() and all its helper functions to
domain_validate.c.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
virDomainDefValidateAliases() is one of the static functions that
needs to be handled before moving virDomainDefValidateInternal().
Let's move all related validate functions to domain_validate.c
at the same time.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Now that virPCIDeviceIsPCIExpress() checks the length of the file when
the process lacks sufficient privilege to read the entire PCI config
file in sysfs, we can remove the open-coding for that case from its
consumer.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The one instance of a virPCIDevice in
qemuDomainDeviceCalculatePCIConnectFlags() needs to be converted to
use g_autoptr as a prerequisite for a bugfix. It's in this patch by
itself (rather than in a patch converting all virPCIDevice usages to
g_autoptr) to simplify any backport of said bugfix.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Although the function currently only returns errors for PCI addresses,
check it here too, in case that changes in the future.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We decided to not do metadata-less checkpoints and checking whether the
metadata is consistent is done once the data is actually needed. Remove
the comment.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The alignment step is not really necessary once we've done it already
since we fully populate the definition. In case of checkpoints it was a
relic necessary for populating the 'idx' to match checkpoint disk to
definition disk, but that was already removed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 926563dc3a which refactored the function call deleting the
snapshot's on disk state introduced a logic bug, which skips over the
deletion of libvirt metadata after the disk state deletion is done.
To fix it we must not return early.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/109
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
We already calculated the guest area, which is what is subject
to minimum size requirements, a few lines earlier.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The validation callback always fetched a fresh copy of 'qemuCaps' to use
for validation which is wrong in cases when the VM is already running,
such as device hotplug. The newly-fetched qemuCaps may contain flags
which weren't originally in use when starting the VM e.g. on a libvirtd
upgrade.
Since the post-parse/validation machinery has a per-run 'parseOpaque'
field filled with qemuCaps of the actual process we can reuse the caps
in cases when we get them.
The code still fetches a fresh copy if parseOpaque doesn't have a
per-run copy to preserve existing functionality.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The validation callbacks always fetch latest qemuCaps so it won't ever
be NULL. Remove the tautological conditions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virDomainDefPostParse infrastructure has apart from the global opaque
data also per-run data, but this was not duplicated into the validation
callbacks.
This is important when drivers want to use correct run-state for the
validation.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit f5e8715a8b added logic which adds some fake job info when qemu
didn't return anything but in such case the job type would not be set.
Since we already have the proper job type recorded in qemuBlockJobDataPtr
which the caller fetched, we can use this it and also remove the lookup
from the disk which was necessary prior to the conversion to
qemuBlockJobDataPtr.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Nodename may be asociated to a disk backup job, add support to looking
up in that chain too. This is specifically useful for the
BLOCK_WRITE_THRESHOLD event which can be registered for any nodename.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Nodename may be asociated to a disk backup job, add support to looking
up in that chain too. This is specifically useful for the
BLOCK_WRITE_THRESHOLD event which can be registered for any nodename.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's a technical detail in qemu that QCOW2 is needed for a pull-mode
backup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virStorageFileChainLookup' reports an error when the lookup of the
backing chain entry is unsuccessful. Since we possibly use it multiple
times when looking up backing for 'disk->mirror' the function can report
error which won't be actually reported.
Replace the call to virStorageFileChainLookup by lookup in the chain by
index.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use dummy variable to fill 'src' so that access to it doesn't need to be
conditionalized and use temporary variable for 'disk' rather than
dereferencing the array multiple times.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In v6.0.0-rc1~439 (and friends) we tried to cache NUMA
capabilities because we assumed they are immutable. And to some
extent they are (NUMA hotplug is not a thing, is it). However,
our capabilities contain also some runtime info that can change,
e.g. hugepages pool allocation sizes or total amount of memory
per node (host side memory hotplug might change the value).
Because of the caching we might not be reporting the correct
runtime info in 'virsh capabilities'.
The NUMA caps are used in three places:
1) 'virsh capabilities'
2) domain startup, when parsing numad reply
3) parsing domain private data XML
In cases 2) and 3) we need NUMA caps to construct list of
physical CPUs that belong to NUMA nodes from numad reply. And
while this may seem static, it's not really because of possible
CPU hotplug on physical host.
There are two possible approaches:
1) build a validation mechanism that would invalidate the
cached NUMA caps, or
2) drop the caching and construct NUMA caps from scratch on
each use.
In this commit, the latter approach is implemented, because it's
easier.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1819058
Fixes: 1a1d848694
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
If the job has finished, but we didn't yet process the completion fake
that it's still incomplete so that apps which decided to poll
qemuDomainGetBlockJobInfo rather than use events can be sure that the
XML update was completed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Replace qemuMonitorGetBlockJobInfo by qemuMonitorGetAllBlockJobInfo and
hash table lookup. This basically open-codes qemuMonitorGetBlockJobInfo,
but it will be removed in next patch.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
In recent commit of bf8bd93df0 (and friends) we switched the way
we process queried command line arguments: from string lists to
virJSONValue stored in a hash table. To achieve this
qemuMonitorJSONGetCommandLineOptions() helper was introduced
which executes the "query-command-line-options" monitor command
and then calls virJSONValueArrayForeachSteal() to process the
output. The array process function is also given
qemuMonitorJSONGetCommandLineOptionsWorker() as the callback
which is called over each item of the returned array. This
callback then steals "parameters" attribute of each array iteam
storing it in the hash table, but it leaves behind "option"
attribute (because it's g_strdup()-ed). After all of this, the
callback returns 0 which is a signal to the array processing
function that the callback took ownership of the array item. But
this is not true. While it removed "parameters" it did not take
the rest ("option" for instance). And therefore, it leads to a
memory leak:
5,347 (1,656 direct, 3,691 indirect) bytes in 69 blocks are definitely lost in loss record 2,752 of 2,794
at 0x483BEC5: calloc (vg_replace_malloc.c:760)
by 0x4E25A10: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6400.5)
by 0x4943317: virJSONValueNewObject (virjson.c:569)
by 0x4945692: virJSONParserHandleStartMap (virjson.c:1768)
by 0x5825A86: yajl_do_parse (in /usr/lib64/libyajl.so.2.1.0)
by 0x4945BFA: virJSONValueFromString (virjson.c:1896)
by 0xAF5C115: qemuMonitorJSONIOProcessLine (qemu_monitor_json.c:224)
by 0xAF5C45E: qemuMonitorJSONIOProcess (qemu_monitor_json.c:279)
by 0xAF4BB6C: qemuMonitorIOProcess (qemu_monitor.c:342)
by 0xAF4C444: qemuMonitorIO (qemu_monitor.c:574)
by 0x4FEF846: socket_source_dispatch (in /usr/lib64/libgio-2.0.so.0.6400.5)
by 0x4E1F727: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.6400.5)
The callback must return 1 so that the array item is properly
freed.
Fixes: ebeff6cd57
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since the function is now only used in qemu_domain.c, move it from
domain_conf.c and rename it.
This reverts the work done in commit ace5931553
(conf, qemu: move qemuDomainNVDimmAlignSizePseries to domain_conf.c).
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>