The configuration of a defined mdev can be modified after the mdev is
started. The defined configuration and the active configuration can
therefore run out of sync. Handle this by storing the modifiable data
which is the mdev type and attributes in two separate active and
defined configurations. mdevctl supports with callout scripts to do an
attribute retrieval of started mdevs which is already implemented in
libvirt.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor attribute handling code into methods for easier reuse.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Create a new structure holding type and attributes as these are
modifiable in a persistent mdev configuration and run out of sync with
the active mdev configuration.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Similar to when actual data is being written to the stream, it is
necessary to acknowledge handling of the client request when a hole is
encountered. This is done later in daemonStreamHandleWrite by sending a
fake zero-length reply if the status variable is set to
VIR_STREAM_CONTINUE. It seems that setting status from the message
header was missed for holes in the introduction of the sparse stream
feature.
Signed-off-by: Vincent Vanlaer <libvirt-e6954efa@volkihar.be>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As of v9.8.0-rc1~7 we check whether two <memory/> devices don't
overlap (since we allow setting where a <memory/> device should
be mapped to). We do this pretty straightforward, by comparing
start and end address of each <memory/> device combination.
But since only the start address is given (an exposed in the
XML), the end address is computed trivially as:
start + mem->size * 1024
And for majority of memory device types this works. Except for
NVDIMMs. For them the <memory/> device consists of two separate
regions: 1) actual memory device, and 2) label.
Label is where NVDIMM stores some additional information like
namespaces partition and so on. But it's not mapped into the
guest the same way as actual memory device. In fact, mem->size is
a sum of both actual memory device and label sizes. And to make
things a bit worse, both sizes are subject to alignment (either
the alignsize value specified in XML, or system page size if not
specified in XML).
Therefore, to get the size of actual memory device we need to
take mem->size and substract label size rounded up to alignment.
If we don't do this we report there's an overlap between two
NVDIMMs even when in reality there's none.
Fixes: 3fd64fb0e2
Fixes: 91f9a9fb4f
Resolves: https://issues.redhat.com/browse/RHEL-4452?focusedId=23805174#comment-23805174
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
VIR_CLOSE() sets errno on failure so it's better to use
virReportSystemError() than plain virReportError() as the former
reports errno value too.
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The current implementation sets the guest-sync timeout to the
smaller value between the default value (QEMU_AGENT_WAIT_TIME)
and agent->timeout, without considering the timeout passed
via the qga command.
This patch enhances the guest-sync timeout logic to use the
minimum value among the default value, agent->timeout, and
the timeout passed via the qga command.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/590
Signed-off-by: ray <honglei.wang@smartx.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move the assumption from the code pre-creating the storage to
qemuMigrationDstPrepareStorage where it's checked for other cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Migrating into a 'directory' won't ever work as we ask qemu to emulate a
fat filesystem, so restoring of the files won't be possible. Same for
'vhost-user' disks which don't support blockjobs as there's no block
backend used in qemu.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Check the existance of storage per-type rather than trying to come up
with a common "path".
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have a switch statement, the code adding the 'slice' for
block devices of non-equal sizes can be moved to appropriate location.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Automatically free helper variables, remove the 'cleanup' label and
use virBufferCurrentContent() to take the XML from the buffer rather
than extracting it to a separate variable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Allow storage migration of VDPA devices by properly checking that they
exist on the destionation. Pre-creation is not supported but if the
device exists the migration should be able to succeed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Decrease the likelyhood that addition of a new storage type will be
forgotten.
This patch also unifies the type check to consult the 'actual' type of
the storage in both cases as the NVMe check looked for the XML declared
type while virStorageSourceIsLocalStorage() looks for the
actual/translated type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Previously, the script would only detect differences between
libvirt's and qemu's list of x86 features, adding those features
to libvirt was a manual and error prone procedure.
Replace with a script that can generate libvirt's feature list
directly from qemu source code.
Usage: sync_qemu_features_i386.py [--output OUTPUT] [qemu]
If not specified otherwise, "output" defaults to x86_features.xml
in the same directory as sync_qemu_features_i386.py. If a checkout
of the qemu source code resides next to the libvirt directory, it
will be found automatically and need not be specified.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Use "0x%08x" as format for all values:
sed \
-e "s/'0x\(..\)'/'0x000000\\1'/g" \
-e "s/'0x\(...\)'/'0x00000\\1'/g"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
commit v9.10.0-129-g8b93d78c83 (first appearing in libvirt-10.0.0) was
supposed to allow forcing a PCI hostdev to be bound to a particular
driver by adding <driver model='blah'/> to the XML for the
device. Unfortunately, a single line was missed during the final
changes to the patch prior to pushing, and the result was that the
driver model could be set to *anything* and it would be accepted but
just ignored.
This patch adds the missing line, which will set the stubDriverName
field of the virPCIDevice object from the hostdev object as the
virPCIDevice is being created. This ends up being used by
virPCIDeviceBindToStub() as the driver that it binds the device to.
Fixes: 8b93d78c83
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This option controls whether the sysctl config for enabling unprivileged
userfaultfd will be installed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
/dev/userfaultfd device is preferred over userfaultfd syscall for
post-copy migrations. Unless qemu driver is configured to disable mount
namespace or to forbid access to /dev/userfaultfd in cgroup_device_acl,
we will copy it to the limited /dev filesystem QEMU will have access to
and label it appropriately. So in the default configuration post-copy
migration will be allowed even without enabling
vm.unprivileged_userfaultfd sysctl.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Our virSecret XML is still parsed and formatted using old way
(e.g. virXPathString() + virXXXTypeFromString() combo, or
formatting elements using plain virBufferAsprintf() instead of
virXMLFormatElement()). Modernize the code as it'll make it
easier for future expansion.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Convert the field and adjust the XML parsers to use
virXMLPropEnum().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virSecretDefParseUsage() function is called conditionally.
Call it unconditionally and keep pointer to the <usage/> node as
it'll come handy soon.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When adding vtpm virSecret usage type (in v5.6.0-rc1~61) we
forgot to update polkit access check. This limited user's ability
to match secrets in their rules. Add missing case into switch in
virAccessDriverPolkitCheckSecret().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Previously we were only starting or stopping nbdkit when the guest was
started or stopped or when hotplugging/unplugging a disk. But when doing
block operations, the disk backing store sources can also be be added or
removed independently of the disk device. When this happens the nbdkit
backend was not being handled properly. For example, when doing a
blockcopy from a nbdkit-backed disk to a new disk and pivoting to that
new location, the nbdkit process did not get cleaned up properly. Add
some functionality to qemuDomainStorageSourceAccessModify() to handle
this scenario.
Since we're now starting nbdkit from the ChainAccessAllow/Revoke()
functions, we no longer need to explicitly start nbdkit in hotplug code
paths because the hotplug functions already call these allow/revoke
functions and will start/stop nbdkit if necessary.
Add a check to qemuNbdkitProcessStart() to report an error if we
are trying to start nbdkit for a disk source that already has a running
nbdkit process. This shouldn't happen, and if it does it indicates an
error in another part of our code.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When starting nbdkit processes for the backing store of a disk, we were
returning an error if any backing store failed, but we were not cleaning
up processes that succeeded higher in the chain. Make sure that if we
return a failure status from qemuNbdkitStartStorageSource() that we roll
back any processes that had been started.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow us to start or stop nbdkit for just a single disk source
or for every source in the backing chain. This will be used in following
patches.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This patch adds a new attribute "register" to the <domain> element. If
set to "yes", the DNS server created for the virtual network is
registered with systemd-resolved as a name server for the associated
domain. The names known to the dnsmasq process serving DNS and DHCP
requests for the virtual network will then be resolvable from the host
by appending the domain name to them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When checking for machined we do not really care whether systemd itself
is running, we just need machined to be either running or socket
activated by systemd. That is, exactly the same we do for logind.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After previous cleanups, qemuMonitorIOWriteWithFD() is but a thin wrapper
over virSocketSendMsgWithFDs(). Replace the body of the former
with a call to the latter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After previous cleanups, virSocketSendFD() is but a thin wrapper
over virSocketSendMsgWithFDs(). Replace the body of the former
with a call to the latter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Instead of using strlen() to calculate length of payload we're
sending, let caller specify the size: they may want to send just
a portion of a buffer (even though the only current user
doesn't).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently, virSocketSendMsgWithFDs() reports two errors:
1) if CMSG_FIRSTHDR() fails,
2) if sendmsg() fails.
Well, the latter sets an errno, so caller can just use
virReportSystemError(). And the former - it is very unlikely to
fail because memory for whole control message was allocated just
a few lines above.
The motivation is to unify behavior of virSocketSendMsgWithFDs()
and virSocketSendFD() because the latter is just a subset of the
former (will be addressed later).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'raw' driver without any special configuration is not needed and
creates overhead in qemu.
Stop using the 'raw' format driver in cases when it's not needed. A
special case when it is needed is for FD passed images with only a
single writable FD passed, where we need an overlay driver to properly
reflect the 'read-only' flag.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Store whether qemu supports the appropriate option for block-stream and
block-commit commands and always use it if available.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The capability is asserted when both block-stream and block-commit QMP
commands support the 'backing-mask-protocol' argument.
The argument causes qemu to record 'raw' as the backing file format in
case when a protocol node is used directly. This is needed to preserve
compatibility of images after a block-commit or block-pull libvirt
operation with older libvirt versions in case when we'll want to remove
the unneded 'raw' format drivers from the block graph.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Inside of virsocket.c there is an include of poll.h and
PKT_TIMEOUT_MS macro definition. Neither of these is really
needed and in fact it's a leftover after I reworked one of
previously merged commits during review.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
enable VIR_DOMAIN_NET_TYPE_ETHERNET network support for ch guests.
Tested with following interface config:
<interface type='ethernet'>
<target dev='chtap0' managed="yes"/>
<model type='virtio'/>
<driver queues='2'/>
<interface>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This capability checks if ch can receive multiple fds along with net-add
api. This capability is required to enable multiple queues for
domain/guest interfaces.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virSocketSendMsgWithFDs method send fds along with payload using
SCM_RIGHTS. virSocketRecv method polls, receives and sends the response
to callers.
These methods are required to add network suppport in ch driver.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move domain interface management methods from qemu to hypervisor. This
refactoring allows the domain management methods to be shared between CH and
qemu drivers.
This commit does not introduce any functional changes.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Drop unused parameter from virDomainNetReleaseActualDevice method.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The helper was used only in 'qemucapabilitiesnumbering' test which was
removed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Instead of open-coding a partial version of it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
The qemuDomainGetSCSIControllerModel() function, which is
responsible for choosing a model for a SCSI controller that
didn't have one provided by the user, considers values >0 to
mean "model has been set".
Since MODEL_SCSI_AUTO == 0, this means that such a value is
considered the same as MODEL_SCSI_DEFAULT (-1). This makes
sense, as not specifying a model name or explicitly asking for
one to be automatically chosen intuitively should result in
the same behavior.
Specifically, there is no case in which a value of
MODEL_SCSI_AUTO or MODEL_SCSI_DEFAULT is encountered after the
initial controller creation: it is either replaced with an
actual model, or an error is raised.
Despite this, there are a few places in the QEMU driver where
we incorrectly treat these values as if they were actual
model names. To reduce confusion, make sure that no longer
happens.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Remove the wannabe error reporting via 'VIR_DEBUG/VIR_INFO' in favor of
proper errors.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code abused 'VIR_INFO' as an attempt at error reporting. Rework the
code to return the usual 0/-1 and raise proper errors.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rewrite the conditions after exiting the parser so that they are easier
to understand. This partially decreases the granularity of "error"
messages as they are not strictly necessary albeit for debugging.
As it was already observed in this code the logic itself often does
something else than the comment claims, thus the code logic is
preserved.
Changes:
- any case when not all data was processed is aggregated together and
gets a common "error" message
- absence of 'checksum' field is checked separately
- helper variables are removed as they are no longer needed
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use a 'switch' statement instead of a bunch of if/elseif statements.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'fieldFormat' variable is guaranteed to have only the proper enum
values by virPCIVPDResourceGetFieldValueFormat.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Merge the pre-checks with the 'switch' statement which is operating on
the same values to simplify further refactoring.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Replace VIR_INFO being used as form of error reporting with proper
virReportError and the usual return values.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Each caller was checking that the function read as many bytes as it
expected. Move the check inside virPCIVPDReadVPDBytes and make it report
a proper error rather than just a combination of VIR_DEBUG inside the
function and a random VIR_INFO in the caller.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Until now 'virPCIDeviceGetVPD' couldn't reallistically raise an error,
but that will change. Handle the errors by either resetting it if we'd
be ignoring it or forward it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
- fix passing of 'errno' to 'virReportSystemError'
The 'open' syscall returns '-1' and sets 'errno' on failure. The code
passed '-fd' as 'errno' rather than errno itself, thus always reporting
EPERM.
- don't overwrite errors when closing FD
Use VIR_AUTOCLOSE to avoid overwriting the errors from virPCIVPDParse.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
A checker function should not raise VIR_INFO or VIR_WARN messages
especially if they contain information useful only for debugging.
Turn the message into a VIR_DEBUG with universal meaning.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function always succeeded and after the removal of programing error
checks doesn't even have a 'return false' case. Additionally one of the
tests in 'virpcivpdtest' tested that this function never failed on wrong
data. Embrace this logic and remove the return value and adjust logging
to VIR_DEBUG level to avoid spamming logs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Don't overwrite already reported errors and improve parsing of
attributes.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The errors raised in virNodeDeviceCapVPDParseCustomFields were actually
ignored by continuing the parse rather than raised.
Rather than just replace 'continue' by 'return -1' this patch refactors
the whole parser to simplify it as well as report reasonable errors.
Parsing of individual fields is done without XPath and is extracted into
a common helper.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All callers satisfy these checks as they are just for programming
errors.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function are never called with NULL argument so the checks can be
removed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
None of the callers pass NULL, so the NULL check is pointless. Remove it
an remove the return value.
The function is exported only for use in 'virpcivpdtest' thus marking
the arguments as NONNULL is unnecessary.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use VIR_DEBUG instead of VIR_INFO as that's more appropriate and report
relevant information for debugging.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virXPathNodeSet' returns -1 only when 'ctxt' or 'xpath' are NULL or
when the 'xpath' string is invalid. Both are programming errors. It
doesn't make sense for the code to overwrite the error message for
anything supposedly more relevant.
The majority of calls to 'virXPathNodeSet' already didn't do this, so
this patch fixes the rest to prevent it from spreading again.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The check in 'virPCIVPDResourceIsValidTextValue' allows any printable
characters, thus the XML schema should do the same.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to previous commit other specific fields which come from the
system data and aren't sanitized enough to be safe for XML were also
formatted via virBufferAsprintf.
Other static and safe strings used virBufferEscapeString instead of
virBufferAddLit.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The custom field data is taken from PCI device data which can contain
any printable characters, and thus must be escaped when putting into
XML.
Originally, based on the comment and XML schema which was fixed in
previous commits the idea seemed to be that the parser would validate
that only characters which don't break the XML would be present but that
didn't seem to materialize.
Switch to proper escaping of the XML.
Fixes: 3954378d06
Resolves: https://issues.redhat.com/browse/RHEL-22314
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is never called with NULL argument. Remove the check and
refactor the rest including the debug statement.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function does not reject '&', '<', '>' contrary to what it actually
states. Move and adjust the comment.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's nothing under the 'cleanup:' label thus the whole code can be
simplified.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Originally the migration code didn't register the NBD disk port with the
port allocator when it was manually specified. Later when commit
e74d627bb3 refactored the code and started registering it, the
old logic which was clearing 'priv->nbdPort' in case when it was manually
specified was not removed.
This caused following problems:
- the port was not released after successful migration
- the port was released even when it was not allocated on failures
regarding the NBD server start
- the port was not released on other failures of the migration after
NBD server startup
To address this we remove the assumption that 'priv->nbdPort' is used
only for auto-allocated port and fill it only once the port is
allocated and make the caller of qemuMigrationDstStartNBDServer
responsible for releasing it.
Fixes: e74d627bb3
Resolves: https://issues.redhat.com/browse/RHEL-21543
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Add a few debug statements to be able to trace lifetime of a
reserved/allocated port.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Locks in following text:
A: virNetServer
B: virNetServerClient
C: daemonClientPrivate
'virNetServerSetClientAuthenticated' locks A then B
'remoteDispatchAuthPolkit' calls 'virNetServerSetClientAuthenticated'
while holding C.
If a client closes its connection 'virNetServerProcessClients' with the
lock A and B locked will call 'virNetServerClientCloseLocked' which will
try to dispose of the 'client' private data by:
ref(b);
unlock(b);
remoteClientFreePrivateCallbacks();
lock(b);
unref(b);
Unfortunately remoteClientFreePrivateCallbacks() tries lock C.
Thus the locks are held in the following order:
polkit auth: C -> A
connection close: A -> C
causing a textbook-example deadlock. To resolve it we can simply drop
lock 'C' before calling 'virNetServerSetClientAuthenticated' as the lock
is not needed any more.
Resolves: https://issues.redhat.com/browse/RHEL-20337
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Commit 7a39b04d68 ("apparmor: Enable passt support") grants
passt(1) read-write access to /{,var/}run/libvirt/qemu/passt/* if
started by the libvirt daemon. That's the path where passt creates
PID and socket files only if the guest is started by the root user.
If the guest is started by another user, though, the path is more
commonly /var/run/user/$UID/libvirt/qemu/run/passt: add it as
read-write location. Otherwise, passt won't be able to start, as
reported by Andreas.
While at it, replace /{,var/}run/ in the existing rule by its
corresponding tunable variable, @{run}.
Fixes: 7a39b04d68 ("apparmor: Enable passt support")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1061678
Reported-by: Andreas B. Mundt <andi@debian.org>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Historically creating offline external snapshot required disk-only flag
as well. Now when user requests new snapshot for offline VM and at least
one disk is specified to use external snapshot we will no longer require
disk-only flag as all other not specified disk will use external
snapshots as well.
Resolves: https://issues.redhat.com/browse/RHEL-22797
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Introduce new function qemuSnapshotCreateUseExternal() that will return
true if we will use external snapshots as default location.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The condition was completely wrong. As per the comment for function
virDomainMomentIsAncestor() it checks that the first argument is
descendant of the second argument.
Consider the following snapshot tree for VM:
s1
|
+- s2
| |
| +- s3
|
+- s4
|
+- s5 (current)
When deleting s2 with the original code we checked if
virDomainMomentIsAncestor(s2, s5) which would return false basically for
any snapshot as s5 is leaf snapshot so no children.
When deleting s2 with fixed code we check if
virDomainMomentIsAncestor(s5, s2) which still returns false but when
deleting s4 it will correctly return true.
Before this fix it fails with the following error:
error: Failed to delete snapshot s2
error: invalid argument: could not find base disk source in disk source chain
After the fix it fails with correct error:
error: Failed to delete snapshot s2
error: unsupported configuration: deletion of non-leaf external snapshot that is not in active chain is not supported
Resolves: https://issues.redhat.com/browse/RHEL-23212
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Rectify the condition to remove a domain only if it is not persistent.
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It has nothing to do with assigning addresses, so it makes more
sense to have it in qemu_domain.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
qemuDomainGetSCSIControllerModel() can return -1 on failure,
but qemuDomainFindOrCreateSCSIDiskController() didn't implement
any handling for this scenario.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Group things together where it makes sense, avoid unnecessary
uses of 'else if', plus other tweaks.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The current defaults, that can be altered on a per-architecture
basis, are derived from the historical x86 behavior.
Every time support for a new architecture is added to libvirt,
care must be taken to override these default: if that doesn't
happen, guests will end up with additional hardware, which is
something that's generally undesirable.
Turn things around, and require architectures to explicitly
ask for the devices to be created by default instead. The
behavior for existing architectures is preserved.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
They reference functions that have since been renamed.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This is pretty straightforward.
Resolves: https://issues.redhat.com/browse/RHEL-15316
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI_DYNAMIC_MEMSLOTS reflects
whether QEMU is capable of .dynamic-memslots for virtio-mem.
Use it when validating domain configuration.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Starting from v8.2.0-rc0~74^2~2 QEMU has .dynamic-memslots
attribute for virtio-mem-pci device. Introduce a capability which
reflects that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Introduced in v8.2.0-rc0~74^2~2, QEMU now allows setting
.dynamic-memslots attribute for virtio-mem-pci devices. When
turned on, it allows memory exposed to guest to be split into
multiple memslots and thus smaller memory footprint (see the
original commit for detailed explanation).
Therefore, introduce new <target/> attribute which will control
that QEMU knob.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In v9.10.0-rc1~103 the remote driver was switched to g_auto() for
client RPC return parameters. But whilst doing so a small bug
slipped in: previously, when virDomainGetBlockIoTune() was called
with *nparams == 0, the function set *nparams to the number of
supported params and zero was returned (so that client can
allocate memory and call the API second time). IOW - the usual,
old style of APIs where we didn't want to allocate memory on
caller's behalf. But because of this bug, a negative one is
returned instead.
Fixes: 501825011c
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Encryption secrets are considered a format dependency, even
when being used by the storage node itself, as in the case of
using encryption engine=librbd.
Currently, the storage node is created (blockdev-add) before
creating the format dependencies (including encryption secrets).
As a result, when trying to perform a blockcopy when the target
disk uses librbd encryption, an error of this form is returned:
"error: internal error: unable to execute QEMU command 'blockdev-add': No secret with id 'libvirt-5-format-encryption-secret0'"
To overcome this error, we change the order of commands so that
format dependencies are created BEFORE creating the storage node.
Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
After v9.1.0-rc1~116 we track whether it's us who created a
macvtap or not. But when updating a vNIC its definition might be
replaced with a new one (though, ifname is not allowed to
change), e.g. to reflect new QoS, link state, etc.
Now, the fact whether we created macvtap for given vNIC is stored
in net->privateData->created. And replacing definition is done by
simply freeing the old definition and making the pointer point to
the new one. But this does not preserve the 'created' flag, which
in turn means when a domain is shutting off, the macvtap is not
removed (see loop inside of qemuProcessStop()).
Copy this flag into new definition and leave a note in
_qemuDomainNetworkPrivate struct.
Fixes: 61d1b9e659
Resolves: https://issues.redhat.com/browse/RHEL-22714
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Turns out, there are two ways to specify an empty CD-ROM drive in
a .vmx file:
1) .fileName = "emptyBackingString"
2) .fileName = ""
While we do parse 1) successfully, the code does not accept 2)
and an error is reported. Modify the code to treat both cases the
same.
Resolves: https://issues.redhat.com/browse/RHEL-19380
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After guest is started, or we are reconnecting to already running
one (after daemon restart), qemuProcessRefreshRxFilters() is
called to refresh rx-filters (basically MAC addresses of guest
NICs) as they might have changed while we were not running (for
the case when reconnecting to an already running guest), or we
need to enable them by running a command (for freshly started
guest - see processNicRxFilterChangedEvent()).
Now, our XML parser allowed trustGuestRxFilters attribute for all
types and models of <interface/> while in reality, only virtio
model AND TUN/TAP based types can see MAC address changes. For
other combinations, QEMU reports an error.
This all means that when the daemon is restarted and it
reconnects to a guest with, well invalid configuration, or when
such guest is restored from a saved image, or migrated then we
issue the monitor command, to which QEMU replies with an error
which is then propagated to users:
error: internal error: unable to execute QEMU command 'query-rx-filter': invalid net client name: hostdev0
While on one hand users should fix their configuration (and after
v10.0.0-rc1~123 they can do that even on live domains), libvirt
can also has some logic built in that prevent issuing the command
in the first place (for obviously wrong cases).
Fixes: 060d4c83ef
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This should fix build failures when a daemon code is compiled before the
included *_protocol.h headers are ready, such as:
FAILED: src/virtqemud.p/remote_remote_daemon_config.c.o
../src/remote/remote_daemon_config.c: In function ‘daemonConfigNew’:
../src/remote/remote_daemon_config.c:111:30: error:
‘REMOTE_AUTH_POLKIT’ undeclared (first use in this function)
111 | data->auth_unix_rw = REMOTE_AUTH_POLKIT;
| ^~~~~~~~~~~~~~~~~~
../src/remote/remote_daemon_config.c:111:30: note: each undeclared
identifier is reported only once for each function it appears in
../src/remote/remote_daemon_config.c:115:30: error:
‘REMOTE_AUTH_NONE’ undeclared (first use in this function)
115 | data->auth_unix_rw = REMOTE_AUTH_NONE;
| ^~~~~~~~~~~~~~~~
../src/remote/remote_daemon_config.c: In function
‘daemonConfigLoadOptions’:
../src/remote/remote_daemon_config.c:252:31: error:
‘REMOTE_AUTH_POLKIT’ undeclared (first use in this function)
252 | if (data->auth_unix_rw == REMOTE_AUTH_POLKIT) {
| ^~~~~~~~~~~~~~~~~~
or
FAILED: src/virtqemud.p/remote_remote_daemon_dispatch.c.o
In file included from ../src/remote/remote_daemon.h:28,
from ../src/remote/remote_daemon_dispatch.c:26:
src/remote/lxc_protocol.h:13:5: error:
unknown type name ‘remote_nonnull_domain’
13 | remote_nonnull_domain dom;
| ^~~~~~~~~~~~~~~~~~~~~
In file included from ../src/remote/remote_daemon.h:29,
from ../src/remote/remote_daemon_dispatch.c:26:
src/remote/qemu_protocol.h:13:5: error:
unknown type name ‘remote_nonnull_domain’
13 | remote_nonnull_domain dom;
| ^~~~~~~~~~~~~~~~~~~~~
src/remote/qemu_protocol.h:14:5: error:
unknown type name ‘remote_nonnull_string’
14 | remote_nonnull_string cmd;
| ^~~~~~~~~~~~~~~~~~~~~
...
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When trying to start nbdkit-backed disks in backing chains, we were
accidentally always checking the private data of the top of the chain
instead of using the loop variable.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Since commit 4af3cbafdd the function always returns 0, so it is
possible to make this function void and remove return value checks.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
Current implementation of virDomainMemoryDefCheckConflict() does
only a one way comparison, i.e. if there's a memory device within
def->mems[] which address falls in [mem->address, mem->address +
mem->size] range (mem is basically an iterator within
def->mems[]). And for static XML this works just fine. Problem is
with hot/cold plugging of a memory device. Then mem points to
freshly parsed memory device and these half checks are
insufficient. Not only we must check whether an existing memory
device doesn't clash with freshly parsed memory device, but also
whether freshly parsed memory device does not fall into range of
already existing memory device.
Resolves: https://issues.redhat.com/browse/RHEL-4452
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
meson wraps python scripts already on win32, so we end up with these
failing commands:
[185/868] Generating src/rpc/virnetprotocol.h with a custom command
FAILED: src/rpc/virnetprotocol.h
"sh" "libvirt/scripts/meson-python.sh" "F:/msys64/ucrt64/bin/python3.EXE" "F:/msys64/ucrt64/bin/python.exe" "libvirt/scripts/rpcgen/main.py" "--mode=header" "../src/rpc/virnetprotocol.x" "src/rpc/virnetprotocol.h"
SyntaxError: Non-UTF-8 code starting with '\x90' in file F:/msys64/ucrt64/bin/python.exe on line 1, but no encoding declared; see https://peps.python.org/pep-0263/ for details
The issue was introduced in a62486b95f commit.
These changes are similar as e06beacec2 commit.
Signed-off-by: Biswapriyo Nath <nathbappai@gmail.com>
Rewrite the function so that it's more compact and easier to
extend as new architectures, which will likely come with
multibus support right out the gate, are introduced.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It belongs next to qemuDomainSupportsPCI().
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The way the function is currently written sort of obscures this
fact, but ultimately we already unconditionally assume PCI
support on most architectures.
Arm and RISC-V need some additional checks to maintain
compatibility with existing configurations but for all future
architectures, such as the upcoming LoongArch64, we expect PCI
support to come out of the box.
Last but not least, the functions is made const-correct.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's no longer used anywhere.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For all versions of QEMU that we support, the virt machine type
has a hard dependency on this device, so we can stop checking
whether the capability is present and just use it unconditionally.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bus name for the default PHB is always "pci.0".
Fixes: 937f319536
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The XML parser for consoles sets the 'port=' attribute of '<target' to
be always the index of the console.
Thus when the "really crazy backcompat stuff for consoles" function
modifies the order of consoles by inserting the default one for a serial
port it must re-number the ports to ensure that the value will not
change on subsequent parse.
This luckily didn't cause any visible changes to the VM as the port
number isn't used for anything.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Assigning a PCI address needs to also assign any extension addresses
right away. Otherwise they'd be assigned only after subsequent
format->parse cycle and thus be potentially missing on first run after
defining the VM and thus could change.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'size' of a 'shmem' device is parsed and formatted as a "scaled"
value, stored in bytes, but the formatting scale is mebibytes. This
precission loss combined with the fact that the value was validated only
when starting and the size is formatted only when non-zero meant that
on first parse a value < 1 MiB would be accepted, but would be formatted
to the XML as 0 MiB as it was non-zero but truncated and a subsequent
parse would parse of such XML would parse it as 0 bytes, which in turn
would be interpreted as 'default' size.
Fix the issue by moving the validator, which ensures that the number is
a power of two and more than 1 MiB to the validator code so that it'll
be rejected at XML parsing time.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to auto-adding of controllers, the assignment of indexes can
cause them to be considered in different ordering according to the logic
in 'virDomainControllerInsert' than they currently are.
To prevent changes in commandline between first run after defining a VM
xml and any subsequent run or restart of the daemon, we need to reorder
them when assigning the index.
The simplest method is to assign indexes and then create a new list of
controllers and re-instert them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDefAddController' which is used in code-paths which auto-add
controllers to the definition such as 'virDomainDefMaybeAddController',
'virDomainDefAddUSBController', 'qemuDomainDefAddDefaultDevices' was
adding the controller at the end of the list. However that is not how
the XML parser would order the controller in the list as it uses
virDomainControllerInsert grouping them by type and additional
properties.
This would cause that auto-added controllers would re-order:
- between first and any subsequent run of the VM (even on commandline)
- after a libvirtd/virtqemud restart
- after any update of the definition based on the 'define' operation
(e.g. virsh edit)
To ensure that the ordering of controllers is identical always use
virDomainControllerInsert.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Format the code the usual way despite having more than 80 columns so
that it's easier to read.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since this tests inactive/config XML files rename it accordingly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDefFormatInternalSetRootName' which is the top level XML
formatter function has the following condition as the very first thing:
if (def->id == -1)
flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;
This makes it pointless to separately do inactive->active and
inactive->inactive XML -> XML testing as both will be in the end treated
as inactive->inactive.
This patch adds a warning to virDomainDefFormatInternalSetRootName and
removes the second pointless invocation of the test from
qemuxml2xmtest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This makes it so libvirt can obtain accurate information about
guest CPUs from QEMU, and should make it possible to correctly
perform operations such as CPU hotplug.
Of course this is mostly moot at the moment: only aarch64 can use
CPU clusters, and CPU hotplug is not yet implemented on that
architecture.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The default number of CPU clusters is 1, and values other than
that one are currently rejected by all hypervisor drivers.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
For machines that don't expose useful information through sysfs,
the dummy ID 0 is used.
https://issues.redhat.com/browse/RHEL-7043
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Just like in rest of the function virDomainFSDefParseXML,
use goto error so that def will be cleaned up in error cases.
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
A QEMU change (10218ae6d006f76410804cc4dc690085b3d008b5) introduced
some libnuma calls that require read access to
/sys/devices/system/node/*/cpumap, which currently is forbidden by the
standard apparmor profile.
This commit allows read-only access to the file specified above.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/515
Signed-off-by: Sergio Durigan Junior <sergio.durigan@canonical.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
In v9.7.0-rc1~130 I've shortened the path that's generated for
<channel/> source. With that, I had to adjust regex that matches
all versions of paths we have ever generated so that we can drop
them (see comment around qemuDomainChrDefDropDefaultPath()). But
as it is usually the case with regexes - they are write only. And
while I attempted to make one portion of the path optional
("/target/") I accidentally made regex accept more, which
resulted in libvirt dropping the user provided path and
generating our own instead.
Fixes: d3759d3674
Resolves: https://issues.redhat.com/browse/RHEL-20807
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The first thread to issue a client RPC request will own the event
loop execution, sitting in the virNetClientIOEventLoop function.
It releases the client lock while running:
virNetClientUnlock()
g_main_loop_run()
virNetClientLock()
If a second thread arrives with an RPC request, it will queue it
for the first thread to process. To inform the first thread that
there's a new request it calls g_main_loop_quit() to break it out
of the main loop.
This works if the first thread is in g_main_loop_run() at that
time. There is a small window of opportunity, however, where
the first thread has released the client lock, but not yet got
into g_main_loop_run(). If that happens, the wakeup from the
second thread is lost.
This patch deals with that by changing the way the wakeup is
performed. Instead of directly calling g_main_loop_quit(), the
second thread creates an idle source to run the quit function
from within the first thread. This guarantees that the first
thread will see the wakeup.
Tested by: Fima Shevrin <efim.shevrin@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When VIR_DOMAIN_BLOCK_RESIZE_CAPACITY is set, the 'size' parameter
is currently ignored. Since applications must none the less pass a
value for this parameter, it is preferrable to declare some explicit
semantics for it.
This declare that the parameter must be 0, or the exact size of the
underlying block device. The latter gives the management application
the ability to sanity check that the block device size matches what
they think it should be.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
During post-copy migration (once it actually switches to post-copy mode)
dirty memory pages are continued to be migrated iteratively, while the
destination can explicitly request a specific page to be migrated before
the iterative process gets to it (which happens when a guest wants to
read a page that was not migrated yet). Without the postcopy-preempt
capability enabled such pages need to wait until all other pages already
queued are transferred. Enabling this capability will instruct the
hypervisor to create a separate migration channel for explicitly
requested pages so that they can preempt the queue.
The only requirement for the feature to work is running a migration over
a protocol that supports multiple connections. In other words, we can't
pre-create the connection and pass its file descriptor to QEMU (i.e.,
using MIGRATION_DEST_CONNECT_SOCKET), but we have to let QEMU open the
connections itself (using MIGRATION_DEST_SOCKET). This change is applied
to all post-copy migrations even if postcopy-preempt is not supported to
avoid making the code even uglier than it is now. There's no real
difference between the two methods with modern QEMU (which can properly
report connection failures) anyway.
This capability is enabled for all post-copy migration as long as the
capability is supported on both sides of migration.
https://issues.redhat.com/browse/RHEL-7100
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We enable various migration capabilities according to the flags passed
to a migration API. Missing support for such capabilities results in an
error because they are required by the corresponding flag. This patch
adds support for additional optional capability we may want to enable
for a given API flag in case it is supported. This is useful for
capabilities which are not critical for the flags to be supported, but
they can make things work better in some way.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The migration cookie contains two bitmaps of migration capabilities:
supported and automatic. qemuMigrationParamsCheck expects the letter so
lets make it more obvious by renaming the parameter as remoteAuto.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Add validation and formatting of the commandline arguments for
'iothread-vq-mapping' parameter. The validation logic mirrors what qemu
allows.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce a new <iothreads> sub-element of disk's <driver> which will
allow configuring multiple iothreads and also map them to specific
virt-queues of virtio devices.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The capability represents the support for mapping virtqueues to
iothreads for the 'virtio-blk' device.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rework the helper to use a GPtrArray structure to simplify callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than always binding to the vfio-pci driver, use the new
function virPCIDeviceFindBestVFIOVariant() to see if the running
kernel has a VFIO variant driver available that is a better match for
the device, and if one is found, use that instead.
virPCIDeviceFindBestVFIOVariant() function reads the modalias file for
the given device from sysfs, then looks through
/lib/modules/${kernel_release}/modules.alias for the vfio_pci alias
that matches with the least number of wildcard ('*') fields.
The appropriate "VFIO variant" driver for a device will be the PCI
driver implemented by the discovered module - these drivers are
compatible with (and provide the entire API of) the standard vfio-pci
driver, but have additional device-specific APIs that can be useful
for, e.g., saving/restoring state for migration.
If a specific driver is named (using <driver model='blah'/> in the
device XML), that will still be used rather than searching
modules.alias; this makes it possible to force binding of vfio-pci if
there is an issue with the auto-selected variant driver.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This patch makes it possible to manually specify which VFIO variant
driver to use for PCI hostdev device assignment, so that, e.g. you
could force use of a VFIO "variant" driver, with e.g.
<driver model='mlx5_vfio_pci'/>
or alternately to force use of the generic vfio-pci driver with
<driver model='vfio-pci'/>
when libvirt would have normally (after applying a subsequent patch)
found a "better match" for a device in the active kernel's
modules.alias file. (The main potential use of this manual override
would probably be to work around a bug in a new VFIO variant driver by
temporarily not using that driver).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Xen only supports a single type of PCI hostdev assignment, so it is
superfluous to have <driver name='xen'/> peppered throughout the
config. It *is* necessary to have the driver type explicitly set in
the hostdev object before calling into the hypervisor-agnostic "hostdev
manager" though (otherwise the hostdev manager doesn't know whether it
should do Xen-specific setup, or VFIO-specific setup).
Historically, the Xen driver has checked for "default" driver name
(i.e. not set in the XML), and set it to "xen', during the XML
postparse, thus guaranteeing that it will be set by the time the
object is sent to the hostdev manager at runtime, but also setting it
so early that a simple round-trip of parse-format results in the XML
always containing an explicit <driver name='xen'/>, even if that
wasn't specified in the original XML.
The QEMU driver *doesn't* set driver.name during postparse though;
instead, it waits until domain startup time (or device attach time for
hotplug), and sets the driver.name then. The result is that a
parse-format round trip of the XML in the QEMU driver *doesn't* add in
the <driver name='vfio'/>.
This patch modifies the Xen driver to behave similarly to the QEMU
driver - the PostParse just checks for a driver.name that isn't
supported by the Xen driver, and any explicit setting to "xen" is
deferred until domain runtime rather than during the postparse, thus
Xen domain XML also doesn't get extraneous <driver name='xen'/>.
This delayed setting of driver.name of course results in slightly
different xml2xml parse-format results, so the unit test data is
modified accordingly.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
virHostdevIsVFIODevice() and virDomainDefHasVFIOHostdev() are only ever
called from the QEMU driver, and in the case of the QEMU driver, any
PCI hostdev by definition uses VFIO, so really all these callers only
need to know if the device is a PCI hostdev.
(It turned out that the less specific virHostdevIsPCIDevice() already
existed in hypervisor/virhostdev.c, so I had to remove one of them;
since conf is a lower level directory than hypervisor, and the
function is called from conf, keeping the copy in hypervisor would
have required moving its caller (virDomainDefHasPCIHostdev()) into
hypervisor as well, so I just removed the copy in hypervisor.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now if a new attribute is added to <driver>, we only need to update
the formatting/parsing in one place.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This is done so that we can re-use the same parser/formatter for
<network> and <networkport>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>