The call to resolve the actual network type will turn any NICs with
type=network into one of the other types. Thus there should be no need
to handle type=network in later switch() statements jumping off the
actual type.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Ports allocated on virtual networks with type=nat|route|open all get
given an actual type of 'network'.
Only ports in networks with type=bridge use an actual type of 'bridge'.
This distinction makes little sense since the virtualization drivers
will treat both actual types in exactly the same way, as they're all
just bridge devices a VM needs to be connected to.
This doesn't affect user visible XML since the "actual" device XML
is internal only, but we need code to convert the data upgrades.
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Firstly, VIR_STRDUP() accepts NULL, so there is no need to check
if the string we want to duplicate is not-NULL. Secondly,
virDomainNetSetModelString() also accepts NULL. Thirdly, we have
VIR_AUTOFREE().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This requires drivers to opt in to handle the raw modelstr
network model, all others will error if a passed in XML value
is not in the model enum.
Enable this feature for libxl/xen/xm and qemu drivers
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This converts the qemu driver to the net model enum, for all
the model values that we have hardcoded for various checks,
which adds e1000e, virtio-transitional, virtio-non-transitional,
usb-net, spapr-vlan, lan9118, smc91c111
Because the qemu driver has historically also allowed the raw
model string onto the qemu command line, this isn't a full
conversion. Unwinding that will require more thought. However
for all new driver code we should be adding explicit enum
values for any model name we have special handling for.
Remove the now unused virDomainNetStreqModelString
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This adds a network model enum. The virDomainNetDef property
is named 'model' like most other devices.
When the XML parser or a driver calls NetSetModelString, if
the passed string is in the enum, we will set net->model,
otherwise we copy the string into net->modelstr
Add a single example for the 'netfront' xen model, and wire
that up, just to verify it's all working
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
To ease converting the net->model value to an enum, add
the wrapper functions:
virDomainNetGetModelString
virDomainNetSetModelString
virDomainNetStreqModelString
virDomainNetStrcaseeqModelString
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
hostdevs have a link back to the original network device. This is fairly
generic accepting any type of device, however, we don't intend to make
use of this approach in future. It can thus be specialized to network
devices.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The APIs for allocating/notifying/removing network ports just take
an internal domain interface struct right now. As a step towards
turning these into public facing APIs, add a virNetworkPtr argument
to all of them.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The port allocation APIs are currently called unconditionally for all
types of NIC, but (mostly) only do anything for NICs with type=network.
The exception is the port allocate API which does some validation even
for NICs with type!=network. Relying on this validation is flawed,
however, since the network driver may not even be installed. IOW virt
drivers must not delegate validation to the network driver for NICs
with type != network.
This change allows us to report errors when the virtual network driver
is not registered.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This helps in a scenarios where vCPUs run with a priority that is so high they
might starve the emulator thread. And it also fits with the rest of the
settings.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Qemu commit 767abe7 ("chardev: forbid 'wait' option with client
sockets") effectively deprecates usage of "wait" with client sockets
starting with qemu 4.0, and earlier versions ignored the value.
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
My earlier commit be46f61326 was incomplete. It removed caching of
microcode version in the CPU driver, which means the capabilities XML
will see the correct microcode version. But it is also cached in the
QEMU capabilities cache where it is used to detect whether we need to
reprobe QEMU. By missing the second place, the original commit
be46f61326 made the situation even worse since libvirt would report
correct microcode version while still using the old host CPU model
(visible in domain capabilities XML).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Support for kqemu was dropped in libvirt by commit 8e91a400c and even
back then we never set these capabilities when doing QMP probing.
Since no QEMU we aim to support has these, drop them completely.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The NVIDIA V100 GPU has an onboard RAM that is mapped into the
host memory and accessible as normal RAM via an NVLink2 bridge. When
passed through in a guest, QEMU puts the NVIDIA RAM window in a
non-contiguous area, above the PCI MMIO area that starts at 32TiB.
This means that the NVIDIA RAM window starts at 64TiB and go all the
way to 128TiB.
This means that the guest might request a 64-bit window, for each PCI
Host Bridge, that goes all the way to 128TiB. However, the NVIDIA RAM
window isn't counted as regular RAM, thus this window is considered
only for the allocation of the Translation and Control Entry (TCE).
For more information about how NVLink2 support works in QEMU,
refer to the accepted implementation [1].
This memory layout differs from the existing VFIO case, requiring its
own formula. This patch changes the PPC64 code of
@qemuDomainGetMemLockLimitBytes to:
- detect if we have a NVLink2 bridge being passed through to the
guest. This is done by using the @ppc64VFIODeviceIsNV2Bridge function
added in the previous patch. The existence of the NVLink2 bridge in
the guest means that we are dealing with the NVLink2 memory layout;
- if an IBM NVLink2 bridge exists, passthroughLimit is calculated in a
different way to account for the extra memory the TCE table can alloc.
The 64TiB..128TiB window is more than enough to fit all possible
GPUs, thus the memLimit is the same regardless of passing through 1 or
multiple V100 GPUs.
Further reading explaining the background
[1] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg03700.html
[2] https://www.redhat.com/archives/libvir-list/2019-March/msg00660.html
[3] https://www.redhat.com/archives/libvir-list/2019-April/msg00527.html
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The NVLink2 support in QEMU implements the detection of NVLink2
capable devices by verifying the attributes of the VFIO mem region
QEMU allocates for the NVIDIA GPUs. To properly allocate an
adequate amount of memLock, Libvirt needs this information before
a QEMU instance is even created, thus querying QEMU is not
possible and opening a VFIO window is too much.
An alternative is presented in this patch. Making the following
assumptions:
- if we want GPU RAM to be available in the guest, an NVLink2 bridge
must be passed through;
- an unknown PCI device can be classified as a NVLink2 bridge
if its device tree node has 'ibm,gpu', 'ibm,nvlink',
'ibm,nvlink-speed' and 'memory-region'.
This patch introduces a helper called @ppc64VFIODeviceIsNV2Bridge
that checks the device tree node of a given PCI device and
check if it meets the criteria to be a NVLink2 bridge. This
new function will be used in a follow-up patch that, using the
first assumption, will set up the rlimits of the guest
accordingly.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This does not cause a problem in usual scenarios thanks to us allowing
CAP_DAC_OVERRIDE for the qemu process, however in some scenarios this might be
an issue because the directory is created with mkdtemp(3) which explicitly
creates that with 0700 permissions and qemu running as non-root cannot access
that.
The scenarios include:
- Builds without CAPNG
- Running libvirtd in certain container configurations [1]
- and possibly others.
[1] https://github.com/kubevirt/kubevirt/pull/2181#issuecomment-481840304
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The structure can only be used for CPUID data now. Adding a type
indicator and moving the data into a union will let us store alternative
data types.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The following patches introduce CPU features read from MSR in addition
to those queried via CPUID instruction. Let's introduce a container
struct which will be able to describe either feature type.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Vim has trouble figuring out the filetype automatically because
the name doesn't follow existing conventions; annotations like
the ones we already have in Makefile.ci help it out.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The latter is deprecated and will be removed soon. The advised
replacement is '-overcommit mem-lock=on|off'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Added in QEMU commit of v3.0.0-rc0~48^2~9 (then fixed by
v3.1.0-rc0~119^2~37) QEMU is replacing '-realtime mlock' with
'-overcommit mem-lock'. Add a capability to tell if we're dealing
new new enough qemu to use the replacement.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The '-realtime mlock' cmd line argument was introduced in QEMU
commit v1.5.0-rc0~190 which matches minimal QEMU version we
require. Therefore, the capability will always be present.
Apparently, nearly none of our xml2argv test cases had the
capability hence slightly bigger change under qemuxml2argvdata/.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Standardize on putting the _LAST enum value on the second line
of VIR_ENUM_IMPL invocations. Later patches that add string labels
to VIR_ENUM_IMPL will push most of these to the second line anyways,
so this saves some noise.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Demonstrate how VIR_RETURN_PTR is used by refactoring qemu_block.c
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is a locally used helper struct but we can make use of automatic
freeing for it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 6864d8f740 moved this one level up
for qemuBuildMemoryBackendProps but left qemuBuildMemPathStr intact.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
If a management application wants to use firmware auto selection
feature it can't currently know if the libvirtd it's talking to
support is or not. Moreover, it doesn't know which values that
are accepted for the @firmware attribute of <os/> when parsing
will allow successful start of the domain later, i.e. if the mgmt
application wants to use 'bios' whether there exists a FW
descriptor in the system that describes bios.
This commit then adds 'firmware' enum to <os/> element in
<domainCapabilities/> XML like this:
<enum name='firmware'>
<value>bios</value>
<value>efi</value>
</enum>
We can see both 'bios' and 'efi' listed which means that there
are descriptors for both found in the system (matched with the
machine type and architecture reported in the domain capabilities
earlier and not shown here).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
The point of this API is to fetch all FW descriptors, parse them
and return list of supported interfaces and SMM feature for given
combination of machine type and guest architecture.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
This reverts commit a5e1602090.
Getting rid of unistd.h from our headers will require more work than
just fixing the broken mingw build. Revert it until I have a more
complete proposal.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
util/virutil.h bogously included unistd.h. Drop it and replace it by
including it directly where needed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virutil.(c|h) is a very gross collection of random code. Remove the enum
handlers from there so we can limit the scope where virtutil.h is used.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'viralloc.h' does not provide any type or macro which would be necessary
in headers. Prevent leakage of the inclusion.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Keeping them with viralloc.h forcibly pulls in the other stuff from
viralloc.h into other header files. This in turn creates a mess
as more and more headers pull in the 'viral' header file.
If we want to make 'viralloc.h' omnipresent we should pick a different
approach.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The rules are the same for all virt guests, regardless of the
architecture.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Our PCIe topology depends on the availability of PCIe Root Ports,
so if none of the suitable devices (pcie-root-port, ioh3420) is
compiled into QEMU we should fall back to virtio-mmio rather than
trying to use PCI addresses only to fail immediately afterwards
when we realize we can't use the necessary controllers.
Note that this additional check is basically moot for ARM virt
guests, because PCIe Root Ports were enabled in QEMU builds for
the architecture well before guest OS support had been widely
available; however, the opposite is true for RISC-V, and tweaking
the code this way will allow us to share it between architectures.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Since the STOP event handler can use the pausedReason as sent to
qemuProcessStopCPUs, we no longer need to send duplicate suspended
lifecycle events because we know what caused the stop along with extra
details. This processing allows us to also remove the duplicated state
change from qemuProcessStopCPUs.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Map is based on existing cases in code where we send suspended
event after changing domain state to paused.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Similar to commit [1] which saves and passes the running reason to
the RESUME event handler, during qemuProcessStopCPUs let's save and pass
the pause reason in the domain private data so that the STOP event
handler can use it.
[1] 5dab984ed : qemu: Pass running reason to RESUME event handler
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1693066
Up until memfd introduction (in 24b74d187c) we did not need to
know @pagesize because qemuGetDomainHupageMemPath() could deal
with it being zero (value of zero means use the default hugetlbfs
mount). But since for memfd we are not passing a path to
hugetlbfs mount rather the page size value we need to know its
value upfront.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This helper returns the default hugetlbfs mount point from given
array of mount points.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuMigrationSrcPerform callers expect it to call virDomainObjEndAPI
in any case so on error paths we miss the virDomainObjEndAPI call.
To fix this let's make qemuMigrationSrcPerform callers responsible
for the virDomainObjEndAPI call.
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
d_type is a non-portable extension to the struct dirent and even if it
exists, its value may be DT_UNKNOWN if the filesystem doesn't support
it. This is common with older versions of XFS which have ftype=0
feature.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Use virJSONValueToBuffer so that we can append the command terminator
string without copying of the string again. Also avoid a 'strlen' as we
can query the buffer use size.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
The internal qemu machinery already logs the sent message via the PROBE
point in qemuMonitorSend and the monitor receive function. Those are way
better as they are easy grepable. Remove the additional ones from the
monitor code which just duplicate the sent data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Refactor code paths which clear strings on cleanup paths to use the
automatic helper.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'blockdev-snapshot-sync' is present in QEMU since v0.14.0-rc0 and
'transaction' since v1.1.0 (52e7c241ac766406f05fa)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemu added the 'drive-mirror' command in v1.3.0 (d9b902db3fb71fdc)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemu added the 'block-commit' command in v1.3.0 (ed61fc10e8c8d2)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This was detected by the presence of 'block-stream' which is present in
qemu since v1.1 (db58f9c0605fa151b8c4)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When adding <migrationSource> I've used a slightly unusual approach. To
allow using the disk source XML parser and formatter convert
<migrationSource> to look like <disk>. This means that <source> will be
added as a subelement of <migrationSource> rather than being formatted
inline.
Conversion from the old format in the parser is very simple as it
involves only moving the XPath context current node slightly if the new
format is found.
The status XML to XML test shows that the upgrade is done correctly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All callers including transitive callers through
virDomainDiskSourceFormatInternal always pass true. Remove the argument.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Using copy_on_read for removable disks is a hassle. It also does not
work for CDROMs at all as the image is supposed to be read-only and we
might ignore it for floppies when they are started as empty. Forbid it
for floppies completely rather than trying to support what probably
nobody is using.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Until the block job completes we can't change the disk chain. Removal
would fail as the block job still has reference to the chain.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Unref the config pointer automatically in code paths which get a local
copy.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuDomainChangeGraphicsPasswords and qemuDomainRemoveHostDevice
don't use 'cfg' any more since commits 4327df7eee and 802c59d4b9
respectively.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Failure of qemuMonitorGetVersion is fatal now that we only support QMP
based qemus. Remove the debug message since we report an error already.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
If the detected qemu version is below our required version 'package'
would be leaked.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Move the check out of virQEMUCapsInitQMPMonitor similarly to other
functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Move the code out of virQEMUCapsInitQMPMonitor similarly to other
functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Move the check out of virQEMUCapsInitQMPMonitor similarly to other
functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Some caps are cleared according to some more advanced logic after
detection. Split all that logic out into virQEMUCapsInitProcessCaps.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
virQEMUCapsInitQMPMonitor is massive now since it collects calls to the
various probing functions and also version based capabilities. Split
out the version based caps into a separate function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Check that the attribute is the same in qemuDomainDiskChangeSupported
in case somebody tries to change it using the UpdateDevice API.
https://bugzilla.redhat.com/show_bug.cgi?id=1601677
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
https://bugzilla.redhat.com/show_bug.cgi?id=1601677
This reverts commit 047cfb05ee
Using numeric comparison on strings means we reject every update
that does include the group name, even if it's unchanged.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
The warning is reported at a code path which already reports a proper
error so it's pointless to add yet another line into logs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Avoid the extra parameter passing in the disk 'dst' parameter to be
reported instead of the device alias. Using 'dst' instead of alias does
not add much value.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
qemuDomainRemoveDiskDevice calls qemuDomainReleaseDeviceAddress which
already calls virDomainUSBAddressRelease so we don't need to call it
again.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
There is one specific caller (testInfoSetArgs() in
qemuxml2argvtest.c) which expect the va_list argument to change
after returning from the virQEMUCapsSetVAList() function.
However, since we are passing plain va_list this is not
guaranteed. The man page of stdarg(3) says:
If ap is passed to a function that uses va_arg(ap,type), then
the value of ap is undefined after the return of that function.
(ap is a variable of type va_list)
I've seen this in action in fact: on i686 the qemuxml2argvtest
fails on the second test case because testInfoSetArgs() sees
ARG_QEMU_CAPS and calls virQEMUCapsSetVAList to process the
capabilities (in this case there's just one
QEMU_CAPS_SECCOMP_BLACKLIST). But since the changes are not
reflected in the caller, in the next iteration testInfoSetArgs()
sees the QEMU capability and not ARG_END.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Use the common base class virDomainMoment for iterator callbacks
related to snapshots from the qemu code, so that when checkpoint
operations are introduced, they can share the same callbacks.
Simplify the code for qemuDomainSnapshotCurrent by better utilizing
virDomainMoment helpers.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The qemu driver already had a full-blown virDomainMomentObjPtr to
check against, and the test driver ought to have one since we get
better error checking that the user passed in a valid object. Removes
the need for a helper function added in commit commit 4819f54b.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The VIR_MIGRATE_PARALLEL flag is implemented using QEMU's multifd
migration capability and the corresponding multifd-channels migration
parameter.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
For [some unknown reason, possibly/probably pure chance], Net devices
have been taken offline and their bandwidth tc rules cleared as the
very first operation when detaching the device. This is contrary to
every other type of device, where all hostside teardown is delayed
until we receive the DEVICE_DELETED event back from qemu, indicating
that the guest has finished with the device.
This patch delays these two operations until receipt of
DEVICE_DELETED, which removes an ugly wart from
qemuDomainDetachDeviceLive(), and also seems to be a more correct
sequence of events.
Signed-off-by: Laine Stump <laine@laine.org>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
The VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event is sent after qemu has
responded to a device_del command with a DEVICE_DELETED event. Before
queuing the event, *some* of the final teardown of the device's
trappings in libvirt is done, but not *all* of it. As a result, an
application may receive and process the DEVICE_REMOVED event before
libvirt has really finished with it.
Usually this doesn't cause a problem, but it can - in the case of the
bug report referenced below, vdsm is assigning a PCI device to a guest
with managed='no', using livirt's virNodeDeviceDetachFlags() and
virNodeDeviceReAttach() APIs. Immediately after receiving a
DEVICE_REMOVED event from libvirt signalling that the device had been
successfully unplugged, vdsm would cal virNodeDeviceReAttach() to
unbind the device from vfio-pci and rebind it to the host driverm but
because the event was received before libvirt had completely finished
processing the removal, that device was still on the "activeDevs"
list, and so virNodeDeviceReAttach() failed.
Experimentation with additional debug logs proved that libvirt would
always end up dispatching the DEVICE_REMOVED event before it had
removed the device from activeDevs (with a *much* greater difference
with managed='yes', since in that case the re-binding of the device
occurred after queuing the device).
Although the case of hostdev devices is the most extreme (since there
is so much involved in tearing down the device), *all* device types
suffer from the same problem - the DEVICE_REMOVED event is queued very
early in the qemuDomainRemove*Device() function for all of them,
resulting in a possibility of any application receiving the event
before libvirt has really finished with the device.
The solution is to save the device's alias (which is the only piece of
info from the device object that is needed for the event) at the
beginning of processing the device removal, and then queue the event
as a final act before returning. Since all of the
qemuDomainRemove*Device() functions (except
qemuDomainRemoveChrDevice()) are now called exclusively from
qemuDomainRemoveDevice() (which selects which of the subordinates to
call in a switch statement based on the type of device), the shortest
route to a solution is to doing the saving of alias, and later
queueing of the event, in the higher level qemuDomainRemoveDevice(),
and just completely remove the event-related code from all the
subordinate functions.
The single exception to this, as mentioned before, is
qemuDomainRemoveChrDevice(), which is still called from somewhere
other than qemuDomainRemoveDevice() (and has a separate arg used to
trigger different behavior when the chr device has targetType ==
GUESTFWD), so it must keep its original behavior intact, and must be
treated differently by qemuDomainRemoveDevice() (similar to the way
that qemuDomainDetachDeviceLive() treats chr and lease devices
differently from all the others).
Resolves: https://bugzilla.redhat.com/1658198
Signed-off-by: Laine Stump <laine@laine.org>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Now that all the qemuDomainDetachPrep*() functions look nearly
identical at the end, we can put one copy of that identical code in
qemuDomainDetachDeviceLive() at the point after the individual prep
functions have been called, and remove the duplicated code from all
the prep functions. The code to locate the target "detach" device
based on the "match" device remains, as do all device-type-specific
validations.
Unfortunately there are a few things going on at once in this patch,
which makes it a bit more difficult to follow than the others; it was
just impossible to do the changes in stages and still have a
buildable/testable tree at each step.
The other changes of note:
* The individual prep functions no longer need their driver or async
args, so those are removed, as are the local "ret" variables, since
in all cases the functions just directly return -1 or 0.
* Some of the prep functions were checking for a valid alias and/or
for attempts to detach a multifunction PCI device, but not all. In
fact, both checks are valid (or at least harmless) for *all* device
types, so they are removed from the prep functions, and done a
single time in the common function.
(any attempts to *create* an alias when there isn't one has been
removed, since that is doomed to failure anyway; the only way the
device wouldn't have an alias is if 1) the domain was created by
calling virsh qemu-attach to attach an existing qemu process to
libvirt, and 2) the qemu command that started said process used "old
style" arguments for creating devices that didn't have any device
ids. Even if we constructed a device id for one of these devices,
qemu wouldn't recognize it in the device_del command anyway, so we
may as well fail earlier with "device missing alias" rather than
failing later with "couldn't delete device net0".)
* Only one type of device has shutdown code that must not be called
until after *all* validation of the device is done (including
checking for multifunction PCI and valid alias, which is done in the
toplevel common code). For this reason, the Net function has been
split in two, with the 2nd half (qemuDomainDetachShutdownNet())
called from the common function, right before sending the delete
command to qemu.
Signed-off-by: Laine Stump <laine@laine.org>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Although all hotpluggable devices other than lease, controller,
watchdof, and vsock can be audited, and *are* audited when an unplug
is successful, only disk, net, and hostdev were actually being audited
on failure.
This patch corrects that omission.
Signed-off-by: Laine Stump <laine@laine.org>
ACKed-by: Peter Krempa <pkrempa@redhat.com>