The slirp helper process should be associated with the VM cgroup, like
other helpers.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Don't stop the DBus daemon if a slirp helper failed to start, as it
may be shared with other helpers.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add support for xl.cfg(5) 'passthrough' option in the domXML-to-xenconfig
configuration converter.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'passthrough' is Xen-Specific guest configuration option new to Xen 4.13
that enables IOMMU mappings for a guest and hence whether it supports PCI
passthrough. The default is disabled. See the xl.cfg(5) man page and
xen.git commit babde47a3fe for more details.
The default state of disabled prevents hotlugging PCI devices. However,
if the guest configuration contains a PCI passthrough device at time of
creation, libxl will automatically enable 'passthrough' and subsequent
hotplugging of PCI devices will also be possible. It is not possible to
unconditionally enable 'passthrough' since it would introduce a migration
incompatibility due to guest ABI change. Instead, introduce another Xen
hypervisor feature that can be used to enable guest PCI passthrough
<features>
<xen>
<passthrough state='on'/>
</xen>
</features>
To allow finer control over how IOMMU maps to guest P2M table, the
passthrough element also supports a 'mode' attribute with values
restricted to snyc_pt and share_pt, similar to xl.cfg(5) 'passthrough'
setting .
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
e820_host is a Xen-specific option, only available for PV domains, that
provides the domain a virtual e820 memory map based on the host one. It
is enabled with a new Xen hypervisor feature, e.g.
<features>
<xen>
<e820_host state='on'/>
</xen>
</features>
e820_host is required when using PCI passthrough and is generally
considered safe for any PV kernel. e820_host is silently ignored if set
in HVM domain configuration. See xl.cfg(5) man page in the Xen
documentation for more details.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
The udev monitor thread "udevEventHandleThread()" will lag the
actual/real view of devices in sysfs as it serially processes udev
monitor events. So for instance if you were to run the following cmd
to create a new veth pair and rename one of the veth endpoints
you might see the following monitor events and real world that looks like
time
| create v0 sysfs entry
wake udevEventHandleThread | create v1 sysfs entry
udev_monitor_receive_device(v1-add) | move v0 sysfs to v2
udevHandleOneDevice(v1) |
udev_monitor_receive_device(v0-add) |
udevHandleOneDevice(v0) | <--- error msgs in virNetDevGetLinkInfo()
udev_monitor_receive_device(v2-move) | as v0 no longer exists
udevHandleOneDevice(v2) |
\/
As you can see the changes in sysfs can take place well before we get
to act on the events in the udevEventHandleThread(), so by the time we
get around to processing the v0 add event, the sysfs entry has been
moved to v2.
To work around this we check if the sysfs entry is valid before
attempting to read it and don't bother trying to read link info if
not. This is safe since we will never read sysfs entries earlier than
it existing, ie. if the entry is not there it has either been removed
in the time since we enumerated the device or something bigger is
busted, in either case, no sysfs entry, no link info. In the case
described above we will eventually get the link info as we work
through the queue of monitor events and get to the 'move' event.
https://bugzilla.redhat.com/show_bug.cgi?id=1557902
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is possible and common to rename some devices, this is especially
true for ethernet devices such as veth pairs.
In the udevEventHandleThread() we will be notified of this change but
currently we only process "add", "change" and "remove"
events. Renaming a device such as above results in a "move" event, not
a "remove" followed by and "add" or vise versa. This change will add
the new/destination device to our records but unfortunately there is
no usable mechanism to identify the old/source device to remove it
from the records. So this is admittedly only a partial fix.
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Changes in the API:
- APIs related to the graphics adapter are no longer on the
IMachine interface, but on a IGraphicsAdapter interface
- The LaunchVMProcess method takes a list of env variables
instead of a single variable containing a concatenated
list. Since we only ever pass a single env variable, we
can simply stuff it straight into a list.
- The DHCP server start method no longer needs the network
name
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Changes in the API:
- The CreatedSharedFolder method now accepts a target mount
point. Since we don't request automount, we're just passing
NULL. We could, however, use this to pass the desired
mount target from the XML config in future.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Long ago we switched the vbox driver to run inside libvirtd to avoid
libvirt.so being polluted with GPLv2-only code. Since libvirtd is not
built on Windows, we disabled vbox on Windows builds. Thus the MSCOM
glue code is not required.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
While I'm at it, use more g_autofree and g_autoptr() in this
file. This also fixes a possible mem-leak in
virNetDevGetVirtualFunctions().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
I've just got a new machine and I'm still converging on the
kernel config. Anyway, since I don't have enabled any of SRIO-V
drivers, my kernel doesn't have NET_DEVLINK enabled (i.e.
virNetDevGetFamilyId() returns 0). But this makes nodedev driver
ignore all interfaces, because when enumerating all devices via
udev, the control reaches virNetDevSwitchdevFeature() eventually
and subsequently virNetDevGetFamilyId() which 'fails'. Well, it's
not really a failure - the virNetDevSwitchdevFeature() stub
simply returns 0.
Also, move the call a few lines below, just around the place
where it's needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduced in v3.8.0-rc1~96, the virNetDevGetFamilyId() gets
netlink family ID for passed family name (even though it's used
only for getting "devlink" ID). Nevertheless, the function
returns 0 on an error or if no family ID was found. This makes it
harder for a caller to distinguish these two. Change the retval
so that a negative value is returned upon error, zero is no ID
found (but no error encountered) and a positive value is returned
on successful translation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As explained in the previous commit, we need to relabel the file
we are restoring the domain from. That is the FD that is passed
to QEMU. If the file is not under /dev then the file inside the
namespace is the very same as the one in the host. And regardless
of using transactions, the file will be relabeled. But, if the
file is under /dev then when using transactions only the copy
inside the namespace is relabeled and the one in the host is not.
But QEMU is reading from the one in the host, actually.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1772838
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This API allows drivers to separate out handling of @stdin_path
of virSecurityManagerSetAllLabel(). The thing is, the QEMU driver
uses transactions for virSecurityManagerSetAllLabel() which
relabels devices from inside of domain's namespace. This is what
we usually want. Except when resuming domain from a file. The
file is opened before any namespace is set up and the FD is
passed to QEMU to read the migration stream from. Because of
this, the file lives outside of the namespace and if it so
happens that the file is a block device (i.e. it lives under
/dev) its copy will be created in the namespace. But the FD that
is passed to QEMU points to the original living in the host and
not in the namespace. So relabeling the file inside the namespace
helps nothing.
But if we have a separate API for relabeling the restore file
then the QEMU driver can continue calling
virSecurityManagerSetAllLabel() with transactions enabled and
call this new API without transactions.
We already have an API for relabeling a single file
(virSecurityManagerDomainSetPathLabel()) but in case of SELinux
it uses @imagelabel (which allows RW access) and we want to use
@content_context (which allows RO access).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This commit partially reverts
commit c360ea28dc267802690e129fbad08ca2f22a44e9
Refs: v6.2.0-rc1-1-gc360ea28dc
Author: Rafael Fonseca <r4f4rfs@gmail.com>
AuthorDate: Fri Mar 27 18:40:47 2020 +0100
Commit: Michal Prívozník <mprivozn@redhat.com>
CommitDate: Mon Mar 30 09:48:22 2020 +0200
util: virdaemon: fix compilation on mingw
The daemons are not supported on Win32 and therefore were not compiled
in that platform. However, with the daemon code sharing, all the code in
utils *is* compiled and it failed because `waitpid`, `fork`, and
`setsid` are not available. So, as before, let's not build them on
Win32 and make the code more portable by using existing vir* wrappers.
Not compiling virDaemonForkIntoBackground on Win32 is good, but the
second part of the original patch incorrectly replaced waitpid and fork
with our virProcessWait and virFork APIs. These APIs are more than just
simple wrappers and we don't want any of the extra functionality.
Especially virFork would reset any setup made before
virDaemonForkIntoBackground is called, such as logging, signal handling,
etc.
As a result of the change the additional fix in v6.2.0-67-ga87e4788d2
(util: virdaemon: fix waiting for child processes) is no longer
needed and it is effectively reverted by this commit.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fixes build error introduced in
commit aa15e9259f1f246e69fb9742581ced720c88695d
Author: Laine Stump <laine@redhat.com>
Date: Sun Apr 5 22:40:37 2020 -0400
qemu/conf: set HOTPLUGGABLE connect flag during PCI address set init
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Trivial comment fix, reflecting the changes in
4ee2b31804f4d3477ee83bac28d9991afb0c3393.
Signed-off-by: Leonid Bloch <lb.workbox@gmail.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Previously, we used virCapabilitiesDomainDataLookup() to fill
machine type in post parse callback if none was provided in the
domain XML. If machine type couldn't be filled in an error was
reported. After 4a4132b4625 we've changed it to
virQEMUCapsGetPreferredMachine() which returns NULL, but we no
longer report an error and proceed with the post parse callbacks
processing. This may lead to a crash because the code later on
assumes def->os.machine is not NULL.
Fixes: 4a4132b4625
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>
When preparing to do a blockcopy, the mirror image is modified so
that QEMU can access it. For instance, the mirror has seclabels
set, if it is a NVMe disk it is detached from the host and so on.
And usually, the restore is done upon successful finish of the
blockcopy operation. But, if something fails then we need to
explicitly revoke the access to the mirror image (and thus
reattach NVMe disk back to the host).
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1822538
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>
When we do parallel migration, The multifd-channels migration parameter
needs to be set on the destination side as well before incoming migration
URI, unless we accept the default number of connections(2).
Usually, This can be correctly handled by libvirtd. But in this case if
we use p2p + xbzrle compression without parameter '--comp-xbzrle-cache',
qemuMigrationParamsDump returns too early, The corresponding migration
parameter will not be set on the destination side, It results QEMU hangs.
Reproducer:
virsh migrate --live --p2p --comp-methods xbzrle \
--parallel --parallel-connections 3 GUEST qemu+ssh://dsthost/system
or
virsh migrate --live --p2p --compressed \
--parallel --parallel-connections 3 GUEST qemu+ssh://dsthost/system
Signed-off-by: Lin Ma <lma@suse.com>
Message-Id: <20200416044451.21134-1-lma@suse.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
With libpmem support compiled into qemu it will trigger the following
denials on every startup.
apparmor="DENIED" operation="open" name="/"
apparmor="DENIED" operation="open" name="/sys/bus/nd/devices/"
This is due to [1] that tries to auto-detect if the platform supports
auto flush for all region.
Once we know all the paths that are potentially needed if this feature
is really used we can add them conditionally in virt-aa-helper and labelling
calls in case </pmem> is enabled.
But until then the change here silences the denial warnings seen above.
[1]: https://github.com/pmem/pmdk/blob/master/src/libpmem2/auto_flush_linux.c#L131
Bug: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1871354
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Starting with 3b076391befc3fe72deb0c244ac6c2b4c100b410
(v6.1.0-122-g3b076391be) we support http cookies. Since they may contain
somewhat sensitive information we should not format them into the XML
unless VIR_DOMAIN_DEF_FORMAT_SECURE is asserted.
Reported-by: Han Han <hhan@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
We always tried to install backing store for the image even if it didn't
make sense, e.g. for a full backup into a raw image. Additionally we
didn't record the backing file into the qcow2 metadata so the image
itself contained the diff of data but reading from it would be
incomplete as it depends on the backing image.
This patch fixes both issues by carefully installing the correct backing
file when appropriate and also recording it into the metadata when
creating the image.
https://bugzilla.redhat.com/show_bug.cgi?id=1813310
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is the last missing g_autofree conversion change in the module after
commit 1e2ae2e311c took care of the VIR_AUTOFREE conversion.
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Before this patch we would simply rely on QEMU failing to attach the
device. Since we have a flag in the address set telling us which
controllers support hotplug, we can fail the operation sooner.
This also assures that when hotplugging with no provided PCI address,
that we skip any controllers with hotplug='off', and attempt to assign
the device to a controller that not only supports hotplug, but also
has it enabled.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The HOTPLUGGABLE flag is set for appropriates buses in a PCI address
set, and thnis patch updates virDomainPCIAddressFlagsCompatible() to
check the HOTPLUGGABLE flag when searching for a suitable bus/slot for
a device. No devices request HOTPLUGGABLE though (yet), so there is no
observable effect.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomainPCIAddressBusSetModel() is called for each PCI controller
when building an address set prior to assiging PCI addresses to
devices.
This patch adds a new argument, allowHotplug, to that function that
can be set to false if we know for certain that a particular
controller won't support hotplug
The most interesting case is in qemuDomainPCIAddressSetCreate(), where
the config of each existing controller is available while building the
address set, so we can appropriately set allowHotplug = false when the
user has "hotplug='off'" in the config of a controller that normally
would support hotplug. In all other cases, it is set to true or false
in accordance with the capability of the controller model.
So far we aren't doing anything with this bus flag in the address set.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Old behavior: If the address was manually provided by config, copy
device AUTOASSIGN flag into the bus flag, and then later on in the
function *always* check for a match of the flags (which will always
match if the address came from config, since we just copied it).
New behavior: Don't mess with the bus flags - just directly check if
the AUTOASSIGN flag matches in bus and dev, but only make the check if
the address didn't come from config (i.e. it was auto-assigned by
libvirt).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When the HOTPLUGGABLE flag was originally added, it was set for all
the PCI controllers that accepted hotplugged devices, and requested
for all devices that were auto-assigned to a controller. While we're
still autoassigning to the same list of controllers, those controllers
may or may not support hotplug, so let's use the flag that fits what
we're actually doing.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This new flag will be set for any controller that we decide can have
devices assigned to it automatically during PCI device assignment. In
the past PCI_CONNECT_TYPE_HOTPLUGGABLE was used for this purpose, but
that is overloading that flag, and no longer technically correct; what
we *really* want is to auto-assign devices to any pcie-root-port or
pcie-switch-downstream-port regardless of whether or not that
controller happens to have hotplug enabled.
This patch just adds the flag, but doesn't use it at all. Note that
the numbering of all the other flags was changed in order to insert
the new flag near the beginning of the list; that doesn't cause any
problem because the connect flags aren't stored anywhere between runs
of libvirtd.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If a pcie-root-port or pcie-downstream-port has hotplug='off' in its
<target> subelement, and if the qemu binary supports the hotplug=false
option, then it will be added to the commandline for the pcie
controller. This controller will then not allow any hotplug/unplug of
devices while the guest is running (and the hotplug capability won't
be advertised to the guest OS, so the guest OS also won't present
unplugging of PCI devices as an option).
<controller type='pci' model='pcie-root-port'>
<target hotplug='off'/>
</controller>
For any PCI controllers other than pcie-downstream-port and
pcie-root-port, of for qemu binaries that don't support the hotplug
commandline option, an error will be logged during validation.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
a <controller type='pci'...> element can now have a "hotplug"
attribute in the <target> subelement. This is intended to control
whether or not the slot(s) of the controller support
hotplugging/unplugging a device:
<controller type='pci' model='pcie-root-port'>
<target hotplug='off'/>
</controller>
The default value of hotplug is "on".
Since support for configuring such an option is hypervisor-dependent
(and will vary among different types of PCI controllers even on a
single hypervisor), no validation is done in this patch - that
validation will be done in the patch that wires support for the
setting into the hypervisor.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This caps flag is set when the qemu binary supports the option
"hotplug" for pcie-root-port, ioh3420 (Intel pcie-root-port) and
xio3130-downstream (Intel pcie-downstream-port). If it's available,
it's possible to disable hotplugging/unplugging devices on a
particular port by adding ",hotplug=off" to the qemu device
commandline. This option first appears in qemu-5.0.0.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add support in the domXML<->native config converter for max_event_channels.
The parser and formater functions for max_grant_frames were reworked to
also parse max_event_channels. In doing so the xenbus controller is added
earlier in the config parsing, requiring a small adjustment to one of the
existing tests. Include a new test for the event channel conversion.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add support for setting event_channels in libxl domain config object and
include a test to check that it is properly converted from XML to libxl
domain config.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Event channels are like PV interrupts and in conjuction with grant frames
form a data transfer mechanism for PV drivers. They are also used for
inter-processor interrupts. Guests with a large number of vcpus and/or
many PV devices many need to increase the maximum default value of 1023.
For this reason the native Xen config format supports the
'max_event_channels' setting. See xl.cfg(5) man page for more details.
Similar to the existing maxGrantFrames option, add a new xenbus controller
option 'maxEventChannels', allowing to adjust the maximum value via libvirt.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>