The 'storageMigration' flag is supposed to be true if storage migration
is requested, which is based on VIR_MIGRATE_NON_SHARED_DISK or
VIR_MIGRATE_NON_SHARED_INC flags. The assignment to the variable used
QEMU_MONITOR_MIGRATE_NON_SHARED_INC (0x04) instead of
VIR_MIGRATE_NON_SHARED_INC (0x80), caused libvirtd to skip the actual
copy of data.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1978526
Fixes: da69f4b2084bff140238e450e264d6036ebef898
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Libvirt started emitting two threshold events, once with index and once
withouth when the index isn't registered. Document this caveat.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Remember whether the user passed an explicit index when registering the
event so that we can avoid the top level event when it isn't needed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When users register the threshold event for the top level image with an
explicit index (e.g. vda[3]) they are clearly expecting the index in the
event.
This flag will help avoiding emission of the second event without the
index when the client clearly requested one with the index.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When qos is set or delete, we have to check if the port is an ovs managed
port. If true, call the virNetDevOpenvswitchInterfaceSetQos function when qos
is set, and call the virNetDevOpenvswitchInterfaceClearQos function when
the interface is to be destroyed.
Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Return 0 directly if the port is ovs managed. When the ovs port is set
noqueue, qos config on this port will not work.
Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Introduce qos setting and cleaning method. Use ovs command to set qos
parameters on specific interface of qemu virtual machine.
When an ovs port is created, we add 'ifname' to external-ids. When setting
qos on an ovs port, query its qos and queue. If found, change qos on queried
queue and qos, otherwise create new queue and qos. When cleaning qos, query
and clean queues and qos in ovs table record by 'ifname' and 'vmid'.
Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Tell whether a port definition is an ovs managed virtual port
Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When seeing a guest with a sound device, and no audio backend, we
automatically add an audio backend XML element based on the historical
QEMU driver behaviour. Unfortunately when we live migrate back to an
old libvirt, it may not understand the audio driver type we configured.
We thus need to strip the default audio backend when migrating.
Fixes https://gitlab.com/libvirt/libvirt/-/issues/179
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There might be misunderstanding [1] when libvirt permits domain
redefinition and if it's a valid case at all.
1. b973d7c4b4/plugins/modules/virt.py (L533)
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It all started as a simple bug: trying to move domain memory
between NUMA nodes (e.g. via virsh numatune) did not work. I've
traced the problem to qemuProcessHook() because that's where we
decide whether to rely on CGroups or use numactl APIs to satisfy
<numatune/>. The problem was that virCgroupControllerAvailable()
was telling us that cpuset controller is unavailable. This is
CGroupsV2, and pretty weird because CGroupsV2 definitely do
support cpuset controller and I had them mounted in a standard
way. What I found out (with Pavel's help) was that
virCgroupNewSelf() was looking into the following path to detect
supported controllers:
/sys/fs/cgroup/system.slice/cgroup.controllers
However, if there's no other VM running then the system.slice
only has 'memory' and 'pids' controllers. Therefore, we saw
'cpuset' as not available. The fix is to look at the top most
path, which has the full set of controllers:
/sys/fs/cgroup/cgroup.controllers
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1976690
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When the qemu or libxl driver is configured to use lockd and
file_lockspace_dir is set, virtlockd emits an error when libvirtd
is retarted
May 25 15:44:31 virt81 virtlockd[7723]: Requested operation is not
valid: Lockspace for path /data/libvirtd/lockspace already exists
There is really no need to fail when the lockspace already exists,
paricularly since the user is expected to create the lockspace
specified in file_lockspace_dir. Failure to do so will prevent
starting any domains
virsh start test
error: Failed to start domain 'test'
error: Unable to open/create resource /data/libvirtd/lockspace/de22c4bf931e7c48b49e8ca64b477d44e78a51543e534df488b05ccd08ec5caa: No such file or directory
Also, virLockManagerLockDaemonSetupLockspace already has logic to ignore
the error. Since callers are not interested in the error, change
virtlockd to not report or return an error when the specified lockspace
already exists.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
If guest is configured to use memfd then the function that build
memory-backend-* part of command line will put
memory-backend-memfd, always. Even for NVDIMMs. This is not
correct, because NVDIMMs need a backing path (usually to a real
host NVDIMM device). Therefore, regardless of memfd being
requested, we have to stick with memory-backend-file.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When constructing guest name for machined we have to be very
cautious as machined expects a name that's basically a valid URI.
Therefore, if there's a dot it has to be followed by a letter or
a number. And if there's a sequence of two or more dashes they
should be joined into a single dash. These rules are implemented
in virDomainMachineNameAppendValid(). There's the @skip variable
which is supposed to track whether it is safe to append a dot or
a dash into name. However, the variable is set to false (meaning
it is safe to append a dot or a dash) even if the current
character we are processing is not in the set of allowed
characters (and thus skipped over).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1948433
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
In 'virResctrlAllocUpdateMask', mask is updated only if 'previous mask' is NULL.
By default, the bitmask for a cache resource for a VM is initialized with
'default-resctrl-group' bitmask. So the 'previous mask' would not be NULL and
mask won't get updated if cachetune is configured for a VM. This causes libvirt
to use same bitmask as 'default-resctrl-group' bitmask for a cache resource for
a VM. This patch fixes the issue.
Fixes: d8a354954aba9cd45ab0317915a0a2be27c04767
Signed-off-by: Vinayak Kale <vkale@nvidia.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Bounding set capabilities were introduced in kernel commit of
v2.6.25-rc1~912. I guess it is safe to assume that all Linux
hosts we ran on have at least that version or newer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When trying to destroy a node device that is not active, we end up with
a confusing error message:
# nodedev-destroy mdev_88a6b868_46bd_4015_8e5b_26107f82da38
error: Failed to destroy node device 'mdev_88a6b868_46bd_4015_8e5b_26107f82da38'
error: failed to access '/sys/bus/mdev/devices/88a6b868-46bd-4015-8e5b-26107f82da38/iommu_group': No such file or directory
With this patch, the error is more clear:
# nodedev-destroy mdev_88a6b868_46bd_4015_8e5b_26107f82da38
error: Failed to destroy node device 'mdev_88a6b868_46bd_4015_8e5b_26107f82da38'
error: Requested operation is not valid: Device 'mdev_88a6b868_46bd_4015_8e5b_26107f82da38' is not active
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Currently, we have three different types of mdevctl errors:
1. the command cannot be constructed ecause of unsatisfied
preconditions
2. the command cannot be executed due to some error
3. the command is executed, but returns an error status
These different failures are handled differently. Some cases set an
error and return and error status, and some return a error message but
do not set an error.
This means that the caller has to check both whether the return value is
negative and whether the errmsg parameter is non-NULL before deciding
whether to report the error or not. The situation is further complicated
by the fact that there are occasional instances where mdevctl exits with
an error status but does not print an error message. This results in
errmsg being an empty string "" (i.e. non-NULL).
Simplify the situation by ensuring that virReportError() is called for
all error conditions rather than returning an error message back to the
calling function.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
This macro will be utilized in the following patch. Since mdevctl
commands can fail with or without an error message, this macro makes it
easy to print a fallback error in the case that the error message is not
set.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
In commit 68580a51, I removed the checks for NULL cmd variables because
virCommandRun() already handles the case where it is called with a NULL
cmd. Unfortunately, it handles this case by raising a generic error
which is both unhelpful and overwrites our existing error message. So
for example, when I attempt to create a mediated device with an invalid
parent, I get the following output:
virsh # nodedev-create mdev-test.xml
error: Failed to create node device from mdev-test.xml
error: internal error: invalid use of command API
With this patch, I now get a useful error message again:
virsh # nodedev-create mdev-test.xml
error: Failed to create node device from mdev-test.xml
error: internal error: unable to find parent device 'pci_0000_00_03_0'
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
At the point where the error message is emitted, the field def->name is
still set to "new device", so the error message becomes:
Unable to start mediated device 'new device': ...
Since the name doesn't contain anything useful, just omit it from the
error message altogether.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Due to a rather unfortunate misunderstanding, we were parsing the list
of defined devices from mdevctl incorrectly. Since my primary
development machine only has a single device capable of mdevs, I
apparently neglected to test multiple parent devices and made some
assumptions based on reading the mdevctl code. These assumptions turned
out to be incorrect, so the parsing failed when devices from more than
one parent device were returned.
The details: mdevctl returns an array of objects representing the
defined devices. But instead of an array of multiple objects (with each
object representing a parent device), the array always contains only a
single object. That object has a separate property for each parent
device.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is possible to define/edit(in shut off state) a domain XML with
same hostdev device repeated more than once, as shown below. This
behavior is not expected. So, this patch fixes it.
vser1:
<domain type='kvm'>
[...]
<devices>
[...]
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-ccw'>
<source>
<address uuid='8e782fea-e5f4-45fa-a0f9-024cf66e5009'/>
</source>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0005'/>
</hostdev>
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-ccw'>
<source>
<address uuid='8e782fea-e5f4-45fa-a0f9-024cf66e5009'/>
</source>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0006'/>
</hostdev>
[...]
</devices>
</domain>
$ virsh define vser1
Domain 'vser1' defined from vser1
Signed-off-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We already reject TPM 1.2 in a number of scenarios; let's add
ARM virt guests to the list.
https://bugzilla.redhat.com/show_bug.cgi?id=1970310
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The TPM 2.0 specification predates ARM virtualization, and so
implementing TPM 1.2 support on ARM was not considered a useful
endeavor.
This is technically a breaking change, but TPM support on ARM was
only introduced fairly recently (libvirt 7.1.0) and the previous
default resulted in non working TPM devices; anyone who has a
working configuration is not going to be affected.
https://bugzilla.redhat.com/show_bug.cgi?id=1970310
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
A process can access a file if the set of MCS categories
for the file is equal-to *or* a subset-of, the set of
MCS categories for the process.
If there are two VMs:
a) svirt_t:s0:c117
b) svirt_t:s0:c117,c720
Then VM (b) is able to access files labelled for VM (a).
IOW, we must discard case where the categories are equal
because that is a subset of many other valid category pairs.
Fixes: https://gitlab.com/libvirt/libvirt/-/issues/153
CVE-2021-3631
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There are few cases where we execute a virCommand with all caps
cleared (virCommandClearCaps()). For instance
dnsmasqCapsRefreshInternal() does just that. This means, that
after fork() and before exec() the virSetUIDGIDWithCaps() is
called. But since the caller did not want to change anything,
just drop capabilities, these are the values of arguments:
virSetUIDGIDWithCaps (uid=-1, gid=-1, groups=0x0, ngroups=0,
capBits=0, clearExistingCaps=true)
This means that indeed all capabilities will be dropped,
including CAP_SETPCAP. But this capability controls whether
capabilities can be set, IOW whether capng_apply() succeeds.
There are two calls of capng_apply() in the function. The
CAP_SETPCAP is dropped after the first call and thus the other
call (capng_apply(CAPNG_SELECT_BOUNDS);) fails.
The solution is to keep the capability for as long as needed
(just like CAP_SETGID and CAP_SETUID) and drop it only at the
very end (just like CAP_SETGID and CAP_SETUID).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1949388
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
I noticed the following denial when running confined VMs with the QEMU
driver
type=AVC msg=audit(1623865089.263:865): apparmor="DENIED" operation="open" \
profile="virt-aa-helper" name="/etc/ssl/openssl.cnf" pid=12503 \
comm="virt-aa-helper" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Allow reading the file by including the openssl abstraction in the
virt-aa-helper profile.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
I noticed the following denial messages from apparmor in audit.log when
starting confined VMs via the QEMU driver
type=AVC msg=audit(1623864006.370:837): apparmor="DENIED" operation="open" \
profile="virt-aa-helper" name="/etc/libnl/classid" pid=11265 \
comm="virt-aa-helper" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
type=AVC msg=audit(1623864006.582:849): apparmor="DENIED" operation="open" \
profile="libvirt-0ca2720d-6cff-48bb-86c2-61ab9a79b6e9" \
name="/etc/libnl/classid" pid=11270 comm="qemu-system-x86" \
requested_mask="r" denied_mask="r" fsuid=107 ouid=0
It is possible for site admins to assign names to classids in this file,
which are then used by all libnl tools, possibly those used by libvirt.
To be on the safe side, allow read access to the file in the virt-aa-helper
profile and the libvirt-qemu abstraction.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The host key fingerprint for SSH servers is used in a scenario where
cryptographic strength is important. We should thus be defaulting to
use of SHA256 where available. We only need SHA1 for Ubuntu 18.04
which does not have libssh >= 0.8.1
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Cleanup to follow. This removes the last re-use of `nodes` in this function,
eliminating two VIR_FREEs.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
`feature` is always one of the values listed in the switch,
ensured by `virDomainKVMTypeFromString` above.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>