All backing chain members which were auto-added by image detection,
including the terminating element, should have the 'detected' property
set to true. This is needed to properly strip the detected elements in
some cases, e.g. for the status XML where we could treat some images as
manually terminated even when it was auto-detected.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rewrap argument definition of qemuDomainSaveInternal and align argument
in the invocation of the aforementioned function in
qemuDomainManagedSaveHelper.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After the previous commit we no longer require that logind is actually
running, it merely has to be activatable.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Historically we wanted to check if logind was actually running, not
merely activatable, because on systems where systemd is installed,
but the OS is booted into non-systemd init, we want to fallback to
pm-utils.
Requiring logind to be running, however, forces us to serialize libvirtd
startup on startup of logind which is undesirable. We can relax this
dependancy if we check whether systemd itself is running, which implies
that logind will activated when we need it.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since systemd 240, all services get an open file hard limit of
500k, and a soft limit of 1024. This limit means apps are safe
to use select() by default which is limited to 1024 FDs. Apps
which don't use select() are expected to simply set their soft
limit to match the hard limit during startup.
With our current unit file settings we've been effectively
reducing the max open files we have on most modern systems.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
None of our daemons use select(), so it is safe to raise the max file
limit to its maximum on startup.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Historically the max files limit for processes has always been 1024,
because going beyond this is incompatible with the select() function.
None the less most apps these days will use poll() so should not be
limited in this way.
Since systemd >= 240, the hard limit will be 500k, while the soft
limit remains at 1k. Applications which don't use select() should
raise their soft limit to match the hard limit during their startup.
This function provides a convenient helper to do this limit raising.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
These wrappers added no semantic difference over calling the system
function directly.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The unit files both have After=network.target, and this in turn implies
After=network-pre.target. Both iptables.service & ip6tables.service have
Before=network-pre.target since Fedora >= 35 and RHEL >= 8.4.
When we first added the deps on ip[6]tables.service in
commit 0756415f147dda15a417bd79eef9a62027d176e6
Author: Laine Stump <laine@redhat.com>
Date: Fri May 1 00:05:50 2020 -0400
systemd: start libvirtd after firewalld/iptables services
the Before=network-pre.target didn't exist, but we can rely on it now
given our supported platforms matrix.
The firewalld.service has similarly has a Before=network-pre.target,
even when we took that commit above, so this dep was in face never
actually needed. This answers the question posed in that above commit
message about firewalld ordering.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
All services are ordered after local-fs.target unless they have set
DefaultDependencies=no, which we do not do.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A test case can be part of a test suite (just like we already
have 'syntax-check'). This then allows developers to run only a
subset of tests. For instance - when using valgrind test setup
(`meson test -C _build/ --setup valgrind`) it makes zero sense to
run syntax-check tests or other script based tests (e.g.
check-augeas-*, check-remote_protocol, etc.). What does makes
sense is to run compiled binaries.
Strictly speaking, reaching that goal is as trivial as annotating
only those compiled tests (declared in tests/meson.build) and
running them selectively:
meson test -C _build/ --setup valgrind --suite $TAG
But it may be also desirable to run test scripts separately.
Therefore, introduce two new tags: 'bin' for compiled tests, and
'script' for script based tests and annotate each test()
accordingly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The current virtStorageBackendZFSCheckPool checks for the existence of a
path under /dev/zvol/ to determine if the pool is active. ZFS does not
create a path under /dev/zvol/ if no ZFS volumes have been created under
a particular dataset, thus, empty ZFS storage pools are deactivated
whenever checkPool is called on them (as noted in referenced issue).
This commit changes virStorageBackendZFSCheckPool so that the 'zfs list'
command is used to explicitly check for the existence a dataset
specified by the pool's def->source.name.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/221
Signed-off-by: Matt Low <matt@mlow.ca>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since commit 44a0f2f0, we now query mdevctl for transient (active) mdevs
in order to gather attributes for the mdev. Unfortunately, this commit
introduced a regression because nodeDeviceUpdateMediatedDevice() assumed
that all mdevs returned from mdevctl were actually persistent mdevs but
we were using it to update transient mdevs. Refactor the function so
that we can use it to update both persistent and transient mdevs.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
The virtio-gpu 'blob' support was insufficiently validated. Qemu
requires a memfd memory backing in order to use udmabuf and enable blob
support. Example error:
$ virsh start rhel9
error: Failed to start domain 'rhel9'
error: internal error: qemu unexpectedly closed the monitor: 2023-07-18T02:33:57.083178Z qemu-kvm: -device {"driver":"virtio-vga","id":"video0","max_outputs":1,"blob":true,"bus":"pcie.0","addr":"0x1"}: cannot enable blob resources without udmabuf
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Historically, the way to set PC speaker for a guest was to pass:
-soundhw pcspk
but as of QEMU commit v5.1.0-rc0~28^2~3 this is deprecated and we
should use:
-machine pcspk-audiodev=$id
instead. The old way was then removed in commit v7.1.0-rc0~99^2~3.
Now, ideally we would have a capability selecting whether we talk
to a QEMU that understands the new way or not. But it's not that
simple - the machine attribute is just an alias to the .audiodev=
attribute of 'isa-pcspk' object and both are created in
pc_machine_initfn() function, i.e. not then the PC_MACHINE() class
is initialized, but when it's instantiated. IOW, it's not possible
for us to query whether we're dealing with older or newer QEMU.
But given that the newer version is supported since v5.1.0 and the
minimal version we require is v4.2.0 (i.e. there are two releases
which don't understand the newer cmd line) and how frequently this
feature is (un-)used (the issue was reported after ~1 year since it
stopped working), I believe we can live without any capability and
just use the newer cmd line unconditionally.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/490
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Now that the QEMU_CAPS_USB_STORAGE_REMOVABLE capability is no
longer used we can stop querying it and retire it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Introduced in QEMU commit of v0.14.0-rc0~83^2~1 and not being
able to compile the .removable attribute of the "usb-storage"
object out, renders our corresponding capability
QEMU_CAPS_USB_STORAGE_REMOVABLE always set. Stop using it in
command generation / domain validation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After previous commit, there's no functional difference between
real virRandomGenerateWWN() and the mocked version. Drop the mock
then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This brings the code closer to real implementation:
nodeDeviceCreateXML(). For the unique OUI, let's take the value
from tests/virrandommock.c: 100000.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Firstly, drop needless concatenation of two static strings.
Secondly, use proper (portable) formatter for uint64_t so that
typecast to ULL can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Commit be1b7d5b18 introduced parsing /proc/cpuinfo for "address size"
which is not including on S390 and therefore reports an internal error.
Lets remove the parsing on S390.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add async-teardown to the features list in domain capabilities allowing
high level management to introspect the availability of the asynchronous
teardown feature.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Up until v2.11.0-rc2~19^2~3 QEMU used to require at least one
NUMA node to be configured when memory hotplug was enabled. After
that commit, QEMU automatically adds a NUMA node if none was
specified on the cmd line. Reflect this in domain XML, i.e.
explicitly add a NUMA node into our domain definition if needed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2216236
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Rather than directly executing mdevctl from the udev event thread when
we determine that we need to re-query, schedule the mdevctl thread to
run. This also helps to coalesce multiple back-to-back updates into a
single one when there are multiple updates in a row or at startup when a
host has a very large number of mdevs.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Factor out a new scheduleMdevctlUpdate() function so that we can re-use
it from other places. Now that other events can make it necessary to
re-query mdevctl for mdev updates, this function will be useful for
coalescing multiple updates in quick succession into a single mdevctl
query.
Also rename a couple functions. The names weren't very descriptive of
their behavior. For example, the old scheduleMdevctlHandler() function
didn't actually schedule anything, it just started a thread. So rename
it to free up the 'schedule' name for the above refactored function.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Update the optional mdev attributes by running an mdevctl update on a
new created nodedev object representing an mdev.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143158
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
If a domain has NUMA configured, then all <memory/> devices
(except for 'virtio-pmem') need to have targetNode set. There are
two checks inside of qemuDomainDefValidateMemoryHotplugDevice()
for this: one inside of big switch() statement, which only checks
'dimm' and 'nvdimm' cases, and the other at the end of the
function that checks all models (except for 'virtio-pmem'). Let's
keep the latter and remove the former as the latter covers the
former too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
The libxl driver has basic support for VIR_MIGRATE_CHANGE_PROTECTION
by starting and stopping modify jobs in the begin/confirm and prepare/finish
phases of migration, but it doesn't advertise that support. This can result
in unterminated jobs because the migration logic skips phases of migration
when the VIR_MIGRATE_CHANGE_PROTECTION feature is absent. Ensure jobs are
terminated properly by advertising support for VIR_MIGRATE_CHANGE_PROTECTION.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
For unknown reasons, the libxl driver attempts to resume a domain in the
confirm phase when a migration operation has been canceled. This has shown
to be problematic when simulating scenarios that result in a canceled
migration. In all scenarios, the domain was in a running state when entering
libxlDomainMigrationSrcConfirm, causing the call to libxl_domain_resume to
fail. Making matters worse, the domain state is changed to paused when in
fact it's running. And finally, libxlDomainMigrationSrcConfirm incorrectly
returns an error.
Remove this incorrect logic from libxlDomainMigrationSrcConfirm. On a
canceled migration it's sufficient to resume the lock process that was
paused in the perform phase.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Our CI started to enable udev backend on FreeBSD. And while there
is udev on FreeBSD some parts of our code are highly Linux
specific, e.g. translating SCSI device type to string (from an
integer obtained from the sysfs). Obviously, this doesn't work
anywhere else. This is the reason why we need to include
scsi/scsi.h header file (which actually comes from the Linux
kernel source tree but for some reason glibc started to
distribute it, followed by musl).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Asynchronous teardown can be specified if the QEMU binary supports it by
adding in the domain XML
<features>
...
<async-teardown enabled='yes|no'/>
...
</features>
By default this new feature is disabled.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
QEMU capability is looking in query-command-line-options response for
...
{
"parameters": [
{
"name": "async-teardown",
"type": "boolean"
}
],
"option": "run-with"
}
...
allow to use the QEMU option -run-with async-teardown=on|off
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Allow //disk/target@removable for scsi disk devices, since QEMU has support
the removable attribute for scsi-hd device from v0.14.0[1].
[1]: 419e691f8e: scsi-disk: Allow overriding SCSI INQUIRY removable bit
Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Do for all other profiles what we already do for the
virt-aa-helper one. In this case we limit the feature to AppArmor
3.x, as it was never implemented for 2.x.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
For AppArmor 3.x we can use 'include if exists', which frees us
from having to create a dummy override. For AppArmor 2.x we keep
things as they are to avoid introducing regressions.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Implement the standard AppArmor 3.x abstraction extension
approach.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
The subprofile can only work by including the abstraction shipped
in the passt package, which we can't assume is present, and
'include if exists' doesn't work well on 2.x.
No distro that's stuck on AppArmor 2.x is likely to be shipping
passt anyway.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Compared to profiles, we only need a single preprocessing step
here, as there is no variable substitution happening.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Perform an additional preprocessing step before the existing
variable substitution. This is the same approach that we already
use to customize systemd unit files based on whether the service
supports TCP connections.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
After v8.1.0-61-g030faee28d it is no longer necessary to make the
/proc/meminfo file nonseekable as our code that fills the file
with spoofed values can handle seeking just fine.
Previously, `free(1)` was okay with failed lseek(), but this was
ages ago and meanwhile the procps project moved to creating a
library and moved the file parsing code under an exported
function. In attempt to make the function callable multiple
times, it can lseek() multiple times and failure to do so is
fatal.
This reverts commit 766495508650bebd5f4ac23224ecd0a2ee2ca9eb
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/492
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Fix the syntax-check failures (which can be seen after
python3-flake8-import-order package is installed) with the help
of isort[1]:
289/316 libvirt:syntax-check / flake8 FAIL 5.24s exit status 2
[1]: https://pycqa.github.io/isort/
Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As it turns out, apparmor 2.x and 3.x behave differently or have differing
levels of support for local customizations of profiles and profile
abstractions. Additionally the apparmor 2.x tools do not cope well with
'include if exists'. Revert this commit until a more complete solution is
developed that works with old and new apparmor.
Reverts: 9b743ee19053db2fc3da8fba1e9cf81915c1e2f4
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
If VIR_ASYNC_JOB_NONE flag is present, job.current is equal
to NULL, which leads to SIGSEGV. Thus, this check should be
moved up.
Fixes: v8.0.0-427-gf304de0df6
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
In one of my previous commits I've introduced @logfd variable
that was supposed to hold FD of passt logfile. But I've forgot to
assign the qemuDomainOpenFile() retval to it.
Fixes: 8511b96a319836700b4829816cdae27c3630060d
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are a few situations where passt itself is unable to create
a file because it runs under QEMU user (e.g. just like our
example from formatdomain.rst suggests: /var/log/passt.log). If
libvirtd runs with sufficient permissions (e.g. as root) it can
create the file and set seclabels on it so that passt can then
open it.
Ideally, we would just pass pre-opened FD, but this wasn't viewed
as secure enough [1]. So lets just create the file and set
seclabels.
For the case when both libvirtd and passt have the same
permissions, well then we fail before even needing to fork() and
exec().
1: https://archives.passt.top/passt-dev/20230606225836.63aecebe@elisabeth/
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2209191
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
New storage types are not implemented in generators for -drive and the
xen config. Explicitly reject them in case of a programming error.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If there are no parameters, there is nothing to validate.
If params == NULL, memcpy below results in memcpy(sorted, NULL, 0),
which is UB.
Found by UBSAN. Example of this codepath: virDomainBlockCopy()
(where nparams == 0 is valid) -> qemuDomainBlockCopy()
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
When detaching a device, the following race condition may happen:
Once qemuDomainSignalDeviceRemoval() marks the device for
removal, it returns true, which means it is the caller
that marked the device for removal is going to remove the
device from domain definition.
But qemuDomainWaitForDeviceRemoval() may still receive
timeout from virDomainObjWaitUntil() which is implemented
by pthread_cond_timedwait() due to an unavoidable race
between the expiration of the timeout and the predicate
state(priv->unplug.alias) change.
And then qemuDomainWaitForDeviceRemoval() will return 0,
thus the caller will not remove the device from domain
definition.
In this situation, the device is still present in the domain
definition but doesn't exist in qemu anymore. Worse, there is
no way to remove it from the domain definition.
Solution is to recheck the value of priv->unplug.alias to
determine who is going to remove the device from domain
definition.
Signed-off-by: zuo boqun <zuoboqun@baidu.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Qemu 8.1.0 will add discard_no_unref option for qcow2 images.
When this option is enabled (default=false), then it will no longer
unreference clusters when guest does a discard, but it will just free
the blocks (useful for incremental backups for example) and pass the
discard to the lower layer.
This was implemented to avoid fragmentation within the qcow2 image.
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>