Inside daemonStreamHandleWrite on stream completion (status=OK) we
reuse msg object to send confirmation.
Only after that, msg is poped from the queue and checked for continue.
By that time, msg might've already been processed for the confirmation
and freed.
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Helped to debug next patch use-after-free.
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If a per-domain SWTPM state directory exists but is empty our
code still considers it a valid state and skips running
'swtpm_setup' (handled in qemuTPMEmulatorRunSetup()).
While we should not try to inspect individual files created by
swtpm, we can still consider empty folder as non-existent state.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/320
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There might be cases where we want to know whether given
directory is empty or not. Introduce a helper for that:
virDirIsEmpty().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Queues is supported by virtio bus, including virtio-blk and
vhost-user-blk.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we don't use it for probing at all we can remove all the
corresponding monitor code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The capability code now probes the presence of commands from the QMP
schema instead of using 'query-commands'. Don't call the command and
adjust the '.replies' files.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move the probing code to extract the data from the QMP schema rather
than invoking 'query-commands'. This patch doesn't yet remove the actual
invocation of 'query-commands', just moves the actual probing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
nodeDeviceUpdateMediatedDevices invokes virMdevctlListDefined and
virMdevctlListActive both of which were passed the same 'errmsg' buffer.
Since virCommandSetErrorBuffer() always allocates the error buffer one
of them was leaked.
Fix it by populating the 'errmsg' buffer only on failure of
virMdevctlListActive|Defined which invoke the command.
Add a comment to nodeDeviceGetMdevctlListCommand reminding how
virCommandSetErrorBuffer() works.
Fixes: 44a0f2f0c8f
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
The support for configuring the 'wwn' of a IDE disk was added in qemu
commit 95ebda85e09 (v1.0-1869-g95ebda85e0) and can't be compiled
out.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The support for configuring the 'wwn' of a SCSI disk was added in qemu
commit 27395add759ff4caeb0 (v1.0-3326-g27395add75) and can't be compiled
out.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
CVE-2023-3750
'virStoragePoolObjListSearch' explicitly documents that it's returning
a pointer to a locked and ref'd pool that maches the lookup function.
This was not the case as in commit 0c4b391e2a9 (released in
libvirt-8.3.0) the code was accidentally converted to use 'VIR_LOCK_GUARD'
which auto-unlocked it when leaving the scope, even when the code was
originally "leaking" the lock.
Revert the corresponding conversion and add a comment that this function
is intentionally leaking a locked object.
Fixes: 0c4b391e2a9
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2221851
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All backing chain members which were auto-added by image detection,
including the terminating element, should have the 'detected' property
set to true. This is needed to properly strip the detected elements in
some cases, e.g. for the status XML where we could treat some images as
manually terminated even when it was auto-detected.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rewrap argument definition of qemuDomainSaveInternal and align argument
in the invocation of the aforementioned function in
qemuDomainManagedSaveHelper.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After the previous commit we no longer require that logind is actually
running, it merely has to be activatable.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Historically we wanted to check if logind was actually running, not
merely activatable, because on systems where systemd is installed,
but the OS is booted into non-systemd init, we want to fallback to
pm-utils.
Requiring logind to be running, however, forces us to serialize libvirtd
startup on startup of logind which is undesirable. We can relax this
dependancy if we check whether systemd itself is running, which implies
that logind will activated when we need it.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since systemd 240, all services get an open file hard limit of
500k, and a soft limit of 1024. This limit means apps are safe
to use select() by default which is limited to 1024 FDs. Apps
which don't use select() are expected to simply set their soft
limit to match the hard limit during startup.
With our current unit file settings we've been effectively
reducing the max open files we have on most modern systems.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
None of our daemons use select(), so it is safe to raise the max file
limit to its maximum on startup.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Historically the max files limit for processes has always been 1024,
because going beyond this is incompatible with the select() function.
None the less most apps these days will use poll() so should not be
limited in this way.
Since systemd >= 240, the hard limit will be 500k, while the soft
limit remains at 1k. Applications which don't use select() should
raise their soft limit to match the hard limit during their startup.
This function provides a convenient helper to do this limit raising.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
These wrappers added no semantic difference over calling the system
function directly.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The unit files both have After=network.target, and this in turn implies
After=network-pre.target. Both iptables.service & ip6tables.service have
Before=network-pre.target since Fedora >= 35 and RHEL >= 8.4.
When we first added the deps on ip[6]tables.service in
commit 0756415f147dda15a417bd79eef9a62027d176e6
Author: Laine Stump <laine@redhat.com>
Date: Fri May 1 00:05:50 2020 -0400
systemd: start libvirtd after firewalld/iptables services
the Before=network-pre.target didn't exist, but we can rely on it now
given our supported platforms matrix.
The firewalld.service has similarly has a Before=network-pre.target,
even when we took that commit above, so this dep was in face never
actually needed. This answers the question posed in that above commit
message about firewalld ordering.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
All services are ordered after local-fs.target unless they have set
DefaultDependencies=no, which we do not do.
https://gitlab.com/libvirt/libvirt/-/issues/489
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A test case can be part of a test suite (just like we already
have 'syntax-check'). This then allows developers to run only a
subset of tests. For instance - when using valgrind test setup
(`meson test -C _build/ --setup valgrind`) it makes zero sense to
run syntax-check tests or other script based tests (e.g.
check-augeas-*, check-remote_protocol, etc.). What does makes
sense is to run compiled binaries.
Strictly speaking, reaching that goal is as trivial as annotating
only those compiled tests (declared in tests/meson.build) and
running them selectively:
meson test -C _build/ --setup valgrind --suite $TAG
But it may be also desirable to run test scripts separately.
Therefore, introduce two new tags: 'bin' for compiled tests, and
'script' for script based tests and annotate each test()
accordingly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The current virtStorageBackendZFSCheckPool checks for the existence of a
path under /dev/zvol/ to determine if the pool is active. ZFS does not
create a path under /dev/zvol/ if no ZFS volumes have been created under
a particular dataset, thus, empty ZFS storage pools are deactivated
whenever checkPool is called on them (as noted in referenced issue).
This commit changes virStorageBackendZFSCheckPool so that the 'zfs list'
command is used to explicitly check for the existence a dataset
specified by the pool's def->source.name.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/221
Signed-off-by: Matt Low <matt@mlow.ca>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since commit 44a0f2f0, we now query mdevctl for transient (active) mdevs
in order to gather attributes for the mdev. Unfortunately, this commit
introduced a regression because nodeDeviceUpdateMediatedDevice() assumed
that all mdevs returned from mdevctl were actually persistent mdevs but
we were using it to update transient mdevs. Refactor the function so
that we can use it to update both persistent and transient mdevs.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
The virtio-gpu 'blob' support was insufficiently validated. Qemu
requires a memfd memory backing in order to use udmabuf and enable blob
support. Example error:
$ virsh start rhel9
error: Failed to start domain 'rhel9'
error: internal error: qemu unexpectedly closed the monitor: 2023-07-18T02:33:57.083178Z qemu-kvm: -device {"driver":"virtio-vga","id":"video0","max_outputs":1,"blob":true,"bus":"pcie.0","addr":"0x1"}: cannot enable blob resources without udmabuf
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Historically, the way to set PC speaker for a guest was to pass:
-soundhw pcspk
but as of QEMU commit v5.1.0-rc0~28^2~3 this is deprecated and we
should use:
-machine pcspk-audiodev=$id
instead. The old way was then removed in commit v7.1.0-rc0~99^2~3.
Now, ideally we would have a capability selecting whether we talk
to a QEMU that understands the new way or not. But it's not that
simple - the machine attribute is just an alias to the .audiodev=
attribute of 'isa-pcspk' object and both are created in
pc_machine_initfn() function, i.e. not then the PC_MACHINE() class
is initialized, but when it's instantiated. IOW, it's not possible
for us to query whether we're dealing with older or newer QEMU.
But given that the newer version is supported since v5.1.0 and the
minimal version we require is v4.2.0 (i.e. there are two releases
which don't understand the newer cmd line) and how frequently this
feature is (un-)used (the issue was reported after ~1 year since it
stopped working), I believe we can live without any capability and
just use the newer cmd line unconditionally.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/490
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Now that the QEMU_CAPS_USB_STORAGE_REMOVABLE capability is no
longer used we can stop querying it and retire it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Introduced in QEMU commit of v0.14.0-rc0~83^2~1 and not being
able to compile the .removable attribute of the "usb-storage"
object out, renders our corresponding capability
QEMU_CAPS_USB_STORAGE_REMOVABLE always set. Stop using it in
command generation / domain validation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After previous commit, there's no functional difference between
real virRandomGenerateWWN() and the mocked version. Drop the mock
then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This brings the code closer to real implementation:
nodeDeviceCreateXML(). For the unique OUI, let's take the value
from tests/virrandommock.c: 100000.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Firstly, drop needless concatenation of two static strings.
Secondly, use proper (portable) formatter for uint64_t so that
typecast to ULL can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Commit be1b7d5b18 introduced parsing /proc/cpuinfo for "address size"
which is not including on S390 and therefore reports an internal error.
Lets remove the parsing on S390.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add async-teardown to the features list in domain capabilities allowing
high level management to introspect the availability of the asynchronous
teardown feature.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Up until v2.11.0-rc2~19^2~3 QEMU used to require at least one
NUMA node to be configured when memory hotplug was enabled. After
that commit, QEMU automatically adds a NUMA node if none was
specified on the cmd line. Reflect this in domain XML, i.e.
explicitly add a NUMA node into our domain definition if needed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2216236
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Rather than directly executing mdevctl from the udev event thread when
we determine that we need to re-query, schedule the mdevctl thread to
run. This also helps to coalesce multiple back-to-back updates into a
single one when there are multiple updates in a row or at startup when a
host has a very large number of mdevs.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Factor out a new scheduleMdevctlUpdate() function so that we can re-use
it from other places. Now that other events can make it necessary to
re-query mdevctl for mdev updates, this function will be useful for
coalescing multiple updates in quick succession into a single mdevctl
query.
Also rename a couple functions. The names weren't very descriptive of
their behavior. For example, the old scheduleMdevctlHandler() function
didn't actually schedule anything, it just started a thread. So rename
it to free up the 'schedule' name for the above refactored function.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Update the optional mdev attributes by running an mdevctl update on a
new created nodedev object representing an mdev.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143158
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
If a domain has NUMA configured, then all <memory/> devices
(except for 'virtio-pmem') need to have targetNode set. There are
two checks inside of qemuDomainDefValidateMemoryHotplugDevice()
for this: one inside of big switch() statement, which only checks
'dimm' and 'nvdimm' cases, and the other at the end of the
function that checks all models (except for 'virtio-pmem'). Let's
keep the latter and remove the former as the latter covers the
former too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
The libxl driver has basic support for VIR_MIGRATE_CHANGE_PROTECTION
by starting and stopping modify jobs in the begin/confirm and prepare/finish
phases of migration, but it doesn't advertise that support. This can result
in unterminated jobs because the migration logic skips phases of migration
when the VIR_MIGRATE_CHANGE_PROTECTION feature is absent. Ensure jobs are
terminated properly by advertising support for VIR_MIGRATE_CHANGE_PROTECTION.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
For unknown reasons, the libxl driver attempts to resume a domain in the
confirm phase when a migration operation has been canceled. This has shown
to be problematic when simulating scenarios that result in a canceled
migration. In all scenarios, the domain was in a running state when entering
libxlDomainMigrationSrcConfirm, causing the call to libxl_domain_resume to
fail. Making matters worse, the domain state is changed to paused when in
fact it's running. And finally, libxlDomainMigrationSrcConfirm incorrectly
returns an error.
Remove this incorrect logic from libxlDomainMigrationSrcConfirm. On a
canceled migration it's sufficient to resume the lock process that was
paused in the perform phase.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Our CI started to enable udev backend on FreeBSD. And while there
is udev on FreeBSD some parts of our code are highly Linux
specific, e.g. translating SCSI device type to string (from an
integer obtained from the sysfs). Obviously, this doesn't work
anywhere else. This is the reason why we need to include
scsi/scsi.h header file (which actually comes from the Linux
kernel source tree but for some reason glibc started to
distribute it, followed by musl).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Asynchronous teardown can be specified if the QEMU binary supports it by
adding in the domain XML
<features>
...
<async-teardown enabled='yes|no'/>
...
</features>
By default this new feature is disabled.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
QEMU capability is looking in query-command-line-options response for
...
{
"parameters": [
{
"name": "async-teardown",
"type": "boolean"
}
],
"option": "run-with"
}
...
allow to use the QEMU option -run-with async-teardown=on|off
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Allow //disk/target@removable for scsi disk devices, since QEMU has support
the removable attribute for scsi-hd device from v0.14.0[1].
[1]: 419e691f8e: scsi-disk: Allow overriding SCSI INQUIRY removable bit
Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>