I've came across an aarch64 system which supports hugepages up to
16GiB of size. However, I was unable to allocate them using
virsh allocpages. This is because cmdAllocpages() uses
vshCommandOptScaledInt(), which scales passed value into bytes,
but since the virNodeAllocPages() expects size in KiB the
variable holding bytes is then divided by 1024. However, the
limit for the biggest value passed to vshCommandOptScaledInt() is
UINT_MAX which is now obviously wrong, as it needs to be UINT_MAX
* 1024.
The same bug is in completer. But here, let's use ULLONG_MAX so
that we don't have to care about it anymore.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While commit a5e659f0 removed the restriction against multiple queues
for the vdpa net device, there were some missing pieces. Configuring a
device statically and then starting the domain worked as expected, but
hotplugging a device didn't have the expected multiqueue support
enabled. Add the missing bits.
Consider the following device xml:
<interface type="vdpa">
<mac address="00:11:22:33:44:03" />
<source dev="/dev/vhost-vdpa-0" />
<model type="virtio" />
<driver queues='2' />
</interface>
Without this patch, hotplugging the above XML description resulted in
the following:
{"execute":"netdev_add","arguments":{"type":"vhost-vdpa","vhostdev":"/dev/fdset/0","id":"hostnet1"},"id":"libvirt-392"}
{"execute":"device_add","arguments":{"driver":"virtio-net-pci","netdev":"hostnet1","id":"net1","mac":"00:11:22:33:44:03","bus":"pci.5","addr":"0x0"},"id":"libvirt-393"}
With the patch, hotplugging results in the following:
{"execute":"netdev_add","arguments":{"type":"vhost-vdpa","vhostdev":"/dev/fdset/0","queues":2,"id":"hostnet1"},"id":"libvirt-392"}
{"execute":"device_add","arguments":{"driver":"virtio-net-pci","mq":true,"vectors":6,"netdev":"hostnet1","id":"net1","mac":"00:11:22:33:44:03","bus":"pci.5","addr":"0x0"},"id":"libvirt-393"}
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2024406
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In 0895a0e, it was noted that the "sockets" value in the topology
section of capabilities reflects not the number of sockets per NUMA
node, not the total number.
Unfortunately, the fix was applied to the wrong place: the domain XML
format documentation, not that for the capabilities output. And, in
fact, the domain XML interprets "sockets" as the total number, not a
per-node value.
Back out this change in favour of a note in the capabilities
documentation instead.
Fixes: 0895a0e75d13874254218e16dc66dcad673671d3
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.
Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/
Co-authored-by: Weblate <noreply@weblate.org>
Signed-off-by: Fedora Weblate Translation <i18n@lists.fedoraproject.org>
The callback ID can be zero, not necessarily positive; correct the
comment to reflect this.
Signed-off-by: John Levon <levon@movementarian.org>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When changing the size of pipe that virFileWrapperFdNew() creates
we start at 1MiB and if that fails because it's above the system
wide limit we get EPERM and continue with half of the size.
However, we might get another error in which case we should
report proper system error and return failure from
virFileWrapperFdNew().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Apply the user-requested changes to the device definition as requested
by the <qemu:deviceOverride> element from the custom qemu XML namespace.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/287
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The definition object will be later used to access the qemu namespace
definition used to override device properties.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Taint the domain object when the user requests custom device properties.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Upcoming patches will add possibility to override configuration of a
device with custom properties as a more versatile replacement to using
QEMU's '-set' parameter, which doesn't work when we use JSON to
instantiate devices.
Describe the XML used for the override as well as expectations of
upstream support in case something breaks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
currently the only user of virFileWrapperFdNew is the qemu driver;
virsh save is very slow with a default pipe size.
This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also
the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When vTPM is secured via virSecret libvirt passes the secret
value via an FD when swtpm is started (arguments --key and
--migration-key). The writing of the secret into the FDs is
handled via virCommand, specifically qemu_tpm calls
virCommandSetSendBuffer()) and then virCommandRunAsync() spawns a
thread to handle writing into the FD via
virCommandDoAsyncIOHelper. But the thread is not created unless
VIR_EXEC_ASYNC_IO flag is set, which it isn't. In order to fix
it, virCommandDoAsyncIO() must be called.
The credit goes to Marc-André Lureau
<marcandre.lureau@redhat.com> who has done all the debugging and
proposed fix in the bugzilla.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2064115
Fixes: a9c500d2b50c5c041a1bb6ae9724402cf1cec8fe
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This reverts commit 06c960e477de4561c7ba956f82994fa120226397.
Turns out, this feature is not needed and QEMU will fix TSC
without any intervention from outside.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>P
This reverts commit 150540394ddaa515f6857616a2bcf792748f162c.
Turns out, this feature is not needed and QEMU will fix TSC
without any intervention from outside.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>P
QEMU 7.0.0 adds a new property tsc-clear-on-reset to x86 CPU, corresponding
to Libvirt's <tsc on_reboot="clear"/> element. Plumb it in the validation,
command line handling and tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The workaround is to reset the TSC to a small value. Add
to the domain configuration an attribute for this. It can be used
by QEMU and in principle also by ESXi, which has a property called
monitor_control.enable_softResetClearTSC as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Make sure that all tests are run after the helpers and mocks are
(re)built. This enables for example using "meson test" as the
command line passed to "git bisect run".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
libparted_dep is not used if -Dstorage_disk=disabled. Do not
bother looking for this library if the disk storage backend was
not requested.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
rbd_dep is not used if -Dstorage_rbd=disabled. Do not bother looking for
the libraries that compose it if the rbd storage backend was not requested.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It makes sense to have these in the same file as the definitions
of enums.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
These enums are essentially the same and always sorted in the
same order in every hypervisor with jobs. They can be generalized
by using the qemu enums as the main ones as they are the most
extensive.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
I think the code looks cleaner without else branches.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Let's generate prealloc-threads property onto the cmd line if
domain configuration requests so.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Only fairly new QEMUs are capable of user provided number of
preallocation threads. Validate this assumption.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The prealloc-threads is property of memory-backend class which is
parent to the other three classes memory-backend-{ram,file,memfd}.
Therefore the property is present for all, or none if QEMU is
older than v5.0.0-rc0~75^2~1^2~3 which introduced the property.
Anyway, the .reserve property is the same story, and we chose
memory-backend-file to detect it, so stick with our earlier
decision and use the same backend to detect this new property.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Since its v5.0.0 release QEMU is capable of specifying number of
threads used to allocate memory. It defaults to 1, which may be
too low for humongous guests with gigantic pages.
In general, on QEMU cmd line level it is possible to use
different number of threads per each memory-backend-* object, in
practical terms it's not useful. Therefore, use <memoryBacking/>
to set guest wide value and let all memory devices 'inherit' it,
silently. IOW, don't introduce per device knob because that would
only complicate things for a little or no benefit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This brings in all the fixes made since April 2020.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
By default, stdout/stderr Avocado test log files do not have any file
extension which confuses GitLab's web UI to mangle the MIME type for
these and so the browser will never offer the option to open such file
from in a text editor rather than dowloading it.
Since GitLab sets a proper MIME for .txt and .log file extensions,
rename all Avocado log files without an extension to *.log . This pairs
nicely with the coredumpctl info file which we already name as
'coredumpctl.txt' because of this.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Some Red Hat-like distros have cores limited with a soft limit of 0
which means that neither a stack trace nor a core file will be
available. Since we want the stack trace we need to set the core limit
with systemd globally to unlimited/infinity.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Custom runners are private to a project, so naturally forks cannot run
any workloads on these. The integration test suite which requires
access to our custom runner is naturally disabled on forks and can be
enabled by setting LIBVIRT_CI_INTEGRATION=1.
The problem is that the current integration jobs definitions have tags
statically defined as 'redhat-vm-host'. If users are going to supply
their own private runners for their forks, they can define whatever
tags they want with it and so unless they add 'redhat-vm-host' to their
own runner's tags, the pipeline won't run.
To solve this, define the integration job tag using a variable. The
repo config will use the value defined in the job for the variable
while users can override the value easily on a project/pipeline level
thanks to GitLab's CI variable precedence [1].
[1] https://docs.gitlab.com/ee/ci/variables/#cicd-variable-precedence
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The updateLock is a R/W lock held by anything which needs to read or
modify the rules associated with an NWFilter.
APIs for defining/undefining NW filters rules hold a write lock on
updateLock.
APIs for creating/deleting NW filter bindings hold a read lock on
updateLock, which prevents define/undefine taking place concurrently.
The problems arise when we attempt to creating two NW filter bindings in
parallel.
Thread 1 can acquire the mutex for filter A
Thread 2 can acquire the mutex for filter B
Consider if filters A and B both reference filters C and D, but in
different orders:
Filter A
-> filter C
-> filter D
Filter B
-> filter D
-> filter C
Thread 1 will try to acquire locks in order A, C, D while thread 1 will
try to acquire in order A, D, C. Deadlock can still occur.
Think we can sort the list of filters before acquiring locks on all of
them ? Nope, we allow arbitrary recursion:
Filter A
-> filter C
-> filter E
-> filter F
-> filter H
-> filter K
-> filter D
-> filter G
-> filter I
So we can't tell from looking at 'A' which filters we're going to
need to lock. We can only see the first level of filters references
and we need to lock those before we can see the second level of
filters, etc.
We could probably come up with some cleverness to address this but
it isn't worth the time investment. It is simpler to just keep the
process of creating NW filter bindings totally serialized.
Using two separate locks for this serialization though is pointless.
Every code path which gets a read(updateLock) will go on to hold
updateMutex. It is simpler to just hold write(updateLock) and
get rid of updateMutex. At that point we don't need updateLock
to be a R/W lock, it can be a plain mutex.
Thus this patch gets rid of the current updateLock and updateMutex
and introduces a new top level updateMutex.
This has a secondary benefit of introducing fairness into the
locking. With a POSIX R/W lock, you get writer starvation if
you have lots of readers. IOW, if we call virNWFilterBIndingCreate
and virNWFilterBindingDelete in a tight loop from a couple of
threads, we can prevent virNWFilterDefine from ever acquiring
a write lock.
Getting rid of the R/W lock gives us FIFO lock acquisition
preventing starvation of any API call servicing.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
schemas are used for more than just documentation,
virsh edit fails if schemas are not available.
Therefore, fix the no-docs build by moving schemas/
to the parsing code inside src/conf/.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
In case we are snapshotting at least one 'manual' disk we will pause the
VM and keep it paused.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1866400
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The idea of the manual mode is to allow a synchronized snapshot in cases
when the storage is outsourced to an unmanaged storage provider which
requires cooperation with snapshotting.
The mode will instruct the hypervisor to pause along when the other
components are snapshotted and the 'manual' disk can be snapshotted
along. This increases latency of the snapshot but allows them in
otherwise impossible situations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>