Older kernels did not support this sysctl, but they did not restrict
userfaultfd in any way so everything worked as if
vm.unprivileged_userfaultfd was set to 1. Thus we can safely ignore
errors when setting the value.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The qemuPrepareNVRAM() function accepts three arguments and the
last one being a boolean type. However, when the function is
called from qemuProcessPrepareHost() the argument passed is a
result of logical and of @flags (unsigned int) and
VIR_QEMU_PROCESS_START_RESET_NVRAM value. In theory this is
unsafe to do because if the value of the flag is ever changed
then this expression might overflow. Do what we do elsewhere:
double negation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In one of recent commits qemuProcessStartFlags enum gained new
value: VIR_QEMU_PROCESS_START_RESET_NVRAM but due to a typo it
has the same value as another member of the enum. Fix that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When the <loader> had an explicit readonly='no' attribute we
accidentally still marked the plfash as readonly due to the
bad conversion from virTristateBool to bool. This was missed
because the test cases run with no capabilities set and thus
are validated the -drive approach for pflash configuration,
not the -blockdev approach.
This affected the following config:
<os>
<loader readonly='no' type='pflash'>/var/lib/libvirt/qemu/nvram/test-bios.fd</loader>
</os>
for the sake of completeness, we also add a test XML config
with no readonly attribute at all, to demonstrate that the
default for pflash is intended to be r/w.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We can now replace the existing NVRAM file on startup when
the API requests this.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
If we crash part way through writing the NVRAM file we end up with an
unusable NVRAM on file. To avoid this we need to write to a temporary
file and fsync(2) at the end, then rename to the real NVRAM file path.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This change was generated using the following spatch:
@ rule1 @
expression a;
identifier f;
@@
<...
- f(*a);
... when != a;
- *a = NULL;
+ g_clear_pointer(a, f);
...>
@ rule2 @
expression a;
identifier f;
@@
<...
- f(a);
... when != a;
- a = NULL;
+ g_clear_pointer(&a, f);
...>
Then, I left some of the changes out, like tools/nss/ (which
doesn't link with glib) and put back a comment in
qemuBlockJobProcessEventCompletedActiveCommit() which coccinelle
decided to remove (I have no idea why).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Modify 'qemuProcessGetVCPUQOMPath' to take the detected QOM path of the
first vCPU which is always present as the QOM path used our code probing
CPU flags via 'qom-get'.
This is needed as upcoming qemu will change it.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/272
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2051451
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to previous commit we need to probe the vcpus first.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Upcoming changes will require that we have a proper QOM path for cpus
when querying the flags as qemu is going to change it.
By moving the flag probing code later we'll already probe the QOM paths
so no re-query will be needed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The QOM path will be needed by code which is querying the cpu flags via
'qom-get' and thus needs a valid QOM path to the vCPU.
Add it into the private data and transfer from the queried data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Convert all code using the 'QOM_CPU_PATH' macro to accept the QOM path
as an argument.
For now the new helper for fetching the path 'qemuProcessGetVCPUQOMPath'
will always return the same hard-coded value.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use automatic memory clearing and remove the 'ret' variable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is used only as a helper in src/qemu/qemu_monitor_json.c
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This directory contains runtime state, not persistent state.
The latter goes into swtpmStorageDir.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
When looping over TPM devices for a domain, we can avoid calling
this function for each iteration and call it once per domain
instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Using the word "create" can give users the impression that disk
operations will be performed, when in reality all these functions
do is string formatting.
Follow the naming convention established by virBuildPath(),
virFileBuildPath() and virPidFileBuildPath().
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This leaves qemuExtTPMCleanupHost() to only deal with looping
over TPM devices, same as other qemuExtTPMDoThing() functions.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This leaves qemuExtTPMSetupCgroup() to only deal with looping
over TPM devices, same as other qemuExtTPMDoThing() functions.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Other functions that operate on a single TPM emulator follow
the qemuTPMEmulatorDoThing() naming convention.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
When we are about to spawn QEMU, we validate the domain
definition against qemuCaps. Except when domain is/was already
running before (i.e. on incoming migration, snapshots, resume
from a file). However, especially on incoming migration it may
happen that the destination QEMU is different to the source
QEMU, e.g. the destination QEMU may have some devices disabled.
And we have a function that validates devices/features requested
in domain XML against the desired QEMU capabilities (aka
qemuCaps) - it's virDomainDefValidate() which calls
qemuValidateDomainDef() and qemuValidateDomainDeviceDef()
subsequently.
But the problem here is that the validation function is
explicitly skipped over in specific scenarios (like incoming
migration, restore from a snapshot or previously saved file).
This in turn means that we may spawn QEMU and request
device/features it doesn't support. When that happens QEMU fails
to load migration stream:
qemu-kvm: ... 'virtio-mem-pci' is not a valid device model name
(NB, while the example shows one particular device, the problem
is paramount)
This problem is easier to run into since we are slowly moving
validation from qemu_command.c into said validation functions.
The solution is simple: do the validation in all cases. And while
it may happen that users would be unable to migrate/restore a
guest due to a bug in our validator, spawning QEMU without
validation is worse (especially when you consider that users can
supply their own XMLs for migrate/restore operations - these were
never validated).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2048435
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The binary validation in virPidFileReadPathIfAlive may fail with EACCES
if the calling process does not have CAP_SYS_PTRACE capability.
Therefore instead do only the check that the pidfile is locked by the
correct process.
Fixes the same issue as with swtpm.
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Access to /proc/[pid]/exe may be restricted in certain environments (e.g.
in containers) and any attempt to stat(2) or readlink(2) the file will
result in 'permission denied' error if the calling process does not have
CAP_SYS_PTRACE capability. According to proc(5) manpage:
Permission to dereference or read (readlink(2)) this symbolic link is
governed by a ptrace access mode PTRACE_MODE_READ_FSCREDS check; see
ptrace(2).
The binary validation in virPidFileReadPathIfAlive may fail with EACCES.
Therefore instead do only the check that the pidfile is locked by the
correct process. To ensure this is always the case the daemonization and
pidfile handling of the swtpm command is now controlled by libvirt.
Signed-off-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Report an error upfront if the binary does not exist
or is not executable.
https://bugzilla.redhat.com/show_bug.cgi?id=1999372
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Introduce support for
<serial type='pty'>
<target type='isa-debug'>
<model type='isa-debugcon'/>
</target>
<address type='isa' iobase='0x402'/>
</console>
which is used as a way to receive debug messages from the
firmware on x86 platforms.
Note that the default port is hypervisor specific, with QEMU
currently using 0xe9 since that's the original Bochs debug port.
For use with SeaBIOS/OVMF, the iobase port needs to be explicitly
set to 0x402.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This mostly overlaps with virDomainAudioType, but in a couple of
cases the string representations are different.
Right now we're doing that in a somewhat sketchy way, in that we
store values of one enumeration and then convert them to strings
using TypeToString() implementation for the other enumeration;
when converting from string, we open-code the handling of the
special values mentioned above.
Drop the second enumeration and introduce two helpers to deal
with conversion. Most calling sites don't need to be changed, and
one can even be simplified significantly.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This, along with "pa", is the other case where the libvirt and
QEMU names do not match.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We recently started listing these in the spec file and, since we
were not creating them during the installation phase, that broke
RPM builds.
Fixes: 4b43da0bff9b78dcf1189388d4c89e524238b41d
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently, memory device (def->mems) part of cmd line is
generated before any controller. In majority of cases it doesn't
matter because neither of memory devices live on a bus that's
created by an exposed controller (e.g. there's no DIMM
controller, at least not exposed). Except for virtio-mem and
virtio-pmem, which do have a PCI address. And if it so happens
that the device goes onto non-default bus (pci.0) starting such
guest fails, because the controller that creates the desired bus
wasn't processed yet. QEMU processes arguments in order.
For instance, if virtio-mem has address with bus='0x01' QEMU
refuses to start with the following message:
Bus 'pci.1' not found
Similarly for virtio-pmem. I've successfully tested migration and
changing the order does not affect migration stream.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2047271
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
MIPS Malta (and no other supported MIPS machine) has a PCI bus.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This identifies various MIPS Malta machines, be it 32-bit or 64-bit,
little-endian or big-endian.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There are few places where the g_steal_pointer() is open coded.
Switch them to calling the g_steal_pointer() function instead.
Generated by the following spatch:
@ rule1 @
expression a, b;
@@
<...
- b = a;
... when != b
- a = NULL;
+ b = g_steal_pointer(&a);
...>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The service files were copied out of the service file for libvirtd and
the name of the corresponding manpage was not fixed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2045959
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Any active domain has a copy in the privateData, filled in
qemuProcessInit.
Move the qemu capability check below the activeness check and remove
the extra lookup.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Apparently, some of '&*variable' slipped in. Drop '&*' and access
the variable directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
We require the header and the secret to be present.
Use a different approach to virParams to report an error if they
are not present, instead of trying to pass empty arguments to QEMU
via QMP.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor some cgroup management methods from qemu into hypervisor.
These methods will be shared with ch driver for cgroup management.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
A <hostdev/> can have <address type='unassigned'/> which means
libvirt manages the device detach from/reattach to the host but
the device is never exposed to the guest. This means that we have
to take a shortcut during hotunplug (e.g. never ask QEMU on the
monitor to detach the device, or never wait for DEVICE_DELETED
event).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
A <hostdev/> can have <address type='unassigned'/> which means
libvirt manages the device detach from/reattach to the host but
the device is never exposed to the guest. This means that we have
to take a shortcut during hotplug, similar to the one we are
taking when constructing the command line (see
qemuBuildHostdevCommandLine()).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2040548
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are a some scenarios in which we want to prealloc guest
memory (e.g. when requested in domain XML, when using hugepages,
etc.). With 'regular' <memory/> models (like 'dimm', 'nvdimm' or
'virtio-pmem') or regular guest memory it is corresponding
memory-backend-* object that ends up with .prealloc attribute
set. And that's desired because neither of those devices can
change its size on the fly. However, with virtio-mem model things
are a bit different. While one can set .prealloc attribute on
corresponding memory-backend-* object it doesn't make much sense,
because virtio-mem can inflate/deflate on the fly, i.e. change
how big of a portion of the memory-backend-* object is exposed to
the guest. For instance, from a say 4GiB module only a half can
be exposed to the guest. Therefore, it doesn't make much sense to
preallocate whole 4GiB and keep them allocated. But we still want
the part exposed to the guest preallocated (when conditions
described at the beginning are met).
Having said that, with new enough QEMU the virtio-mem-pci device
gained new attribute ".prealloc" which instructs the device to
talk to the memory backend object and allocate only the requested
portion of memory.
Now, that our algorithm for setting .prealloc was isolated in a
single function, the function can be called when constructing cmd
line for virtio-mem-pci device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This new capability tracks whether virtio-mem device is capable
of memory preallocation, which is detected by the device having
.prealloc attribute.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>