All the fd-passing setup of chardevs which this hack meant to disable
was moved to the host-preparation phase which is skipped for formatting
of non-real commandlines.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
And its callers. The parameter is no longer used since virDomainObjSave
was replaced with qemuDomainSaveStatus wrapper.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is a nice wrapper around virDomainObjSave which logs a warning, but
otherwise ignores the error. Let's use it where appropriate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As of ff024b60cc we are opening chardevs before starting QEMU.
However, we are also doing that before domain private directories
are created. This leaves us unable to create guest agent socket
which lives under priv->channelTargetDir.
While creating the dirs can be moved just before
qemuProcessPrepareHostBackendChardev() it's better to do it as
the very first step so that this kind of error is prevented in
future.
Fixes: ff024b60cc
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Add handling to qemuDomainDeviceBackendChardevForeachOne and callbacks
so that we can later use 'qemuBuildChardevCommand' for TPM devices.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add handling to qemuDomainDeviceBackendChardevForeachOne and callbacks
so that we can later use 'qemuBuildChardevCommand' for vhost-user disks
instead of a custom formatter.
Since we don't pass the FD for the vhost-user connection to qemu all of
the setup can be skipped.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use the qemuDomainDeviceBackendChardevForeach helper to iterate all
eligible structs and convert the setup of the TLS defaults from the
config.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The opening of files for FD passing for a chardev backend was
historically done in the function which is formatting the commandline.
This has multiple problems. Firstly the function takes a lot of
parameters which need to be passed through the commandline formatters.
This made the 'qemuBuildChrChardevStr' extremely unappealing to the
extent that we have multiple other custom formatters in places which
didn't really want to use the function.
Additionally the function is also creating files in the host in certain
configurations which is wrong for a commandline formatter to do. This
meant that e.g. not all chardev test cases can be converted to use
DO_TEST_CAPS_LATEST as we attempt to use such code path and attempt to
create files outside of the test directory.
This patch moves the opening of the filedescriptors from
'qemuBuildChrChardevFileStr' into a new helper
'qemuProcessPrepareHostBackendChardevOne' which is called using
'qemuDomainDeviceBackendChardevForeach'.
To preserve test behaviour we also have another instance
'testPrepareHostBackendChardevOne' which is populating mock
filedescriptors.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use automatic memory freeing for the temporary bitmap and remove the
pointless 'cleanup' section.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
I think it makes more sense for the variable about jobs to be in
the job object. I also renamed it to be consistent with the rest
of the struct.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Remove the check from conditions where it's coupled with some other
checks.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'xmlopt' parameter can be auto-unref by using g_autoptr().
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The upcoming QEMU 6.2.0 implements a new event called
DEVICE_UNPLUG_GUEST_ERROR, a new event that reports generic device
unplug errors that were detected by the guest and reported back to QEMU.
This new event is going to be specially useful for pseries guests that
uses newer kernels (must have kernel commit 29c9a2699e71), which is the
case for Fedora 34 at this moment. These guests have the capability of
reporting CPU removal errors back to QEMU which, starting in 6.2.0, will
emit the DEVICE_UNPLUG_GUEST_ERROR event. Libvirt can use this event to
abort the device removal immediately instead of waiting for 'setvcpus'
timeout.
QEMU 6.2.0 is also going to emit DEVICE_UNPLUG_GUEST_ERROR for memory
hotunplug errors, both in pseries and ACPI guests. QEMU 6.1.0 reports
memory removal errors using the MEM_UNPLUG_ERROR event, which is going to
be deprecated by DEVICE_UNPLUG_GUEST_ERROR in 6.2.0. Given that
Libvirt wasn't handling the MEM_UNPLUG_ERROR event we don't need to
worry about it - adding support to DEVICE_UNPLUG_GUEST_ERROR will be
enough to cover all future cases.
This patch adds support to DEVICE_UNPLUG_GUEST_ERROR by adding the
minimal wiring required for Libvirt to be aware of it. The monitor
callback for this event will abort the pending removal operation of the
device reported by the "device" property of the event. Most of the heavy
lifting is already done by existing code that handles
QEMU_DOMAIN_UNPLUGGING_DEVICE_STATUS_GUEST_REJECTED, making our life
easier to abort the pending removal operation.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
After previous cleanups this callback is unused. Remove it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently, when opening an agent socket the qemuConnectAgent()
increments domain object refcounter and calls qemuAgentOpen()
where the domain object pointer is simply stored inside
_qemuAgent struct. If qemuAgentOpen() fails, then it clears @cb
member only to avoid qemuProcessHandleAgentDestroy() being called
(which decrements the domain object refcounter) and the domain
object refcounter is then decreased explicitly in
qemuConnectAgent().
The same result can be achieved with much cleaner code: increment
the refcounter inside qemuAgentOpen() and drop the dance around
@cb.
Also, the comment in qemuConnectAgent() about holding an extra
reference is not correct. The thread that called
qemuConnectAgent() already holds a reference to the domain
object. No matter how many time the object is locked and unlocked
the reference counter can't be decreased.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Just like qemuMonitorOpen(), hold the domain object locked
throughout the whole time of qemuConnectAgent() and unlock it
only for a brief time of actual connect() (because this is the
only part that has a potential of blocking).
The reason is that qemuAgentOpen() does access domain object
(well, its privateData) AND also at least one argument (@context)
depends on domain object. Accessing these without the lock is
potentially dangerous.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1845468#c12
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
During the vm rebooting, the vm could be paused if the libvirtd is
restarted for some reason, which is not expected. We need continue
fakereboot process if fakereboot flags is true and the vm is in
paused-user status.
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
During the vm rebooting, the vm could be shut down if the libvirtd is
restarted for some reason, which is not expected. We move set
fakereboot flags false after processing fakereboot over, so we can
ensure that fakereboot process have been executed.
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is a typical example of what can go wrong when sending out
an old patch. Back in January, when I was writing
qemuProcessHandleMemoryDeviceSizeChange() events were sent to the
worker pool thread using virThreadPoolSendJob(). Then, in July a
helper was introduced (qemuProcessEventSubmit()) but since my
code was not committed and I did not pay attention my code wasn't
updated. Later, when I merged my code it uses the old approach.
BTW: this also fixes a possible double free which I completely
missed when writing the code ~10 months ago.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Nobody's interested in the return value of any of
struct _qemuMonitorCallbacks callbacks. They are all void, but
domainMemoryDeviceSizeChange. Change it to void.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Libvirt will put the pid file of pr-helper to per-domain directory.
However, the ownership of the per-domain directory is the user to run
the QEMU process and the user has the write permission of the directory.
If VM escape occurs, the attacker can
1. write arbitrary content to the pid file (if running QEMU using root),
then the attacker can kill any process by writing appropriate pid to
the pid file;
2. spoof the pid file (if running QEMU using a regular user), then the
pr-helper process will never be cleared even if the VM is destroyed.
So, move the pid file of pr-helper from per-domain directory to
stateDir.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fill in the effective boot index for network devices (or hostdev-backed
network devices via 'qemuProcessPrepareDeviceBootorder'. This patch
doesn't clean up the cruft to make it more obvious what's happening.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rename it to 'qemuProcessPrepareDeviceBootorder' and call it from
'qemuProcessPrepareDomain' rather than
'qemuProcessPrepareDomainStorage'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'effectiveBootIndex' is a copy of 'bootIndex' if '<boot order=' was
present and left unassigned if not. This allows hypervisor drivers to
reinterpret <os><boot> without being visible in the XML.
QEMU driver had a internal implementation for disks, which is now
replaced. Additionally this will simplify a refactor of network boot
assignment.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We commonly use 'props' for the JSON object describing something. Rename
the monitor device addition code.
Additionally the common approach is to clear the pointer if it was
consumed so the arguments are adjusted to do so.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The -netdev formatter code switched to a real virQEMUCaps flag so we can
remove the old flags which used to enable JSON for -netdev for
validation purposes.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reporting how much memory is exposed to the guest happens under
<currentMemory/> which is taken from def->mem.cur_balloon. The
reported amount should account for both balloon size and the sum
of @currentsize of all virtio-mems. For instance, if domain has
4GiB via balloon and additional 2GiB via virtio-mem, then the
domain XML should report 6GiB. The same applies for domain
statistics.
The way to achieve this is to account for either balloon or
virtio-mem when the size of the other is changed, e.g. on balloon
change we have to add all @currentsize (for non virtio-mem these
will be zero, so the check for memory model is needless, but
makes it more obvious what's happening), and vice versa.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If the QEMU driver restarts it loses the track of the current size
of virtio-mem (because it's runtime type of information and thus
not stored in XML) and therefore, we have to refresh it when
reconnecting to the domain monitor.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As advertised in previous commit, this event is delivered to us
when virtio-mem module changes the allocation inside the guest.
It comes with one attribute - size - which holds the new size of
the virtio-mem (well, allocated size), in bytes.
Mind you, this is not necessarily the same number as 'requested
size'. It almost certainly will be when sizing the memory up, but
it might not be when sizing the memory down - the guest kernel
might be unable to free some blocks.
This current size is reported in the domain XML as an output
element only.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:
1) block-size, which defines the size of a block
2) requested-size, which defines how much memory (in bytes)
is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'current' and it is dealt with in future
commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.
Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:
<memory model='virtio-mem'>
<source>
<nodemask>1-3</nodemask>
<pagesize unit='KiB'>2048</pagesize>
</source>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of
'block-size', and
2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
address.
Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When action for 'on_poweroff' is set to 'restart', 'fake reboot'
is triggered and qemu shutdown state is transient. Domain state
need not to be changed and events not sent in this case.
Fixes: 4ffc807214
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We don't need to propagate all public flags, only the information
about the presence of the validation one, which can differ from
function to function. This patch makes it easier and more
readable in case of a future additions of validation flags.
This change was suggested by Daniel.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'-qmp' in this case behaves the same as '-chardev' so it should have
been converted the same way as others were in 43c9c0859f since
short options are deprecated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Split out the logic which was used to determine whether qemu should
allow the guest OS to reboot for QEMU versions which don't support the
'set-action' QMP command.
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In cases when we are adding a <transient/> disk with sharing backend
(and thus hotplugging it) we need to re-initialize ACPI tables so that
the VM boots from the correct device.
This has a side-effect of emitting the RESET event and forwarding it to
the clients which is not correct.
Fix this by ignoring RESET events during startup of the VM.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We don't use the value of the flag when the new handling is in place so
we don't have to initialize it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The '-no-shutdown' flag prevents qemu from terminating if a shutdown was
requested. Libvirt will handle the termination of the qemu process
anyways and using this consistently will allow greater flexibility for
the virDomainSetLifecycleAction API as well as will allow using
the 'system-reset' QMP command during startup to reinitiate devices
exported to the firmware.
This efectively partially reverts 0e034efaf9
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than using '-no-reboot' use the QMP command to update the
lifecycle action of 'on_reboot'.
This will be identical to how we set the behaviour during lifetime and
also avoids problems with use of the 'system-reset' QMP command during
bringup of the VM (used to update the firmware table of disks when disks
were hotplugged as part of startup).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The RESET event is delivered by qemu only when the guest OS is actually
allowed to reboot ('-no-reboot' or equivalent is not used) and due to
the nature of async handling of the events VM is actually already
executing guest code after the reboot, until our code gets to killing
it.
In general it should have been impossible to reach a state where the
reboot action is 'destroy' but we didn't use '-no-reboot' but due to
various bugs it was.
Due to the fact that this was not a desired operation and additionally
guest code already is executing I think the best option is not to kill
the VM any more (possible data loss?) and rely for the proper fix where
we use the new 'set-action' QMP command to enable an equivalent
behaviour to '-no-reboot' during runtime.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Directly use 'priv->allowReboot' as we now document what the behaiour is
to avoid another lookup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We simply terminate qemu instead of issuing a reset as the semantics of
the setting dictate.
Fix it by handling it identically to 'fake reboot'.
We need to forbid the combination of 'onReboot' -> 'destroy' and
'onPoweroff' -> reboot though as the handling would be hairy and it
honetly makes no sense.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>