Commit Graph

1642 Commits

Author SHA1 Message Date
Jiri Denemark
38fb9106ed qemu: Use qemuDomainSaveStatus
It is a nice wrapper around virDomainObjSave which logs a warning, but
otherwise ignores the error. Let's use it where appropriate.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-12-15 10:44:50 +01:00
Michal Privoznik
adeec11ba7 qemuProcessPrepareHost: Create domain private dirs as early as possible
As of ff024b60cc we are opening chardevs before starting QEMU.
However, we are also doing that before domain private directories
are created. This leaves us unable to create guest agent socket
which lives under priv->channelTargetDir.

While creating the dirs can be moved just before
qemuProcessPrepareHostBackendChardev() it's better to do it as
the very first step so that this kind of error is prevented in
future.

Fixes: ff024b60cc
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-12-13 12:53:39 +01:00
Peter Krempa
5c62df7e78 qemu: Implement chardev source setup for tpm
Add handling to qemuDomainDeviceBackendChardevForeachOne and callbacks
so that we can later use 'qemuBuildChardevCommand' for TPM devices.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-12-10 16:37:42 +01:00
Peter Krempa
5f2cc74257 qemu: Implement chardev source setup for disk
Add handling to qemuDomainDeviceBackendChardevForeachOne and callbacks
so that we can later use 'qemuBuildChardevCommand' for vhost-user disks
instead of a custom formatter.

Since we don't pass the FD for the vhost-user connection to qemu all of
the setup can be skipped.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-12-10 16:37:42 +01:00
Peter Krempa
0eabefb2b8 qemuBuildChrChardevStr: Remove unused arguments and clean up callers
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-12-10 16:37:42 +01:00
Peter Krempa
73871c3a30 qemu: domain: Refactor chardev definition preparing
Use the qemuDomainDeviceBackendChardevForeach helper to iterate all
eligible structs and convert the setup of the TLS defaults from the
config.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-12-10 16:37:42 +01:00
Peter Krempa
ff024b60cc qemu: Move creation and opening of chardev backend FDs to host prepare step
The opening of files for FD passing for a chardev backend was
historically done in the function which is formatting the commandline.

This has multiple problems. Firstly the function takes a lot of
parameters which need to be passed through the commandline formatters.
This made the 'qemuBuildChrChardevStr' extremely unappealing to the
extent that we have multiple other custom formatters in places which
didn't really want to use the function.

Additionally the function is also creating files in the host in certain
configurations which is wrong for a commandline formatter to do. This
meant that e.g. not all chardev test cases can be converted to use
DO_TEST_CAPS_LATEST as we attempt to use such code path and attempt to
create files outside of the test directory.

This patch moves the opening of the filedescriptors from
'qemuBuildChrChardevFileStr' into a new helper
'qemuProcessPrepareHostBackendChardevOne' which is called using
'qemuDomainDeviceBackendChardevForeach'.

To preserve test behaviour we also have another instance
'testPrepareHostBackendChardevOne' which is populating mock
filedescriptors.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-12-10 16:37:42 +01:00
Peter Krempa
e4b4ccb94f qemuProcessValidateHotpluggableVcpus: Refactor cleanup
Use automatic memory freeing for the temporary bitmap and remove the
pointless 'cleanup' section.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-12-10 16:36:24 +01:00
Peter Krempa
e552a0d502 qemuProcessRefreshLegacyBlockjobs: Automatically free GHashTable and refactor cleanup
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-12-01 13:53:11 +01:00
Peter Krempa
2e93441697 qemuProcessRefreshDisks: Automatically free GHashTable and refactor cleanup
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-12-01 13:53:11 +01:00
Peter Krempa
7ef8e9af6f qemuProcessWaitForMonitor: Automatically free GHashTable
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-12-01 13:53:11 +01:00
Peter Krempa
6e9ddad43b qemuRefreshPRManagerState: Automatically free GHashTable and refactor cleanup
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-12-01 13:53:11 +01:00
Peter Krempa
b59430d107 qemuRefreshVirtioChannelState: Automatically free GHashTable and refactor cleanup
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-12-01 13:53:11 +01:00
Kristina Hanicova
01f9873724 qemu_domainjob: move jobs_queued to struct qemuDomainJobObj
I think it makes more sense for the variable about jobs to be in
the job object. I also renamed it to be consistent with the rest
of the struct.

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2021-12-01 12:45:40 +01:00
Ján Tomko
c3e79a9008 qemu: remove ignore_value for qemuDomainObjExitMonitor
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-12-01 10:56:58 +01:00
Ján Tomko
57d665b390 qemu: do not check return value of qemuDomainObjExitMonitor
Remove the check from conditions where it's coupled with some other
checks.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-12-01 10:56:58 +01:00
Ján Tomko
d7b23755ef qemu: do not check return value of qemuDomainObjExitMonitor
Remove the unreachable code.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-12-01 10:56:58 +01:00
Daniel Henrique Barboza
748c4a6b74 qemu_process.c: use g_autoptr() in qemuProcessQMPInitMonitor
The 'xmlopt' parameter can be auto-unref by using g_autoptr().

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-11-18 14:51:29 -03:00
Daniel Henrique Barboza
df194c5c08 qemu: add DEVICE_UNPLUG_GUEST_ERROR event support
The upcoming QEMU 6.2.0 implements a new event called
DEVICE_UNPLUG_GUEST_ERROR, a new event that reports generic device
unplug errors that were detected by the guest and reported back to QEMU.

This new event is going to be specially useful for pseries guests that
uses newer kernels (must have kernel commit 29c9a2699e71), which is the
case for Fedora 34 at this moment. These guests have the capability of
reporting CPU removal errors back to QEMU which, starting in 6.2.0, will
emit the DEVICE_UNPLUG_GUEST_ERROR event. Libvirt can use this event to
abort the device removal immediately instead of waiting for 'setvcpus'
timeout.

QEMU 6.2.0 is also going to emit DEVICE_UNPLUG_GUEST_ERROR for memory
hotunplug errors, both in pseries and ACPI guests. QEMU 6.1.0 reports
memory removal errors using the MEM_UNPLUG_ERROR event, which is going to
be deprecated by DEVICE_UNPLUG_GUEST_ERROR in 6.2.0. Given that
Libvirt wasn't handling the MEM_UNPLUG_ERROR event we don't need to
worry about it - adding support to DEVICE_UNPLUG_GUEST_ERROR will be
enough to cover all future cases.

This patch adds support to DEVICE_UNPLUG_GUEST_ERROR by adding the
minimal wiring required for Libvirt to be aware of it. The monitor
callback for this event will abort the pending removal operation of the
device reported by the "device" property of the event. Most of the heavy
lifting is already done by existing code that handles
QEMU_DOMAIN_UNPLUGGING_DEVICE_STATUS_GUEST_REJECTED, making our life
easier to abort the pending removal operation.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-11-12 13:44:42 -03:00
Michal Privoznik
e812213bc1 qemu_agent: Drop destroy callback
After previous cleanups this callback is unused. Remove it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-11-12 14:11:43 +01:00
Michal Privoznik
0a9cb29ba2 qemuAgentOpen: Rework domain object refcounting
Currently, when opening an agent socket the qemuConnectAgent()
increments domain object refcounter and calls qemuAgentOpen()
where the domain object pointer is simply stored inside
_qemuAgent struct. If qemuAgentOpen() fails, then it clears @cb
member only to avoid qemuProcessHandleAgentDestroy() being called
(which decrements the domain object refcounter) and the domain
object refcounter is then decreased explicitly in
qemuConnectAgent().

The same result can be achieved with much cleaner code: increment
the refcounter inside qemuAgentOpen() and drop the dance around
@cb.

Also, the comment in qemuConnectAgent() about holding an extra
reference is not correct. The thread that called
qemuConnectAgent() already holds a reference to the domain
object. No matter how many time the object is locked and unlocked
the reference counter can't be decreased.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-11-12 14:11:29 +01:00
Michal Privoznik
108e131a3d qemu_agent: Rework domain object locking when opening agent
Just like qemuMonitorOpen(), hold the domain object locked
throughout the whole time of qemuConnectAgent() and unlock it
only for a brief time of actual connect() (because this is the
only part that has a potential of blocking).

The reason is that qemuAgentOpen() does access domain object
(well, its privateData) AND also at least one argument (@context)
depends on domain object. Accessing these without the lock is
potentially dangerous.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1845468#c12
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-11-12 14:11:11 +01:00
Bihong Yu
e3959c928e qemu_process: continue to process fakereboot after restarting libvirtd
During the vm rebooting, the vm could be paused if the libvirtd is
restarted for some reason, which is not expected. We need continue
fakereboot process if fakereboot flags is true and the vm is in
paused-user status.

Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-11-10 14:30:10 +01:00
Bihong Yu
83ce9ec0a7 qemu_process: set fakereboot flags false after processing fakereboot over
During the vm rebooting, the vm could be shut down if the libvirtd is
restarted for some reason, which is not expected. We move set
fakereboot flags false after processing fakereboot over, so we can
ensure that fakereboot process have been executed.

Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-11-10 14:30:08 +01:00
Michal Privoznik
030e80042e qemuProcessHandleMemoryDeviceSizeChange: Use qemuProcessEventSubmit()
This is a typical example of what can go wrong when sending out
an old patch. Back in January, when I was writing
qemuProcessHandleMemoryDeviceSizeChange() events were sent to the
worker pool thread using virThreadPoolSendJob(). Then, in July a
helper was introduced (qemuProcessEventSubmit()) but since my
code was not committed and I did not pay attention my code wasn't
updated. Later, when I merged my code it uses the old approach.

BTW: this also fixes a possible double free which I completely
missed when writing the code ~10 months ago.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-26 10:43:55 +02:00
Michal Privoznik
14c60c3ae7 qemu_monitor: Make domainMemoryDeviceSizeChange cb return void
Nobody's interested in the return value of any of
struct _qemuMonitorCallbacks callbacks. They are all void, but
domainMemoryDeviceSizeChange. Change it to void.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-26 10:43:55 +02:00
Peng Liang
74e1ebee7f qemu: Move pid file of pr-helper to stateDir
Libvirt will put the pid file of pr-helper to per-domain directory.
However, the ownership of the per-domain directory is the user to run
the QEMU process and the user has the write permission of the directory.
If VM escape occurs, the attacker can
1. write arbitrary content to the pid file (if running QEMU using root),
   then the attacker can kill any process by writing appropriate pid to
   the pid file;
2. spoof the pid file (if running QEMU using a regular user), then the
   pr-helper process will never be cleared even if the VM is destroyed.

So, move the pid file of pr-helper from per-domain directory to
stateDir.

Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-10-19 09:12:26 +02:00
Ján Tomko
f1818032f5 qemu: Revert "qemuExtDevicesStart: pass logManager"
This reverts commit b164eac5e1

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-12 14:12:11 +02:00
Peter Krempa
c3bd60ddc6 qemu: Use 'effectiveBootIndex' to handle <os><boot dev='network'>
Fill in the effective boot index for network devices (or hostdev-backed
network devices via 'qemuProcessPrepareDeviceBootorder'. This patch
doesn't clean up the cruft to make it more obvious what's happening.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-12 10:26:02 +02:00
Peter Krempa
c90d17c812 qemu: process: Make qemuProcessPrepareDomainDiskBootorder more universal
Rename it to 'qemuProcessPrepareDeviceBootorder' and call it from
'qemuProcessPrepareDomain' rather than
'qemuProcessPrepareDomainStorage'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-12 10:26:02 +02:00
Peter Krempa
aee82fe616 conf: Introduce 'effectiveBootIndex' into 'virDomainDeviceInfo'
'effectiveBootIndex' is a copy of 'bootIndex' if '<boot order=' was
present and left unassigned if not. This allows hypervisor drivers to
reinterpret <os><boot> without being visible in the XML.

QEMU driver had a internal implementation for disks, which is now
replaced. Additionally this will simplify a refactor of network boot
assignment.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-12 10:26:02 +02:00
Peter Krempa
2154718c29 qemu: Rename 'qemuMonitorAddDeviceArgs' to 'qemuMonitorAddDeviceProps'
We commonly use 'props' for the JSON object describing something. Rename
the monitor device addition code.

Additionally the common approach is to clear the pointer if it was
consumed so the arguments are adjusted to do so.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-12 10:26:01 +02:00
Peter Krempa
424dc5d2d2 qemu: Remove 'qemuBuildCommandLineFlags' and associated code
The -netdev formatter code switched to a real virQEMUCaps flag so we can
remove the old flags which used to enable JSON for -netdev for
validation purposes.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-12 10:26:00 +02:00
Michal Privoznik
51f65e9522 qemu: Account for both memballoon and virtio-mem
Reporting how much memory is exposed to the guest happens under
<currentMemory/> which is taken from def->mem.cur_balloon. The
reported amount should account for both balloon size and the sum
of @currentsize of all virtio-mems. For instance, if domain has
4GiB via balloon and additional 2GiB via virtio-mem, then the
domain XML should report 6GiB. The same applies for domain
statistics.

The way to achieve this is to account for either balloon or
virtio-mem when the size of the other is changed, e.g. on balloon
change we have to add all @currentsize (for non virtio-mem these
will be zero, so the check for memory model is needless, but
makes it more obvious what's happening), and vice versa.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:04:57 +02:00
Michal Privoznik
5c2d6908a6 qemu: Refresh the current size of virtio-mem on monitor reconnect
If the QEMU driver restarts it loses the track of the current size
of virtio-mem (because it's runtime type of information and thus
not stored in XML) and therefore, we have to refresh it when
reconnecting to the domain monitor.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:04:53 +02:00
Michal Privoznik
9985f62b51 qemu: Wire up MEMORY_DEVICE_SIZE_CHANGE event
As advertised in previous commit, this event is delivered to us
when virtio-mem module changes the allocation inside the guest.
It comes with one attribute - size - which holds the new size of
the virtio-mem (well, allocated size), in bytes.
Mind you, this is not necessarily the same number as 'requested
size'. It almost certainly will be when sizing the memory up, but
it might not be when sizing the memory down - the guest kernel
might be unable to free some blocks.

This current size is reported in the domain XML as an output
element only.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:04:47 +02:00
Michal Privoznik
f931cb7f21 conf: Introduce virtio-mem <memory/> model
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:

  1) block-size, which defines the size of a block
  2) requested-size, which defines how much memory (in bytes)
     is the device requested to expose to the guest.

The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'current' and it is dealt with in future
commits (it is a runtime information anyway).

In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.

Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:

  <memory model='virtio-mem'>
    <source>
      <nodemask>1-3</nodemask>
      <pagesize unit='KiB'>2048</pagesize>
    </source>
    <target>
      <size unit='KiB'>2097152</size>
      <node>0</node>
      <block unit='KiB'>2048</block>
      <requested unit='KiB'>1048576</requested>
    </target>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
  </memory>

I hope by now it is obvious that:

  1) 'requested-size' must be an integer multiple of
     'block-size', and
  2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
     address.

Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:02:53 +02:00
Zhenzhong Duan
88a3977922 qemu: ingore the transient domain state in fake reboot
When action for 'on_poweroff' is set to 'restart', 'fake reboot'
is triggered and qemu shutdown state is transient. Domain state
need not to be changed and events not sent in this case.

Fixes: 4ffc807214
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 09:53:20 +02:00
Kristina Hanicova
3e4f4c2eec src: network_conf: propagate only bool to virNetworkDefParseString()
We don't need to propagate all public flags, only the information
about the presence of the validation one, which can differ from
function to function. This patch makes it easier and more
readable in case of a future additions of validation flags.
This change was suggested by Daniel.

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-10 17:38:06 +02:00
Peter Krempa
526f2900bc qemuProcessQMPInit: Use long options for '-qmp' when probing qemu.
'-qmp' in this case behaves the same as '-chardev' so it should have
been converted the same way as others were in 43c9c0859f since
short options are deprecated.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-10 15:18:48 +02:00
Masayoshi Mizuma
a2e6039cca qemu: process: Split out logic for setting the 'allowReboot' internal flag
Split out the logic which was used to determine whether qemu should
allow the guest OS to reboot for QEMU versions which don't support the
'set-action' QMP command.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-09-06 11:13:06 +02:00
Peter Krempa
9eef395fcc qemu: process: Ignore 'RESET' event during startup
In cases when we are adding a <transient/> disk with sharing backend
(and thus hotplugging it) we need to re-initialize ACPI tables so that
the VM boots from the correct device.

This has a side-effect of emitting the RESET event and forwarding it to
the clients which is not correct.

Fix this by ignoring RESET events during startup of the VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:45 +02:00
Peter Krempa
3061f8f9cb qemu: process: Don't set 'allowReboot' when qemu supports 'set-action'
We don't use the value of the flag when the new handling is in place so
we don't have to initialize it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:45 +02:00
Peter Krempa
b67e450a5a qemu: command: Always use '-no-shutdown'
The '-no-shutdown' flag prevents qemu from terminating if a shutdown was
requested. Libvirt will handle the termination of the qemu process
anyways and using this consistently will allow greater flexibility for
the virDomainSetLifecycleAction API as well as will allow using
the 'system-reset' QMP command during startup to reinitiate devices
exported to the firmware.

This efectively partially reverts 0e034efaf9

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:45 +02:00
Peter Krempa
d0fad4ab2e qemuProcessLaunch: Setup handling of 'on_reboot' via QMP when starting the process
Rather than using '-no-reboot' use the QMP command to update the
lifecycle action of 'on_reboot'.

This will be identical to how we set the behaviour during lifetime and
also avoids problems with use of the 'system-reset' QMP command during
bringup of the VM (used to update the firmware table of disks when disks
were hotplugged as part of startup).

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:45 +02:00
Peter Krempa
24dab19f8a qemuProcessHandleReset: Don't emulate lifecycle actions for RESET event
The RESET event is delivered by qemu only when the guest OS is actually
allowed to reboot ('-no-reboot' or equivalent is not used) and due to
the nature of async handling of the events VM is actually already
executing guest code after the reboot, until our code gets to killing
it.

In general it should have been impossible to reach a state where the
reboot action is 'destroy' but we didn't use '-no-reboot' but due to
various bugs it was.

Due to the fact that this was not a desired operation and additionally
guest code already is executing I think the best option is not to kill
the VM any more (possible data loss?) and rely for the proper fix where
we use the new 'set-action' QMP command to enable an equivalent
behaviour to '-no-reboot' during runtime.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:44 +02:00
Peter Krempa
fa11852433 qemu: domain: Remove qemuDomainIsUsingNoShutdown
Directly use 'priv->allowReboot' as we now document what the behaiour is
to avoid another lookup.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:44 +02:00
Peter Krempa
4ffc807214 qemu: Honor 'restart' action for 'on_poweroff'
We simply terminate qemu instead of issuing a reset as the semantics of
the setting dictate.

Fix it by handling it identically to 'fake reboot'.

We need to forbid the combination of 'onReboot' -> 'destroy' and
'onPoweroff' -> reboot though as the handling would be hairy and it
honetly makes no sense.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-25 15:32:44 +02:00
Kristina Hanicova
8555dee6ba src & network_conf: add validation against schema in define
This patch also includes propagation of flags into the
virNetworkDefParse().

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-24 15:46:54 +02:00
Peter Krempa
88f7511923 qemuMonitorSetBlockIoThrottle: Remove booleans controlling used fields
All supported QEMU versions have all the fields so we can remove the
booleans controlling which fields are used on the monitor.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-08-18 09:57:34 +02:00
Peter Krempa
2a47d74758 qemu: capabilities: Rename QEMU_CAPS_CHARDEV_FD_PASS to QEMU_CAPS_CHARDEV_FD_PASS_COMMANDLINE
Make it more obvious that we care about passing FDs on the commandline
before startup of qemu, which is used to avoid startup monitor polling.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-08-18 09:54:46 +02:00
Michal Privoznik
5c254bb541 conf: Store SCSI bus length in virDomainDef
Libvirt assumes that a SCSI bus can fit up to 8 devices
(including controller itself), except for so called wide bus
which can accommodate up to 16 devices (again, including
controller). This plays important role when computing 'drive'
address in virDomainDiskDefAssignAddress(). So far, the only
driver that enables wide SCSI bus is VMX. But with newer
releases, ESX is capable of "super wide" bus (64 devices).

We can blindly bump the limit in our code because then we would
compute address that's invalid for older ESX versions that we
still want to support.

Unfortunately, I haven't found a better place where to store this
than virDomainDef.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-08-16 14:22:38 +02:00
Peter Krempa
d1aa253730 qemu: domain: Store capability overrides in NULL-terminated string list
We always process the full list so there's no value in storing the count
separately.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-08-09 10:09:00 +02:00
Peter Krempa
a257668ede qemuProcessSetupHotpluggableVcpus: Use automatic memory freeing
'bootHotplug' can be auto-freed when terminating the function and moving
the declaration of 'vcpuprops' to the loop which uses it along with
automatic freeing allows us to simplify cleanup in certain cases.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-06 08:53:26 +02:00
Peter Krempa
98f6f2081d util: alloc: Reimplement VIR_APPEND_ELEMENT using virAppendElement
Use virAppendElement instead of virInsertElementsN to implement
VIR_APPEND_ELEMENT which allows us to remove error handling as the
only relevant errors were removed when switching to aborting memory
allocation functions.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-06 08:53:25 +02:00
Ján Tomko
ff7b8043b6 util: virPidFileForceCleanupPath: add group argument
Add a version of virPidFileForceCleanupPath that takes
a 'group' bool argument and propagate it all the way
down to virProcessKillPainfullyDelay.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-08-05 11:18:09 +02:00
Peter Krempa
e286a62941 qemu: process: Extract code for submitting event handling to separate thread
The submission of the event to the helper thread has a verbose cleanup
path which was duplicated in all the event handlers. Simplify it by
extracting the code into a helper named 'qemuProcessEventSubmit' and
reuse it where appropriate.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
59ba742cbc qemu: Remove return value from qemuMonitorDomainMemoryFailureCallback
Change the callback prototype and fix the callback registered in the
process code.

The removed error messages are impossible as the enum values are
converted via VIR_ENUM helpers and guarded by compiler checks.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
b9357e939d qemu: Remove return value from qemuMonitorDomainGuestCrashloadedCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
7f984ba7eb qemu: Remove return value from qemuMonitorDomainRdmaGidStatusChangedCallback
Change the callback prototype and fix the callback registered in the
process code.

It is also impossible for @info to be non-NULL in the cleanup section so
the cleanup can be completely removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
3b63871f2c qemu: Remove return value from qemuMonitorDomainPRManagerStatusChangedCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
a55093ec28 qemu: Remove return value from qemuMonitorDomainDumpCompletedCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
4d01996633 qemu: Remove return value from qemuMonitorDomainBlockThresholdCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
a23f9ce576 qemu: Remove return value from qemuMonitorDomainAcpiOstInfoCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
5e7d9542ec qemu: Remove return value from qemuMonitorDomainMigrationPassCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
a4e654f988 qemu: Remove return value from qemuMonitorDomainMigrationStatusCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
1ee09b5d4b qemu: Remove return value from qemuMonitorDomainSpiceMigratedCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
6e8289585b qemu: Remove return value from qemuMonitorDomainSerialChangeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
f3d62ee3a5 qemu: Remove return value from qemuMonitorDomainNicRxFilterChangedCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
cc121412fc qemu: Remove return value from qemuMonitorDomainDeviceDeletedCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:48 +02:00
Peter Krempa
81db1e75b3 qemu: Remove return value from qemuMonitorDomainGuestPanicCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
05a6da5862 qemu: Remove return value from qemuMonitorDomainPMSuspendDiskCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
e8502f79db qemu: Remove return value from qemuMonitorDomainBalloonChangeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
fa0af946d8 qemu: Remove return value from qemuMonitorDomainPMSuspendCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
86a5925edd qemu: Remove return value from qemuMonitorDomainPMWakeupCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
f4b36cd87c qemu: Remove return value from qemuMonitorDomainTrayChangeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
9b69147c05 qemu: Remove return value from qemuMonitorDomainJobStatusChangeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
5bc4f8dd0f qemu: Remove return value from qemuMonitorDomainBlockJobCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
b0487ba754 qemu: Remove return value from qemuMonitorDomainGraphicsCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
96d98a4b19 qemu: Remove return value from qemuMonitorDomainIOErrorCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
bd9a14cf6e qemu: Remove return value from qemuMonitorDomainWatchdogCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
8ed88fe9a0 qemu: Remove return value from qemuMonitorDomainRTCChangeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
1b5097172b qemu: Remove return value from qemuMonitorDomainResumeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
e57a537ad2 qemu: Remove return value from qemuMonitorDomainStopCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
8e95b76b1a qemu: Remove return value from qemuMonitorDomainResetCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
40950f60fc qemu: Remove return value from qemuMonitorDomainShutdownCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
b2bf8d5bab qemu: Remove return value from qemuMonitorDomainEventCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Boris Fiuczynski
9568a4d410 conf: Add s390-pv as launch security type
Add launch security type 's390-pv' as well as some tests.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-07-21 13:30:25 +02:00
Boris Fiuczynski
96bc8312aa conf: Refactor launch security to allow more types
Adding virDomainSecDef for general launch security data
and moving virDomainSEVDef as an element for SEV data.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-07-21 13:30:14 +02:00
Jiri Denemark
364995ed57 qemu: Signal domain condition in qemuProcessStop a bit later
Signaling the condition before vm->def->id is reset to -1 is dangerous:
in case a waiting thread wakes up, it does not see anything interesting
(the domain is still marked as running) and just enters virDomainObjWait
where it waits forever because the condition will never be signalled
again.

Originally it was impossible to get into such situation because the vm
object was locked all the time between signaling the condition and
resetting vm->def->id, but after commit 860a999802 released in 6.8.0,
qemuDomainObjStopWorker called in qemuProcessStop between
virDomainObjBroadcast and setting vm->def->id to -1 unlocks the vm
object giving other threads a chance to wake up and possibly hang.

In real world, this can be easily reproduced by killing, destroying, or
just shutting down (from the guest OS) a domain while it is being
migrated somewhere else. The migration job would never finish.

So let's make sure we delay signaling the domain condition to the point
when a woken up thread can detect the domain is not active anymore.

https://bugzilla.redhat.com/show_bug.cgi?id=1949869

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-19 15:49:16 +02:00
Kristina Hanicova
c39757f700 qemu: Do not erase duplicate devices from namespace if error occurs
If the attempt to attach a device failed, we erased the
unattached device from the namespace. This resulted in erasing an
already attached device in case of a duplicate. We need to check
for existing file in the namespace in order to determine erasing
it in case of a failure.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1780508

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-15 12:00:24 +02:00
Peter Krempa
a3edda6b9e qemu: Prevent two threshold events when it was registered with index
Remember whether the user passed an explicit index when registering the
event so that we can avoid the top level event when it isn't needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-12 16:34:17 +02:00
zhangjl02
2f922b2c46 qemu: interface: check and use ovs command to set qos of ovs managed port
When qos is set or delete, we have to check if the port is an ovs managed
port. If true, call the virNetDevOpenvswitchInterfaceSetQos function when qos
is set, and call the virNetDevOpenvswitchInterfaceClearQos function when
the interface is to be destroyed.

Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-12 09:40:13 +02:00
Michal Privoznik
fb1289c155 qemu: Don't set NVRAM label when creating it
The NVRAM label is set in qemuSecuritySetAllLabel(). There's no
need to set its label upfront. In fact, setting it twice creates
an imbalance because it's unset only once which mangles seclabel
remembering. However, plain removal of the
qemuSecurityDomainSetPathLabel() undoes the fix for the original
bug (when dynamic ownership is off then the NVRAM is not created
with cfg->user and cfg->group but as root:root). Therefore, we
have to switch to virFileOpenAs() and pass cfg->user and
cfg->group and VIR_FILE_OPEN_FORCE_OWNER flag. There's no need to
pass VIR_FILE_OPEN_FORCE_MODE because the file will be created
with the proper mode.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1969347
Fixes: bcdaa91a27
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-06-17 09:15:09 +02:00
Masayoshi Mizuma
7c69f72230 qemuProcessSetupDisksTransientSnapshot: Skip enabling transientOverlayCreated flag
QEMU_DOMAIN_DISK_PRIVATE(disk)->transientOverlayCreated flag
gets true unexpectedly on qemuProcessSetupDisksTransientSnapshot() when
the disk has <transient shareBacking='yes'> option.

The flag should be enabled on qemuDomainAttachDiskGeneric() after the
overlay setup is completed.

Skip enabling transientOverlayCreated for the disk here.

Fixes: 75871da0ec
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-01 08:20:01 +02:00
Peter Krempa
75871da0ec qemu: Allow <transient> disks with images shared accross VMs
Implement this behaviour by skipping the disks on traditional
commandline and hotplug them before resuming CPUs. That allows to use
the support for hotplugging of transient disks which inherently allows
sharing of the backing image as we open it read-only.

This commit implements the validation code to allow it only with buses
supporting hotplug and the hotplug code while starting up the VM.

When we have such disk we need to issue a system-reset so that firmware
tables are regenerated to allow booting from such device.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:08 +02:00
Peter Krempa
34c3291139 qemu: Track creation of <transient> disk overlay individually
In preparation for hotplug of <transient> disks we'll need to track
whether the overlay file was created individually per-disk.

Add 'transientOverlayCreated' to 'struct _qemuDomainDiskPrivate' and
remove 'inhibitDiskTransientDelete' from 'qemuDomainObjPrivate' and
adjust the code for the change.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:08 +02:00
Peter Krempa
2976b6aaeb qemu: Move 'bootindex' handling for disks out of command line formatter
The logic assigning the bootindices from the legacy boot order
configuration was spread through the command line formatters for the
disk device and for the floppy controller.

This patch adds 'effectiveBootindex' property to the disk private data
which holds the calculated boot index and moves the logic of determining
the boot index into 'qemuProcessPrepareDomainDiskBootorder' called from
'qemuProcessPrepareDomainStorage'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Masayoshi Mizuma
b4d87669ba qemu_snapshot: Add the guest name to the transient disk path
Later patches will implement sharing of the backing file, so we'll need
to be able to discriminate the overlays per VM.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Peter Krempa
b7583a5ba3 qemu: snapshot: move transient snapshot code to qemu_process
The code deals with the startup of the VM and just uses the snapshot
code to achieve the desired outcome.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Peter Krempa
06e9b0c28d qemu: process: Setup transient disks only when starting a fresh VM
Creating the overlay for the disk is needed when starting a new VM only.
Additionally for now migration with transient disks is forbidden
anyways.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Peter Krempa
92a3eddd03 Remove static analysis assertions
None of them are currently needed to pass our upstream CI, most were
either for ancient clang versions or coverity for silencing false
positives.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-24 20:26:20 +02:00
Peter Krempa
bbd55e9284 Drop magic comments for coverity
They were added mostly randomly and we don't really want to keep working
around of false positives.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-24 20:26:20 +02:00
Kristina Hanicova
bcdaa91a27 qemu: Use qemuDomainOpenFile() in qemuPrepareNVRAM()
Previously, nvram file was created with user/group owner as
'root', rather than specifications defined in libvirtd.conf. The
solution is to call qemuDomainOpenFile(), which creates file with
defined permissions and qemuSecurityDomainSetPathLabel() to set
security label for created nvram file.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1783255

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-21 14:36:57 +02:00
Kristina Hanicova
19967f64f4 qemu: Add check for needed paths for memory devices
When building a commandline for a DIMM memory device with
non-default access mode, the qemuBuildMemoryBackendProps() will
tell QEMU to allocate memory from per-domain memory backing dir.
But later, when preparing the host, the
qemuProcessNeedMemoryBackingPath() does not check for memory
devices at all resulting in per-domain memory backing dir not
being created which upsets QEMU.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1961114

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-21 08:51:11 +02:00
Michal Privoznik
655f67c68a qemu_process: Drop needless check in qemuProcessNeedMemoryBackingPath()
The aim of this function is to return whether domain definition
and/or memory device that user intents to hotplug needs a private
path inside cfg->memoryBackingDir. The rule for the memory device
that's being hotplug includes checking whether corresponding
guest NUMA node needs memoryBackingDir. Well, while the rationale
behind makes sense it is not necessary to check for that really -
just a few lines above every guest NUMA node was checked exactly
for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-05-18 17:47:58 +02:00
Michal Privoznik
4d779874ef qemu_process: Deduplicate code in qemuProcessNeedHugepagesPath()
The aim of qemuProcessNeedHugepagesPath() is to return whether
guest needs private path inside HugeTLBFS mounts (deducted from
domain definition @def) or whether the memory device that user is
hotplugging in needs the private path (deducted from the @mem
argument). The actual creation of the path is done in the only
caller qemuProcessBuildDestroyMemoryPaths().

The rule for the first case (@def) and the second case (@mem) is
the same (domain has a DIMM device that has HP requested) and is
written twice. Move the logic into a function to deduplicate the
code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-05-18 17:47:58 +02:00
Michal Privoznik
310b37e486 qemu: Don't double free @node_cpus in qemuProcessSetupPid()
When placing vCPUs into CGroups the qemuProcessSetupPid() is
called which then enters a for() loop (around its middle) where
it calls virDomainNumaGetNodeCpumask() for each guest NUMA node.
But the latter returns only a pointer not new reference/copy and
thus the caller must not free it. But the variable is decorated
with g_autoptr() which leads to a double free.

Fixes: 2d37d8dbc9
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-04-23 11:02:21 +02:00
Luyao Zhong
2d37d8dbc9 qemu: Add support for 'restrictive' mode in numatune
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-04-19 11:39:21 +02:00
Michal Privoznik
c8238579fb lib: Drop internal virXXXPtr typedefs
Historically, we declared pointer type to our types:

  typedef struct _virXXX virXXX;
  typedef virXXX *virXXXPtr;

But usefulness of such declaration is questionable, at best.
Unfortunately, we can't drop every such declaration - we have to
carry some over, because they are part of public API (e.g.
virDomainPtr). But for internal types - we can do drop them and
use what every other C project uses 'virXXX *'.

This change was generated by a very ugly shell script that
generated sed script which was then called over each file in the
repository. For the shell script refer to the cover letter:

https://listman.redhat.com/archives/libvir-list/2021-March/msg00537.html

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-13 17:00:38 +02:00
Tim Wiederhake
c5d4d0198f qemuProcessUpdateGuestCPU: Check host cpu for forbidden features
See https://bugzilla.redhat.com/show_bug.cgi?id=1840770

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com
2021-03-26 11:40:55 +01:00
Jiri Denemark
1107c0b9c3 Do not check return value of VIR_REALLOC_N
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-03-22 12:44:18 +01:00
Jiri Denemark
b8c919b5b4 qemu: Drop redundant checks for qemuCaps before virQEMUCapsGet
virQEMUCapsGet checks for qemuCaps itself, no need to do it explicitly.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-03-22 12:44:18 +01:00
Peter Krempa
8967ad7be6 qemu: backup: Restore security label on backup disk store image on VM termination
When the backup job is terminated normally the security label is
restored by the blockjob finishing handler.

If the VM dies or is destroyed that wouldn't happen as the blockjob
handler wouldn't be called.

Restore the security label on disk store where we remember that the job
was running at the point when 'qemuBackupJobTerminate' was called.

Not resetting the security label means that we also leak the xattr
attributes remembering the label which prevents any further use of the
file, which is a problem for block devices.

This also requires that the call to 'qemuBackupJobTerminate' from
'qemuProcessStop' happens only after 'vm->pid' was reset as otherwise
the security subdrivers attempt to enter the process namespace which
fails if the process isn't running any more.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1939082
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-19 16:41:39 +01:00
Michal Privoznik
6e9c4811be qemu_process: Use accessor for def->mem.total_memory
When connecting to the monitor, a timeout is calculated that is
bigger the more memory guest has (because QEMU has to allocate
and possibly zero out the memory and what not, empirically
deducted). However, when computing the timeout the @total_memory
mmember is accessed directly even though
virDomainDefGetMemoryTotal() should have been used.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-16 09:16:13 +01:00
Peter Krempa
aa372e5a01 backup: Store 'apiFlags' in private section of virDomainBackupDef
'qemuBackupJobTerminate' needs the API flags to see whether
VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL. Unfortunately when called via
qemuProcessReconnect()->qemuProcessStop() early (e.g. if the qemu
process died while we were reconnecting) the job is cleared temporarily
so that other APIs can be called. This would mean that we couldn't clean
up the files in some cases.

Save the 'apiFlags' inside the backup object and set it from the
'qemuDomainJobObj' 'apiFlags' member when reconnecting to a VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-12 10:59:05 +01:00
Andrea Bolognani
c2180c2fd6 qemu: Set limits only when explicitly asked to do so
The current code is written under the assumption that, for all
limits except the core size, asking for the limit to be set to
zero is a no-op, and so the operation is performed
unconditionally.

While this is the behavior we want for the QEMU driver, the
virCommand and virProcess facilities are generic, and should not
implement this kind of policy: asking for a limit to be set to
zero should result in that limit being set to zero every single
time.

Add some checks in the QEMU driver, effectively moving the
policy where it belongs.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-08 22:41:40 +01:00
Andrea Bolognani
bd33680f02 qemu: Set all limits at the same time
qemuProcessLaunch() is the correct place to set process limits,
and in fact is where we were dealing with almost all of them,
but the memory locking limit was handled in
qemuBuildCommandLine() instead for some reason.

The code is rewritten so that the desired limit is calculated
and applied in separated steps, which will help with further
changes, but this doesn't alter the behavior.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-08 22:41:40 +01:00
Andrea Bolognani
9bf5c00f9b qemu: Make some minor tweaks
Doing this now will make the next changes nicer.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-08 22:41:40 +01:00
Peter Krempa
3c546f7eb4 qemuProcessReportLogError: Don't mark "%s: %s" as translatable
The function is constructing an error message from a prefix and the
contents of the qemu log file. Marking just two string modifiers as
translatable is pointless and will certainly confuse translators.

Remove the marking and add a comment which bypasses the
sc_libvirt_unmarked_diagnostics check.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-05 15:01:29 +01:00
Peter Krempa
c8ff56c7ad qemuProcessReportLogError: Remove unnecessary math for max error message
Now that error message formatting doesn't use fixed size buffers we can
drop the math for calculating the maximum chunk of log to report in the
error message and use a round number. This also makes it obvious that
the chosen number is arbitrary.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-05 15:01:29 +01:00
Michal Privoznik
7f482a67e4 lib: Replace virFileMakePath() with g_mkdir_with_parents()
Generated using the following spatch:

  @@
  expression path;
  @@
  - virFileMakePath(path)
  + g_mkdir_with_parents(path, 0777)

However, 14 occurrences were not replaced, e.g. in
virHostdevManagerNew(). I don't really understand why.
Fixed by hand afterwards.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-04 20:52:23 +01:00
Michal Privoznik
b1e3728dec lib: Replace virFileMakePathWithMode() with g_mkdir_with_parents()
These functions are identical. Made using this spatch:

  @@
  expression path, mode;
  @@
  - virFileMakePathWithMode(path, mode)
  + g_mkdir_with_parents(path, mode)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-04 20:52:23 +01:00
Jiri Denemark
c8f3b83c72 qemu_domainjob: Make copy of owner API
Using the job owner API name directly works fine as long as it is a
static string or the owner's thread is still running. However, this is
not always the case. For example, when the owner API name is filled in a
job when we're reconnecting to existing domains after daemon restart,
the dynamically allocated owner name will disappear with the
reconnecting thread. Any follow up usage of the pointer will read random
memory.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-02-25 09:55:31 +01:00
Peng Liang
1ac703a7d0 qemu: Add missing lock in qemuProcessHandleMonitorEOF
qemuMonitorUnregister will be called in multiple threads (e.g. threads
in rpc worker pool and the vm event thread).  In some cases, it isn't
protected by the monitor lock, which may lead to call g_source_unref
more than one time and a use-after-free problem eventually.

Add the missing lock in qemuProcessHandleMonitorEOF (which is the only
position missing lock of monitor I found).

Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-24 15:00:51 +01:00
Stefan Berger
f30aa2ec74 qemu: Fix libvirt hang due to early TPM device stop
This patch partially reverts commit 5cde9dee where the qemuExtDevicesStop()
was moved to a location before the QEMU process is stopped. It may be
alright to tear down some devices before QEMU is stopped, but it doesn't work
for the external TPM (swtpm) which assumes that QEMU sends it a signal to stop
it before libvirt may try to clean it up. So this patch moves the
virFileDeleteTree() calls after the call to qemuExtDevicesStop() so that the
pid file of virtiofsd is not deleted before that call.

Afftected libvirt versions are 6.10 and 7.0.

Fixes: 5cde9dee8c
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-19 17:31:37 +01:00
Daniel P. Berrangé
17f001c451 qemu: record deprecation messages against the domain
These messages are only valid while the domain is running.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:12 +00:00
Peter Krempa
480fecaa21 Replace virStringListJoin by g_strjoinv
Our implementation was inspired by glib anyways. The difference is only
the order of arguments.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:05:34 +01:00
Peter Krempa
56cedfcf38 Replace virStringListHasString by g_strv_contains
The glib variant doesn't accept NULL list, but there's just one caller
where it wasn't checked explicitly, thus there's no need for our own
wrapper.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:05:33 +01:00
Peter Krempa
d9f7e87673 qemuProcessUpdateDevices: Refactor cleanup and memory handling
Use automatic memory freeing and remove the 'cleanup' label. Also make
it a bit more obvious that nothing happens if the 'old' list wasn't
present.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:05:33 +01:00
Daniel P. Berrangé
c32f172d12 qemu: wire up support for maximum CPU model
The "max" model can be treated the same way as "host" model in general.

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-10 11:44:48 +00:00
Laine Stump
674719afe6 qemu: replace VIR_FREE with g_free in all vir*Free() functions
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-02-05 00:20:43 -05:00
Pavel Hrdina
836e0a960b storage_source: use virStorageSource prefix for all functions
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-01-22 11:10:27 +01:00
Pavel Hrdina
01f7ade912 util: extract virStorageFile code into storage_source
Up until now we had a runtime code and XML related code in the same
source file inside util directory.

This patch takes the runtime part and extracts it into the new
storage_file directory.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-01-22 11:10:27 +01:00
Laine Stump
c2b2cdf746 call virDomainNetNotifyActualDevice() for all interface types
Now that this function can be called regardless of interface type (and
whether or not we have a conn for the network driver), let's actually
call it for all interface types. This will assure that we re-connect
any disconnected bridge devices for <interface type='bridge'> as
mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1730084#c26
(until now we've only been reconnecting bridge devices for <interface
type='network'>)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-01-08 11:34:49 -05:00
Michal Privoznik
5ac2439a83 qemu_process: Release domain seclabel later in qemuProcessStop()
Some secdrivers (typically SELinux driver) generate unique
dynamic seclabel for each domain (unless a static one is
requested in domain XML). This is achieved by calling
qemuSecurityGenLabel() from qemuProcessPrepareDomain() which
allocates unique seclabel and stores it in domain def->seclabels.
The counterpart is qemuSecurityReleaseLabel() which releases the
label and removes it from def->seclabels. Problem is, that with
current code the qemuProcessStop() may still want to use the
seclabel after it was released, e.g. when it wants to restore the
label of a disk mirror.

What is happening now, is that in qemuProcessStop() the
qemuSecurityReleaseLabel() is called, which removes the SELinux
seclabel from def->seclabels, yada yada yada and eventually
qemuSecurityRestoreImageLabel() is called. This bubbles down to
virSecuritySELinuxRestoreImageLabelSingle() which find no SELinux
seclabel (using virDomainDefGetSecurityLabelDef()) and this
returns early doing nothing.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1751664
Fixes: 8fa0374c5b
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-01-06 13:29:09 +01:00
Jiri Denemark
f7c40b5c71 qemu: The TSC tolerance interval should be closed
The kernel refuses to set guest TSC frequency less than a minimum
frequency or greater than maximum frequency (both computed based on the
host TSC frequency). When writing the libvirt code with a reversed logic
(return success when the requested frequency falls within the tolerance
interval) I forgot to include the boundaries.

Fixes: d8e5b45600
https://bugzilla.redhat.com/show_bug.cgi?id=1839095

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-01-06 11:24:37 +01:00
Peter Krempa
d0819b9f02 qemu: Properly handle setting of <iotune> for empty cdrom
When starting a VM with an empty cdrom which has <iotune> configured the
startup fails as qemu is not happy about setting tuning for an empty
drive:

 error: internal error: unable to execute 'block_set_io_throttle', unexpected error: 'Device has no medium'

Resolve this by skipping the setting of throttling for empty drives and
updating the throttling when new medium is inserted into the drive.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/111
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-01-06 09:24:48 +01:00
Shi Lei
9b5d741a9d netdevmacvlan: Use helper function to create unique macvlan/macvtap name
Simplify ReserveName/GenerateName for macvlan and macvtap by using
common functions.

Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2020-12-15 13:35:33 -05:00
Shi Lei
c36cad1a31 netdevtap: Use common helper function to create unique tap name
Simplify GenerateName/ReserveName for netdevtap by using common
functions.

Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2020-12-15 13:35:27 -05:00
Daniel Henrique Barboza
9432693e2b domain_conf.c: move virDomainDeviceDefValidate() to domain_validate.c
Move virDomainDeviceDefValidate() and all its helper functions to
domain_validate.c.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-14 09:29:09 -03:00
Peter Krempa
18de9dfd77 virDomainDefValidate: Add per-run 'opaque' data
virDomainDefPostParse infrastructure has apart from the global opaque
data also per-run data, but this was not duplicated into the validation
callbacks.

This is important when drivers want to use correct run-state for the
validation.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-09 09:33:47 +01:00
Peter Krempa
c1720b9ac7 qemuDomainDiskLookupByNodename: Lookup also backup 'store' nodenames
Nodename may be asociated to a disk backup job, add support to looking
up in that chain too. This is specifically useful for the
BLOCK_WRITE_THRESHOLD event which can be registered for any nodename.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-08 15:12:34 +01:00
Michal Privoznik
40a162f83e qemu: Don't cache NUMA caps
In v6.0.0-rc1~439 (and friends) we tried to cache NUMA
capabilities because we assumed they are immutable. And to some
extent they are (NUMA hotplug is not a thing, is it). However,
our capabilities contain also some runtime info that can change,
e.g. hugepages pool allocation sizes or total amount of memory
per node (host side memory hotplug might change the value).

Because of the caching we might not be reporting the correct
runtime info in 'virsh capabilities'.

The NUMA caps are used in three places:

  1) 'virsh capabilities'
  2) domain startup, when parsing numad reply
  3) parsing domain private data XML

In cases 2) and 3) we need NUMA caps to construct list of
physical CPUs that belong to NUMA nodes from numad reply. And
while this may seem static, it's not really because of possible
CPU hotplug on physical host.

There are two possible approaches:

  1) build a validation mechanism that would invalidate the
     cached NUMA caps, or
  2) drop the caching and construct NUMA caps from scratch on
     each use.

In this commit, the latter approach is implemented, because it's
easier.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1819058
Fixes: 1a1d848694
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-07 11:32:40 +01:00
Peter Krempa
0f7b80691b qemuMonitorBlockJobInfo: Store 'ready' and 'ready_present' separately
Don't make the logic confusing by representing the 3 options using an
integer with negative values.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2020-12-07 10:15:00 +01:00
Daniel Henrique Barboza
5a34d0667d qemu: move memory size align to qemuProcessPrepareDomain()
qemuBuildCommandLine() is calling qemuDomainAlignMemorySizes(),
which is an operation that changes live XML and domain and has
little to do with the command line build process.

Move it to qemuProcessPrepareDomain() where we're supposed to
make live XML and domain changes before launch. qemuProcessStart()
is setting VIR_QEMU_PROCESS_START_NEW if !migrate && !snapshot,
same conditions used in qemuBuildCommandLine() to call
qemuDomainAlignMemorySizes(), making this change seamless.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-03 17:19:35 -03:00
Daniel Henrique Barboza
3bb9ed8bc2 qemu_process.c: check migrateURI when setting VIR_QEMU_PROCESS_START_NEW
qemuProcessCreatePretendCmdPrepare() is setting the
VIR_QEMU_PROCESS_START_NEW regardless of whether this is
a migration case or not. This behavior differs from what we're
doing in qemuProcessStart(), where the flag is set only
if !migrate && !snapshot.

Fix it by making the flag setting consistent with what we're
doing in qemuProcessStart().

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-03 17:16:33 -03:00
John Ferlan
148cfcf051 qemu: Pass / fill niothreads for qemuMonitorGetIOThreads
Let's pass along / fill @niothreads rather than trying to make dual
use as a return value and thread count.

This resolves a Coverity issue detected in qemuDomainGetIOThreadsMon
where if qemuDomainObjExitMonitor failed, then a -1 was returned and
overwrite @niothreads causing a memory leak.

Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-12-03 17:06:07 +01:00
Michal Privoznik
b7d4e6b67e lib: Replace VIR_AUTOSTRINGLIST with GStrv
Glib provides g_auto(GStrv) which is in-place replacement of our
VIR_AUTOSTRINGLIST.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-02 15:43:07 +01:00
Pavel Hrdina
82bda55e2f qemuProcessHandleGraphics: no need to check for NULL
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-11-16 17:25:41 +01:00
Jiri Denemark
d8e5b45600 qemu: Do not require TSC frequency to strictly match host
Some CPUs provide a way to read exact TSC frequency, while measuring it
is required on other CPUs. However, measuring is never exact and the
result may slightly differ across reboots. For this reason both Linux
kernel and QEMU recently started allowing for guests TSC frequency to
fall into +/- 250 ppm tolerance interval around the host TSC frequency.

Let's do the same to avoid unnecessary failures (esp. during migration)
in case the host frequency does not exactly match the frequency
configured in a domain XML.

https://bugzilla.redhat.com/show_bug.cgi?id=1839095

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-11-12 17:29:16 +01:00
Masayoshi Mizuma
5cde9dee8c qemu: Move qemuExtDevicesStop() before removing the pidfiles
A qemu guest which has virtiofs config fails to start if the previous
starting failed because of invalid option or something.

That's because the virtiofsd isn't killed by virPidFileForceCleanupPath()
on the former failure because the pidfile was already removed by
virFileDeleteTree(priv->libDir) in qemuProcessStop(), so
virPidFileForceCleanupPath() just returned.

Move qemuExtDevicesStop() before virFileDeleteTree(priv->libDir) so that
virPidFileForceCleanupPath() can kill virtiofsd correctly.

For example of the reproduction:

  # virsh start guest
  error: Failed to start domain guest
  error: internal error: process exited while connecting to monitor: qemu-system-x86_64: -foo: invalid option

  ... fix the option ...

  # virsh start guest
  error: Failed to start domain guest
  error: Cannot open log file: '/var/log/libvirt/qemu/guest-fs0-virtiofsd.log': Device or resource busy
  #

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-11-11 15:20:12 +01:00
Peter Krempa
62a01d84a3 util: hash: Retire 'virHashTable' in favor of 'GHashTable'
Don't hide our use of GHashTable behind our typedef. This will also
promote the use of glibs hash function directly.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
2020-11-06 10:40:51 +01:00
Daniel P. Berrangé
99a1cfc438 qemu: honour fatal errors dealing with qemu slirp helper
Currently all errors from qemuInterfacePrepareSlirp() are completely
ignored by the callers. The intention is that missing qemu-slirp binary
should cause the caller to fallback to the built-in slirp impl.

Many of the possible errors though should indeed be considered fatal.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-10-27 12:03:19 +00:00
zhenwei pi
7555a55470 qemu: implement memory failure event
Since QEMU 5.2 (commit-77b285f7f6), QEMU supports 'memory failure'
event, posts event to monitor if hitting a hardware memory error.
Fully support this feature for QEMU.

Test with commit 'libvirt: support memory failure event', build a
little complex environment(nested KVM):
1, install newly built libvirt in L1, and start a L2 vm. run command
in L1:
 ~# virsh event l2 --event memory-failure

2, run command in L0 to inject MCE to L1:
 ~# virsh qemu-monitor-command l1 --hmp mce 0 9 0xbd000000000000c0 0xd 0x62000000 0x8c

Test result in l1(recipient hypervisor case):
event 'memory-failure' for domain l2:
recipient: hypervisor
action: ignore
flags:
        action required: 0
        recursive: 0

Test result in l1(recipient guest case):
event 'memory-failure' for domain l2:
recipient: guest
action: inject
flags:
        action required: 0
        recursive: 0

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-23 09:42:00 +02:00
Peter Krempa
d6d4c08daf util: hash: Change type of hash table name/key to 'char'
All users of virHashTable pass strings as the name/key of the entry.
Make this an official requirement by turning the variables to 'const
char *'.

For any other case it's better to use glib's GHashTable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-10-22 15:02:46 +02:00
Daniel P. Berrangé
7b1ed1cd73 qemu: stop passing -enable-fips to QEMU >= 5.2.0
Use of the -enable-fips option is being deprecated in QEMU >= 5.2.0. If
FIPS compliance is required, QEMU must be built with libcrypt which will
unconditionally enforce it.

Thus there is no need for libvirt to pass -enable-fips to modern QEMU.
Unfortunately there was never any way to probe for -enable-fips in the
first instance, it was enabled by libvirt based on version number
originally, and then later unconditionally enabled when libvirt dropped
support for older QEMU. Similarly we now use a version number check to
decide when to stop passing -enable-fips.

Note that the qemu-5.2 capabilities are currently from the pre-release
version and will be updated once qemu-5.2 is released.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-10-22 09:03:33 +02:00
Jonathon Jongsma
08f8fd8413 conf: Add support for vDPA network devices
This patch adds new schema and adds support for parsing and formatting
domain configurations that include vdpa devices.

vDPA network devices allow high-performance networking in a virtual
machine by providing a wire-speed data path. These devices require a
vendor-specific host driver but the data path follows the virtio
specification.

When a device on the host is bound to an appropriate vendor-specific
driver, it will create a chardev on the host at e.g.  /dev/vhost-vdpa-0.
That chardev path can then be used to define a new interface with
type='vdpa'.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2020-10-20 14:46:52 -04:00
Peter Krempa
7b0ced89e7 qemu: Prepare hostdev data which depends on the host state separately
SCSI hostdev setup requires querying the host os for the actual path of
the configured hostdev. This was historically done in the command line
formatter. Our new approach is to split out this part into
'qemuProcessPrepareHost' which is designed to be skipped in tests.

Refactor the hostdev code to use this new semantics, and add appropriate
handlers filling in the data for tests and the qemuConnectDomainXMLToNative
users.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-20 15:08:22 +02:00
Peter Krempa
9ff3ad9058 qemuProcessCreatePretendCmd: Split up preparation and command building
Host preparation steps which are deliberately skipped when
pretend-creating a commandline are normally executed after VM object
preparation. In the test code we are faking some of the host
preparation steps, but we were doing that prior to the call to
qemuProcessPrepareDomain embedded in qemuProcessCreatePretendCmd.

By splitting up qemuProcessCreatePretendCmd into two functions we can
ensure that the ordering of the prepare steps stays consistent.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-20 15:08:22 +02:00
Erik Skultety
ccb40cf288 qemu: process: sev: Fill missing 'cbitpos' & 'reducedPhysBits' from caps
These XML attributes have been mandatory since the introduction of SEV
support to libvirt. This design decision was based on QEMU's
requirement for these to be mandatory for migration purposes, as
differences in these values across platforms must result in the
pre-migration checks failing (not that migration with SEV works at the
time of this patch).

This patch enables autofill of these attributes right before launching
QEMU and thus updating the live XML.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-10-19 11:03:27 +02:00
Erik Skultety
1fdc907325 qemu: process: Move SEV capability check to qemuValidateDomainDef
Checks such as this one should be done at domain def validation time,
not before starting the QEMU process.
As for this change, existing domains will see some QEMU error when
starting as opposed to a libvirt error that this QEMU binary doesn't
support SEV, but that's okay, we never guaranteed error messages to
remain the same.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-10-19 11:03:16 +02:00
Erik Skultety
649f720a9a qemu_process: sev: Drop an unused variable
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-10-19 11:01:56 +02:00
Pavel Hrdina
5ad8272888 util: vircgroup: change virCgroupFree to take only virCgroupPtr
As preparation for g_autoptr() we need to change the function to take
only virCgroupPtr.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2020-10-09 16:24:35 +02:00
Ján Tomko
cc3190cc4c qemu: process: use g_new0
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-10-05 16:44:06 +02:00
Ján Tomko
868c350752 qemu: separate out VIR_ALLOC calls
Move them to separate conditions to reduce churn
in following patches.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-10-05 16:44:06 +02:00
Cole Robinson
0fa5c23865 qemu: Taint cpu host-passthrough only after migration
From a discussion last year[1], Dan recommended libvirt drop the tain
flag for cpu host-passthrough, unless the VM has been migrated.

This repurposes the existing host-cpu taint flag to do just that.

[1]: https://www.redhat.com/archives/virt-tools-list/2019-February/msg00041.html

https://bugzilla.redhat.com/show_bug.cgi?id=1673098

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2020-10-05 10:08:26 -04:00
Peter Krempa
faa88866f5 Don't check return value of virBitmapNewCopy
The function will not fail any more.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-10-05 15:50:45 +02:00
Peter Krempa
cb6fdb0125 virBitmapNew: Don't check return value
Remove return value check from all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-10-05 15:38:47 +02:00
Masayoshi Mizuma
1c9227de5d qemu: process: Handle transient disks on VM startup
Add overlays after the VM starts before we start executing guest code.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
2020-10-01 09:55:02 +02:00
Peter Krempa
afc25e8553 qemu: prepare cleanup for <transient/> disk overlays
Later patches will implement support for <transient/> disks in libvirt
by installing an overlay on top of the configured image. This will
require cleanup after the VM will be stopped so that the state is
correctly discarded.

Since the overlay will be installed only during the startup phase of the
VM we need to ensure that qemuProcessStop doesn't delete the original
file on some previous failure. This is solved by adding
'inhibitDiskTransientDelete' VM private data member which is set prior
to any startup step and will be cleared once transient disk overlays are
established.

Based on that we can then delete the overlays for any <transient/> disk.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
2020-10-01 09:55:02 +02:00
Peter Krempa
3673bdbe13 qemu: domain: Extract preparation of hostdev specific data to a separate function
Historically we've prepared secrets for all objects in one place. This
doesn't make much sense and it's semantically more appealing to prepare
everything for a single device type in one place.

Move the setup of the (iSCSI|SCSI) hostdev secrets into a new function
which will be used to setup other things as well in the future.

This is a similar approach we do for disks.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-15 15:20:23 +02:00
Ján Tomko
af16e754cd qemuProcessReconnect: clear 'oldjob'
After we started copying the privateData pointer in
qemuDomainObjRestoreJob, we should also free them
once we're done with them.

Register the clear function and use g_auto.
Also add a check for job->cb to qemuDomainObjClearJob,
to prevent freeing an uninitialized job.

https://bugzilla.redhat.com/show_bug.cgi?id=1878450

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: aca37c3fb2
2020-09-14 18:10:56 +02:00
Tim Wiederhake
caf5a88e59 qemu: Use glib memory functions in qemuProcessReadLog
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-09-11 18:19:58 +02:00
Michal Privoznik
ec46e6d44b qemu_process: Separate VIR_PERF_EVENT_* setting into a function
When starting a domain, qemuProcessLaunch() iterates over all
VIR_PERF_EVENT_* values and (possibly) enables them. While there
is nothing wrong with the code, the for loop where it's done makes
it harder to jump onto next block of code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-08 10:57:24 +02:00
Martin Kletzander
f5b486daea qemu: Allow setting affinity to fail and don't report error
This is just a clean-up of commit 3791f29b08 using the new parameter of
virProcessSetAffinity() introduced in commit 9514e24984 so that there is
no error reported in the logs.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-07 14:48:57 +02:00
Martin Kletzander
9514e24984 Do not report error when setting affinity is allowed to fail
Suggested-by: Ján Tomko <jtomko@redhat.com>

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-07 11:35:36 +02:00
Nikolay Shirokovskiy
5c0cd375d1 qemu: don't shutdown event thread in monitor EOF callback
This hunk was introduced in [1] in order to avoid loosing
events from monitor on stopping qemu process. But as explained
in [2] on destroy we won't get neither EOF nor any other
events as monitor is just closed. In case of crash/shutdown
we won't get any more events as well and qemuDomainObjStopWorker
will be called by qemuProcessStop eventually. Thus let's
remove qemuDomainObjStopWorker from qemuProcessHandleMonitorEOF
as it is not useful anymore.

[1] e6afacb0f: qemu: start/stop an event loop thread for domains
[2] d2954c072: qemu: ensure domain event thread is always stopped

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-07 09:33:59 +03:00
Martin Kletzander
fc7d53edf4 qemu: Fix comment in qemuProcessSetupPid
This was supposed to be done in commit 3791f29b08, but I missed a spot.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2020-09-06 13:44:27 +02:00
Martin Kletzander
3791f29b08 qemu: Do not error out when setting affinity failed
Consider a host with 8 CPUs. There are the following possible scenarios

1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs

2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs

3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus;
   QEMU should get 8 CPUs

4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus;
   QEMU should get 8 CPUs

5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus;
   QEMU should get 4 CPUs

6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus;
   QEMU should get 4 CPUs

Scenarios 1 & 2 always work unless systemd restricted libvirtd privs.

Scenario 3 works because libvirt checks current affinity first and
skips the sched_setaffinity call, avoiding the SYS_NICE issue

Scenario 4 works only if CAP_SYS_NICE is availalbe

Scenarios 5 & 6 works only if CAP_SYS_NICE is present *AND* the cgroups
cpuset is not set on the container.

If libvirt blindly ignores the sched_setaffinity failure, then scenarios
4, 5 and 6 should all work, but with caveat in case 4 and 6, that
QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively.
This is still better than failing.

Therefore libvirt can blindly ignore the setaffinity failure, but *ONLY*
ignore it when there was no affinity specified in the XML config.
If user specified affinity explicitly, libvirt must report an error if
it can't be honoured.

Resolves: https://bugzilla.redhat.com/1819801

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-04 14:44:21 +02:00
Michal Privoznik
95b9db4ee2 lib: Prefer WITH_* prefix for #if conditionals
Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH.
Things got way better with Pavel's work on meson, but apparently,
mixing these two lead to confusing and easy to miss bugs (see
31fb929eca for instance). While we were forced to use HAVE_
prefix with autotools, we are free to chose our own prefix with
meson and since WITH_ prefix appears to be more popular let's use
it everywhere.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-02 10:28:10 +02:00
Laine Stump
95089f481e util: assign tap device names using a monotonically increasing integer
When creating a standard tap device, if provided with an ifname that
contains "%d", rather than taking that literally as the name to use
for the new device, the kernel will instead use that string as a
template, and search for the lowest number that could be put in place
of %d and produce an otherwise unused and unique name for the new
device. For example, if there is no tap device name given in the XML,
libvirt will always send "vnet%d" as the device name, and the kernel
will create new devices named "vnet0", "vnet1", etc. If one of those
devices is deleted, creating a "hole" in the name list, the kernel
will always attempt to reuse the name in the hole first before using a
name with a higher number (i.e. it finds the lowest possible unused
number).

The problem with this, as described in the previous patch dealing with
macvtap device naming, is that it makes "immediate reuse" of a newly
freed tap device name *much* more common, and in the aftermath of
deleting a tap device, there is some other necessary cleanup of things
which are named based on the device name (nwfilter rules, bandwidth
rules, OVS switch ports, to name a few) that could end up stomping
over the top of the setup of a new device of the same name for a
different guest.

Since the kernel "create a name based on a template" functionality for
tap devices doesn't exist for macvtap, this patch for standard tap
devices is a bit different from the previous patch for macvtap - in
particular there was no previous "bitmap ID reservation system" or
overly-complex retry loop that needed to be removed. We simply find
and unused name, and pass that name on to the kernel instead of
"vnet%d".

This counter is also wrapped when either it gets to INT_MAX or if the
full name would overflow IFNAMSIZ-1 characters. In the case of
"vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly
someday someone will change the name from vnet to something else.

(NB: It is still possible for a user to provide their own
parameterized template name (e.g. "mytap%d") in the XML, and libvirt
will just pass that through to the kernel as it always has.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-09-01 14:16:44 -04:00
Laine Stump
d7f38beb2e util: replace macvtap name reservation bitmap with a simple counter
There have been some reports that, due to libvirt always trying to
assign the lowest numbered macvtap / tap device name possible, a new
guest would sometimes be started using the same tap device name as
previously used by another guest that is in the process of being
destroyed *as the new guest is starting.

In some cases this has led to, for example, the old guest's
qemuProcessStop() code deleting a port from an OVS switch that had
just been re-added by the new guest (because the port name is based on
only the device name using the port). Similar problems can happen (and
I believe have) with nwfilter rules and bandwidth rules (which are
both instantiated based on the name of the tap device).

A couple patches have been previously proposed to change the ordering
of startup and shutdown processing, or to put a mutex around
everything related to the tap/macvtap device name usage, but in the
end no matter what you do there will still be possible holes, because
the device could be deleted outside libvirt's control (for example,
regular tap devices are automatically deleted when the qemu process
terminates, and that isn't always initiated by libvirt but could
instead happen completely asynchronously - libvirt then has no control
over the ordering of shutdown operations, and no opportunity to
protect it with a mutex.)

But this only happens if a new device is created at the same time as
one is being deleted. We can effectively eliminate the chance of this
happening if we end the practice of always looking for the lowest
numbered available device name, and instead just keep an integer that
is incremented each time we need a new device name. At some point it
will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15
character limit if nothing else), and we can't guarantee that the new
name really will be the *least* recently used name, but "math"
suggests that it will be *much* less common that we'll try to re-use
the *most* recently used name.

This patch implements such a counter for macvtap/macvlan, replacing
the existing, and much more complicated, "ID reservation" system. The
counter is set according to whatever macvtap/macvlan devices are
already in use by guests when libvirtd is started, incremented each
time a new device name is needed, and wraps back to 0 when either
INT_MAX is reached, or when the resulting device name would be longer
than IFNAMSIZ-1 characters (which actually is what happens when the
template for the device name is "maccvtap%d"). The result is that no
macvtap name will be re-used until the host has created (and possibly
destroyed) 99,999,999 devices.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-09-01 14:16:36 -04:00
Ján Tomko
0a37e0695b Split declarations from initializations
Split those initializations that depend on a statement
above them.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
a5152f23e7 Move declarations before statements
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Laine Stump
5cad64ec03 qemu: remove unreachable code in qemuProcessStart()
Back when the original version of this chunk of code was added (commit
41b087198 in libvirt-0.8.1 in April 2010), we used virExecDaemonize()
to start the qemu process, and would continue on in the function
(which at that time was called qemudStartVMDaemon()) even if a -1 was
returned. So it was possible to get to this code with rv == -1 (it was
called "ret" in that version of the code).

In modern libvirt code, qemu is started with virCommandRun(); then we
call virPidFileReadPath(); those are the only two ways of setting "rv"
prior to this code being removed, and in either case if the new value
of rv < 0, then we immediately skip over the rest of the code to the
cleanup: label.

This means that the code being removed by this patch is
unreachable.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-08-24 23:46:51 -04:00
Michal Privoznik
9048dc4e62 qemuDomainBuildNamespace: Populate basic /dev from daemon's namespace
As mentioned in previous commit, populating domain's namespace
from pre-exec() hook is dangerous. This commit moves population
of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm,
etc.) into daemon's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
8da362fe62 qemu_domain_namespace: Repurpose qemuDomainBuildNamespace()
Okay, here is the deal. Currently, the way we build namespace is
very fragile. It is done from pre-exec hook when starting a
domain, after we mass closed all FDs and before we drop
privileges and exec() QEMU. This fact poses some limitations onto
the namespace build code, e.g. it has to make sure not to keep
any FD opened (not even through a library call), because it would
be leaked to QEMU. Also, it has to call only async signal safe
functions. These requirements are hard to meet - in fact as of my
commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling
libdevmapper functions.

To solve this issue and avoid similar problems in the future, we
should change our paradigm. We already have functions which can
populate domain's namespace with nodes from the daemon context.
If we use them to populate the namespace and keep only the bare
minimum in the pre-exec hook, we've mitigated the risk.

Therefore, the old qemuDomainBuildNamespace() is renamed to
qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace()
function is introduced. So far, the new function is basically a
NOP and domain's namespace is still populated from the pre-exec
hook - next patches will fix it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
764eaf1aa4 qemu_domain_namespace: Rename qemuDomainCreateNamespace()
The name of this function is not very helpful, because it doesn't
create anything, it just flips a bit in a bitmask when domain is
starting up. Move the function internals into qemu_process.c and
forget the function ever existed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:33 +02:00
Michal Privoznik
90eee87569 qemu: Separate out namespace handling code
The qemu_domain.c file is big as is and we should split it into
separate semantic blocks. Start with code that handles domain
namespaces.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:32:27 +02:00
Ján Tomko
ee247e1d3f Use g_strfeev instead of virStringFreeList
Both accept a NULL value gracefully and virStringFreeList
does not zero the pointer afterwards, so a straight replace
is safe.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:37:36 +02:00
Ján Tomko
1edf164848 Remove redundant conditions
All of these have been checked earlier.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:19:28 +02:00
Ján Tomko
6c7ba7b496 qemu: Fix affinity typo
Fixes: 4c0398b528
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-07-22 15:51:26 +02:00
Bihong Yu
3ee423c363 qemu: pre-create the dbus directory in qemuStateInitialize
There are races condiction to make '/run/libvirt/qemu/dbus' directory in
virDirCreateNoFork() while concurrent start VMs, and get "failed to create
directory '/run/libvirt/qemu/dbus': File exists" error message. pre-create the
dbus directory in qemuStateInitialize.

Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-07-22 09:40:15 +02:00
Jiri Denemark
1031db3600 qemu: Properly set //cpu/@migratable default value for running domains
Since active domains which do not have the attribute already set were
not started by libvirt that probed for CPU migratable property, we need
to check this property on reconnect and update the domain definition
accordingly.

https://bugzilla.redhat.com/show_bug.cgi?id=1857967

Reported-by: Mark Mielke <mark.mielke@gmail.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-07-21 15:40:01 +02:00
Daniel Henrique Barboza
f187b2fb98 qemu_process.c: modernize qemuProcessQMPNew()
Use g_autoptr() and remove the 'cleanup' label.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200717211556.1024748-3-danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-07-21 15:34:36 +02:00
Peter Krempa
db712b0673 qemuDomainDiskLookupByNodename: Remove unused 'idx'
All callers pass NULL as the value. Remove the argument.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-21 09:52:46 +02:00
Peter Krempa
c414ab00e2 qemuProcessHandleBlockThreshold: Report correct indexes
The index returned by qemuDomainDiskLookupByNodename is the position in
the backing chain rather than the index we report in the XML.

Since with -blockdev they differ now and additionally the disk source
also has an index we need to fix the 'threshold' events we report:

1) If it's the top level image we must always trigger the event without
   any suffix as we did until now

2) We must report the correct index

3) We must report the correct index also for the top level image, when
   blockdev is used.

This means that we need to potentially emit 2 events, one for the device
without the index and then when blockdev is used and the top level image
has an index we must do it also with the index.

This will fix it for blockdev cases, while also not removing previous
semantics.

https://bugzilla.redhat.com/show_bug.cgi?id=1857204

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-21 09:52:46 +02:00
Peter Krempa
4a19b7b832 qemuDomainDiskBackingStoreGetName: Remove unused argument
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-21 09:52:46 +02:00
Prathamesh Chavan
aca37c3fb2 qemu_domainjob: introduce privateData for qemuDomainJob
To remove dependecy of `qemuDomainJob` on job specific
paramters, a `privateData` pointer is introduced.
To handle it, structure of callback functions is
also introduced.

Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-07-20 15:34:58 +02:00