Commit Graph

12806 Commits

Author SHA1 Message Date
Michal Privoznik
3458c3ff8c qemu_extdevice: Expose qemuExtDevicesInitPaths()
This function is going to be called outside of qemu_extdevice.c.
Expose it to the rest of the driver.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 14:25:15 +01:00
Michal Privoznik
f1958a3e5e qemu_extdevice: Init paths in qemuExtDevicesPrepareDomain()
The path generation phase belongs conceptually into domain
preparation phase and not host preparation. Move
qemuExtDevicesInitPaths() call from qemuExtDevicesPrepareHost()
into qemuExtDevicesPrepareDomain().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 14:25:03 +01:00
Michal Privoznik
107ebe62f4 qemu_process: Document qemuProcessPrepare{Domain,Host}() order
The domain startup process is split into multiple phases. One of
them is preparing the domain (at that point live) XML, private
data, various paths, etc - see qemuProcessPrepareDomain(); the
other prepares the host - see qemuProcessPrepareHost(). It's
obvious that the domain XML preparation function must be called
before the host preparation function (e.g. the host preparation
might try to create a file which path is generated in the domain
preparation phase). Nevertheless, let's document this
expectation.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 14:21:33 +01:00
Peter Krempa
c2a0d09046 qemuDomainDetachDeviceLiveAndConfig: Refactor cleanup
Remove the 'cleanup' label and 'ret' variable as we can now directly
return form all cases.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 12:03:11 +01:00
Peter Krempa
333fb6714e qemuDomainDetachDeviceLiveAndConfig: Parse XML twice rather than use virDomainDeviceDefCopy
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 12:03:11 +01:00
Peter Krempa
645afd640c qemuDomainUpdateDeviceFlags: Parse XML twice rather than use virDomainDeviceDefCopy
'virDomainDeviceDefCopy' formats the definition and parses it back.
Since we already are parsing the XML here, we're better off parsing it
twice and save the formatting step.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 12:03:11 +01:00
Peter Krempa
b358995a14 qemu: driver: Fix formatting of function headers around qemuDomainAttachDevice
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 12:03:11 +01:00
Shaleen Bathla
465f1b9d4c qemu: Don't report spurious errors from vCPU tid validation on hotunplug timeout
Use of qemuDomainValidateVcpuInfo in the helpers for hotplug and unplug
of vCPUs can lead to spurious errors reported such as:

  internal error: qemu didn't report thread id for vcpu 'XX'"

The reason for this is that qemuDomainValidateVcpuInfo validates the
state of all vCPUs against the expected state of vCPUs. If an unplug
operation completed before libvirt was unable to process it yet the
expected state could not reflect the current state.

To avoid spurious errors the qemuDomainHotplugAddVcpu and
qemuDomainRemoveVcpu functions are modified to do localized validation
only for the vCPUs they actually modify.

We also now ensure that the cgroups are modified before bailing out on
error for any vCPUs which passed validation.

Additionally in order for qemuDomainRemoveVcpuAlias to be able to find
the unplugged vCPU we must ensure that qemuDomainRefreshVcpuInfo does
not clear out the alias in case when the vCPU is no longer reported by
qemu.

Co-authored-by: Partha Satapathy <partha.satapathy@oracle.com>
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 11:15:08 +01:00
Michal Privoznik
713578d77f qemu_tpm: Set log file label on migration
Recently, the QEMU driver gained support for migration with TPM
state on a shared volume (e.g. NFS). As a part of that, the
destination side avoids setting seclabels on it to avoid cutting
off the source while it is still using it. Makes sense, except
for a wee bit: the secdriver API does a bit more - it also sets
label on the swtpm log file. And this one definitely needs to be
labeled (it lives under /var/log/swtpm/libvirt/qemu/..., i.e. not
on a shared volume).

Previously, qemuSecurityStartTPMEmulator() took care of that. But
during rework to shared volume migration, the code was changed so
now plain qemuSecurityCommandRun() would be run (i.e. no
relabelling).

But after previous commits, we can now chose whether the TPM
state should be relabelled or just the log file.

Fixes: 2e669ec789
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2130192#c7
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 10:40:52 +01:00
Michal Privoznik
3c2e55c5ed qemu_tpm: Extend start/stop APIs
This is basically just a continuation of the previous commit.
Now that the security driver APIs have a boolean flag that
controls setting/restoring seclabel of either both TPM state and
log files, or just the log file, propagate this boolean into
those APIs that start/stop swtpm emulator. For now, just pass
true. The juicy bits are soon to come.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 10:40:52 +01:00
Michal Privoznik
f3259f82fd security: Extend TPM label APIs
The virSecurityDomainSetTPMLabels() and
virSecurityDomainRestoreTPMLabels() APIs set/restore label on two
files/directories:

  1) the TPM state (tpm->data.emulator.storagepath), and
  2) the TPM log file (tpm->data.emulator.logfile).

Soon there will be a need to set the label on the log file but
not on the state. Therefore, extend these APIs for a boolean flag
that when set does both, but when unset does only 2).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-05 10:40:52 +01:00
Peter Krempa
dc5b8dbe66 qemuAgentSSHGetAuthorizedKeys: Convert last use ofvirJSONValueObjectGetStringArray
Use virJSONValueObjectGetArray + virJSONValueArrayToStringList instead
so that the ofvirJSONValueObjectGetStringArray wrapper can be removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
5baeebef0f qemu: monitor: Use qemuMonitorJSONGetReply in conjunction with virJSONValueArrayToStringList
In two instances (qemuMonitorJSONGetStringListProperty,
qemuMonitorJSONGetStringArray) the return value is checked by
qemuMonitorJSONCheckReply and extracted by
virJSONValueObjectGetStringArray.

We can use qemuMonitorJSONGetReply which returns it directly and then
virJSONValueArrayToStringList to convert it without the additional
lookup.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
6b3bc1cb2c qemuMonitorJSONGetCPUDefinitions: Avoid double lookup of object
Using 'virJSONValueObjectHasKey' when we want to access the value
afterwards is wasteful. Fetch the JSON value right away.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
662ec854d2 qemuMonitorJSONGetCPUDefinitions: Rework lookup of 'unavailable-features'
Rather than checking that the object has the correct key and then
fetching it again use fetch the array first and then use
virJSONValueArrayToStringList to directly convert it.

Additionally we can avoid the conversion if there are no members
simplifying the surrounding logic.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
3b576601df qemuAgentGetDisks: Don't use virJSONValueObjectGetStringArray for optional data
The 'dependencies' field in the return data may be missing in some
cases. Historically 'virJSONValueObjectGetStringArray' didn't report
error in such case, but later refactor (commit 043b50b948 ) added
an error in order to use it in other places too.

Unfortunately this results in the error log being spammed with an
irrelevant error in case when qemuAgentGetDisks is invoked on a VM
running windows.

Replace the use of virJSONValueObjectGetStringArray by fetching the
array first and calling virJSONValueArrayToStringList only when we have
an array.

Fixes: 043b50b948
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2149752
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
962ce78175 qemu: monitor: Unify and refactor 'PTY' case in qemuMonitorJSONAttachCharDev
Use qemuMonitorJSONGetReply and unify the two blocks with the same
condition.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
80f1b8a5b0 qemu: monitor: Use qemuMonitorJSONGetReply when the value is extracted directly
Use qemuMonitorJSONGetReply in cases where qemuMonitorJSONCheckReply
is followed by virJSONValueObjectGet*(reply, "return").

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
32573e3d23 qemu: monitor: Use qemuMonitorJSONGetReply for VIR_JSON_TYPE_ARRAY
Replace usage of the following pattern with the new helper:

  if (qemuMonitorJSONCheckReply(cmd, reply, VIR_JSON_TYPE_ARRAY) < 0)
      return -1;

  data = virJSONValueObjectGetArray(reply, "return");

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
9c9adc9757 qemu: monitor: Use qemuMonitorJSONGetReply for VIR_JSON_TYPE_OBJECT
Replace usage of the following pattern with the new helper:

  if (qemuMonitorJSONCheckReply(cmd, reply, VIR_JSON_TYPE_OBJECT) < 0)
      return -1;

  data = virJSONValueObjectGetObject(reply, "return");

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
a434684a57 qemu: monitor: Introduce qemuMonitorJSONGetReply, a better qemuMonitorJSONCheckReply
Rather than simply checking that the 'return' field is of the expected
type we can directly return it as the caller is very likely going to use
it. Extract the code into the new function and add a wrapper to preserve
old functionality.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
2f8968ff76 qemu: migration: Use 'unsigned int' for flags
Don't continue with the historical mistake and fix all internal
functions to use a sane type for flags.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-02 16:18:37 +01:00
Peter Krempa
b0e8fb3ab8 qemu: processGuestPanicEvent: Use 'unsigned int' for flags
No need to use 'long'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-12-02 16:18:37 +01:00
Michal Privoznik
5c1b5f208a virCommandSetSendBuffer: Take double pointer of @buffer
The virCommandSetSendBuffer() function consumes passed @buffer,
but takes it only as plain pointer. Switch to a double pointer to
make this obvious. This allows us then to drop all
g_steal_pointer() in callers.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2022-12-01 14:22:39 +01:00
Jiri Denemark
7159fb8524 qemu: Reindent qemuMigrationCookieParse prototype arguments
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-12-01 11:01:58 +01:00
Jiri Denemark
9e5b42b5eb qemu: Replace priv with qemuCaps in qemuMigrationCookieParse
QEMU capabilities is the only thing we use from priv so we can just pass
that directly.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-12-01 11:01:58 +01:00
Jiri Denemark
8745591457 qemu: Reorder qemuMigrationCookieParse arguments
When an internal API takes a vm pointer, it's usually just after the
driver argument.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-12-01 11:01:58 +01:00
Jiri Denemark
af59c944bb qemu: Pass vm to qemuMigrationCookieParse if it exists
The vm object is used inside qemuMigrationCookieParse based on the flags
passed to qemuMigrationCookieParse and the content of the cookie. The
callers should not just blindly guess and pass NULL if they
(incorrectly) think the vm object is not needed. We should always pass
the vm object unless it does not exist yet.

This fixes a bug when statistics of a completed migration reported
"Unknown" operation instead of "Incoming migration" on the destination
host.

https://bugzilla.redhat.com/show_bug.cgi?id=2137298

Fixes: v8.7.0-79-g0150f7a8c1
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-12-01 10:30:21 +01:00
Michal Privoznik
bc9d0df4f9 qemu_tpm: Check for qemuTPMSetupEncryption() errors
Inside of qemuTPMEmulatorBuildCommand() there are two calls to
qemuTPMSetupEncryption() which simply ignore returned error. This
is suboptimal because then we rely on swtpm binary reporting a
generic error (something among invalid command line arguments)
while an error reported by qemuTPMSetupEncryption() is more
specific.

However, since virCommandSetSendBuffer() only sets an error
inside of virCommand structure (the error is then reported in
virCommandRun()), we need to exempt its retval from error
checking. Thus, the signature of qemuTPMSetupEncryption() is
changed a bit so that -1/0 can be returned to indicate error.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2022-11-29 15:22:39 +01:00
Jonathon Jongsma
2a2d586043 qemu: fix memlock without vIOMMU
When there is no vIOMMU, vfio devices don't need to lock the entire guest
memory per-device, but they still need to lock the entire guest memory to
share between all vfio devices. This memory accounting is not shared
with vDPA devices, so it should be added to the memlock limit separately.

Commit 8d5704e2 added support for multiple vfio/vdpa devices but
calculated the limits incorrectly when there were both vdpa and vfio
devices and no vIOMMU. In this case, the memory lock limit was not
increased separately for the vfio devices.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143838

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2022-11-21 15:37:41 -06:00
Jiri Denemark
3211895be3 qemu: Ignore failure in post-copy migration when QEMU says completed
When post-copy migration is running in Finish phase we already did
everything needed and we're just waiting for all the memory to transfer
to the destination. The domain is already running on there at this
point. Once all data is transferred (QEMU sends a MIGRATION completed
event) we're done. So in this specific post-copy case the source does
not need to care about the result of the Finish call as long as QEMU
says migration completed. The Finish call to the destination daemon may
fail for reasons that do not affect QEMU, e.g., libvirt daemon was
restarted there or the libvirt connection broke.

Currently we just mark the post-copy migration as failed on the source
and keep the domain paused there. But when libvirt daemon is restarted
at this point, it will detect migration finished successfully and kill
the domain as migrated. It make sense to do this even without having to
restart the daemon.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/338

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-11-21 18:17:44 +01:00
Jiri Denemark
bf77578c9c qemu: Always restore post-copy migration job on reconnect
We need the restored job even in case the migration already finished
even though we will stop it just a few lines below as the functions we
call in between require an existing migration job.

This fixes a crash on reconnect when post-copy migration finished while
the daemon was not running.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-11-21 18:17:44 +01:00
Michal Privoznik
f1154a4825 qemu_command: Generate thread-context object for main guest memory
When generating memory for main guest memory memory-backend-*
might be used. This means, we may need to generate thread-context
objects too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-11-15 11:56:08 +01:00
Michal Privoznik
f808e7c738 qemu: Generate thread-context object for memory devices
When generating memory for memory devices memory-backend-* might
be used. This means, we may need to generate thread-context
objects too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-11-15 11:56:02 +01:00
Michal Privoznik
1200aa0669 qemu_command: Generate thread-context object for guest NUMA memory
When generating memory for guest NUMA memory-backend-* might be
used. This means, we may need to generate thread-context objects
too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-11-15 11:55:40 +01:00
Michal Privoznik
ba92b86b4f qemu: Delete thread-context objects at domain startup
While technically thread-context objects can be reused, we only
use them (well, will use them) to pin memory allocation threads.
Therefore, once we connect to QEMU monitor, all memory (with
prealloc=yes) was allocated and thus these objects are no longer
needed and can be removed. For on demand allocation the TC object
is left behind.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-11-15 11:54:56 +01:00
Michal Privoznik
b03386d148 qemu_command: Introduce qemuBuildThreadContextProps()
The aim of thread-context object is to set affinity on threads
that allocate memory for a memory-backend-* object. For instance:

-object '{"qom-type":"thread-context","id":"tc-ram-node0","node-affinity":[3]}' \
-object '{"qom-type":"memory-backend-memfd","id":"ram-node0","hugetlb":true,\
          "hugetlbsize":2097152,"share":true,"prealloc":true,"prealloc-threads":8,\
          "size":15032385536,"host-nodes":[3],"policy":"preferred",\
          "prealloc-context":"tc-ram-node0"}' \

allocates 14GiB worth of memory, backed by 2MiB hugepages from
host NUMA node 3, using 8 threads. If it weren't for
thread-context these threads wouldn't have any affinity and thus
theoretically could be scheduled to run on CPUs of different NUMA
node (which is what I saw occasionally).

Therefore, whenever we are pinning memory (IOW setting host-nodes
attribute), we can generate thread-context object with the same
affinity.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-11-15 11:47:08 +01:00
Michal Privoznik
d5320907e3 qemu_capabilities: Introduce QEMU_CAPS_THREAD_CONTEXT
In its commit v7.1.0-1429-g7208429223 QEMU gained new object
thread-context, which allows running specialized tasks with
affinity set to a given subset of host CPUs/NUMA nodes. Even
though only memory allocation task accepts this new object, it's
exactly what we aim to implement in libvirt. Therefore, introduce
a new capability to track whether QEMU is capable of this object.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-11-15 11:28:45 +01:00
Michal Privoznik
e5d8697585 qemu_validate: Use proper printf directive for ssize_t
In one of recent commits an error message was introduced. In this
message a variable of type ssize_t is being printed out, but the
corresponding format directive is %ld instead of %zd which breaks
on 32bits systems. Switch to proper format.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 16:51:39 +01:00
Tim Wiederhake
aee64348eb Fix spelling
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
2022-11-11 16:48:48 +01:00
Lin Yang
ddb1bc0519 qemu: Add command-line to generate SGX EPC memory backend
According to the result parsing from xml, add the argument of
SGX EPC memory backend into QEMU command line.

$ qemu-system-x86_64 \
    ...... \
    -object '{"qom-type":"memory-backend-epc","id":"memepc0","prealloc":true,"size":67108864,"host-nodes":[0,1],"policy":"bind"}' \
    -object '{"qom-type":"memory-backend-epc","id":"memepc1","prealloc":true,"size":16777216,"host-nodes":[2,3],"policy":"bind"}' \
    -machine sgx-epc.0.memdev=memepc0,sgx-epc.0.node=0,sgx-epc.1.memdev=memepc1,sgx-epc.1.node=1

Signed-off-by: Lin Yang <lin.a.yang@intel.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:47 +01:00
Michal Privoznik
83bb0f0ee1 qemu_namespace: Create SGX related nodes in domain's namespace
This is similar to the previous commit. SGX memory backend needs
to access /dev/sgx_vepc and /dev/sgx_provision. Create these
nodes in domain's private /dev when required by domain's config.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:42 +01:00
Michal Privoznik
bea39eb9f3 qemu_cgroup: Allow SGX in devices controller
SGX memory backend needs to access /dev/sgx_vepc (which allows
userspace to allocate "raw" EPC without an associated enclave)
and /dev/sgx_provision (which allows creating provisioning
enclaves). Allow these two devices in CGroups if a domain is
configured so.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:39 +01:00
Lin Yang
facadf2491 conf: Introduce SGX EPC element into device memory xml
<devices>
  ...
  <memory model='sgx-epc'>
    <source>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>512</size>
      <node>0</node>
    </target>
  </memory>
  ...
</devices>

Signed-off-by: Lin Yang <lin.a.yang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:34 +01:00
Haibin Huang
8db09767a9 conf: expose SGX feature in domain capabilities
Extend hypervisor capabilities to include sgx feature. When available,
the hypervisor supports launching an VM with SGX on Intel platfrom.
The SGX feature tag privides additional details like section size and
sgx1 or sgx2.

Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:31 +01:00
Haibin Huang
6b7c36c8c2 Convert QMP capabilities to domain capabilities
the QMP capabilities:
  {"return":
    {
      "sgx": true,
      "section-size": 1024,
      "flc": true
    }
  }

the domain capabilities:
  <sgx>
    <flc>yes</flc>
    <epc_size>1</epc_size>
  </sgx>

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:27 +01:00
Haibin Huang
1a68499c01 qemu: Get SGX capabilities form QMP
Generate the QMP command for query-sgx-capabilities and the command
return SGX capabilities from QMP.

{"execute":"query-sgx-capabilities"}

the right reply:
  {"return":
    {
      "sgx": true,
      "section-size": 197132288,
      "flc": true
    }
  }

the error reply:
  {"error":
    {"class": "GenericError", "desc": "SGX is not enabled in KVM"}
  }

Signed-off-by: Haibin Huang <haibin.huang@intel.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-11 14:06:24 +01:00
Peter Krempa
697e26fac6 qemu: capabilities: Detect support for JSON args for -netdev
JSON args for -netdev were added as precursor for adding the 'dgram'
network backend type. Enable the detection and update test cases using
DO_TEST_CAPS_LATEST.

Enabling the capability also ensures that the -netdev argument is
validated against the QAPI schema of 'netdev_add' which was already
implemented but not enabled.

The parser supporting JSON was added by qemu commit f3eedcddba3 and
enabled when adding stream/dgram netdevs in commit 5166fe0ae46.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-10 13:16:15 +01:00
Peter Krempa
2f6e858b3c qemuMonitorJSONQueryNamedBlockNodes: Drop 'flat' argument
All callers pass the equivalent of looking up whether qemu supports
QEMU_CAPS_QMP_QUERY_NAMED_BLOCK_NODES_FLAT. Use
'mon->queryNamedBlockNodesFlat' directly and refactor all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-09 14:08:31 +01:00
Peter Krempa
bbd4d48993 qemuMonitorJSONBlockStatsUpdateCapacityBlockdev: Use 'flat' mode of query-named-block-nodes
'query-named-block-nodes' in non-flat mode returns redundantly nested
data under the 'backing-image' field. Fortunately we don't need it when
updating the capacity stats.

This function was unfortunately not fixed originally when the support
for flat mode was added. Use the flat cached in the monitor object to
force flat mode if available.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-09 14:08:31 +01:00
Peter Krempa
b0e4ad5263 qemu: monitor: Store whether 'query-named-block-nodes' supports 'flat' parameter
Rather than having callers always pass this flag store it in the
qemuMonitor object. Following patches will convert the code to use this
internal flag.

In the future this will also simplify removal when all supported qemu
versions will support the new mode.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-09 14:08:31 +01:00
Peter Krempa
3fe74ebd90 qemu: qemuBlockGetNamedNodeData: Remove pointless error path
We don't need automatic freeing for 'blockNamedNodeData' and we can
directly return it rather than checking it for NULL-ness first.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-09 14:08:31 +01:00
Peter Krempa
9c26c1bfd4 conf: Introduce support for 'hv-avic' Hyper-V enlightenment
qemu-6.2 introduced support for the hv-avic enlightenment which allows
to use Hyper-V SynIC with hardware APICv/AVIC enabled.

Implement the libvirt support for it.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/402
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-09 14:08:31 +01:00
Michal Privoznik
f68a074203 qemu: Add missing 'break' statement in couple of switch()-es
In recent commits migration of TPM on shared storage was
introduced. However, I've only complied it with gcc and thus did
not notice that clang build fails due to missing break; at the
end of some (empty) cases in switch() statements.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 13:52:40 +01:00
Stefan Berger
3c9968ec9a qemu: tpm: Never remove state on outgoing migration and shared storage
Never remove the TPM state on outgoing migration if the storage setup
has shared storage for the TPM state files. Also, do not do the security
cleanup on outgoing migration if shared storage is detected.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 12:26:42 +01:00
Stefan Berger
2e669ec789 qemu: tpm: Avoid security labels on incoming migration with shared storage
When using shared storage there is no need to apply security labels on the
storage since the files have to have been labeled already on the source
side and we must assume that the source and destination side have been
setup to use the same uid and gid for running swtpm as well as share the
same security labels. Whether the security labels can be used at all
depends on the shared storage and whether and how it supports them.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 12:26:38 +01:00
Stefan Berger
188dfeb398 qemu: tpm: Pass --migration option to swtpm if supported and needed
Pass the --migration option to swtpm if swptm supports it (starting
with v0.8) and if the TPM's state is written on shared storage. If this
is the case apply the 'release-lock-outgoing' parameter with this
option and apply the 'incoming' parameter for incoming migration so that
swtpm releases the file lock on the source side when the state is migrated
and locks the file on the destination side when the state is received.

If a started swtpm instance is running with the necessary options of
migrating with share storage then remember this with a flag in the
virDomainTPMPrivateDef.

Report an error if swtpm does not support the --migration option and an
incoming migration across shared storage is requested.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 12:26:35 +01:00
Stefan Berger
5597476e40 qemu: tpm: Add support for storing private TPM-related data
Add support for storing private TPM-related data. The first private data
will be related to the capability of the started swtpm indicating whether
it is capable of migration with a shared storage setup since that requires
support for certain command line flags that were only becoming available
in v0.8.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 12:26:32 +01:00
Stefan Berger
68103e9daf qemu: tpm: Conditionally create storage on incoming migration
Do not create storage if the TPM state files are on shared storage and
there's an incoming migration since in this case the storage directory
must already exist. Also do not run swtpm_setup in this case.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 12:26:27 +01:00
Stefan Berger
384138d790 qemu: tpm: Introduce qemuTPMHasSharedStorage()
New qemuTPMHasSharedStorage() function is introduced which
returns whether the swtpm state directory is on a shared
filesystem (e.g. NFS).

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-11-09 12:26:24 +01:00
Michal Privoznik
56de80cb79 qemu: Retire QEMU_CAPS_DISK_WRITE_CACHE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
d974ecbab5 qemu_capabilities: Stop detecting QEMU_CAPS_DISK_WRITE_CACHE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
f28807a1e0 qemu: Assume QEMU_CAPS_DISK_WRITE_CACHE
Introduced in QEMU's commit of v2.7.0-rc0~32^2~5 the .write-cache
attribute of virtio-blk dvice is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

The change in some .args is justified, because the qemuxml2argvdatatest
runs these test caseses with very minimalistic set of capabilities,
that's nowhere near real life scenario.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
e2927db185 qemu: Retire QEMU_CAPS_DISK_SHARE_RW
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
589e9a769b qemu_capabilities: Stop detecting QEMU_CAPS_DISK_SHARE_RW
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
6c4148f693 qemu: Assume QEMU_CAPS_DISK_SHARE_RW
Introduced in QEMU's commit of v2.9.0-rc0~48^2~25 the .share-rw
attribute of virtio-blk device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

The change in controller-order.args is justified, because the
qemuxml2argvdatatest runs the test case with very minimalistic
set of capabilities, that's nowhere near real life scenario.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
d27fb06ec4 qemu: Retire QEMU_CAPS_VIRTIO_BLK_NUM_QUEUES
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
7b1d8933de qemu_capabilities: Stop detecting QEMU_CAPS_VIRTIO_BLK_QUEUE_SIZE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
f33d9ce977 qemu: Assume QEMU_CAPS_VIRTIO_BLK_NUM_QUEUES
Introduced in QEMU's commit of v2.7.0-rc0~83^2 the .num-queues
attribute of virtio-blk device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
c568b557d6 qemu: Retire QEMU_CAPS_BLOCKIO
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
0244d42b82 qemu_capabilities: Stop detecting QEMU_CAPS_BLOCKIO
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
718721f0f9 qemu: Assume QEMU_CAPS_BLOCKIO
Introduced in QEMU's commit of v0.13.0-rc0~1072 the
.logical_block_size attribute of virtio-blk device is always
available for all QEMU versions we support (4.2.0, currently).
Therefore, we can assume the capability is always set and thus
doesn't need to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
c40ea3eaed qemu: Retire QEMU_CAPS_VIRTIO_NET_FAILOVER
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
7c20bca6ae qemu_capabilities: Stop detecting QEMU_CAPS_VIRTIO_NET_FAILOVER
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
69eeea5d92 qemu: Assume QEMU_CAPS_VIRTIO_NET_FAILOVER
Introduced in QEMU's commit of v4.2.0-rc0~23^2~4 the .failover
attribute of virtio-net device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
0bf7e0cf63 qemu: Retire QEMU_CAPS_VIRTIO_NET_HOST_MTU
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
2390c076ee qemu_capabilities: Stop detecting QEMU_CAPS_VIRTIO_NET_HOST_MTU
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
2eab78d5f5 qemu: Assume QEMU_CAPS_VIRTIO_NET_HOST_MTU
Introduced in QEMU's commit of v2.9.0-rc0~162^2~10 the .host_mtu
attribute of virtio-net device is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
c0896a2e80 qemu: Retire QEMU_CAPS_VIRTIO_NET_TX_QUEUE_SIZE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
fec918000d qemu_capabilities: Stop detecting QEMU_CAPS_VIRTIO_NET_TX_QUEUE_SIZE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
ed8696549d qemu: Assume QEMU_CAPS_VIRTIO_NET_TX_QUEUE_SIZE
Introduced in QEMU's commit of v2.10.0-rc0~95^2~20 the
.tx_queue_size attribute of virtio-net device is always available
for all QEMU versions we support (4.2.0, currently). Therefore,
we can assume the capability is always set and thus doesn't need
to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
1afab9d245 qemu: Retire QEMU_CAPS_VIRTIO_NET_RX_QUEUE_SIZE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
5bb7fe5437 qemu_capabilities: Stop detecting QEMU_CAPS_VIRTIO_NET_RX_QUEUE_SIZE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
7fd8465187 qemu: Assume QEMU_CAPS_VIRTIO_NET_RX_QUEUE_SIZE
Introduced in QEMU's commit of v2.8.0-rc0~116^2~26 the
.rx_queue_size attribute of virtio-net device is always available
for all QEMU versions we support (4.2.0, currently). Therefore,
we can assume the capability is always set and thus doesn't need
to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
4a7ec2b8d4 qemu: Retire QEMU_CAPS_QUERY_DISPLAY_OPTIONS
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
f02190dc54 qemu_capabilities: Stop detecting QEMU_CAPS_QUERY_DISPLAY_OPTIONS
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
b9f70ae05b qemu: Assume QEMU_CAPS_QUERY_DISPLAY_OPTIONS
Introduced in QEMU's commit of v3.1.0-rc3~8^2 the
query-display-options command is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
6e3e008f6e qemu: Retire QEMU_CAPS_BITMAP_MERGE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
7a69622cf3 qemu_capabilities: Stop detecting QEMU_CAPS_BITMAP_MERGE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
e42461231f qemu: Retire QEMU_CAPS_QUERY_CURRENT_MACHINE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
55ff57dbf2 qemu_capabilities: Stop detecting QEMU_CAPS_QUERY_CURRENT_MACHINE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
80a5dbb478 qemu: Assume QEMU_CAPS_QUERY_CURRENT_MACHINE
Introduced in QEMU's commit of v4.0.0-rc0~202^2~3 the
query-current-machine command is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
cf54743277 qemu: Retire QEMU_CAPS_QOM_LIST_PROPERTIES
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
b15e602278 qemu_capabilities: Stop detecting QEMU_CAPS_QOM_LIST_PROPERTIES
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
3c2697b54c qemu: Assume QEMU_CAPS_QOM_LIST_PROPERTIES
Introduced in QEMU's commit of v2.12.0-rc0~48^2~25 the
qom-list-properties command is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
15919f5558 qemu: Retire QEMU_CAPS_DUMP_COMPLETED
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
45d0015d86 qemu_capabilities: Stop detecting QEMU_CAPS_DUMP_COMPLETED
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
ac02c09dd8 qemu: Assume QEMU_CAPS_DUMP_COMPLETED
Introduced in QEMU's commit of v2.6.0-rc0~74^2~6 the
DUMP_COMPLETED event is always available for all QEMU versions we
support (4.2.0, currently). Therefore, we can assume the
capability is always set and thus doesn't need to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
5724035ed5 qemu: Retire QEMU_CAPS_VSERPORT_CHANGE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
bf140a6edd qemu_capabilities: Stop detecting QEMU_CAPS_VSERPORT_CHANGE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
c18e2fd746 qemu_agent: Drop @singleSync from _qemuAgent
Historically, before sending any guest agent command we would
send 'guest-sync' command to make guest agent reset its internal
state and flush any partially read command (json). This was
because there was no event emitted when the agent
(dis-)connected.

But now that we have the event we can execute the sync command
just once - the first time after we've connected. Should agent
disconnect in the middle of reading a command, and then connect
back again we would get the event and disconnect and connect back
again, resulting in the sync command being executed again.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
3cf0a764cd qemu: Assume QEMU_CAPS_VSERPORT_CHANGE
Introduced in QEMU's commit of v2.1.0-rc0~18^2~2 the
VSERPORT_CHANGE event is always available for all QEMU versions
we support (4.2.0, currently). Therefore, we can assume the
capability is always set and thus doesn't need to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
05aa2e1a5d qemu: Retire QEMU_CAPS_NUMA
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
8ffcafe211 qemu_capabilities: Stop detecting QEMU_CAPS_NUMA
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
8bf50fa018 qemu: Assume QEMU_CAPS_NUMA
Introduced in QEMU's commit of v3.0.0-rc0~124^2~1 the
set-numa-node command is always available for all QEMU versions
we support (4.2.0, currently). Therefore, we can assume the
capability is always set and thus doesn't need to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
b697b702ac qemu: Acquire QUERY job in qemuDomainQueryWakeupSuspendSupport()
The qemuDomainQueryWakeupSuspendSupport() does not change state
of the domain as it just runs 'query-current-machine' QMP
command. Therefore, there's no need for it to acquire MODIFY job,
QUERY job is perfectly okay.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
8d175cbe64 qemu: Drop misleading comment for qemuDomainQueryWakeupSuspendSupport()
The was an attempt to document the retvals for
qemuDomainQueryWakeupSuspendSupport(). However, it's misleading
because in reality, the function can return nothing but 0 or -1,
but the comment implies retval of 1 too.

Since the set of possible return values complies with our
unwritten rule (0 for success, -1 for error), there's no real
value in having the comment and as such can be dropped.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 12:10:50 +01:00
Michal Privoznik
91ef81a378 qemu_agent: Bring back single sync
Historically, we had no idea whether the qemu-ga running inside
the guest was running or not. Or whether it crashed in the middle
of reading of a command. That's why we issued guest-sync prior
any intended command, to make the agent flush any partially read
JSON and reset its state machine.

But with VSERPORT_CHANGE event we know when the guest agent
(dis-)connects and thus can issue the sync command just once for
each 'connection'. Whether the agent is synced is tracked in
agent->inSync member, which used to be set to true upon
successful sync. But after rework in v8.0.0-rc1~361 that line is
gone, leaving us with using the historic approach basically.

Fixes: cad84fd51e
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-08 09:17:48 +01:00
Michal Privoznik
7416d19b8d qemu: Retire QEMU_CAPS_OBJECT_MEMORY_FILE_ALIGN
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:43 +01:00
Michal Privoznik
fc141bfe88 qemu_capabilities: Stop detecting QEMU_CAPS_OBJECT_MEMORY_FILE_ALIGN
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:43 +01:00
Michal Privoznik
8d5c564622 qemu: Assume QEMU_CAPS_OBJECT_MEMORY_FILE_ALIGN
Introduced in QEMU's commit of v2.12.0-rc0~148^2~4 the .align
attribute of memory-backend-file is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:43 +01:00
Michal Privoznik
536f561d13 qemu: Retire QEMU_CAPS_OBJECT_MEMORY_FILE_DISCARD
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:43 +01:00
Michal Privoznik
881cf3c4f1 qemu_capabilities: Stop detecting QEMU_CAPS_OBJECT_MEMORY_FILE_DISCARD
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:43 +01:00
Michal Privoznik
8c0d43803b qemu: Assume QEMU_CAPS_OBJECT_MEMORY_FILE_DISCARD
Introduced in QEMU's commit of v2.11.0-rc0~95^2~9 the .discard
attribute of memory-backend-file is always available for all QEMU
versions we support (4.2.0, currently). Therefore, we can assume
the capability is always set and thus doesn't need to be checked
for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
9d86ae4ca2 qemu: Retire QEMU_CAPS_OBJECT_MEMORY_FILE
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
9b279f2d3e qemu_capabilities: Stop detecting QEMU_CAPS_OBJECT_MEMORY_FILE
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
8641fcfa63 qemu: Assume QEMU_CAPS_OBJECT_MEMORY_FILE
Introduced in QEMU's commit of v2.1.0-rc0~41^2~26 only for Linux,
and later in v3.1.0-rc0~71^2~10 for all POSIX, the
memory-backend-file is going to be present for all QEMU versions
we support (4.2.0, currently). Therefore, we can assume the
capability is always set and thus doesn't need to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
7addd1baa6 qemu: Retire QEMU_CAPS_OBJECT_MEMORY_RAM
Now that nothing uses this capability, it can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
b77f5b08a7 qemu_capabilities: Stop detecting QEMU_CAPS_OBJECT_MEMORY_RAM
All supported QEMUs have this capability. Stop detecting it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
fbbae04214 qemu: Assume QEMU_CAPS_OBJECT_MEMORY_RAM
Introduced in QEMU's commit of v2.1.0-rc0~41^2~104 the
memory-backend-ram is going to be present for all QEMU versions
we support (4.2.0, currently). Therefore, we can assume the
capability is always set and thus doesn't need to be checked for.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 15:27:42 +01:00
Michal Privoznik
eebef24d96 qemu: Drop NULL checks guarding g_slist_free_full()
The g_slist_free_full() function is perfectly capable of handling
NULL (in which case it's NOP), therefore there's no need to check
passed pointers for NULL. We have them though in couple of
places. Drop them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 10:47:57 +01:00
Peter Krempa
9acd9fa733 qemu: validate: Validate maximum start time for <clock offset='absolute'>
Glib can internally convert only unix timestamps up to
9999-12-31T23:59:59 (253402300799). Validate that the user doesn't use
more than that as otherwise we cause an assertion failure:

 (process:1183396): GLib-CRITICAL **: 14:25:00.906: g_date_time_format: assertion 'datetime != NULL' failed

Additionally adjust the schema to allow bigger values as we use
'unsigned long long' to parse the value.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128993
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-04 09:54:33 +01:00
Jiri Denemark
a607baf65a qemu: Avoid memory leak in qemuMonitorJSONExtractQueryStatsSchema
In a rare case when virHashAddEntry fails we would just leak the
structure we wanted to add to the hash table.

Fixes: e89acdbc3b
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-02 16:10:38 +01:00
Peter Krempa
423d93967a virDomainJobObj: Use 'unsigned int' instead of 'unsigned long' for 'apiFlags' field
The callers store only an 'unsigned int' in the field. Convert it to the
proper type including parser/formatter.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-02 09:20:58 +01:00
Peter Krempa
08c5c48124 qemuDomainObjPrivateXMLParseBlockjobData: Use virXMLPropUInt instead of virXPathULongHex
Use the function for the proper type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-02 09:20:58 +01:00
Peter Krempa
83e1368d95 virDomainTimerDef: Convert 'track' field to proper enum type
Adjust the parser and add missing switch cases to make the complier
happy.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-02 09:20:58 +01:00
Peter Krempa
7fb8adc7cd virDomainTimerDef: Convert 'tickpolicy' field to proper enum type
Convert the field, adjust the XML parser to use virXMLPropEnum and add
the VIR_DOMAIN_TIMER_TICKPOLICY_LAST enum case to all appropriate
'switch' statements.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-02 09:20:58 +01:00
Peter Krempa
a3c7426839 virQEMUCapsLoadCache: Use 'virXMLPropUInt' instead of 'virXPathULong'
The libvirt version is stored in an 'unsigned int' use the proper XPath
query function for the type and remove the temporary variable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-02 09:20:57 +01:00
Michal Privoznik
43ac2e703c qemu_namespace: Make qemuDomainGetPreservedMounts() more robust wrt running VMs
The aim of qemuDomainGetPreservedMounts() is to get a list of
filesystems mounted under /dev and optionally generate a path for
each one where they are moved temporarily when building the
namespace. And if given domain is also running it looks into its
mount table rather than at the host one. But if it did look at
the domain's private mount table, it find /dev mounted twice: the
first time by udev, the second time the tmpfs mounted by us.

Now, later in the function there's a "sorting" algorithm that
tries to reduce number of mount points needing preservation, by
identifying nested mount points. And if we keep the second
occurrence of /dev on the list, well, after the "sorting" we are
left with nothing but "/dev" because all other mount points are
nested.

Fixes: 46b03819ae
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-01 14:51:48 +01:00
Michal Privoznik
bca7a53333 qemu_namespace: Don't leak memory in qemuDomainGetPreservedMounts()
The aim of qemuDomainGetPreservedMounts() is to get a list of
filesystems mounted under /dev and optionally generate a path for
each one where they are moved temporarily when building the
namespace. And the function tries to be a bit clever about it.
For instance, if /dev/shm mount point exists, there's no need to
consider /dev/shm/a nor /dev/shm/b as preserving just 'top level'
/dev/shm gives the same result. To achieve this, the function
iterates over the list of filesystem as returned by
virFileGetMountSubtree() and removes the nested ones. However, it
does so in a bit clumsy way: plain VIR_DELETE_ELEMENT() is used
without freeing the string itself. Therefore, if all three
aforementioned example paths appeared on the list, /dev/shm/a and
/dev/shm/b strings would be leaked.

And when I think about it more, there's no real need to shrink
the array down (realloc()). It's going to be free()-d when
returning from the function. Switch to
VIR_DELETE_ELEMENT_INPLACE() then.

Fixes: cdd9205dff
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-01 14:51:48 +01:00
Peter Krempa
ecb8c93196 qemuAppendDomainMemoryMachineParams: Refactor formatting of 'dump-guest-core'
Use virTristateSwitchFromBool to fill in the default if user didn't
request it explicitly.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-11-01 13:07:20 +01:00
Martin Kletzander
86e27b3506 Remove Before=libvirt-guests.service from other services
libvirt-guests has After= dependency for all the sockets and that is enough.
With the extra Before= in the service file systemd postpones the start of the
socket activated service (when libvirt-guests is trying to connect to the
socket) until after libvirt-guests is stopped effectively making `systemctl stop
libvirt-guests` deadlock.  The reason for that is that all stop jobs are
scheduled before any start job.  Removing the redundant Before= specification
fixes this behaviour.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-27 17:39:19 +02:00
Ján Tomko
045072ee3a qemu: fix conversion specifier in qemuBuildVsockDevProps
vhostfd is a signed integer.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-10-24 15:36:33 +02:00
Ján Tomko
0b1da01ef2 qemu: do not attempt to pass unopened vsock FD
On normal vm startup, we open a file descriptor
for the vsock device in qemuProcessPrepareHost.

However, when doing domxml-to-native, no file descriptors are open.

Only pass the fd if it's not -1, to make domxml-to-native work.

https://bugzilla.redhat.com/show_bug.cgi?id=1777212

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-10-24 15:36:33 +02:00
Jiri Denemark
1a570f9712 qemu: Do not crash when canceling migration on reconnect
When libvirtd is restarted during an active outgoing migration (or
snapshot, save, or dump which are internally implemented as migration)
it wants to cancel the migration. But by a mistake in commit
v8.7.0-57-g2d7b22b561 the qemuMigrationSrcCancel function is called with
wait == true, which leads to an instant crash by dereferencing NULL
pointer stored in priv->job.current.

When canceling migration to file (snapshot, save, dump), we don't need
to wait until it is really canceled as no migration capabilities or
parameters need to be restored.

On the other hand we need to wait when canceling outgoing migration and
since we don't have virDomainJobData at this point, we have to
temporarily restore the migration job to make sure we can process
MIGRATION events from QEMU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-24 15:28:47 +02:00
Jiri Denemark
4dd86f334b qemu_migration: Properly wait for migration to be canceled
In my commit v8.7.0-57-g2d7b22b561 I attempted to make
qemuMigrationSrcCancel synchronous, but failed. When we are canceling
migration after some kind of error which is detected in
in qemuMigrationSrcWaitForCompletion, jobData->status will be set to
VIR_DOMAIN_JOB_STATUS_FAILED regardless on QEMU state. So instead of
relying on the translated jobData->status in qemuMigrationSrcIsCanceled
we need to check the migration status we get from QEMU MIGRATION event.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-24 15:28:47 +02:00
Michal Privoznik
ab966b9d31 qemu: Enable for vCPUs on hotplug
As advertised in the previous commit, QEMU_SCHED_CORE_VCPUS case
is implemented for hotplug case. The implementation is very
similar to the cold boot case, except here we fork off for every
vCPU (because the implementation is done in
qemuProcessSetupVcpu() which is also the function that's called
from hotplug code). But that's okay because our hotplug APIs
allow hotplugging one device at the time.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2074559
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-20 09:01:21 +02:00
Michal Privoznik
d942422482 qemu: Enable SCHED_CORE for vCPUs
For QEMU_SCHED_CORE_VCPUS case, the vCPU threads should be placed
all into one scheduling group, but not the emulator or any of its
threads. Therefore, as soon as vCPU TIDs are detected, fork off a
child which then creates a separate scheduling group and adds all
vCPU threads into it.

Please note, this commit only handles the cold boot case. Hotplug
is going to be implemented in the next commit.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-20 09:00:45 +02:00
Michal Privoznik
000477115e qemu: Enable SCHED_CORE for helper processes
For QEMU_SCHED_CORE_FULL case, all helper processes should be
placed into the same scheduling group as the QEMU process they
serve. It may happen though, that a helper process is started
before QEMU (cold start of a domain). But we have the dummy
process running from which the QEMU process will inherit the
scheduling group, so we can use the dummy process PID as an
argument to virCommandSetRunAmong().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-20 09:00:02 +02:00
Michal Privoznik
279527334d qemu_process: Enable SCHED_CORE for QEMU process
For QEMU_SCHED_CORE_EMULATOR or QEMU_SCHED_CORE_FULL the QEMU
process (and its vCPU threads) should be placed into its own
scheduling group. Since we have the dummy process running for
exactly this purpose use its PID as an argument to
virCommandSetRunAmong().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-20 08:59:15 +02:00
Michal Privoznik
4be75216be qemu_domain: Introduce qemuDomainSchedCoreStart()
The aim of this helper function is to spawn a child process in
which new scheduling group is created. This dummy process will
then used to distribute scheduling group from (e.g. when starting
helper processes or QEMU itself). The process is not needed for
QEMU_SCHED_CORE_NONE case (obviously) nor for
QEMU_SCHED_CORE_VCPUS case (because in that case a slightly
different child will be forked off).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-20 08:58:18 +02:00
Michal Privoznik
6a1500b4ea qemu_conf: Introduce a knob to set SCHED_CORE
Ideally, we would just pick the best default and users wouldn't
have to intervene at all. But in some cases it may be handy to
not bother with SCHED_CORE at all or place helper processes into
the same group as QEMU. Introduce a knob in qemu.conf to allow
users control this behaviour.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-20 08:58:10 +02:00
Michal Privoznik
060d4c83ef qemu: Refresh rx-filters more often
There are couple of scenarios where we need to reflect MAC change
done in the guest:

  1) domain restore from a file (here, we don't store updated MAC
     in the save file and thus on restore create the macvtap with
     the original MAC),
  2) reconnecting to a running domain (here, the guest might have
     changed the MAC while we were not running),
  3) migration (here, guest might change the MAC address but we
     fail to respond to it,

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-20 08:48:31 +02:00
Michal Privoznik
7356dce2b3 qemu: Refresh state after restore from a save image
When restoring a domain from a save image, we need to query QEMU
for some runtime information that is not stored in status XML, or
even if it is, it's not parsed (e.g. virtio-mem actual size, or
soon rx-filters for macvtaps).

During migration, this is done in qemuMigrationDstFinishFresh(),
or in case of newly started domain in qemuProcessStart(). Except,
the way that the code is written, when restoring from a save
image (which is effectively a migration), the state is never
refreshed, because qemuProcessStart() sees incoming migration so
it does not refresh the state thinking it'll be done in the
finish phase. But restoring from a save image has no finish
phase. Therefore, refresh the state explicitly after the domain
was restored but before vCPUs are resumed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-20 08:48:14 +02:00
Michal Privoznik
43973de6f1 qemu: Acquire QUERY job instead of MODIFY when handling NIC_RX_FILTER_CHANGED event
We are not updating domain XML to new MAC address, just merely
setting host side of macvtap. But we don't need a MODIFY job for
that, QUERY is just fine.

This allows us to process the event should it occur during
migration.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-20 08:47:59 +02:00
Michal Privoznik
ebb1e41b3a qemu: Move parts of NIC_RX_FILTER_CHANGED event handling into a function
Parts of the code that responds to the NIC_RX_FILTER_CHANGED
event are going to be re-used. Separate them into a function
(qemuDomainSyncRxFilter()) and move the code into qemu_domain.c
so that it can be re-used from other places of the driver.

There's one slight change though: instead of passing device alias
from the just received event to qemuMonitorQueryRxFilter(), I've
switched to using the alias stored in our domain definition. But
these two are guaranteed to be equal. virDomainDefFindDevice()
made sure about that, if nothing else.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-20 08:47:31 +02:00
Michal Privoznik
1eaf118ce1 processNicRxFilterChangedEvent: Free @guestFilter and @hostFilter automatically
There's no need to call virNetDevRxFilterFree() explicitly, when
corresponding variables can be declared as
g_autoptr(virNetDevRxFilter).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-20 08:44:33 +02:00
Amneesh Singh
8c9e3dae14 qemu_driver: add new stats worker qemuDomainGetStatsVm
This patch adds a new worker qemuDomainGetStatsVm which reports the
stats returned by "query-stats" via qemuMonitorQueryStats for the VM
target.

Signed-off-by: Amneesh Singh <natto@weirdnatto.in>
2022-10-19 15:58:29 +02:00
Amneesh Singh
0f867a3831 qemu_driver: add the vCPU stats by KVM to the current stats
This patch adds the stats queried by qemuMonitorQueryStats for vCPU and
add them according to their QOM device path

Signed-off-by: Amneesh Singh <natto@weirdnatto.in>
2022-10-19 15:58:29 +02:00
Amneesh Singh
b86c77dff2 qemu_monitor: add qemuMonitorGetStatsByQOMPath
This function returns the virJSONValue object which has the
same qom_path as specified.

Signed-off-by: Amneesh Singh <natto@weirdnatto.in>
2022-10-19 15:58:29 +02:00
Amneesh Singh
08af53dcaa qemu_domain: add statsSchema to qemuDomainObjPrivate
This patch adds a hashtable for storing the stats schema and a function
to refresh it by querying "query-stats-schemas" using
qemuMonitorQueryStatsSchema

Signed-off-by: Amneesh Singh <natto@weirdnatto.in>
2022-10-19 15:58:29 +02:00
Amneesh Singh
415f8b2233 qemu_capabilities: add "query-stats-schemas" QMP command to the QEMU capabilities
Related: https://gitlab.com/libvirt/libvirt/-/issues/276

Signed-off-by: Amneesh Singh <natto@weirdnatto.in>
2022-10-19 15:58:29 +02:00
Amneesh Singh
e89acdbc3b qemu_monitor: add qemuMonitorQueryStatsSchema
Related: https://gitlab.com/libvirt/libvirt/-/issues/276

This patch adds a simple API for "query-stats-schemas" QMP command

Signed-off-by: Amneesh Singh <natto@weirdnatto.in>
2022-10-19 15:58:29 +02:00
Martin Kletzander
d057b0bfc4 qemu_driver: Fix indentation in qemuDomainGetStatsVcpu
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2022-10-19 15:57:43 +02:00
Jim Fehlig
71d9836ca1 conf: Add channel devices to domain capabilities
As qemu becomes more modularized, it is important for libvirt to advertise
availability of the modularized functionality through capabilities. This
change adds channel devices to domain capabilities, allowing clients such
as virt-install to avoid using spicevmc channel devices when not supported
by the target qemu.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-17 11:39:00 -06:00
Peter Krempa
e8213fb70a qemu: validate: Clarify error messages for unsupported 3d video acceleration
The error message doesn't really convey the information that 3d
acceleration works only for the 'virtio' model and similarly the same
error would be reported if qemu doesn't support acceleration, which is
hard to debug.

Split and clarify the errors.

Noticed in https://gitlab.com/libvirt/libvirt/-/issues/388

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-10-17 14:13:24 +02:00
Michal Privoznik
babcbf2d5c qemu: Create base hugepages path on memory hotplug
Users can play all sorts of games with mount points. For
instance, they can unmount and mount back a hugetlbfs and only
after that attempt to hotplug memory.

This has an unfortunate consequence though. During memory
hotplug, when qemuProcessBuildDestroyMemoryPaths() is called the
path is created with very restrictive mode (0700) because under
the hood g_mkdir_with_parents(path, 0700) is called.

Therefore, create the driver generic portion of the path
separately.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2134009
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2022-10-17 08:40:58 +02:00
Michal Privoznik
72adf3b717 qemu: Separate out hugepages basedir making
During its initialization, the QEMU driver iterates over
hugetlbfs mount points, creating the driver specific path in each
of them ($prefix/libvirt/qemu). This path is created with very
wide mode (0777) because per-domain directories are then created
under it.

Separate this code into a function so that it can be re-used.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2022-10-17 08:40:18 +02:00
Jim Fehlig
e7d6f2d958 qemu: Use command line to properly check for spice support
domcapabilities reports spice graphics support even against a minimal
qemu installation without spice modules. Checking for 'query-spice'
in the list of qmp commands supported by qemu is not sufficient to
determine spice support. Checking the command line produces acurrate
results.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-14 16:10:11 -06:00
Jim Fehlig
4e13cc4adb conf: Add USB redirect devices to domain capabilities
As qemu becomes more modularized, it is important for libvirt to advertise
availability of the modularized functionality through capabilities. This
change adds USB redirect devices to domain capabilities, allowing clients
such as virt-install to avoid using redirdev devices when not supported
by the target qemu.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-14 16:10:01 -06:00
Jiang Jiacheng
1241670abd qemu: Init address before qemuProcessShutdownOrReboot during reconnect process
When libvirt is restarted, the qemuProcessShutdownReboot command is
executed to restore the VM that is being restarted. In this case, a
coredump may occur when we hotplug a pci device since the PCI address
hasn't be inited yet. Moving the initialization of address to the front
of qemuProcessShutdownOrReboot to ensure that we have the address inited.

Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-12 14:47:55 +02:00
Pierre LIBEAU
f30843142a qemu: Fix race condition when detaching a device
If QEMU replies to device_del command with "DeviceNotFound"
error, then libvirt doesn't clean the device from the live
configuration.

This is because qemuMonitorDelDevice() returns -2 to
qemuDomainDeleteDevice() and instead of calling
qemuDomainRemoveDevice() the qemuDomainDetachDeviceLive() jumps
right onto cleanup label.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/359
Signed-off-by: Pierre LIBEAU <pierre.libeau@corp.ovh.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-12 14:08:42 +02:00
Michal Privoznik
68bf647788 qemu: Avoid memory leak in virQEMUCapsCPUDefsToModels
The @vendor variable inside virQEMUCapsCPUDefsToModels() is
allocated, but never freed. But there is actually no need for it
to be allocated, because it merely passes a retval of
virCPUGetVendorForModel() (which returns a const string) to
virDomainCapsCPUModelsAdd() (which ten accepts the argument as
const string). Therefore, drop the g_strdup() call and fix the
type of the variable.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2022-10-10 15:47:54 +02:00
Jiri Denemark
b0ff3af412 qemu_capabilities: Translate CPU blockers
Since commit "cpu_x86: Disable blockers from unusable CPU models"
(v3.8.0-99-g9c9620af1d) we explicitly disable CPU features reported by
QEMU as usability blockers for a particular CPU model when creating
baseline or host-model CPU definition. When QEMU changed canonical names
for some features (mostly those with '_' in their names), we forgot to
translate the blocker lists to names used by libvirt and the renamed
features would no longer be explicitly disabled in the created CPU model
even if they were reported as blockers by QEMU.

For example, on a host where EPYC CPU model has the following blockers

    <blocker name='sha-ni'/>
    <blocker name='mmxext'/>
    <blocker name='fxsr-opt'/>
    <blocker name='cr8legacy'/>
    <blocker name='sse4a'/>
    <blocker name='misalignsse'/>
    <blocker name='osvw'/>

we would fail to disable 'fxsr-opt':

    <cpu mode='custom' match='exact'>
      <model fallback='forbid'>EPYC</model>
      <feature policy='disable' name='sha-ni'/>
      <feature policy='disable' name='mmxext'/>
      <feature policy='disable' name='cr8legacy'/>
      <feature policy='disable' name='sse4a'/>
      <feature policy='disable' name='misalignsse'/>
      <feature policy='disable' name='osvw'/>
      <feature policy='disable' name='monitor'/>
    </cpu>

The 'monitor' feature is disabled even though it is not reported as a
blocker by QEMU because libvirt's definition of EPYC includes the
feature while it is missing in EPYC definition in QEMU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-10 14:31:43 +02:00
Jiri Denemark
bbd2d9cb40 Introduce virCPUGetVendorForModel and use it in QEMU driver
So far QEMU driver does not get CPU model vendor from QEMU directly and
it has to ask the CPU driver for the info stored in CPU map.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-10 14:31:43 +02:00
Jiri Denemark
2784a83907 domain_capabilities: Add vendor attribute for CPU models
Even though several CPU models from various vendors are reported as
usable on a given host, user may still want to use only those that match
the host vendor. Currently the only place where users can check the
vendor of each CPU model is our CPU map, which is considered internal
and users should not really be using it directly. So to allow for such
filtering we now advertise the vendor of each CPU model in domain
capabilities.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-10 14:31:42 +02:00
Jiri Denemark
6f927dce93 qemu: Do not pass qemuCaps to virQEMUCapsCPUFeature{To,From}QEMU
The only part of qemuCaps both functions are interested in is the CPU
architecture. Changing them to expect just virArch makes the functions
more reusable.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-10 14:31:42 +02:00
Jiri Denemark
f0554d88fb conf: virDomainCapsCPUModelsAdd never fails
Since the function always returns 0, we can just return void and make
callers simpler.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-10 14:31:42 +02:00
Peter Krempa
9b3828e263 qemu: capabilities: Convert virQEMUCapsLoadCache to virXMLParse
Use virXMLParse so that the code doesn't have to explicitly allocate
an XPath context and validate the root element.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-06 10:54:25 +02:00
Peter Krempa
402c31f3ac virDomainDefParseNode: Pass only the XPath context as argument
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-06 10:54:25 +02:00
Peter Krempa
1eb67d24de conf: network: Provide only virNetworkDefParse
Replace virNetworkDefParseString/File by direct calls to
virNetworkDefParse.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-06 10:54:25 +02:00
Peter Krempa
573f764ee4 conf: backup: Remove virDomainBackupDefParseNode
Rename virDomainBackupDefParse to virDomainBackupDefParseXML and use
it in place of virDomainBackupDefParseNode. This is possible as
virXMLParse can be used to replace XPath context allocation and root
node checking.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-10-06 10:54:25 +02:00
Stefan Berger
92f7aafced qemu: tpm: Remove TPM state after successful migration
This patch 'fixes' the behavior of the persistent_state TPM domain XML
attribute that intends to preserve the state of the TPM but should not
keep the state around on all the hosts a VM has been migrated to. It
removes the TPM state directory structure from the source host upon
successful migration when non-shared storage is used. Similarly, it
removes it from the destination host upon migration failure when
non-shared storage is used.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-04 16:34:28 +02:00
Stefan Berger
60a06693cc qemu: Add UNDEFINE_TPM and UNDEFINE_KEEP_TPM flags
Add UNDEFINE_TPM and UNDEFINE_KEEP_TPM flags to qemuDomainUndefineFlags()
API and --tpm and --keep-tpm to 'virsh undefine'. Pass the
virDomainUndefineFlagsValues via qemuDomainRemoveInactive()
from qemuDomainUndefineFlags() all the way down to
qemuTPMEmulatorCleanupHost() and delete TPM storage there considering that
the UNDEFINE_TPM flag has priority over the persistent_state attribute
from the domain XML. Pass 0 in all other API call sites to
qemuDomainRemoveInactive() for now.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-04 16:34:28 +02:00
Ján Tomko
d6245e36c2 qemu: retire QEMU_CAPS_CCW
Now that we no longer use the capability, stop probing for existence
of 'virtual-css-bridge' and its properties.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-03 19:46:42 +02:00
Ján Tomko
bbaa22e24a qemu: retire QEMU_CAPS_CCW_CSSID_UNRESTRICTED
Now that it is no longer used, stop probing for it.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-03 19:46:42 +02:00
Ján Tomko
0662e6bd36 qemu: Assume QEMU_CAPS_CCW
Introduced in libvirt by:
  commit f245a9791c
    qemu: introduce capability for virtual-css-bridge

Which mentions that its support was in QEMU 2.7.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-03 19:46:42 +02:00
Ján Tomko
b02568f1be qemu: Assume QEMU_CAPS_CCW_CSSID_UNRESTRICTED
This capability was introduced by libvirt commit:
  commit 263e65fd20
      qemu: introduce vfio-ccw capability

It probes for the cssid-unrestricted property of
virtual-css-bridge, which was introduced in QEMU v2.12 by:
  commit 99577c492fb2916165ed9bc215f058877f0a4106
      s390x/css: unrestrict cssids

Since we bumped the minimum QEMU version to 4.2.0, assume
this property is always present.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-10-03 19:46:42 +02:00
Michal Privoznik
3478cca80e qemuProcessReconnect: Don't build memory paths
Let me take you on a short trip to history. A long time ago,
libvirt would configure all QEMUs to use $hugetlbfs/libvirt/qemu
for their hugepages setup. This was problematic, because it did
not allow enough separation between guests. Therefore in
v3.0.0-rc1~367 the path changed to a per-domain basis:

  $hugetlbfs/libvirt/qemu/$domainShortName

And to help with migration on daemon restart a call to
qemuProcessBuildDestroyMemoryPaths() was added to
qemuProcessReconnect() (well, it was named
qemuProcessBuildDestroyHugepagesPath() back then, see
v3.10.0-rc1~174). This was desirable then, because the memory
hotplug code did not call the function, it simply assumes
per-domain paths to exist. But this changed in v3.5.0-rc1~92
after which the per-domain paths are created on memory hotplug
too.

Therefore, it's no longer necessary to create these paths in
qemuProcessReconnect(). They are created exactly when needed
(domain startup and memory hotplug).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-30 10:09:42 +02:00
Michal Privoznik
135233df26 qemuNamespaceMknodOne: Call g_file_read_link() in async-signal-safe fashion
When creating a node in QEMU's namespace the whole link chain is
created with it. Here, we use g_file_read_link() from the child
(running inside the namespace) to learn whether a link exists and
points to expected target. Now, when building the namespace there
can't be any symlinks and this g_file_read_link() returns NULL
always. And because we pass a local GError variable to it, glib
tries to set it to a localized error message. This comes with
creating a (static) hash table inside of g_strerror() and is
guarded with a mutex. The hash table is also allocated using
GSlice allocator instead of g_malloc, and since the latter is
safe to use after fork (because it's documented to use plain
malloc), glib went with the former, naturally. Now, GSlice
allocator has plenty of internal mutexes and thus hitting a
locked mutex is not that hard.

Fortunately, we don't care about any error from
g_file_read_link() and thus we can pass NULL which avoids calling
g_strerror().

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2120965
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2022-09-30 10:09:32 +02:00
Michal Privoznik
84adb87105 qemuNamespaceMknodPaths: Don't fork needlessly
The qemuNamespaceMknodPaths() function is responsible for
creating files/directories in QEMU's mount namespace. When
called, it is given list of paths that have to be created in the
namespace. It processes this list and removes items that are not
directly under /dev, but on a 'shared' filesystem (note that all
other mount points are preserved). And it may so happen that
after this pre-process no files/directories need to be created in
the namespace. If that's the case, exit early and avoid
fork()-ing only to find out the same.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-29 16:48:54 +02:00
Lin Ma
85aafea449 qemu: Remove host-passthrough validation check for host-phys-bits=on
Besides the -cpu host, The host-phys-bits=on applies to custom or max
cpu model, So the host-passthrough validation check is unnecessary for
maxphysaddr with mode='passthrough'.

Signed-off-by: Lin Ma <lma@suse.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2022-09-29 08:45:03 -06:00
lu zhipeng
d95c79fbd0 qemu: fix memory leak about virDomainEventTunableNew
For prevent memory leak and easier to use, So change
virDomainEventTunableNew to get virTypedParameterPtr *params
and set it = NULL.

Signed-off-by: lu zhipeng <luzhipeng@cestc.cn>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2022-09-27 10:04:20 +02:00
Kristina Hanicova
fa2a7f888c qemu_monitor_json: remove unnecessary variable 'rc'
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-26 15:53:37 +02:00
Michal Privoznik
0377177c78 qemu_process.c: Propagate hugetlbfs mounts on reconnect
When reconnecting to a running QEMU process, we construct the
per-domain path in all hugetlbfs mounts. This is a relict from
the past (v3.4.0-100-g5b24d25062) where we switched to a
per-domain path and we want to create those paths when libvirtd
restarts on upgrade.

And with namespaces enabled there is one corner case where the
path is not created. In fact an error is reported and the
reconnect fails. Ideally, all mount events are propagated into
the QEMU's namespace. And they probably are, except when the
target path does not exist inside the namespace. Now, it's pretty
common for users to mount hugetlbfs under /dev (e.g.
/dev/hugepages), but if domain is started without hugepages (or
more specifically - private hugetlbfs path wasn't created on
domain startup), then the reconnect code tries to create it.
But it fails to do so, well, it fails to set seclabels on the
path because, because the path does not exist in the private
namespace. And it doesn't exist because we specifically create
only a subset of all possible /dev nodes. Therefore, the mount
event, whilst propagated, is not successful and hence the
filesystem is not mounted. We have to do it ourselves.

If hugetlbfs is mount anywhere else there's no problem and this
is effectively a dead code.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2123196
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 16:33:48 +02:00
Michal Privoznik
5853d70718 qemu_namespace: Introduce qemuDomainNamespaceSetupPath()
Sometimes it may come handy to just bind mount a directory/file
into domain's namespace. Implement a thin wrapper over
qemuNamespaceMknodPaths() which has all the logic we need.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 16:33:09 +02:00
Michal Privoznik
46b03819ae qemu_namespace: Fix a corner case in qemuDomainGetPreservedMounts()
When setting up namespace for QEMU we look at mount points under
/dev (like /dev/pts, /dev/mqueue/, etc.) because we want to
preserve those (which is done by moving them to a temp location,
unshare(), and then moving them back). We have a convenience
helper - qemuDomainGetPreservedMounts() - that processes the
mount table and (optionally) moves the other filesystems too.
This helper is also used when attempting to create a path in NS,
because the path, while starting with "/dev/" prefix, may
actually lead to one of those filesystems that we preserved.

And here comes the corner case: while we require the parent mount
table to be in shared mode (equivalent of `mount --make-rshared /'),
these mount events propagate iff the target path exist inside the
slave mount table (= QEMU's private namespace). And since we
create only a subset of /dev nodes, well, that assumption is not
always the case.

For instance, assume that a domain is already running, no
hugepages were configured for it nor any hugetlbfs is mounted.
Now, when a hugetlbfs is mounted into '/dev/hugepages', this is
propagated into the QEMU's namespace, but since the target dir
does not exist in the private /dev, the FS is not mounted in the
namespace.

Fortunately, this difference between namespaces is visible when
comparing /proc/mounts and /proc/$PID/mounts (where PID is the
QEMU's PID). Therefore, if possible we should look at the latter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 16:32:51 +02:00
Michal Privoznik
687374959e qemu_namespace: Tolerate missing ACLs when creating a path in namespace
When creating a path in a domain's mount namespace we try to set
ACLs on it, so that it's a verbatim copy of the path in parent's
namespace. The ACLs are queried upfront (by
qemuNamespaceMknodItemInit()) but this is fault tolerant so the
pointer to ACLs might be NULL (meaning no ACLs were queried, for
instance because the underlying filesystem does not support
them). But then we take this NULL and pass it to virFileSetACLs()
which immediately returns an error because NULL is invalid value.

Mimic what we do with SELinux label - only set ACLs if they are
non-NULL which includes symlinks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 15:47:54 +02:00
Michal Privoznik
a8947db1a4 qemu_domain: Ignore all but SCSI hostdevs in qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias()
When retiring QEMU_CAPS_BLOCKDEV_HOSTDEV_SCSI capability the
commit removed a bit too much. Previously, all other devices than
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI were ignored in
qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias(). But the
commit in question removed not only the capability check but also
this return early statement. Restore it back.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2129239
Fixes: dc8dbb27d4
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 15:28:34 +02:00
Peter Krempa
4b70a0519c virStateInitialize: Propagate whether running in monolithic daemon mode to stateful driver init
Upcoming patch which is fixing the opening of drivers in monolithic mode
needs to know whether we are inside 'libvirtd' but the code where the
decision needs to happen is not re-compiled per daemon. Thus we need to
pass this information to the stateful driver init function so that it
can be remebered.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-13 10:50:02 +02:00
Michal Privoznik
f14f8dff93 qemu_process: Don't require a hugetlbfs mount for memfd
The aim of qemuProcessNeedHugepagesPath() is to determine whether
a hugetlbfs mount point is required for given domain (as in
whether qemuBuildMemoryBackendProps() picks up
memory-backend-file pointing to a hugetlbfs mount point). Well,
when domain is configured to use memfd backend then that
condition can never be true. Therefore, skip creating domain's
private path under hugetlbfs mount points.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-12 12:04:55 +02:00
Peter Krempa
b0c680853a qemu: monitor: Renumber QEMU_MONITOR_MIGRATE_RESUME
Now that all preceding flags were deleted we can fix the enum value.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
bc753aa6f7 qemu: migration: Remove QEMU_MONITOR_MIGRATE_BACKGROUND
'qemuMonitorJSONMigrate' is called from:
 - qemuMonitorMigrateToHost
 - qemuMonitorMigrateToSocket
   Both of the above function are called only from
   qemuMigrationSrcStart.

 - qemuMonitorMigrateToFd
   - called from:
     - qemuMigrationSrcToFile
       Both instances here pass QEMU_MONITOR_MIGRATE_BACKGROUND
       directly.
     - qemuMigrationSrcStart

qemuMigrationSrcStart is then called from qemuMigrationSrcRun and
qemuMigrationSrcResume, both of which always add QEMU_MONITOR_MIGRATE_BACKGROUND
to the flags.

Thus any caller always passes the flag so that we can remove the flag
altogether.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
d5fb23bc6e qemu: monitor: Drop support for old-style non-shared storage migration
Remove the support for enabling the 'blk' and 'inc' parameters of the
'migrate' command as there are no users any more.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
62b3f97aee qemu: migration: Don't attempt to fall back to old-style storage migration
QEMU supported the NBD server required for the new-style migration for a
long time already and when coupled with -blockdev the old style
migration doesn't even work, thus remove support for it.

This patch modifies the code to check that the destination returned data
for the NBD migration and returns an error if it did not and deletes the
fallback code paths which would not work.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
2980268b22 qemu: capabilities: Retire QEMU_CAPS_NBD_SERVER
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
94ff4f2f91 qemu: migration: Always assume support for QEMU_CAPS_NBD_SERVER
The NBD server (detected via 'nbd-server-start' qmp command) was added
to qemu in v1.3 and can't be compiled out.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
83ffeae75a qemu: migration: Fix setup of non-shared storage migration in qemuMigrationSrcBeginPhase
In commit 6111b23522 removing pre-blockdev code paths I've
improperly refactored the setup of non-shared storage migration.

Specifically the code checking that there are disks and setting up the
NBD data in the migration cookie was originally outside of the loop
checking the user provided list of specific disks to migrate, but became
part of the block as it was not un-indented when a higher level block
was being removed.

The above caused that if non-shared storage migration is requested, but
the user doesn't provide the list of disks to migrate (thus implying to
migrate every appropriate disk) the code doesn't actually setup the
migration and then later on falls back to the old-style migration which
no longer works with blockdev.

Move the check that there's anything to migrate out of the
'nmigrate_disks' block.

Fixes: 6111b23522
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2125111
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/373
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Kristina Hanicova
ecc742126a qemu & conf: move BeginNestedJob & BeginJobNowait into src/conf
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:15:28 +02:00
Kristina Hanicova
4435c026b7 qemu & conf: move BeginAsyncJob & EndAsyncJob into src/conf
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:15:06 +02:00