31314 Commits

Author SHA1 Message Date
Michal Privoznik
f931cb7f21 conf: Introduce virtio-mem <memory/> model
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:

  1) block-size, which defines the size of a block
  2) requested-size, which defines how much memory (in bytes)
     is the device requested to expose to the guest.

The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'current' and it is dealt with in future
commits (it is a runtime information anyway).

In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.

Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:

  <memory model='virtio-mem'>
    <source>
      <nodemask>1-3</nodemask>
      <pagesize unit='KiB'>2048</pagesize>
    </source>
    <target>
      <size unit='KiB'>2097152</size>
      <node>0</node>
      <block unit='KiB'>2048</block>
      <requested unit='KiB'>1048576</requested>
    </target>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
  </memory>

I hope by now it is obvious that:

  1) 'requested-size' must be an integer multiple of
     'block-size', and
  2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
     address.

Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:02:53 +02:00
Michal Privoznik
ed7c51b42e qemu_capabilities: Introduce QEMU_CAPS_MEMORY_BACKEND_RESERVE
This capability tracks whether memory-backend-* supports .reserve
attribute which is going to be important for backends associated
with virtio-mem devices.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:02:09 +02:00
Michal Privoznik
284d9c46d7 qemu_capabilities: Introduce QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI
This commit introduces a new capability that reflects virtio-mem-pci
device support in QEMU:

  QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI, /* -device virtio-mem-pci */

The virtio-mem-pci device was introduced in QEMU 5.1.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 11:01:32 +02:00
Michal Privoznik
45aa4c1d2a virhostmem: Introduce virHostMemGetTHPSize()
New virHostMemGetTHPSize() is introduced which allows caller to
obtain THP PMD (Page Middle Directory) size, which is equal to
the minimal size that THP can use, taken from kernel doc
(Documentation/admin-guide/mm/transhuge.rst):

  Some userspace (such as a test program, or an optimized memory allocation
  library) may want to know the size (in bytes) of a transparent hugepage::

    cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size

Since this size depends on the host architecture and the kernel
it won't change whilst libvirtd is running. Therefore, we can use
virOnce() and cache the value. Of course, we can be running under
kernel that has THP disabled or has no notion of THP at all. In
that case a negative value is returned to signal error.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-10-01 10:58:27 +02:00
Michal Privoznik
9c47d2754c qemuBuildNumaCommandLine: Separate out building of CPU list
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-01 10:52:35 +02:00
Michal Privoznik
c9f47bfc7a qemuBuildNumaCommandLine: Move vars into loops
There are two variables that are used only in a single
loop. Move their definitions into their respective blocks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-01 10:52:35 +02:00
Michal Privoznik
c7d7cae5cc virCPUDefParseXML: Prefer virXMLPropUInt over virXPathUInt
When parsing CPU topology, which is described in <topology/>
attributes we can use virXMLPropUInt() instead of virXPathUInt()
as the former results in shorter code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-01 10:52:35 +02:00
Michal Privoznik
97fbb7e7e8 virCPUDefParseXML: Parse uint using virXPathUInt()
There is no need to use virXPathULong() and a temporary UL
variable if we can use virXPathUInt() directly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-10-01 10:52:35 +02:00
Ján Tomko
0522f02f35 qemu: deprecate QEMU_CAPS_FSDEV_CREATEMODE
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:22 +02:00
Ján Tomko
43fac71b70 qemu: assume QEMU_CAPS_FSDEV_CREATEMODE
Added by QEMU commit:
b96feb2cb9 "9pfs: local: Add support for custom fmode/dmode in 9ps
mapped security modes"
in 2.10.0

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:22 +02:00
Ján Tomko
f501cec73d qemu: Deprecate QEMU_CAPS_MACHINE_KERNEL_IRQCHIP
Now that it's no longer used, remove probing for it
and mark it as deprecated.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:22 +02:00
Ján Tomko
7cd2e25991 qemu: assume QEMU_CAPS_MACHINE_KERNEL_IRQCHIP
Even though we only allow this option on x86,
all QEMUs report the command line option.

Added in QEMU v1.1:
6a48ffaaa7 "kvm: Activate in-kernel irqchip support"

Remove the pointless capability.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:21 +02:00
Ján Tomko
c0f82ba205 qemu: capabilities: do not look at parameters for sandbox
Assume the presence of the 'sandbox' option is enough,
no need to look at the parameters.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:21 +02:00
Ján Tomko
3f3cf5899c qemu: capabilities: deprecate QEMU_CAPS_SECCOMP_BLACKLIST
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:21 +02:00
Ján Tomko
cfb8951e68 qemu: seccomp: remove dead code
There is no QEMU we support that would need the old syntax
for -sandbox on.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:21 +02:00
Ján Tomko
d1be5aa6a4 qemu: conf: simplify seccomp_sandbox comment
It contains too many negations and conditions that are
no longer relevant now that we only support QEMU >= 2.11.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:21 +02:00
Ján Tomko
142938f5c2 qemu: always assume QEMU_CAPS_SECCOMP_BLACKLIST
elevateprivileges was introduced by QEMU commit:
73a1e64725 "seccomp: add elevateprivileges argument to command line"
released in 2.11.0
and later made conditional on SECCOMP support by:
9d0fdecbad sandbox: disable -sandbox if CONFIG_SECCOMP undefined

Use the existence of the sandbox option as a witness for its support.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 10:11:21 +02:00
Zhenzhong Duan
88a3977922 qemu: ingore the transient domain state in fake reboot
When action for 'on_poweroff' is set to 'restart', 'fake reboot'
is triggered and qemu shutdown state is transient. Domain state
need not to be changed and events not sent in this case.

Fixes: 4ffc807214cb80086d57e1d3e7b60959a41d2874
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-27 09:53:20 +02:00
Peter Krempa
960ec985a2 qemu: capabilities: Retire QEMU_CAPS_SPICE_FILE_XFER_DISABLE
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:01 +02:00
Peter Krempa
686caa57e5 qemu: validate: Always assume QEMU_CAPS_SPICE_FILE_XFER_DISABLE
QEMU added the capability to disable file transfers via spice in commit
5ad24e5f3b ("spice: Add -spice disable-agent-file-transfer cmdline
option (rhbz#961850)") released in qemu-v1.6.0 and the option can't be
disabled.

Remove the unnecessary validation.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:01 +02:00
Peter Krempa
41763b6cfa qemu: capabilities: Retire QEMU_CAPS_VNC_MULTI_SERVERS
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
c94c76c4e6 qemu: command: Always QEMU_CAPS_VNC_MULTI_SERVERS
All supported qemu versions now use the new commandline parser
functions, thus we can remove the old-style commandline generator.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
3fa36eeb7a qemu: capabilities: Retire QEMU_CAPS_VNC_OPTS
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
087dbb16c6 qemu: command: Always assume QEMU_CAPS_VNC_OPTS
The switch to QemuOpts parser which brought the long-form options
happened in qemu commit 4db14629c3 ("vnc: switch to QemuOpts, allow
multiple servers") released in v2.3.0.

We can always assume this capability and remove the old-style
generators.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
01c65d761c qemu: command: Simplify 'vnc' commandline generator
'qemuDomainSecretGraphicsPrepare' always populates 'gfxPriv->tlsAlias'
when 'cfg->vncTLS' is enabled.

This means we can remove the fallback code setting up TLS for vnc via
the 'x509=' parameter.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
33ebfe3756 qemuBuildTLSx509BackendProps: Remove unused 'qemuCaps'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
62b019c0fe qemu: capabilities: Retire QEMU_CAPS_OBJECT_TLS_CREDS_X509
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:37:00 +02:00
Peter Krempa
18de1d7621 qemu: Always assume presence of QEMU_CAPS_OBJECT_TLS_CREDS_X509
The 'tls-creds-x509' object is always registered even when qemu is built
without gnutls for all supported qemu versions. This means we cannot
probe for its support and thus simplify the code using TLS.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-24 10:36:59 +02:00
Ján Tomko
4a6d874946 ch: use g_auto in virCHMonitorBuildVMJson
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2021-09-23 15:32:01 +02:00
Ján Tomko
b4436cc3f5 ch: use g_auto in virCHMonitorBuildNetsJson
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2021-09-23 15:32:00 +02:00
Ján Tomko
08b943d641 ch: use g_auto in virCHMonitorBuildNetJson
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2021-09-23 15:32:00 +02:00
Ján Tomko
1149a6ddc7 ch: use g_auto in virCHMonitorBuildDisksJson
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2021-09-23 15:32:00 +02:00
Ján Tomko
48a089a964 ch: use g_auto in virCHMonitorBuildDiskJson
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2021-09-23 15:32:00 +02:00
Ján Tomko
25ffb2ce86 ch: use g_auto in virCHMonitorBuildCPUJson
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2021-09-23 15:32:00 +02:00
Peter Krempa
f5d8913f91 qemu: driver: Remove unused variable 'cfg'
Commit a50c473ad6c988a2 removed last use of 'cfg' from
qemuDomainMemoryPeek and qemuDomainScreenshot triggering a compile time
warning.

Fixes: a50c473ad6c988a249bf79a30fb7c6dc19733347
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2021-09-23 13:47:00 +02:00
Luke Yue
28d5ee324a test_driver: Introduce testDomainGetStatsIOThread
Introduce testDomainGetStatsIOThread to add support for
testConnectGetAllDomainStats to get IOThread infos.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:36 +02:00
Luke Yue
57709f0579 test_driver: Implement virConnectGetAllDomainStats
Implement virConnectGetAllDomainStats in a modular way just like QEMU
driver, though remove some params in GetStatsWorker that we don't need
in test driver currently.

Only add the worker to get state so far, more worker will be added
in the future.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:31 +02:00
Luke Yue
fd205b6712 test_driver: Implement testDomainSetIOThreadParams
Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:27 +02:00
Luke Yue
cde87e941f test_driver: Implement virDomainPinIOThread
Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:24 +02:00
Luke Yue
5af7036ec0 test_driver: Implement virDomainGetIOThreadInfo
If we use test driver on different machines, and use 0 as bitmap_size
for virDomainDriverGetIOThreadsConfig(), we would get different results for
the `CPU Affinity`, because it's depending on the host CPU's bitmap. In
order to get a stable result for testing, use result of
virDomainDefGetVcpus() as bitmap_size instead.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:22 +02:00
Luke Yue
fac57323fc domain_driver.c: Introduce and use virDomainDriverGetIOThreadsConfig()
The test driver can share the same code with qemu driver when implement
testDomainGetIOThreadsConfig, so extract it for test driver to use.

Also add a new parameter `bitmap_size` to the function, it's used for
specifying the bitmap size of the bitmap to generate, it would be helpful
for test driver or some special situation.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:19 +02:00
Luke Yue
958d0a5099 test_driver: Implement virDomainDelIOThread
Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:16 +02:00
Luke Yue
04d25261a6 test_driver: Implement virDomainAddIOThread
Introduce testDomainChgIOThread at the same time, could be used for
virDomainDelIOThread etc.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:13 +02:00
Luke Yue
6650d14f6c test_driver: Introduce testIOThreadInfo and generate IOThread infos
Introduce testIOThreadInfo to store IOThread infos: iothread_id,
poll_max_ns, poll_grow and poll_shrink for future usage.

Add an example of IOThread configuration to testdomfc4.xml, we also want
to generate default testIOThreadInfo for the IOThread configured in the
xml, so introduce testDomainGenerateIOThreadInfos, the values are taken
from QEMU.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:11 +02:00
Luke Yue
cb3033776f domain_driver.c: Introduce and use virDomainDriverAddIOThreadCheck()
The test driver can share the same code with qemu driver when implement
testDomainAddIOThreadCheck and testDomainDelIOThreadCheck, so extract
them for test driver to use.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 13:41:07 +02:00
Peng Liang
c4f3c955d5 qemu: don't change ownership of cache directory
Commit 6bcf25017bc6 ("virDomainMemoryPeek API") introduced memory peek
and commit 9936aecfd1b4 ("qemu: Implement the driver methods")
introduced screenshot.  Both of them will put temporary files in
/var/cache/libvirt/qemu, and the temporary files are created by QEMU.
Therefore, the ownership of /var/cache/libvirt/qemu should be changed to
user and group configured in qemu.conf to make sure that QEMU process
can create and write files in the cache directory.

Libvirt will only put the temporary files in /var/cache/libvirt/qemu
until commit cbde35899b90 ("Cache result of QEMU capabilities
extraction"), which will put the cache of QEMU capabilities in
'capabilities' subdir of the cache directory.  Because the capabilities
is used by libvirt, the ownership of both 'capabilities' subdir and
capabilities files are root.  However, when QEMU process runs as a
regular user (e.g. qemu user), the ownership of /var/cache/libvirt/qemu
will be changed to qemu:qemu while that of
/var/cache/libvirt/qemu/capabilities will be still root:root.  Then the
regular user could spoof different capabilities, which maybe lead to
denial of service.

Since the previous patch has move the temp files of screenshot and
memory peek to per-domain directory, no one except domain capabilities
uses cacheDir currently.  And since domain capabilities are used by
libvirtd instead of QEMU, no need to change the ownership of cacheDir to
qemu:qemu explicitly.

Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 12:42:26 +02:00
Peng Liang
a50c473ad6 qemu: move temp file of screenshot and memorypeek to per-domain dir
The temp files of screenshot and memory peek, which are created by QEMU,
are put in the cache directory.  However, the caches of domain
capabilities, which are created and used by libvirtd, are also put in
the cache directory.  In order to make the cache directory more secure,
move the temp files of screenshot and memory peek to per-domain
directory.

Since the temp files are just temporary files and are only used by
libvirtd (libvirtd will delete them after use), the use of screenshot
and memory peek will be affected.

Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 12:42:26 +02:00
Tim Wiederhake
ddbbbcd969 virDomainDefParseXML: Use automatic memory management
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-09-23 11:09:22 +02:00
Peter Krempa
f147634a38 qemu: command: Remove qemuBuildRBDSecinfoURI
Merge the code into the only caller.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-22 14:53:56 +02:00
Peter Krempa
0151c092fb qemu: domain: Rename secrets setup function
Since there's just one type left, we can change the name to a more
generic one.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-09-22 14:53:56 +02:00