Commit Graph

36122 Commits

Author SHA1 Message Date
Michal Privoznik
f1f4cbb50a ch_monitor: Avoid possible double free in virCHMonitorClose()
The virCHMonitorClose() is meant to be called when monitor to
cloud-hypervisor process closes. It removes the socket and frees
string containing path to the socket.

In general, there is a problem with the following pattern:

  if (var) {
      do_something();
      g_free(var);
  }

because if the pattern executes twice the variable is freed
twice. That's why we have VIR_FREE() macro. Well, replace plain
g_free() with g_clear_pointer(). Mind you, this is NOT a
destructor where clearing pointers is needless.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-12 09:11:36 +01:00
John Levon
4be361a385 test_driver: provide basic disk hotunplug support
Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-12 09:10:44 +01:00
John Levon
c530a96151 test_driver: provide basic disk hotplug support
Add some basic plumbing, based on the qemu driver.

Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-12 09:10:44 +01:00
Martin Kletzander
27ae5e602a qemu_hotplug: Report better error message for platform serial devices
This should be better than the current for both hotplug:

    error: internal error: Invalid target model for serial device

and hot-unplug:

    error: An error occurred, but the cause is unknown

which should not be reached at all.

Resolves: https://issues.redhat.com/browse/RHEL-66222
Resolves: https://issues.redhat.com/browse/RHEL-66223
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 12:48:42 +01:00
Martin Kletzander
52c2e3e0a7 qemu: Expose qemuChrIsPlatformDevice outside from qemu_command
Then it can be used from qemu_hotplug.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 12:48:42 +01:00
Boris Fiuczynski
bf0308b2d4 qemu: command: add multi boot device support on s390x
If QEMU supports multi boot device make use of it instead of using the
single boot device machine parameter.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 10:15:06 +01:00
Boris Fiuczynski
3ccf692e08 qemu: capabilities: Add QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM
Add capability QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM to detect multi boot
device support in QEMU by checking the virtio-blk-ccw device property
existence of loadparm.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 10:15:06 +01:00
Michal Privoznik
a3b8753db9 virnetdevopenvswitch: Warn on unsupported QoS settings
Let me preface this with stating the obvious: documentation on
QoS in OVS is very sparse. This is all based on my observation
and OVS codebase analysis.

For the following QoS setting:

  <bandwidth>
    <inbound average="512" peak="1024" burst="32"/>
  </bandwidth>

the following QoS setting is generated into OVS (NB, our XML
values are in KiB/s, OVS has them in bits/s):

  # ovs-vsctl list qos
  _uuid               : a087226b-2da6-4575-ad4c-bf570cb812a9
  external_ids        : {ifname=vnet1, vm-id="7714e6b5-4885-4140-bc59-2f77cc99b3b5"}
  other_config        : {burst="262144", max-rate="8192000", min-rate="4096000"}
  queues              : {0=655bf3a7-e530-4516-9caf-ec9555dfbd4c}
  type                : linux-htb

from which the following topology is generated:

  # for i in qdisc class; do tc -s -d -g $i show dev vnet1; done
  qdisc htb 1: root refcnt 2 r2q 10 default 0x1 direct_packets_stat 0 ver 3.17 direct_qlen 1000
   Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
   backlog 0b 0p requeues 0

  +---(1:fffe) htb rate 8192Kbit ceil 8192Kbit linklayer ethernet burst 1499b/1mpu 60b cburst 1499b/1mpu 60b level 7
       |       Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
       |       backlog 0b 0p requeues 0
       |
       +---(1:1) htb prio 0 quantum 51200 rate 4096Kbit ceil 8192Kbit linklayer ethernet burst 32Kb/1mpu 60b cburst 32Kb/1mpu 60b level 0
                 Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
                 backlog 0b 0p requeues 0

Long story short, the default class (1:) for an OVS interface has
average and peak set exactly as requested. But since it's nested
under another class (1:fffe), it can borrow unused bandwidth. And
the parent is set to have rate = ceil = peak from our XML. From
[1]: htb_tc_install() calls htb_parse_qdisc_details__() which
sets: 'hc->min_rate = hc->max_rate;' and then calls
htb_setup_class_(..., tc_make_handle(1, 0xfffe), tc_make_handle(1, 0), &hc);
to set up the top parent class.

In other words - the interface is set up to so that it can always
consume 'peak' bandwidth and there is no way for us to set it up
differently. It's too late to deny setting 'peak' different to
'average' at XML validation phase so do the next best thing -
throw a warning, just like we do in case <bandwidth/> is set for
an unsupported <interface/> type.

1: https://github.com/openvswitch/ovs/blob/main/lib/netdev-linux.c#L5039
Resolves: https://issues.redhat.com/browse/RHEL-53963
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-11-08 10:41:02 +01:00
Michal Privoznik
844d1036eb qemu_domain: Automagically add IOMMU if needed
If a Q35 domain has huge number of vCPUS (over 255, currently), then
it needs IOMMU with Extended Interrupt Mode enabled (see check in
qemuValidateDomainVCpuTopology()).

Well, we already add some devices and to other tricks when
parsing new domain XML. Might as well add IOMMU device if above
condition is met.

Resolves: https://issues.redhat.com/browse/RHEL-65844
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-07 10:46:33 +01:00
Michal Privoznik
b15047ff26 qemu: Turn EIM IOMMU on automagically
If a Q35 domain has huge number of vCPUS (over 255, currently), then
it needs IOMMU with Extended Interrupt Mode enabled (see check in
qemuValidateDomainVCpuTopology()).

Well, we already add some devices and to other tricks when
parsing new domain XML. Might as well turn the EIM on for IOMMU
device.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-07 10:46:33 +01:00
Michal Privoznik
a9797d7c43 libvirt_private.syms: Export virDomainIOMMUDefNew()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-07 10:46:32 +01:00
Ján Tomko
e45313c031 ch: check return value of virJSONValueArrayAppend
It only errors out when presented with a non-array, but we do check
it everywhere else.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-06 17:12:32 +01:00
Ján Tomko
da66bf53b0 util: json: check return value of virJSONValueFromJsonC
In virJSONValueFromJsonC, the return value of virJSONValueFromJsonC
was not checked in one case.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-06 17:12:32 +01:00
Ján Tomko
13f40898ab qemu: chardev: avoid impossible overflow
In the rare case where int and long long are not the same size,
the multiplication of an int variable and an int constant might
overflow. Cast the constant to long long to avoid this.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: baa4edfb79
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-06 17:12:32 +01:00
Marc-André Lureau
bb5e26749f qemu: explicit swtpm state locking
With upcoming v0.10 swtpm (commit
aa483aeb6d),
file locking with "lock" option is now supported and reflected in
"tpmstate-opt-lock" capability.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
f1304cc566 qemu_tpm: handle file/block storage source
When swtpm reports "nvram-backend-dir", it can accepts a single file or
block device where TPM state will be stored. --tpmstate must be
backend-uri=file://<path>.

Teach the storage to use custom directory or file source location.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
a110042d0c schema: add TPM emulator <source type='dir' path='..'>
Learn to parse a directory for the TPM state.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
579fd44612 schema: add TPM emulator <source type='file' path='..'>
Learn to parse a file path for the TPM state.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
6d4eb07a55 tpm: rename 'storagepath' to 'source_path'
Mechanically replace existing 'storagepath' with 'source_path', as the
following patches introduce <source path='..'> configuration.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
cc0aab9395 util: check swtpm nvram-backend-{dir,file} capabilities
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Martin Kletzander
a52cd504b3 qemu: Report supported panic device models in domcapabilities
Domain capabilities include information about support for various
devices and models.

Panic devices are not included in the output which means that management
applications need to include the logic for choosing the right device
model or request a default model and try defining such a domain.

Add reporting of panic device models into the domain capabilities based
on the logic in qemuValidateDomainDefPanic() and also report whether
panic devices are supported based on whether at least one model is
supported.  That way consumers of the domain capability XML can
differentiate between libvirt not reporting the panic device models or
no model being supported.

Resolves: https://issues.redhat.com/browse/RHEL-65187
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-05 09:57:37 +01:00
Ján Tomko
faf6edfa74 json: do not call json_tokener_free with NULL
Add an error message for the rare case if json_tokener_new
fails (allocation failure) and guard any use of json_tokener_free
where tok might be NULL (this was possible in libvirt-nss
when the json file could not be opened).

https://gitlab.com/libvirt/libvirt/-/issues/581

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Simon Pilkington
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2024-11-04 12:15:10 +01:00
Peter Krempa
d02140383d virstring: Use 'g_new0' instead of improper use of 'g_malloc0_n'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-01 15:52:18 +01:00
Peter Krempa
bb4bd9d31f Replace improper use of g_malloc(0) with g_new0
Completely remove use of g_malloc (without zeroing of the allocated
memory) and forbid further use.

Replace use of g_malloc0 in cases where the variable holding the pointer
has proper type.

In all of the above cases we can use g_new0 instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-01 15:52:18 +01:00
Peter Krempa
354a3d2be4 virJSONValueFromString: Prefix error message from 'json-c'
The error message from 'json-c' was passed along without any libvirt
string which makes it hard to find in the source and isn't exactly clear
when present in logs:

 libvirtd[843]: internal error : invalid utf-8 string

Prefix the message with 'failed to parse JSON'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-11-01 15:51:53 +01:00
Jiri Denemark
e71a510605 qemu: Fix maximum physical address size in baseline CPU
We should include maximum physical address size in the CPU definition
created by virConnectBaselineHypervisorCPU only if we know the value for
all input CPUs. Otherwise we would create a CPU definition that is not
usable on all hosts from which we gathered the CPU info.

https://issues.redhat.com/browse/RHEL-24850

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-01 10:19:24 +01:00
Laine Stump
7581e3b6d5 Revert "network: add rule to nftables backend that zeroes checksum of DHCP responses"
This reverts commit 42ab0148dd.

This patch was supposed to fix the checksum of dhcp response packets
by setting it to 0 (because having a non-0 but incorrect checksum was
causing the packets to be droppe on FreeBSD guests).

Early testing was positive, but after the patch was pushed upstream
and more people could test it, it turned out that while it fixed the
dhcp checksum problem for virtio-net interfaces on FreeBSD and
OpenBSD, it also *broke* dhcp checksums for the e1000 emulated NIC on
*all* guests (but not e1000e).

So we're reverting this fix and looking for something more universal
to be included in the next release.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-30 11:39:58 +01:00
Laine Stump
42ab0148dd network: add rule to nftables backend that zeroes checksum of DHCP responses
Many years ago (April 2010), soon after "vhost" in-kernel packet
processing was added to the virtio-net driver, people running RHEL5
virtual machines with a virtio-net interface connected via a libvirt
virtual network noticed that when vhost packet processing was enabled,
their VMs could no longer get an IP address via DHCP - the guest was
ignoring the DHCP response packets sent by the host.

(I've been informed by danpb that the same issue had been encountered,
and "fixed" even earlier than that, in 2006, with Xen as the
hypervisor.)

The "gory details" of the 2010 discussion are chronicled here:

  https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html

but basically it was because packet checksums weren't being fully
computed on the host side (because QEMU on the host and the NIC driver
in the guest had agreed between themselves to turn off checksums
because they were unnecessary due to the "link" between the two being
entirely in local memory rather than an error-prone physical cable),
but

1) a partial checksum was being put into the packets at some point by
   "someone"

2) the "don't use checksums" info was known by the guest kernel, which
   would properly ignore the "bad" checksum), and

3) the packets were being read by the dhclient application on the
   guest side with a "raw" socket (thus bypassing the guest kernel UDP
   processing that would have known the checksum was irrelevant and
   ignore it)),

The "fix" for this ended up being two-tiered:

1) The ISC DHCP package (which contains the aforementioned dhclient
program) made a fix to their dhclient code which caused it to accept
packets anyway even if they didn't have a proper checksum (NB: that's
not a full explanation, and possibly not accurate). This remedied the
problem for guests with an updated dhclient. Here is the code with the
fix to ISC DHCP:

  https://github.com/isc-projects/dhcp/blob/master/common/packet.c#L365

This eliminated the issue for any new/updated guests that had the
fixed dhclient, but it didn't solve the problem for existing/old guest
images that didn't/couldn't get their dhclient updated. This brings us
to:

2) iptables added a new "CHECKSUM" target and "--checksum-fill"
action:

  http://patchwork.ozlabs.org/patch/58525/

and libvirt added an iptables rule for each virtual network to match
DHCP response packets and perform --checksum-fill. This way by the
time dhclient on the guest read the raw packet, the checksum would be
corrected, and the packet would be accepted. This was pushed upstream
in libvirt commit v0.8.2-142-gfd5b15ff1a.

The word at the time from those more knowledgeable than me was that
the bad checksum problem was really specific to ISC's dhclient running
on Linux, and so once their fix was in use everywhere dhclient was
used, bad checksums would be a thing of the past and the
--checksum-fill iptables rules would no longer be needed (but would
otherwise be harmless if they were still there).

(Plot twist: the dhclient code in fix (1) above apparently is on a
Linux-only code path - this is very important later!)

Based on this information (and also due to the opinion that fixing it
by having iptables modify the packet checksum was really the wrong way
to permanently fix things, i.e. an "ugly hack"), the nftables
developers made the decision to not implement an equivalent to
--checksum-fill in nftables. As a result, when I wrote the nftables
firewall backend for libvirt virtual networks earlier this year, it
didn't add in any rule to "fix" broken UDP checksums (since there was
apparently no equivalent in nftables and, after all, that was fixed
somewhere else 14 years ago, right???)

But last week, when Rich Jones was doing routine testing using a Fedora
40 host (the first Fedora release to use the nftables backend of libvirt's
network driver by default) and a FreeBSD guest, for "some strange
reason", the FreeBSD guest was unable to get an IP address from DHCP!!

  https://www.spinics.net/linux/fedora/libvirt-users/msg14356.html

A few quick tests proved that it was the same old "bad checksum"
problem from 2010 come back to haunt us - it wasn't a Linux-only issue
after all.

Phil Sutter and Eric Garver (nftables people) pointed out that, while
nftables doesn't have an action that will *compute* the checksum of a
packet, it *does* have an action that will set the checksum to 0, and
suggested we try adding a "zero the checksum" rule for dhcp response
packets to our nftables ruleset. (Why? Because a checksum value of 0
in a IPv4 UDP packet is defined by RFC768 to mean "no checksum
generated", implying "checksum not needed").  It turns out that this
works - dhclient properly recognizes that a 0 checksum means "don't
bother with the checksum", and accepts the packet as valid.

So to once again fix this timeless bug, this patch adds such a
checksum zeroing rule to the nftables rules setup for each virtual
network.

This has been verified (on a Fedora 40 host) to fix DHCP with FreeBSD
and OpenBSD guests, while not breaking it for Fedora or Windows (10)
guests.

Fixes: b89c4991da
Reported-by: Rich Jones <rjones@redhat.com>
Fix-Suggested-by: Eric Garver <egarver@redhat.com>
Fix-Suggested-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-25 12:00:52 -04:00
Laine Stump
d5af1e90bb network: don't unset the firewalld zone if it's going to be immediately re-set
Any time the firewalld zone for an interface is set, by definition
that removes it from any previous zone that it was in, so there is
really no point in unsetting the zone if it's just going to be
immediately set again.

This is useful because when firewalld reloads its rules, 3 things happen:

1) firewalld flushes *all* firewall rules (including those added by libvirt)

2) firewalld unsets the zones for all interfaces (including those set
   by libvirt)

3) firewalld re-adds its own rules, and sets the zone for all the
   interfaces it manages

4) firewalld sends a dbus message that libvirt is watching for, and
   when libvirt receives that message, it reloads all of the
   libvirt-generated rules, and also re-sets the firewalld zone for
   the bridge interfaces managed by libvirt.

libvirt accomplishes step 4 by a) calling
networkRemoveFirewallRules(), and then b) calling
networkAddFirewallRules(). But (because it is useful in other
contexts) networkRemoveFirewallRules() will attempt to *unset* the
zone for each bridge interface, and when firewalld receives this
request, it sees that the bridge interface *has no zone* (because it
was unset by firewalld in step (2) above), and thus logs an error
message.

There is no way for libvirt to suppress an error message that is
logged by firewalld when a request to firewalld fails. But what
libvirt *can* do is realize that in these cases, the firewalld zone is
about to be set again anyway, and so we don't need to unset the zone.

This patch handles that by adding a bool unsetZone to the arguments of
networkRemoveFirewallRules(); most calls to networkRemoveFirewallRules()
have unsetZone=true, but in two cases where the zone is about to be
reset, networkRemoveFirewallRules() is called with unsetZone=false,
which prevents the call to virFirewallDInterfaceUnsetZone() and thus
avoids the unnecessary (and confusing to users!) error message that
would have been logged by firewalld.

Signed-off-by: Laine Stump <laine@redat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 12:31:03 -04:00
Laine Stump
e8228a9e79 network: ignore/don't log errors when unsetting firewalld zone
The most common "error" when trying to unset the firewalld zone of an
interface is for firewalld to tell us that the interface already isn't
in any zone. Since this is what we want, no need to alarm the user by
logging it as an error.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 11:50:41 -04:00
Jiri Denemark
f4dc248a95 domain_capabilities: Report CPU blockers
When a CPU model is reported as usable='no' an additional
<blockers model='...'> element is added for that CPU model to show which
features are missing for the CPU model to become usable.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 15:53:51 +02:00
Jiri Denemark
016be5510a domain_capabilities: Sort CPU models
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 15:53:51 +02:00
Jiri Denemark
0c6134f190 util: Introduce virStringListRemoveDuplicates
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 15:53:51 +02:00
Jiri Denemark
f928eb5fc8 qemu: Change CPU comparison algorithm for future models
When starting a domain we check whether the guest CPU definition is
compatible with the host (i.e., when the host supports all features
required both explicitly and by the specified CPU model) as long as
check == 'partial', which is the default.

We are doing so by checking our definition of the CPU model in the CPU
map amending it with explicitly mentioned features and comparing it to
features QEMU would enabled when started with -cpu host. But since our
CPU model definitions often slightly differ from QEMU we may be checking
features which are not actually needed and on the other hand not
checking something that is part of the CPU model in QEMU.

This patch changes the algorithm for CPU models added in the future
(changing it for existing models could cause them to suddenly become
incompatible with the host and domains using them would fail to start).
The new algorithm uses information we probe from QEMU about features
that block each model from being directly usable. If all those features
are explicitly disabled in the CPU definition we consider the base model
compatible with the host. Then we only need to check that all explicitly
required features are supported by QEMU on the host to get the result
for the whole CPU definition.

After this we only use the model definitions (for newly added models)
from CPU map for creating a CPU definition for host-model.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:45 +02:00
Jiri Denemark
e373f87034 qemu: Introduce virQEMUCapsGetCPUBlockers
A function for accessing a list of features blocking CPU model
usability.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
5f8abbb7d0 cpu: Introduce virCPUCompareUnusable
As opposed to the existing virCPUCompare{,XML} this function does not
use CPU model definitions from CPU map. It relies on CPU model usability
info from a hypervisor with a list of blockers that make the selected
CPU model unusable. Explicitly requested features are checked against
the hypervisor's view of a host CPU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
591b364f49 qemu: Separate partial CPU check into a function
The new qemuDomainCheckCPU function is used as a replacement for
virCPUCompare to make sure all callers use the same comparison
algorithm. As a side effect qemuConnectCompareHypervisorCPU now properly
reports CPU compatibility for CPU model that are considered runnable by
QEMU even if our definition of the model disagrees.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
52d2a8eb6c qemu: Use virCPUCompare in qemuConnectCompareHypervisorCPU directly
The function already parses CPU XML on s390. By parsing it consistently
on all architecture we can switch to virCPUCompare and easily replace it
with a QEMU specific helper in the following patch.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
1c45473b93 qemu: Use g_autoptr in qemuConnectCompareHypervisorCPU
Let's get rid of the only explicitly freed variable left in
qemuConnectCompareHypervisorCPU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
5475688a29 cpu: Introduce virCPUGetCheckMode
On x86 the function returns whether an old style compat check mode
should be used for a specified CPU model according to the CPU map. All
other architectures will always use compat mode.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
cd93f7ddab cpu_map: Use compat partial check for all x86 CPU models
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
f8ade72c2b cpu_x86: Introduce <check> element for CPU models
CPU models in the CPU map may be marked with <check partial="compat"/>
to indicate a backward compatible partial check (comparing our
definition of the model with the host CPU) should be performed. Other
models will be checked using just runnability info from QEMU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Peter Krempa
36080e1b57 qemu: snapshot: Delete leftover overlay files for <transient/> disks
When a VM is terminated by host reboot libvirt doesn't get to cleaning
out the temporary overlay file used for transient disks. Since we create
those files with a very specific suffix it's almost guaranteed that if
it exists it's a leftover from a libvirt run. Delete them instead of
complaining to preserve functionality.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/684
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-22 08:15:36 +02:00
Peter Krempa
7cbe9e94c4 util: bitmap: Rewrite virBitmapShrink using new helpers
Rather than reimplement everything manually use virBitmapBuffsize to
find the current number of units, realloc the buffer and clear the tail
using virBitmapClearTail().

This fixes a corner case where the buffer would be over-allocated by one
unit when shrinking to the boundary of the unit size.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Peter Krempa
e506e0b3f1 util: virbitmap: Extract clearing of unused bits at the end of the last unit
Extract the clearing of the traling bits from 'virBitmapSetAll' into a
new helper.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Peter Krempa
e572150ebe virbitmap: Extract and reuse buffer size calculation into a function
Calculating the number of element can come handy in multiple places,
extract it from virBitmapNew.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Peter Krempa
cfe638ef80 virBitmapNewCopy: Honor sizes of either bitmap when doing memcpy()
'virBitmapNewCopy()' allocates a new bitmap with the same number of bits
but uses the internal allocation length as argument for the memcpy()
operation to copy the bits. Due to bugs in other code these may not be
the same resulting into a buffer overflow if the source is
over-allocated. Use the buffer length of the target bitmap instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Martin Kletzander
f7c89763b1 qemu: Do not hardcode Hyper-V feature names on command line
When constructing the command line for QEMU, some Hyper-V features were
hardcoded, probably due to the fact that they could not have been
automatically translated from the libvirt feature name to QEMU CPU
feature name.

Well now they can be, thanks to their additions to the
virQEMUCapsCPUFeaturesX86 translation table.

Translate all such features the same way when constructing the command
line.  This way any future feature that is not translated will be caught
by tests (if a test is added for it) which was not the case when it was
just hardcoded.  Hopefully this avoids at least some possible future
issues.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 00:43:36 +02:00
Martin Kletzander
ca8c0862ac qemu: Add more translations to virQEMUCapsCPUFeatureTranslationTable
Hyper-V enlightenment features can have hyphenated names which libvirt
exposes under Hyper-V features with underscored names.  When libvirt
checks that all requested features were enabled by QEMU (on x86
architectures) it first queries for all those that QEMU knows and
compiles them in a map while using the virQEMUCapsCPUFeaturesX86 for
translations.

Some features (well, all Hyper-V features with underscores) were not
present in the translation table and were incorrectly reported as not
enabled, consequently failing the start of any such domain.

Add all hyphenated/underscored Hyper-V feature names into the
aforementioned translation table.  That way domains with these features
enabled can be started when QEMU and the kernel support them.

Resolves: https://issues.redhat.com/browse/RHEL-7122
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 00:43:35 +02:00
Adam Julis
0fd36e9656 lxc: fix variable storage order before call
virDomainConfNWFilterInstantiate() was called without updated
net->ifname, it caused in some cases throwing error message. If
function failed, change is reverted.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/658
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-16 16:30:19 +02:00
Martin Kletzander
f2710260d4 qemu_namespace: Only replicate labels on created files
Function qemuNamespaceMknodOne() is trying to replicate a file from the
parent namespace as perfectly as possible, with the same permissions,
labels, ACLs, etc.

If that file already existed it means that the qemu process is probably
using it already and the current setting is probably more correct than
the ones from the parent namespace.

In order to reflect that only replicate the file metadata when it was
(re-)created in this function.

Resolves: https://issues.redhat.com/browse/RHEL-62174
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-16 15:07:10 +02:00
Martin Kletzander
26f249034d qemu_namespace: Properly report new files
Function qemuNamespaceMknodOne() is supposed to return 0 if the file did
not exist before this function.  If, however, the file existed, but was
removed and recreated by this function the @existed flag should be reset
to its proper state (false) because the function then behaves the same
way as if the file did not exist as it needed to be recreated.

So reset the @existed flag to properly reflect what happened.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-16 15:07:10 +02:00
Martin Kletzander
2b19f4b82d qemu_namespace: Rename variable
The boolean actually tells whether the file existed when the function
was called and using it in more places later on makes them
confusing (e.g. do something with a file if it does not exist).  To
better reflect the above and prepare for next patch rename this
variable.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-16 15:07:10 +02:00
Peter Krempa
baa4edfb79 qemu: chardev: Use 'reconnect-ms' instead of deprecated 'reconnect'
qemu-9.2 will deprecate the 'reconnect' field in favor of
'reconnect-ms'. As libvirt currently doesn't track the timeouts in
milliseconds we simply convert them to avoid use of the deprecated
field.

Quite a lot of churn is caused by the need to plumb 'qemuCaps' into the
chardev props generator.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-16 14:39:09 +02:00
Peter Krempa
23fa1d2184 qemu: capabilities: Introduce QEMU_CAPS_CHARDEV_RECONNECT_MILISECONDS
New qemu introduced the 'reconnect-ms' field for character devices
allowing the reconnect timeout to be specified in milliseconds, which
also deprecates the existing 'reconnect' field that libvirt uses.

To avoid use of deprecated interfaces add a capability which will allow
us to use the new field.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-16 14:39:09 +02:00
Andrea Bolognani
81493d8eb6 apparmor: Allow running i686 VMs on Debian 12
In Debian 12, the qemu-system-i386 binary in /usr/bin is a wrapper
script, with the actual executable living in /usr/libexec instead.
This makes it impossible to run i686 VMs when AppArmor is enabled.

Allow running the actual binary.

https://bugs.debian.org/1030926

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-10-16 09:46:49 +02:00
Ján Tomko
e996536a3b Remove pointless bool conversions
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-15 14:48:35 +02:00
Peter Krempa
e2c6f4c800 qemu: snapshot: Remove dead code in 'qemuSnapshotDeleteBlockJobRunning'
'qemuSnapshotDeleteBlockJobIsRunning' returns only 0 and 1. Convert it
to bool and remove the dead code handling -1 return in the caller.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/682
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-14 16:25:21 +02:00
Peter Krempa
04d6a0ec5d qemu: migration: Fix blockdev config with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES
The idea of migration with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES
populated is to sparsify the image. The QEMU NBD client as it was
configured in commit 621f879adf would
signal to the destination to do thick allocation of holes which would
result in a non-sparse image for any backend except a qcow2 image which
I used to test it.

Switch to VIR_DOMAIN_DISK_DETECT_ZEROES_UNMAP and
VIR_DOMAIN_DISK_DISCARD_UNMAP which tells the NBD client (and that in
turn the NBD server) to preserve the sparse blocks it detected from the
image.

Fixes: 621f879adf
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-14 16:25:21 +02:00
Jiri Denemark
0c653fc9a5 util: Rename variable "major" in virIsDevMapperDevice
major() is a macro defined in sys/sysmacros.h so luckily the code works,
but it's very confusing. Let's rename the local variable to make the
difference between it and the macro more obvious. And while touching the
line we can also initialize it to make sure "clever" analyzers do not
think it may be used uninitialized.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-10-14 11:48:50 +02:00
Laine Stump
37800af9a4 network: inhibit idle timeout of daemon if there are any active networks
When the daemons were split out from the monolithic libvirtd, the
network driver didn't implement "inhibit idle timeout if there are any
active objects" as was done for other drivers, so virtnetworkd would
always exit after 120 seconds of no incoming connections. This didn't
every cause any visible problem, although it did mean that anytime a
network API was called after an idle time > 120 seconds, that the
restarting virtnetworkd would flush and reload all the
iptables/nftables rules for any active networks.

This patch replicates what is done in the QEMU driver - an nactive is
added to the network driver object, along with an inhibitCallback; the
latter is passed into networkStateInitialize when the driver is
loaded, and the former is incremented for each already-active network,
then incremented/decremented each time a network is started or
stopped. If nactive transitions from 0 to 1 or 1 to 0, inhibitCallback
is called, and it "does the right stuff" to prevent/enable the idle
timeout.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-10 14:07:12 -04:00
Jim Fehlig
d721b6840f libxl: Reject VM config referencing nwfilters
The Xen libxl driver does not support nwfilter. Introduce a
deviceValidateCallback function with a check for nwfilters, returning
VIR_ERR_CONFIG_UNSUPPORTED if any are found. Also fail to start any
existing VMs referencing nwfilters.

Drivers generally ignore unrecognized XML configuration, but ignoring
a user's request to filter VM network traffic can be viewed as a
security issue.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-10 08:39:12 -06:00
Laine Stump
c0ba3ed69d network: a different implementation of *un*setting firewalld zone when network is destroyed
(this is a remake of commit v10.7.0-78-g200f60b2e1, which was reverted
due to a regression in another patch it was dependent on. The new
implementation just adds the call to virFirewallDInterfaceUnsetZone()
into the existing networkRemoveFirewallRules() (but only if we had set
a zone when the network was first started).

Replaces: 200f60b2e1
Resolves: https://issues.redhat.com/browse/RHEL-61576
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
cb4e38d4b1 network: a different way of supporting firewalld zone for mode='open' networks
Now that networkAddFirewallRules and networkRemoveFirewallRules() are
being called for mode='open' networks, we just need to move the code
that sets the zone outside of the if (mode != ...OPEN) clause, so that
it's done for all forward modes, with the exception of setting the
implied 'libvirt*' zones, which are set when no zone is specified for
all forward modes *except* 'open'.

This was previously done in commit v10.7.0-76-g1a72b83d56, but in a
manner that caused the zone to be unset whenever firewalld reloaded
its rules. That patch was reverted, and this new better patch takes
its place.

Replaces: 1a72b83d56
Resolves: https://issues.redhat.com/browse/RHEL-61576
Re-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
d552d810b9 network: call network(Add|Remove)FirewallRules() for forward mode='open'
Previously networkAddFirewallRules() and networkRemoveFirewallRules()
were only called if the forward mode was none, 'route', or 'nat', so
those functions didn't check the forward mode. Although their current
contents shouldn't be executed for forward mode='open', soon they will
have extra functionality that should be executed for all the current
forward modes and also mode='open'.

This patch modifies all places either of the functions are called to
make sure they are called for mode='open' in addition to current modes
(by either adding 'case ..._OPEN:' to the case of a switch statement,
or just removing an 'if (mode != ...OPEN)' around the calls; to
balance out for that, it puts the entirety of the contents of both
functions inside if (mode != ...OPEN) to retain current behavior. (an
upcoming patch will add code outside that if clause).

debug log messages were also added to make it easier to test that the
right thing is being done in all cases.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
ef760a4133 Revert "network: support setting firewalld zone for bridge device of open networks"
This reverts commit 1a72b83d56. That
patch had made the incorrect assumption that the firewalld zone of a
bridge would not be changed/removed when firewalld reloaded its rules
(e.g. with "killall -HUP firewalld"). It turns out my memory was
faulty, and this *does* remove the bridge interface's zone, which
results in guest networking failure after a firewalld reload, until
the virtual network is restarted.

The functionality reverted as a result of this patch reversion will be
added back in an upcoming patch that keeps the zone setting in
networkAddFirewallRules() (rather than moving it into a separate
function) so that it is called every time the network's firewall rules
are reloaded (including the reload that happens in response to a
reload notification from firewalld).

Signed-off-by: Laine Stump
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
816876f517 Revert "network: *un*set the firewalld zone while shutting down a network"
This reverts commit 200f60b2e1. The same
functionality will be re-added in a different way in an upcoming patch.

Signed-off-by: Laine Stump
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Jim Fehlig
bd6d7ebf62 qemu: Use consistent naming for save image format
The image format setting in qemu.conf is named 'save_image_format'. The
enum of supported format types is declared with name 'virQEMUSaveFormat'.
Let's be consistent and use 'format' instead of 'compressed' when referring
to the save image format.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-09 13:48:39 -06:00
Jim Fehlig
b0dc8a923d qemu: conf: Improve the foo_image_format setting descriptions
The current description of the various foo_image_format settings can
be construded to imply the setting is only used to control compression
of the image. Improve the documentation to clarify that format describes
the representation of guest memory blocks on disk, which includes
compression among other possible layouts.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-09 13:48:38 -06:00
Peter Krempa
aa08a30048 qemu: snapshot: Allow internal snapshots with PFLASH nvram
With the new snapshot QMP command we can select which block device
backend receives the VM state and thus the main issue with internal
snapshots with pflash was addressed.

Thus we can relax the check and allow snapshots if the pflash nvram is
on qcow2.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Peter Krempa
8be8b7de78 qemuSnapshotActiveInternalDeleteGetDevices: Add warning when deleting inconsistent snapshot
As explained in the commit which added the new internal snapshot
deletion code we don't want to do any form of strict checking whether
the libvirt metadata is consistent with the on-disk state as we didn't
historically do that.

In order to be able to spot the cases add a warning into the logs if
such state is encountered. While warnings are easy to miss it's the only
reasonable way to do that. Users will be encouraged to file an issue
with the information, without requiring them to enable debug logs as
the reproduction of that issue may include very old historical state.

The checker is deliberately added separately so that it can be easily
reverted once it's no longer needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Peter Krempa
eac1a86f72 qemu snapshot: use QMP snapshot-delete for internal snapshots deletion
Switch to using the modern QMP command.

As the user visible logic when deleting internal snapshots using the old
'delvm' command was very lax in terms of catching inconsistencies
between the snapshot metadata and on-disk state we re-implement this
behaviour even using the new command. We could improve the validation
but that'd go at the cost of possible failures which users might not
expect.

As 'delvm' was simply ignoring any kind of failure the selection of
devices to delete the snapshot from is based on querying qemu first
which top level images do have the internal snapshot and then continuing
only on those.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel
b93af62c40 qemu snapshot: use QMP snapshot-save for internal snapshots creation
The usage of HMP commands are highly discouraged by qemu. Moreover,
current snapshot creation routine does not provide flexibility in
choosing target device for VM state snapshot.

This patch makes use of QMP commands snapshot-save and by
default chooses first writable non-shared qcow2 disk (if present)
as target for VM state.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Peter Krempa
6d8ae98fa0 qemu: monitor: Store internal snapshot names from 'query-named-block-nodes'
Store the names of internal snapshots present in supported images in the
data we dump from 'query-named-block-nodes' so that the upcoming changes
to the internal snapshot code can access it.

To test this we use the bitmap detection test cases which can be easily
extended to dump this data.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel
9df1453db8 qemu: capabilities: Introduce QEMU_CAPS_SNAPSHOT_INTERNAL_QMP capability
The 'snapshot-save/delete' QMP commands were introduced in QEMU 6.0.0,
so we add a compatible capability to check if target QEMU binary supports it.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel
ce4ed8deef qemu: blockjob: Add job types for 'snapshot-save/delete'
The snapshot creation/deletion QMP commands use the qemu 'job' API
to signal completion thus we need to add corresponding job types.

As the job handles everything internally we don't store anything about
the job.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel
5d0773633a qemu: monitor: Add plumbing for 'snaphot-save'/'snapshot-delete' QMP commands
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Peter Krempa
2e325804cc qemuDomainObjWait: Annotate with G_GNUC_WARN_UNUSED_RESULT
Callers must handle the return value of this function as the VM might
have died. Add compiler annotation to force it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Jiri Denemark
93d97d8fa2 cpu_map: Drop vmx-invvpid-single-context from CPU models
QEMU calls the same feature differently, but translating the names in
libvirt does not make sense because the name in QEMU conflicts with
another feature. QEMU will not change the name for compatibility reasons
so we can just drop our invented name as it is not supported by QEMU.
Apart from this slightly different reason behind the feature being
unsupported by QEMU the situation is similar to vmx-ept-{uc,wb} dropped
in the previous patch and so is the implications.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
b1d4196580 cpu_map: Drop vmx-ept-{uc,wb} features from CPU models
Although QEMU knows and enables the corresponding MSR bits, it does not
allow users to configure them (there are no names attached to them).
They should have never been added to the CPU map and definitely not to
CPU models as the features will always be considered disabled regardless
on their actual state as QEMU will not report them.

While we cannot drop them completely for backward compatibility, we can
at least remove them from all CPU models.

This is effectively no change for CPU models where the features were
marked with added='yes' because migration source would always remove the
features from domain XML so not adding them to the live XML does not
hurt. On the other side the destination could not ever be surprised by
the features being suddenly enabled as QEMU never reports them, which
means libvirt considers them disabled all the time.

GraniteRapids CPU model is the only one which contains the feature ever
since it was introduced in libvirt, but it was never possible to migrate
a domain with such CPU. The source would always mark vmx-ept-wb as
disabled and the destination without the fixes in this series would drop
the feature from the XML completely as it is unsupported by QEMU and
disabled, but when probing for the actual CPU created by QEMU libvirt
would expect the feature to be enabled (as it is included in the CPU
model and not explicitly mentioned in the domain definition) and fail
the migration. There's nothing the source could do to workaround the
behavior on the destination and migration to older libvirt will still be
broken. But it's possible to migrate a domain with GraniteRapids to a
destination with this series applied from both old and new source.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
29aa9b02aa qemu: Replace big condition in virQEMUCapsCPUFilterFeatures with array
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
98700d354b qemu: Translate vmx-invvpid-single-context-noglobals CPU feature
This feature is called "vmx-invept-single-context-noglobals" in QEMU and
our CPU map even contains the appropriate alias. But we failed to
actually translate the name when talking to QEMU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
00e55059e6 qemu: Do not drop unknown CPU features from domain XML
CPU features with policy='disable' which are unknown to QEMU may be
safely skipped when generating the -cpu command line, but we should
still keep them in the domain definition so that we can properly check
they are disabled after migrating the domain to a newer QEMU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
aae8a5774b qemu: Drop vmx-* from migratable CPU model only when origCPU is set
When qemuDomainMakeCPUMigratable is called with origCPU == NULL the code
just removed all vmx-* features marked as added in the specified CPU
model just like when origCPU is not NULL, but does not list any of the
vmx-* features. But this is wrong, we should not touch these features at
all when no origCPU is supplied, which happens when parsing XML passed
by a user (e.g., migration XML). Such XML is supposed to be generated by
libvirt as migration XML and contains only vmx-* features explicitly
requested by a user.

https://issues.redhat.com/browse/RHEL-52314

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:50 +02:00
Martin Kletzander
215cada343 util: Look for newer name of cpu wait time statistic
It looks like linux changed the key for wait time in /proc/<pid>/sched
and /proc/<pid>/task/<tid>/sched files in commit ceeadb83aea2 (or around
that time) from se.statistics.wait_sum to just wait_sum.  Similarly to
the previous change (from se.wait_sum) just look for the new name first.

Resolves: https://issues.redhat.com/browse/RHEL-60030
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 20:43:07 +02:00
Andrea Bolognani
7d6759135e qemu: Handle locking of TPM state directory for incoming migration
By not attempting to lock the lock file, which would fail.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 14:50:06 +02:00
Andrea Bolognani
454219ad6c security: Allow skipping locking when labeling lock files
This is needed when migrating a guest that has persistent TPM
state: relabeling (which implies locking) needs to happen
before the swtpm process is started on the destination host,
but the lock file won't be released by the swtpm process
running on the source host before a handshake with the target
process has happened, creating a catch-22 scenario.

In order to make migration possible, make it so that locking
for lock files can be explicitly skipped. All other state
files are handled as usual.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:59 +02:00
Andrea Bolognani
8fe803247e security: Always forget labels for TPM state directory
In the case of outgoing migration, we avoid restoring the
remembered labels for the TPM state directory because doing so
would risk cutting off storage access for the target node.

Even in that case though, we should still forget (unref) the
remembered labels: if we don't, the source node will keep
thinking that the state directory is in use.

Note that this change only affects the SELinux driver because
the DAC driver doesn't currently implement label remembering
for TPM state at all.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:56 +02:00
Peter Krempa
3bfcb35dd5 qemu: migration: Don't remember seclabel for images shared from current host
In case when the user exports images from current host and there is an
incoming migration from a remote host, security label remembering would
be possible but would attempt to remember the label allowing access to
the image as the image is already used by a VM on remote host.

To prevent remembering the wrong label, we'll skip the remembering of
the label for any shared resource, so that the code behaves identically
regardless of how the image is accessed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-03 13:29:26 +02:00
Peter Krempa
b581045520 storage_source: Add field for skipping seclabel remembering
In case of incoming migration where a local directory is shared to other
hosts we'll need to avoid seclabel remembering as the code would
remember the seclabel already allowing access to the image.

As the decision requires a lot of information not available in the
security driver it would either require plumbing in unpleasant callbacks
able to pass in the data or alternatively we can mark this in the
'virStorageSource' struct.

This patch chose to do the latter approach by adding a field called
'seclabelSkipRemember' which will be filled before starting the process
in cases when it will be required.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-03 13:29:26 +02:00
Peter Krempa
eabeae605f security_(dac|selinux): Unref remembered security labels on outgoing migration
When 'qemuSecurityRestoreAllLabel' is called on outgoing migration it
skips the actual relabeling part of the images in dac/selinux drivers in
order to avoid cutting off access to the image.

As shared filesystems don't really support the trusted XATTR groups,
remembering of security labels never worked on those paths so we never
actually had remembered seclabels for images that could be migrated.

With recent changes we now support migration from local storage to
remote in case the admin declares it as shared. This means that in case
when the VM is started on local storage we'd actually store seclabels,
but when migrating out the XATTRs remembering the seclabels would not
actually be unref'd and thus the seclabels would leak.

As we can't know whether a remote host will be able to use the XATTRs or
not (but really it won't) and at the same time the destination side of
migration will actually call 'qemuSecuritySetAllLabel' setting/refing
it's own seclabels we really need to unref them on our side.

This patch adds the appropriate *RecallLabel() calls on the code paths
in which relabelling is skipped due to migration.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-03 13:29:26 +02:00
Peter Krempa
2983dd44c5 virSecuritySELinuxRestoreImageLabelInt: Move FD image relabeling after 'migrated' check
Reorganize the code so that the 'migrated' flag isn't checked multiple
times and thus that it's more obvious what is happening when the
'migrated' flag is asserted.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-03 13:29:26 +02:00
Peter Krempa
568b3c6abe virParseOwnershipIds: Refactor
Use automatic clearing for temporary variable, remove 'cleanup' label
and declare parameters according to new coding style rules.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-03 13:29:26 +02:00
Peter Krempa
7af0b6ea75 virFileIsSharedFSOverride: Export
Document the function and export it for use outside of the 'virfile'
utils module.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-03 13:29:26 +02:00
Andrea Bolognani
da0c363835 qemu: Always set labels for TPM state
Up until this point, we have avoided setting labels for
incoming migration when the TPM state is stored on a shared
filesystem. This seems to make sense, because since the
underlying storage is shared surely the labels will be as
well.

There's one problem, though: when a guest is migrated, the
SELinux context for the destination process is different from
the one of the source process.

We haven't hit any issues with the current approach so far
because NFS doesn't support SELinux, so effectively it doesn't
matter whether relabeling happens or not: even if the SELinux
contexts of the source and target processes are different,
both will be able to access the storage.

Now that it's possible for the local admin to manually mark
exported directories as shared filesystems, however, things
can get problematic.

Consider the case in which one host (mig-one) exports its
local filesystem /srv/nfs/libvirt/swtpm via NFS, and at the
same time bind-mounts it to /var/lib/libvirt/swtpm; another
host (mig-two) mounts the same filesystem to the same
location, this time via NFS. Additionally, in order to
allow migration in both directions, on mig-one the
/var/lib/libvirt/swtpm directory is listed in the
shared_filesystems qemu.conf option.

When migrating from mig-one to mig-two, things work just fine;
going in the opposite direction, however, results in an error:

  # virsh migrate cirros qemu+ssh://mig-one/system
  error: internal error: QEMU unexpectedly closed the monitor (vm='cirros'):
  qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f
  qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator'
  qemu-system-x86_64: load of migration failed: Input/output error

This is because the directory on mig-one is considered a
shared filesystem and thus labeling is skipped, resulting in
a SELinux denial.

The solution is quite simple: remove the check and always
relabel. We know that it's okay to do so not just because it
makes the error seen above go away, but also because no such
check currently exists for disks and other types of persistent
storage such as NVRAM files, which always get relabeled.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:26 +02:00
Andrea Bolognani
f7b9313ec7 utils: Use overrides in virFileIsSharedFS()
If the local admin has explicitly declared that a certain
filesystem is to be considered shared, we should treat it as
such.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:26 +02:00
Andrea Bolognani
6952af8b43 qemu: Propagate shared_filesystems
virFileIsSharedFS() is the function that ultimately decides
whether a filesystem should be considered shared, but the list
of manually configured shared filesystems is part of the QEMU
driver's configuration, so we need to pass the information
through several layers in order to make use of it.

Note that with this change the list is propagated all the way
through, but its contents are still ignored, so the behavior
remains the same for now.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:26 +02:00
Andrea Bolognani
df3597ee70 qemu: Introduce shared_filesystems configuration option
As explained in the comment, this can help in scenarios where
a shared filesystem can't be detected as such by libvirt, by
giving the admin the opportunity to provide this information
manually.

https://issues.redhat.com/browse/RHEL-35752

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:25 +02:00
Andrea Bolognani
5ea466648c security: Fix alignment
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-03 13:29:25 +02:00
John Levon
c6ba83b3e4 test_driver: provide basic NIC hotunplug support
Provide minimal support for hotunplugging ETHERNET or BRIDGE type NICs
in the test driver.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-03 09:10:23 +02:00
John Levon
cda4ee02a6 test_driver: provide basic NIC hotplug support
Provide minimal support for hotplugging ETHERNET or BRIDGE type NICs in
the test driver.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-03 09:10:22 +02:00
Han Han
3b296a98aa domain_validate: Validate dma_translation for iommu models
The attribute dma_translation is only supported by intel-iommu device.
Report an error when it is used for the other iommu devices.

Fixes: 6866f958c1

Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-02 16:06:51 +02:00
Rayhan Faizel
8105426d8f libxl_conf: Add check for unsupported graphics type
libxlMakeVfb always succeeds regardless of if the graphics type is
actually supported or not.

libxl_defbool_val is called in libxlMakeBuildInfoVfb which besides returning
the boolean value of the defbool also has an assertion that the defbool value
is not set to default. It is possible to fail this assertion if an
unsupported graphics type is used. In libxlMakeVfb, the VNC and SDL enable
defbools are still left in their default state if the graphics type falls
outside the two, which leads to this issue.

This patch adds a check to reject graphics types outside of SDL, VNC, and SPICE
very early on in libxlMakeVfb. As a safeguard, we also initialize both vnc
enable and sdl enable defbools as false early.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-02 16:06:51 +02:00
Rayhan Faizel
cb2a6ef8b5 libxl_conf: Fix config generation for multiple serial devices
Currently, an array of libxl_string_list (char **) or in other words,
a triple char pointer is initialized. This is dereferenced to a char ** type
and stored in serial_list, which is NULL at this point. There is an attempt to
reference an element of this serial_list when making a call to
libxlMakeChrdevStr which causes a segmentation fault.

To fix this, we simply allocate an array of char * instead of
libxl_string_list.

This patch also adds testcases to extend coverage over both single serial and
multiple serial cases.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-02 16:06:50 +02:00
Peter Krempa
621f879adf qemu: Introduce and wire in 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES'
The new 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES' migration
parameter allows users of migration to pass in a list of disks where
zero-detection (which avoids transferring the zeroed-blocks) should be
enabled for the migration connection. This comes at the cost of extra
CPU cycles needed to check each block if it's all-zero.

This is useful for storage backends where information about the
allocation state of a block is not available and thus without this the
image would become fully allocated on the destination.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
448b14f74d qemu: migration: Remove 'nmigration_disks' variable from all places
Now that none of the functions need it we can drop it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
aaefaabf5a qemu: migration: Extract validation of disk target list
The migration code is checking the disk list provided via
VIR_MIGRATE_PARAM_MIGRATE_DISKS against existing disks. Extract it to a
helper function as we'll be passing another list of disk targets soon.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
4ebf1acb83 qemu: migration: Avoid use of 'nmigration_disks'
'migration_disks' is a NULL-terminated string list, so the code can be
converted to either iterate the string-list, use existing accessors or
check the presence of the pointers instead of checking the count.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
d98beef107 qemu: migration: Don't log 'nmigrate_disks'
The actual number of disks to migrate is not important. The presence of
disks to migrate can be inferred from presence of the 'migrate_disks'
pointer which is logged.

Since 'nmigrate_disks' will eventually be removed remove the logging
right now.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
ab52a069ee qemuMigrationSrcBeginPhaseBlockDirtyBitmaps: Use qemuMigrationAnyCopyDisk()
The function open-coded the checking whether a disk is being migrated
with non-shared storage and did so badly (not taking into account if
user doesn't explicitly provide list of disks to migrate).

Use the existing helper instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
9bf319147c virTypedParamsGetStringList: Ensure that returned string list is NULL-terminated
This can simplify callers who don't really need to know the number of
elements to check that a particular element is present.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
7933310ce9 virTypedParamsGetStringList: Ensure that returned array is NULL if there are no matching fields
'virTypedParamsGetStringList' fills the returned array only with string
parameters with matching name. The filtering code though leaves the
possibility that all items are filtered out but the return array is
still (over)allocated.

Since 'virTypedParamsFilter()' now also allows filtering by type we can
move the filtering there ensuring that we always allocate the right
number of elements and more importantly the returned array will be NULL
if none elements are present.

Rework the code and adjust docs.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
b74fed0173 virTypedParamsFilter: Introduce option to filter also by type
The only caller of this function is doing some additional filtering so
it's useful if the filtering function was able to do so internally.

Introduce a 'type' parameter which will optionally filter the results by
type and extend the testsuite to cover this scenario.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
e5fae984b1 virTypedParamsGetStringList: Refactor and adjust docs
Use automatic freeing, declare one variable per line and return early
when possible. As this is an internal helper there's no need to check
that the caller passed non-NULL @values.

Modify the documentation to be accurate and warn callers to not free the
strings just the array.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
933ab93e8f virTypedParamsFilter: Adjust return type and docs
The 'virTypedParamsFilter' function can't fail and thus it never returns
negative values. Change the return type to 'size_t' and adjust callers
to not check the return value for being negative.

Adjust the docs to hilight this and also the fact that the filtered
typed param list returned via @ret is not a deep copy and thus callers
must not use the common function to free it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
165b30e06a qemu: migration: Pre-create QCOW2 images for non-shared storage with 0 allocation
Specify that the <allocation> parameter for the newly-created qcow2
image is 0 so that only metadata gets preallocated. Otherwise the
storage driver code instructs qemu to use 'fallocate' preallocation mode
and considers the image fully allocated.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:57:02 +02:00
Peter Krempa
54109db826 qemu: blockjob: Clean out disk mirror data after concluding the job
The 'disk->mirrorJob' and 'disk->mirrorState' fields need to be cleared
after a blockjob, but should be kept around while 'disk->mirror' is
still in place. As 'disk->mirror' is cleared only after conclusion of
the job in 'qemuBlockJobEventProcessConcluded()' we should be resetting
them only afterwards.

Move the code later, but since the job is unregistered from the disk we
need to store the pointer to the disk before concluding the job.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:54:40 +02:00
Peter Krempa
b85b60d140 qemu: blockjob: Update 'mirror' of a copy job before removing images
When concluding a job with a 'mirror' we first unplugged the appropriate
no-longer used images from qemu and then updated the definition.

Normally this wouldn't be a problem because for any other thread this is
done under the VM lock thus atomic. Unfortunately though, the AppArmor
security backend is using a VM XML to pass data to the helper process
and the state of the definition at that point was unsuitable to format a
valid XML thus making 'virt-aa-helper' report parsing failure.

Since we're removing the images the proper state of the VM definition
indeed should not include the mirror element any more at the point when
the images are removed.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/601
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-01 12:54:40 +02:00
Laine Stump
bcd5ae4e73 qemu: fix regression in update-device for interfaces
Commit a37bd2a15b eliminated a failure
to update *any* change in an interface that was connected via a
network that consisted of a pool of VFs using macvtap passthrough
mode. Unfortunately it caused a regression that results in failure to
update changes to bandwidth/vlan/trustGuestRxFilters in any interface
connected via a network that uses a bridge to connect tap devices.

This fixes that problem by narrowing the usage of the fix in the
earlier patch to only be done in the case that the the interface is
connected via a macvtap+passthrough network.

Signed-off-by: Laine Stump <laine@redhat.com>
Fixes: a37bd2a15b
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-01 10:25:12 +02:00
Andrea Bolognani
55c3c09197 qemu: Look for qemu-bridge-helper in more directories
Commit 0caacf47d7 recently
made it so the new path used for qemu-bridge-helper in Debian
would be allowed, but the logic used to actually figure out
the complete path for the helper was not updated accordingly.

https://bugs.debian.org/1082530

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-30 10:58:15 +02:00
Jiri Denemark
f527da37be cpu_map: Fix SierraForest CPU model
The model was defined with two CPU features that cannot be explicitly
configured in QEMU (it knows the MSR bits, but there's no name
associated with them). The features should have never existed in the CPU
map. While removing them from the list of features and existing CPU
models is not trivial (to avoid compatibility issues), we can at least
fix the SierraForest CPU model added in this release cycle.

The rest will be handled later in a separate series.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-25 09:57:48 +02:00
Cole Robinson
785dfad13c rpc: ssh: Allow SSH_ASKPASS_REQUIRE
openssh 8.4p1 released in Sep 2020 added a feature to force use
of SSH_ASKPASS

https://man.openbsd.org/ssh.1#SSH_ASKPASS_REQUIRE

Don't strip it from the environment

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-09-25 08:46:52 +02:00
Michal Privoznik
6126f743b1 qemu: Provide sane default for dump_guest_core
QEMU uses Linux extensions to madvise() to include/exclude guest
memory from core dump. These are obviously not available
everywhere. Currently, users have two options:

  1) configure <memory dumpCore=''/> in domain XML, or
  2) configure dump_guest_core in qemu.conf

While these work, they may harm user experience as "things just
don't work" out of the box. Provide sane default in
virQEMUDriverConfigNew() so neither of two options is required.

To have predictable results in tests, explicitly set
cfg->dumpGuestCore to false in qemuTestDriverInit() (which
creates cfg object for tests).

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/679

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-25 08:38:09 +02:00
Michal Privoznik
18b61cb4f9 qemu.conf.in: Fix dumpCore capitalization
In qemu.conf.in we give examples of enabling/disabling core
dumps in domain XML. But the attribute is spelled wrong.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-25 08:38:09 +02:00
Martin Kletzander
6f0974ca32 qemu: Generate domain memory backing path directly
This makes qemuDomainGenerateMemoryBackingPath() nicer to call.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
f035f24777 qemu: Rename memory path functions
This way they make sense not only based on where they are located but
the name also relates to what they are actually doing.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
d599fc3d57 qemu: Make qemuGetMemoryBackingDomainPath static
After previous patches it is not used (and should not be used) outside
of qemu_domain.c.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
ff49d2a8c2 qemu: Use per-domain private memoryBackingDir for new memory backends
The function qemuGetMemoryBackingPath() does not need the @def any more
and priv->memoryBackingDir can be used instead of constructing the path
by calling qemuGetMemoryBackingDomainPath().

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
f58a4dc9d5 qemu: Set memoryBackingDir in private data upon start
This way we keep the path for each running VM.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
da8a1d7943 qemu: Add memoryBackingDir to qemuDomainObjPrivate
This way we _can_ (but do not, yet) remember the memory backing path for
running domains even after configuration change and daemon restart.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
c9a35eb255 qemu: Change parameters of qemuGetMemoryBackingDomainPath()
This way it does not use driver, since it will be later reworked and the
following patches cleaner, hopefully.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
edcf14be9c qemu: Move domain-related functions to qemu_domain
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Ján Tomko
81e532c701 util: json: remove yajl implementation
Since the previous commit removed YAJL detection completely,
WITH_YAJL cannot possibly be set. Drop the code.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
d96e753d84 meson: options: drop yajl
Drop the yajl option and all references to it.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
9e6555fd90 util: json: write a json-c implementation
Write an alternative implementation of our virJSON functions,
using json-c instead of yajl.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
28c9872639 meson: switch checks to depend on json-c as well as yajl
Ensure both are required during this series to make bisecting smooth.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
330cf7f492 util: json: introduce virJSONStringPrettifyBlanks
A horribly named function for unifying formatting when pretty-printing
empty JSON arrays and objects. Useful for having stable test output
even if different JSON libraries format these differently.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Laine Stump
e14a5fcac4 util: use uint32 instead of char[4] for several virSocketAddrIPv4 operations
These 3 functions are easier to understand, and more efficient, when
the IPv4 address is viewed as a uint32 rather than an array of bytes.

virsocketAddrGetIPv4Addr() has bothered me for a long time - it was
doing ntohl of the address into a temporary uint32, and then a loop
one-by-one swapping the order of all the bytes back to network
order. Of course this only works as described on little-endian
architectures - on big-endian architectures the first assignment won't
swap the bytes' ordering, but the loop assumes the bytes are now in
little-endian order and "swaps them back", so the result will be
incorrect. (Do we not support any big-endian targets that would have
exposed this bug long before now??)

virSocketAddrCheckNetmask() was checking each byte of the two
addresses individually, when it could instead just do the operation
once on the full 32 bit values.

virSocketGetRange() was checking for "range > 65535" by seeing if the
first 2 bytes of the start and end were different, and then doing
arithmetic combining the lower two bytes (along with necessary bit
shifting to account for network byte order) to determine the exact
size of the range. Instead we can just get the ntohl of start & end,
and do the math directly.

Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-21 15:06:09 -04:00
Laine Stump
009464902a util: make virSocketAddrIPv4 a union
virSocketAddrIPv4 is a type used only internally by
virsocketaddr.c. It is defined to be a character array, which leads to
multiple occurences of extra bit fiddling and byte swapping for no
good reason (except to confuse).

An IPv4 address is really just a uint32_t with the bytes in network
order, which is exactly the type of the s_addr member of the
sockaddr_in that is a part of the publicly consumed struct
virSocketAddr, and that we are copying in and out of a
virSocketAddrIPv4. Sometimes it's simpler to just treat it as a
network-order uint32_t, so let's make our virSocketAddrIPv4 a union
that has both an unsigned char bytes[4] (for the times when we need to
look one byte at a time) and a uint32_t val (for the times when it's
simpler to treat it as a single value).

For now we just change all the uses from, e.g. x[i] to x.bytes[y];
an upcoming patch will simplify some of the code to remove loops by
using x.val instead of x.bytes when appropriate.

Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-21 14:39:05 -04:00
Laine Stump
14623a3424 util: fix virSocketAddrMask() when source and result are the same object
Many years ago (2011), virSocketAddrMask() had caused a bug by failing
to initialize an IPv6-specific field in the result virSocketAddr. This
was fixed by memset(0)ing the entire result (*network) at the
beginning of the function (thus making sure anything and everything
was initialized).

The problem is that virSocketAddrMask() has a comment above it that
says that the source (addr) and destination (network) arguments can
point to the same virSocketAddr. But in that case, the
memset(*network, 0) at the top of the function is actually doing a
memset(*addr, 0), and so there is nothing left for all the assignments
to copy except a giant field of 0's.

Fortunately in the 13 years since the memset was added, nobody has
ever called virSocketAddrMask() with addr and network being the same.

This patch makes the code agree with the comment by copying/masking
into a local virSocketAddr (which is initialized to all 0) and then
copying that to *network after it's finished assigning things from
addr.

Fixes: ba08c5932e
Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-21 14:37:54 -04:00
Laine Stump
f7a2d158f7 network: fix argument order/log level in message about firewall_backend
Oops.

Fixes: 64b966558c
Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-19 16:14:21 -04:00
Laine Stump
c7ea694f7d qemu: rework needBridgeChange/needReconnect decisions in qemuDomainChangeNet()
This patch simplifies (?) the of qemuDomainChangeNet() code while
fixing some incorrect decisions about exactly when it's necessary to
re-attach an interface's bridge device, or to fail the device update
(needReconnect[*]) because the type of connection has changed (or
within bridge and direct (macvtap) type because some attribute of the
connection has changed that can't actually be modified after the
tap/macvtap device of the interface is created).

Example 1: it's pointless to require the bridge device to be
reattached just because the interface has been switched to a different
network (i.e. the name of the network is different), since the new
network could be using the same bridge as the old network (very
uncommon, but technically possible). Instead we should only care if
the name of the *bridge device* changes (or if something in
<virtualport> changes - see Example 3).

Example 2: wrt changing the "type" of the interface, a change should
be allowed if old and new type both used a bridge device (whether or
not the name of the bridge changes), or if old and new type are both
"direct" *and* the device being linked and macvtap mode remain the
same. Any other change in interface type cannot be accommodated and
should be a failure (i.e. needReconnect).

Example 3: there is no valid reason to fail just because the interface
has a <virtualport> element - the <virtualport> could just say
"type='openvswitch'" in both the before and after cases (in which case
it isn't a change by itself, and so is completely acceptable), and
even if the interfaceid changes, or the <virtualport> disappears
completely, that can still be reconciled by simply re-attaching the
bridge device. (If, on the other hand, the modified <virtualport> is
for a type='direct' interface, we can't domodify that, and so must
fail (needReconnect).)

(I tried splitting this into multiple patches, but they were so
intertwined that the intermediate patches made no sense.)

[*] "needReconnect" was a flag added to this function way back in
2012, when I still believed that QEMU might someday support connecting
a new & different device backend (the way the virtual device connects
to the host) to an already existing guest netdev (the virtual device
as it appears to the guest). Sadly that has never happened, so for the
purposes of qemuDOmainChangeNet() "needReconnect" is equivalent to
"fail".

Resolves: https://issues.redhat.com/browse/RHEL-7036
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:39 -04:00
Laine Stump
601f4160b9 qemu: replace open-coded remove/attach bridge with virNetDevTapReattachBridge()
The new function does what the old qemuDomainChangeNetbridge() did
manually, except that:

1) the new function supports changing from a bridge of one type to
   another, e.g. from a Linux host bridge to an OVS
   bridge. (previously that wasn't handled)

2) the new function doesn't emit audit log messages. This is actually
   a good thing, because the old code would just log a "detach"
   followed immediately by "attach" for the same MAC address, so it's
   essentially a NOP. (the audit logs don't have any more detailed
   info about the connection - just the VM name and MAC address, so it
   makes no sense to log the detach/attach pair as it's not providing
   any information).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:31 -04:00
Laine Stump
e3f8bccea6 util: don't return early from virNetDevTapReattachBridge() if "force" is true
It can be useful to force an interface to be detached/reattached from
its bridge even if it's the same bridge - possibly something like the
virtualport profileID has changed, and a detach/attach cycle will get
it connected with the new profileID.

The one and only current use of virNetDevTapReattachBridge() sets
force to false, to preserve current behavior. An upcoming patch will
use it with force set to true.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:19 -04:00
Laine Stump
a37bd2a15b qemu: prevent unnecessarily failing live interface update
Attempts to use update-device to modify just the link state of a guest
interface were failing due to a supposed attempt to modify something
in the interface that can't be modified live (even though the only
thing that was changing was the link state, which *can* be modified
live).

It turned out that this failure happened because the guest interface
in question was type='network', and the network in question was a
'direct' network that provides each guest interface with one device
from a pool of network devices. As a part of qemuDomainChangeNet() we
would always allocate a new port from the network driver for the
updated interface definition (by way of calling
virDomainNetAllocateActualDevice(newdev)), and this new port (ie the
ActualNetDef in newdev) would of course be allocated a new host device
from the pool (which would of course be different from the one
currently in use by the guest interface (in olddev)). Because direct
interfaces don't support changing the host device in a live update,
this would cause the update to fail.

The solution to this is to realize that as long as the interface
doesn't get switched to a different network as a part of the update,
the network port information (ie the ActualNetDef) will not change as
a part of updating the guest interface itself. So for sake of
comparison we can just point the newdev at the ActualNetDef of olddev,
and then clear out one or the other when we're done (to avoid a double
free or, more likely, attempt to reference freed memory).

(If, on the other hand, the name of the network has changed, or if the
interface type has changed to type='network' from something else, then
we *do* need to allocate a new port (actual device) from the network
driver (as we used to do in all cases when the new type was
'network'), and also indicate that we'll need to replace olddev in the
domain with newdev (because either of these changes is major enough
that we shouldn't just try to fix up olddev)

Partially-Resolves: https://issues.redhat.com/browse/RHEL-7036
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:06 -04:00
Peter Krempa
852380cef5 qemuBuildChardevCommand: Remove unused variable
'charstr' is unused since 36d06a5637, breaking the build on some
platforms. Remove it.

Fixes: 36d06a5637
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2024-09-19 13:12:02 +02:00
Peter Krempa
24d468993c qemu: Reject unsupported chardev backend protocols
QEMU supports only 'raw' and 'telnet' in the

 <protocol type='telnets'/>

element. Reject 'telnets' and 'tls'. TLS transport for qemu chardevs is
configured via "tls='yes'" attribute added to the "<source>" element
instead, so this prevents potential misconfig as the value would be
silently accepted.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/412
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
3778964207 conf: Convert 'protocol' field of TCP char device backend to proper type
Use virDomainChrTcpProtocol as type, convert the parser to use
virXMLPropEnum and fix one switch statement in the VMX driver.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
2256466f70 qemu: monitor: Remove the old chardev backend generator
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
e352a692a7 qemu: Use the new chardev backend JSON props generator also in the monitor
Now that we have a unified generator of chardev backend which is also
validated against the QMP schema we can replace the old generator with
it.

This patch modifies the monitor code to take virJSONValue 'props'
instead of the chardev definition and adds the conversion from the
chardev definition to JSON on higher levels.

The monitor code now also attempts to extract the returned 'pty' if
returned from qemu, so higher level code needs to report the error if
the path is needed and missing.

The current monitor generator is for now abandoned in place and will be
removed later.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
d897ad2b89 qemu: Move check for chardev backends which can't be hotplugged out of the monitor
The upcoming refactor of the monitor code will make the hotplug code
paths use the same generator we have for commandline -chardev backends
which doesn't refuse to format certain backends which can't be
hotplugged.

To prepare for this we add a check to qemuHotplugChardevAttach()
refusing such hotplug and remove 'qemumonitorjsontest' test cases which
will not make sense any more.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:14 +02:00
Peter Krempa
36d06a5637 qemu: Introduce unified chardev backend config generator
Similarly to how we approach the generators for
-device/-object/-blockdev/-netdev rewrite the generator of -chardev to
be unified with the generator for the monitor.

Unfortunately with -chardev it will be a bit more quirky when compared
to the others as the generator itself will need to know whether it
generates command line output or not as a few field names change and data
is nested differently.

This first step adds the generator and uses it only for command line
generation. This was possible to achieve without changing any of the
output in tests.

In further patches the same generator will then be used also in the
monitor code replacing both.

As basis for the generator I took the monitor code but modified it to
have the same field order as the commandline code and extended it
further to support all backend types, even those which are not
hotpluggable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:14 +02:00
Peter Krempa
9c88a566d8 qemu: capabilities: Explain that QEMU_CAPS_CHARDEV_JSON will be used in tests only
I've added that capability a long time ago when I was converting various
stuff to use JSON but the support in '-chardev' didn't yet materialize.

Fix the comment to make that clear and also that it'll be used in tests
for the upcoming refactor of the chardev code (so that we can validate
generator against the schema even if that doesn't yet work).

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:14 +02:00
Laine Stump
200f60b2e1 network: *un*set the firewalld zone while shutting down a network
When a bridge device for a virtual network had been placed in a
firewalld zone while starting the network, then even after the network
is shut down and the bridge device is deleted, its name will still
show up in the list of interfaces for whichever zone it had been in,
and this setting will persist through the next time a device with the
same name is created (until a zone is once again explicitly set, or
the device is removed via a firewalld API call).

Usually this isn't a problem, but in the case of forward mode='open',
someone might start the network once with a zone specified, then
shut down the network, remove the zone from its config, and start it
again; in this case the bridge device would come up using the zone
from the previous time it was started.

The solution to this is to remove the interface from whatever zone it
is in as the network is being shut down. There is no downside to doing
this, since the device is going to be deleted anyway. Note that
forward mode='bridge' uses a bridge device that was created outside of
libvirt, and libvirt won't be deleting that bridge, so we take care to
not unset the zone in that case.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 11:22:56 -04:00
Laine Stump
1a3778fe0a network: remove firewalld version check from networkSetBridgeZone()
At the time the version check in this function was written, there were
still several supported versions of some distros that were using a
version of firewalld too old to support the "rich rule priorities"
used by the 'libvirt' zone that we installed for firewalld. Today the
newest distro that has a version of firewalld < 0.7.0 is
RHEL7/CentOS7, so we can remove the complexity and if the libvirt zone
is missing simply say "the libvirt zone is missing".

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Laine Stump
1a72b83d56 network: support setting firewalld zone for bridge device of open networks
The bit of code that sets the firewalld zone was previously a part of
the function networkAddFirewallRules(), which is not called for
networks with <forward mode='open'/>.

Setting the 'libvirt' zone for the bridge device of virtual networks
that also add firewall rules is usually necessary in order to get the
expected traffic through without modifying firewalld's default zone
(which would be a bad idea, because that would affect all the other
host interfaces set to the default zone), but in general we would
*not* want the bridge device for a mode='open' virtual network to be
automatically placed in the "libvirt" zone. However, a user might want
to *explicitly* set some other firewalld zone for mode='open'
networks, and libvirt's network config is a convenient place to do
that.

We enable this by moving the code that sets the firewalld zone into a
separate function that is called for all forward modes that use a
bridge device created/managed by libvirt (nat, route, isolated,
open). If no zone is specified, then the bridge device will be in
whatever zone interfaces are put in by default, but if the <bridge>
element has a "zone" attribute, then the new bridge device will be
placed in the specified zone.

NB: This function is only called when the network is started, and
*not* when the firewall rules of an active network are reloaded at
virtnetworkd restart time, because the firewalld zone of an interface
isn't something that gets inadvertantly changed as a part of some
other unrelated action. For example all iptables rules are cleared by a
firewalld restart, including those rules added by libvirt, but there
is no blanket action that changes the zone of all interfaces, so it's
useful for libvirt to reload its rules when restarting virtnetworkd,
but pointless to re-add the interface to its preferred zone.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Laine Stump
eeebbc1eec network: belatedly update an error message
The 'open' forward type probably hadn't yet been added when this
message was written.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Laine Stump
114c0ec656 network: permit <forward mode='open'/> when a network has no IP address
The whole point of <forward mode='open'/> is to supress libvirt from
adding any firewall rules for a network, and someone might want to
create a network with no IP address (i.e. they don't want the guests
to have connectivity to the host via this interface) and no firewall
rules (they don't want any, or they want to add their own). So there's
no reason to fail when a network has <forward mode='open'/> and also
has no IP address.

Kind-of-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/588
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Martin Kletzander
d0a48eeb72 network: Remove unused variable in networkDestroy
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:43:18 +02:00
Martin Kletzander
8a2717e803 network: Clean up after disappeared transient inactive networks
If a network disappeared the daemon should not only remove it from the
list of networks, but also do a proper cleanup.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:38 +02:00
Martin Kletzander
2bea2782d5 network: Separate cleanup from networkRemoveInactive
The new function (networkCleanupInactive) can be called from an iterator
over the list of networks without the risk of deadlock.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:37 +02:00
Martin Kletzander
74a22c09be network: Try to read dnsmasq PIDs for inactive networks too
Just in case one needs a clean up.

Resolves: https://issues.redhat.com/browse/RHEL-50968
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:37 +02:00
Martin Kletzander
447fda8981 network: Clean up after inactive objects during start
Once networkUpdateState() identifies a dead network it should clean up
after it as well.

Resolves: https://issues.redhat.com/browse/RHEL-50968
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:36 +02:00
Martin Kletzander
0e43cb09ee network: Don't check if network is active in networkShutdownNetwork
It skips the cleanup from networkStartNetwork and the only other path
already checks if the network is active or not.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:35 +02:00
Martin Kletzander
3e43670f01 network: Move port deletion into the shutdown function
It will be more useful in there when calling from new places.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:35 +02:00
Martin Kletzander
5988fdec91 network: Do not call virNetworkObjUnsetDefTransient on start cleanup
The function networkShutdownNetwork already does that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:34 +02:00
Martin Kletzander
97ed0574ea network: Do not update network ports for inactive networks
The semantic does not change since inside networkUpdatePort() (well,
networkNotifyPort, for which the former is a wrapper) exits for inactive
networks, but with an error we can easily avoid with this patch.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:34 +02:00
Andrea Bolognani
d622ca04f6 apparmor: Don't check for existence of templates upfront
Currently, if either template is missing AppArmor support is
completely disabled. This means that uninstalling the LXC
driver from a system results in QEMU domains being started
without AppArmor confinement, which obviously doesn't make any
sense.

The problematic scenario was impossible to hit in Debian until
very recently, because all AppArmor files were shipped as part
of the same package; now that the Debian package is much closer
to the Fedora one, and specifically ships the AppArmor files
together with the corresponding driver, it becomes trivial to
trigger it.

Drop the checks entirely. virt-aa-helper, which is responsible
for creating the per-domain profiles starting from the
driver-specific template, already fails if the latter is not
present, so they were always redundant.

https://bugs.debian.org/1081396

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-09-16 17:57:54 +02:00
Martin Kletzander
4b68c7e55b resctrl: Do not rewrite default MB values for new allocations
The code did it "just in case" the allocation was not reset for new
subdirectories.  That might've happened in the past with CAT settings,
but checking it now it is properly reset to its maximum values for each
new CLOSID (Class of Service ID).

The advantage of this is that we do not rewrite the value with itself
which causes an issue with the current linux kernel and mba_MBps option
where the default is UINT_MAX (or (uint32_t) -1), but gets rounded up to
bandwidth granularity (10), overflows and small number (4) is set
instead.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-16 12:07:15 +02:00
Michal Privoznik
ebc4580a5f Revert "vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs"
Unfortunately, devfs on FreeBSD (accessible via /dev/fd) exposes
only those FDs which can be represented as a file. To cite
manpage [1]:

  The files /dev/fd/0 through /dev/fd/# refer to file descriptors
  which can be accessed through the file system.

This means FDs representing pipes and/or unnamed sockets are not
visible by default. To expose all FDs a slightly different
filesystem must be mounted [2]:

  mount -t fdescfs none /dev/fd

Apparently, on my test machine fdescfs is mounted by default and
thus I haven't seen any problem. Only after aforementioned patch
was merged our CI started reporting problems. While we could try
to figure out whether correct FS is mounted, it's a needless
micro optimization. Just revert the code to the state it was
before I touched it.

1: https://man.freebsd.org/cgi/man.cgi?query=fd&sektion=4&manpath=freebsd-release-ports
2: https://man.freebsd.org/cgi/man.cgi?query=fdescfs&sektion=5&n=1

This reverts commit 308ec0fb2c.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-16 12:05:19 +02:00
Michal Privoznik
308ec0fb2c vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs
On BSD-like systems "/dev/fd" serves the same purpose as
"/proc/self/fd". And since procfs is usually not mounted, on such
systems we can use "/dev/fd" instead.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/518
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Michal Privoznik
4df8dc576f vircommand: Make sysconf(_SC_OPEN_MAX) failure non-fatal
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big
enough bitmap so that subsequent call to
virCommandMassCloseGetFDsDir() can just set the bit instead of
expanding memory (this code runs in a forked off child and thus
using async-signal-unsafe functions like malloc() is a bit
tricky).

But on some systems the limit for opened FDs is virtually
non-existent (typically macOS Ventura started reporting EINVAL).

But with both glibc and musl using malloc() after fork() is safe.
And with sufficiently new glib too, as it's using malloc() with
newer releases instead of their own allocator.

Therefore, pick a sufficiently large value (glibc falls back to
256, [1], Darwin to 10240 [2] so 10240 should be good enough) to
fall back to and make the error non-fatal.

1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getdtsz.c;h=4c5a6208067d2f9eaaac6dba652702fb4af9b7e3;hb=HEAD
2  https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Michal Privoznik
6ded014ba3 vircommand: Isolate FD dir parsing into a separate function
So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd",
iterates over it marking opened FDs in @fds bitmap. Well, we can
do the same on other systems (with altered path), like MacOS or
FreeBSD. Therefore, isolate dir iteration into a separate
function that accepts dir path as an argument.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Michal Privoznik
dfe496ae33 vircommand: Drop unused arguments from virCommandMassCloseGetFDs*()
Both virCommandMassCloseGetFDsLinux() and
virCommandMassCloseGetFDsGeneric() take @cmd argument only to
mark it as unused. Drop it from both.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Martin Kletzander
bfad111c43 resctrl: Use cache IDs instead of max_id/max_cache_id
It is not guaranteed for the cache IDs to be continuous, especially for
L3 caches.  Hence do not assume so and instead record the individual IDs
in a virBitmap.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:57:41 +02:00
Martin Kletzander
f3fd0664cf resctrl: Don't assume MBA availability in virResctrlAllocNewFromInfo
Weirdly, the existence of /sys/fs/resctrl/info/MB does not always mean
that MBA is available and used on the system.  Instead of assuming that
copy the values from the default (root) allocation.  This also makes it
nicer to use the proper values in case the system does not use
percentages or when the root allocation already limits the bandwidth.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Martin Kletzander
bc97a2c043 capabilities: Also report L2 caches
Since some systems support control for L2 caches as well as L3 caches it
would be useful to report their configuration in capabilities.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Martin Kletzander
4437a775dc resctrl: Add virResctrlInfoPerTypeFree
It will be easier to add more dynamic data later on.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Martin Kletzander
7c40f1ead9 resctrl: Add virResctrlInfoMemBWFree
It will be easier to add more dynamic data later on

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Martin Kletzander
03b6383f33 resctrl: Move virResctrlAllocCopyMemBW up in the file
This way it can be used later in virResctrlAllocGetUnused().

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Martin Kletzander
d7e3a15a98 resctrl: Relax the limit of maximum memory bandwidth allocation
The value 100 represented the percentage as it was originally done from
Intel in the Linux kernel and on their CPUs.  Since then the situation
changed and there is no error-prone way of figuring out the meaning of
the value in the current configuration, let alone its possible maximum.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Martin Kletzander
eae19bb505 resctrl: Account for memory bandwidth of 0 being valid
In some scenarios the memory bandwidth in the schemata file might be 0
and so can the minimum allocation in other ones.  Remove checks which
were added for extra cautiousness.

Resolves: https://issues.redhat.com/browse/RHEL-54235
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00
Stepan Zobal
f60e5f87d4 documentation: Remove untrue statement in GetVersion() description
The description of virConnectGetVersion() says the function might only
work with a privileged access to the hypervisor, not with a read-only
connection.  However that is not true since commit a2e2e4652f and can
be safely removed.

Signed-off-by: Stepan Zobal <szobal@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-12 15:53:10 +02:00
Jakub Palacky
cc05007a43 vmx: use xmlBufferDetach() when applicable
xmlBuffer->content was deprecated in libxml2 v2.13.0-33-gb34dc1e4
xmlBufferDetach(xmlBuffer) should be used instead

Signed-off-by: Jakub Palacky <jpalacky@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-12 14:52:55 +02:00
Jakub Palacky
09ebe53349 util/virxml: use xmlCtxtGetLastError when applicable
xmlParserCtxt->lastError was deprecated in libxml2 v2.13.0-103-g1228b4e0
xmlCtxtGetLastError(xmlParserCtxt) should be used instead

Signed-off-by: Jakub Palacky <jpalacky@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-12 14:52:55 +02:00
Jakub Palacky
317139a316 util/virutil: Use readpassphrase when libbsd is available
When libbsd is available, use the preferred readpassphrase() function isntead of getpass()
as the getpass() function has been marked as obsolete and shouldnt be used

Signed-off-by: Jakub Palacky <jpalacky@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-12 13:12:47 +02:00
Richard W.M. Jones
94e8a5b650 vmx: Allow '*' to appear in VMX file keys
When connecting to a VMware server (eg using vpx://) we download and
try to parse the VMware metadata '*.vmx' file of a guest.  In this
case a VMX file was found which contained this key:

  pciPassthru*.present = "False"

The '*' character was not previously allowed in keys so this failed to
parse with the error:

  VIR_ERR_CONF_SYNTAX: VIR_FROM_CONF: configuration file syntax error:
  memory conf:74: expecting an assignment

Resolves: https://issues.redhat.com/browse/RHEL-58446
Thanks: Daniel Berrange
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-12 09:51:58 +02:00
Tom
5f6ccb0875 Allow apparmor parser to be executed in /usr/bin
This commit modifies the AppArmor profile for virt-aa-helper to
accommodate an observed behavior in certain Linux distributions,
such as ArchLinux.

In these distributions, /usr/sbin symlinks to /usr/bin. To ensure
that virt-aa-helper can execute apparmor_parser when it resides
in /usr/bin, the profile has been updated accordingly.

Signed-off-by: Tom <libvirt-patch@douile.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-09-12 09:44:20 +02:00
Peter Krempa
e562b16ede virDiskNameParse: Fix integer overflow in disk name parsing
The conversion to index entails multiplication and accumulation by user
provided data which can easily overflow, use VIR_MULTIPLY_ADD_IS_OVERFLOW
to check if the string is valid.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/674
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:26:39 +02:00
Peter Krempa
a9ede822da virconf: Properly fix numeric overflow when parsing numbers in conf files
The previous fix didn't check the overflow in addition. Use the new
macro to check both multiplication and addition overflows.

Fixes: 8666523b7d
Closes: https://gitlab.com/libvirt/libvirt/-/issues/671
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:26:31 +02:00
Peter Krempa
23cb613606 internal: Add helper macro for checking multiply and add overflows
The macro does the two checks together so that it's obvious what we're
checking as doing it in place is really unpleasant.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:26:28 +02:00
Peter Krempa
3c5839973f virDomainFeaturesDefParse: Add comment warning about features being specified repeatedly
Few of the handlers didn't take that possibility into account. Warn
others.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:26:24 +02:00
Peter Krempa
ead2419df3 virDomainFeaturesTCGDefParse: Don't leak 'tcg_features' when '<tcg>' feature is repeated
Similarly to other cases users may specify the feature flag multiple
times.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:26:20 +02:00
Peter Krempa
574769ceb6 virDomainFeaturesHyperVDefParse: Don't overwrite hypervisor vendor_id
In case when the user specifies the '<hyperv/>' feature multiple times
we could overwrite already parsed data. Clear it beforehand.

As before this isn't trying to address the case of features being
specified multiple times not making much sense.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/675
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:26:09 +02:00
Peter Krempa
8e28f2c5c2 virBitmapShrink: Do not attempt to clear bits beyond end of buffer
'virBitmapShrink' clears the bits beyond the end of the bitmap when
shrinking and then reallocates to match the new size. As it uses the
address of the first bit beyond the bitmap to do the clearing it can
overrun the allocated buffer if we're not actually going to shrink it
and the last bit's address is on the chunk boundary.

Fix it by returning in that corner case and add few more tests to be
sure.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/673
Fixes: d6e582da80
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:25:37 +02:00
Peter Krempa
bc02cb9506 virDomainDefParseBootInitOptions: Don't leak 'name' on failure
One of the failure paths skips code which would assign the string from
the temporary variable to the parsed struct, thus leaking it on failure.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/672
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-09-10 14:24:48 +02:00
Michal Privoznik
2feeefc0b4 cpu_map: Install SierraForest description file
In one of recent commits new CPU model was introduced. But
corresponding change in meson.build is missing which results in
the XML file not being installed.

Fixes: 3afbb1644c
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-09 09:30:04 +02:00
Peter Krempa
ecffc91d02 qemuBackupDiskDataCleanupOne: Don't skip rest of cleanup if we can't enter monitor
Recent fix to use the proper 'async' monitor function would cause
libvirt to leak some of the objects it's supposed to clean up in other
places besides qemu.

Don't skip the whole function on failure to enter the job but just the
monitor section.

Fixes: 9b22c25548
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-06 18:14:34 +02:00
Peter Krempa
8666523b7d virconf: Fix numeric overflow when parsing numbers in conf files
The number is parsed manually without making sure it'll fit.

Fixes: 3bbac7cdb6
Closes: https://gitlab.com/libvirt/libvirt/-/issues/671
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-06 18:14:34 +02:00
Peter Krempa
5d77061d7e conf: Don't overwrite KVM feature config struct if the feature is present twice
Don't allocate the struct if it exists already. This sidesteps the
discussion about whether forbidding multiple feature definitions makes
sense.

Fixes: a8e0f9c682
Closes: https://gitlab.com/libvirt/libvirt/-/issues/670
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-06 18:14:34 +02:00