libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2024-12-22 05:35:25 +00:00

Author	SHA1	Message	Date
Martin Kletzander	27ae5e602a	qemu_hotplug: Report better error message for platform serial devices This should be better than the current for both hotplug: error: internal error: Invalid target model for serial device and hot-unplug: error: An error occurred, but the cause is unknown which should not be reached at all. Resolves: https://issues.redhat.com/browse/RHEL-66222 Resolves: https://issues.redhat.com/browse/RHEL-66223 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-11 12:48:42 +01:00
Martin Kletzander	52c2e3e0a7	qemu: Expose qemuChrIsPlatformDevice outside from qemu_command Then it can be used from qemu_hotplug. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-11 12:48:42 +01:00
Boris Fiuczynski	bf0308b2d4	qemu: command: add multi boot device support on s390x If QEMU supports multi boot device make use of it instead of using the single boot device machine parameter. Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-11 10:15:06 +01:00
Boris Fiuczynski	3ccf692e08	qemu: capabilities: Add QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM Add capability QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM to detect multi boot device support in QEMU by checking the virtio-blk-ccw device property existence of loadparm. Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-11 10:15:06 +01:00
Michal Privoznik	a3b8753db9	virnetdevopenvswitch: Warn on unsupported QoS settings Let me preface this with stating the obvious: documentation on QoS in OVS is very sparse. This is all based on my observation and OVS codebase analysis. For the following QoS setting: <bandwidth> <inbound average="512" peak="1024" burst="32"/> </bandwidth> the following QoS setting is generated into OVS (NB, our XML values are in KiB/s, OVS has them in bits/s): # ovs-vsctl list qos _uuid : a087226b-2da6-4575-ad4c-bf570cb812a9 external_ids : {ifname=vnet1, vm-id="7714e6b5-4885-4140-bc59-2f77cc99b3b5"} other_config : {burst="262144", max-rate="8192000", min-rate="4096000"} queues : {0=655bf3a7-e530-4516-9caf-ec9555dfbd4c} type : linux-htb from which the following topology is generated: # for i in qdisc class; do tc -s -d -g $i show dev vnet1; done qdisc htb 1: root refcnt 2 r2q 10 default 0x1 direct_packets_stat 0 ver 3.17 direct_qlen 1000 Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 +---(1:fffe) htb rate 8192Kbit ceil 8192Kbit linklayer ethernet burst 1499b/1mpu 60b cburst 1499b/1mpu 60b level 7 \| Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0) \| backlog 0b 0p requeues 0 \| +---(1:1) htb prio 0 quantum 51200 rate 4096Kbit ceil 8192Kbit linklayer ethernet burst 32Kb/1mpu 60b cburst 32Kb/1mpu 60b level 0 Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Long story short, the default class (1:) for an OVS interface has average and peak set exactly as requested. But since it's nested under another class (1:fffe), it can borrow unused bandwidth. And the parent is set to have rate = ceil = peak from our XML. From [1]: htb_tc_install() calls htb_parse_qdisc_details__() which sets: 'hc->min_rate = hc->max_rate;' and then calls htb_setup_class_(..., tc_make_handle(1, 0xfffe), tc_make_handle(1, 0), &hc); to set up the top parent class. In other words - the interface is set up to so that it can always consume 'peak' bandwidth and there is no way for us to set it up differently. It's too late to deny setting 'peak' different to 'average' at XML validation phase so do the next best thing - throw a warning, just like we do in case <bandwidth/> is set for an unsupported <interface/> type. 1: https://github.com/openvswitch/ovs/blob/main/lib/netdev-linux.c#L5039 Resolves: https://issues.redhat.com/browse/RHEL-53963 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-11-08 10:41:02 +01:00
Michal Privoznik	844d1036eb	qemu_domain: Automagically add IOMMU if needed If a Q35 domain has huge number of vCPUS (over 255, currently), then it needs IOMMU with Extended Interrupt Mode enabled (see check in qemuValidateDomainVCpuTopology()). Well, we already add some devices and to other tricks when parsing new domain XML. Might as well add IOMMU device if above condition is met. Resolves: https://issues.redhat.com/browse/RHEL-65844 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-07 10:46:33 +01:00
Michal Privoznik	b15047ff26	qemu: Turn EIM IOMMU on automagically If a Q35 domain has huge number of vCPUS (over 255, currently), then it needs IOMMU with Extended Interrupt Mode enabled (see check in qemuValidateDomainVCpuTopology()). Well, we already add some devices and to other tricks when parsing new domain XML. Might as well turn the EIM on for IOMMU device. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-07 10:46:33 +01:00
Michal Privoznik	a9797d7c43	libvirt_private.syms: Export virDomainIOMMUDefNew() Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-07 10:46:32 +01:00
Ján Tomko	e45313c031	ch: check return value of virJSONValueArrayAppend It only errors out when presented with a non-array, but we do check it everywhere else. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-06 17:12:32 +01:00
Ján Tomko	da66bf53b0	util: json: check return value of virJSONValueFromJsonC In virJSONValueFromJsonC, the return value of virJSONValueFromJsonC was not checked in one case. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-06 17:12:32 +01:00
Ján Tomko	13f40898ab	qemu: chardev: avoid impossible overflow In the rare case where int and long long are not the same size, the multiplication of an int variable and an int constant might overflow. Cast the constant to long long to avoid this. Signed-off-by: Ján Tomko <jtomko@redhat.com> Fixes: `baa4edfb79` Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-11-06 17:12:32 +01:00
Marc-André Lureau	bb5e26749f	qemu: explicit swtpm state locking With upcoming v0.10 swtpm (commit `aa483aeb6d`), file locking with "lock" option is now supported and reflected in "tpmstate-opt-lock" capability. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>	2024-11-05 15:25:53 +01:00
Marc-André Lureau	f1304cc566	qemu_tpm: handle file/block storage source When swtpm reports "nvram-backend-dir", it can accepts a single file or block device where TPM state will be stored. --tpmstate must be backend-uri=file://<path>. Teach the storage to use custom directory or file source location. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>	2024-11-05 15:25:53 +01:00
Marc-André Lureau	a110042d0c	schema: add TPM emulator <source type='dir' path='..'> Learn to parse a directory for the TPM state. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>	2024-11-05 15:25:53 +01:00
Marc-André Lureau	579fd44612	schema: add TPM emulator <source type='file' path='..'> Learn to parse a file path for the TPM state. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>	2024-11-05 15:25:53 +01:00
Marc-André Lureau	6d4eb07a55	tpm: rename 'storagepath' to 'source_path' Mechanically replace existing 'storagepath' with 'source_path', as the following patches introduce <source path='..'> configuration. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>	2024-11-05 15:25:53 +01:00
Marc-André Lureau	cc0aab9395	util: check swtpm nvram-backend-{dir,file} capabilities Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>	2024-11-05 15:25:53 +01:00
Martin Kletzander	a52cd504b3	qemu: Report supported panic device models in domcapabilities Domain capabilities include information about support for various devices and models. Panic devices are not included in the output which means that management applications need to include the logic for choosing the right device model or request a default model and try defining such a domain. Add reporting of panic device models into the domain capabilities based on the logic in qemuValidateDomainDefPanic() and also report whether panic devices are supported based on whether at least one model is supported. That way consumers of the domain capability XML can differentiate between libvirt not reporting the panic device models or no model being supported. Resolves: https://issues.redhat.com/browse/RHEL-65187 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-05 09:57:37 +01:00
Ján Tomko	faf6edfa74	json: do not call json_tokener_free with NULL Add an error message for the rare case if json_tokener_new fails (allocation failure) and guard any use of json_tokener_free where tok might be NULL (this was possible in libvirt-nss when the json file could not be opened). https://gitlab.com/libvirt/libvirt/-/issues/581 Signed-off-by: Ján Tomko <jtomko@redhat.com> Reported-by: Simon Pilkington Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>	2024-11-04 12:15:10 +01:00
Peter Krempa	d02140383d	virstring: Use 'g_new0' instead of improper use of 'g_malloc0_n' Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-01 15:52:18 +01:00
Peter Krempa	bb4bd9d31f	Replace improper use of g_malloc(0) with g_new0 Completely remove use of g_malloc (without zeroing of the allocated memory) and forbid further use. Replace use of g_malloc0 in cases where the variable holding the pointer has proper type. In all of the above cases we can use g_new0 instead. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-01 15:52:18 +01:00
Peter Krempa	354a3d2be4	virJSONValueFromString: Prefix error message from 'json-c' The error message from 'json-c' was passed along without any libvirt string which makes it hard to find in the source and isn't exactly clear when present in logs: libvirtd[843]: internal error : invalid utf-8 string Prefix the message with 'failed to parse JSON'. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-11-01 15:51:53 +01:00
Jiri Denemark	e71a510605	qemu: Fix maximum physical address size in baseline CPU We should include maximum physical address size in the CPU definition created by virConnectBaselineHypervisorCPU only if we know the value for all input CPUs. Otherwise we would create a CPU definition that is not usable on all hosts from which we gathered the CPU info. https://issues.redhat.com/browse/RHEL-24850 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-11-01 10:19:24 +01:00
Laine Stump	7581e3b6d5	Revert "network: add rule to nftables backend that zeroes checksum of DHCP responses" This reverts commit `42ab0148dd`. This patch was supposed to fix the checksum of dhcp response packets by setting it to 0 (because having a non-0 but incorrect checksum was causing the packets to be droppe on FreeBSD guests). Early testing was positive, but after the patch was pushed upstream and more people could test it, it turned out that while it fixed the dhcp checksum problem for virtio-net interfaces on FreeBSD and OpenBSD, it also broke dhcp checksums for the e1000 emulated NIC on all guests (but not e1000e). So we're reverting this fix and looking for something more universal to be included in the next release. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-30 11:39:58 +01:00
Laine Stump	42ab0148dd	network: add rule to nftables backend that zeroes checksum of DHCP responses Many years ago (April 2010), soon after "vhost" in-kernel packet processing was added to the virtio-net driver, people running RHEL5 virtual machines with a virtio-net interface connected via a libvirt virtual network noticed that when vhost packet processing was enabled, their VMs could no longer get an IP address via DHCP - the guest was ignoring the DHCP response packets sent by the host. (I've been informed by danpb that the same issue had been encountered, and "fixed" even earlier than that, in 2006, with Xen as the hypervisor.) The "gory details" of the 2010 discussion are chronicled here: https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html but basically it was because packet checksums weren't being fully computed on the host side (because QEMU on the host and the NIC driver in the guest had agreed between themselves to turn off checksums because they were unnecessary due to the "link" between the two being entirely in local memory rather than an error-prone physical cable), but 1) a partial checksum was being put into the packets at some point by "someone" 2) the "don't use checksums" info was known by the guest kernel, which would properly ignore the "bad" checksum), and 3) the packets were being read by the dhclient application on the guest side with a "raw" socket (thus bypassing the guest kernel UDP processing that would have known the checksum was irrelevant and ignore it)), The "fix" for this ended up being two-tiered: 1) The ISC DHCP package (which contains the aforementioned dhclient program) made a fix to their dhclient code which caused it to accept packets anyway even if they didn't have a proper checksum (NB: that's not a full explanation, and possibly not accurate). This remedied the problem for guests with an updated dhclient. Here is the code with the fix to ISC DHCP: https://github.com/isc-projects/dhcp/blob/master/common/packet.c#L365 This eliminated the issue for any new/updated guests that had the fixed dhclient, but it didn't solve the problem for existing/old guest images that didn't/couldn't get their dhclient updated. This brings us to: 2) iptables added a new "CHECKSUM" target and "--checksum-fill" action: http://patchwork.ozlabs.org/patch/58525/ and libvirt added an iptables rule for each virtual network to match DHCP response packets and perform --checksum-fill. This way by the time dhclient on the guest read the raw packet, the checksum would be corrected, and the packet would be accepted. This was pushed upstream in libvirt commit v0.8.2-142-gfd5b15ff1a. The word at the time from those more knowledgeable than me was that the bad checksum problem was really specific to ISC's dhclient running on Linux, and so once their fix was in use everywhere dhclient was used, bad checksums would be a thing of the past and the --checksum-fill iptables rules would no longer be needed (but would otherwise be harmless if they were still there). (Plot twist: the dhclient code in fix (1) above apparently is on a Linux-only code path - this is very important later!) Based on this information (and also due to the opinion that fixing it by having iptables modify the packet checksum was really the wrong way to permanently fix things, i.e. an "ugly hack"), the nftables developers made the decision to not implement an equivalent to --checksum-fill in nftables. As a result, when I wrote the nftables firewall backend for libvirt virtual networks earlier this year, it didn't add in any rule to "fix" broken UDP checksums (since there was apparently no equivalent in nftables and, after all, that was fixed somewhere else 14 years ago, right???) But last week, when Rich Jones was doing routine testing using a Fedora 40 host (the first Fedora release to use the nftables backend of libvirt's network driver by default) and a FreeBSD guest, for "some strange reason", the FreeBSD guest was unable to get an IP address from DHCP!! https://www.spinics.net/linux/fedora/libvirt-users/msg14356.html A few quick tests proved that it was the same old "bad checksum" problem from 2010 come back to haunt us - it wasn't a Linux-only issue after all. Phil Sutter and Eric Garver (nftables people) pointed out that, while nftables doesn't have an action that will compute the checksum of a packet, it does have an action that will set the checksum to 0, and suggested we try adding a "zero the checksum" rule for dhcp response packets to our nftables ruleset. (Why? Because a checksum value of 0 in a IPv4 UDP packet is defined by RFC768 to mean "no checksum generated", implying "checksum not needed"). It turns out that this works - dhclient properly recognizes that a 0 checksum means "don't bother with the checksum", and accepts the packet as valid. So to once again fix this timeless bug, this patch adds such a checksum zeroing rule to the nftables rules setup for each virtual network. This has been verified (on a Fedora 40 host) to fix DHCP with FreeBSD and OpenBSD guests, while not breaking it for Fedora or Windows (10) guests. Fixes: `b89c4991da` Reported-by: Rich Jones <rjones@redhat.com> Fix-Suggested-by: Eric Garver <egarver@redhat.com> Fix-Suggested-by: Phil Sutter <psutter@redhat.com> Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-10-25 12:00:52 -04:00
Laine Stump	d5af1e90bb	network: don't unset the firewalld zone if it's going to be immediately re-set Any time the firewalld zone for an interface is set, by definition that removes it from any previous zone that it was in, so there is really no point in unsetting the zone if it's just going to be immediately set again. This is useful because when firewalld reloads its rules, 3 things happen: 1) firewalld flushes all firewall rules (including those added by libvirt) 2) firewalld unsets the zones for all interfaces (including those set by libvirt) 3) firewalld re-adds its own rules, and sets the zone for all the interfaces it manages 4) firewalld sends a dbus message that libvirt is watching for, and when libvirt receives that message, it reloads all of the libvirt-generated rules, and also re-sets the firewalld zone for the bridge interfaces managed by libvirt. libvirt accomplishes step 4 by a) calling networkRemoveFirewallRules(), and then b) calling networkAddFirewallRules(). But (because it is useful in other contexts) networkRemoveFirewallRules() will attempt to unset the zone for each bridge interface, and when firewalld receives this request, it sees that the bridge interface has no zone (because it was unset by firewalld in step (2) above), and thus logs an error message. There is no way for libvirt to suppress an error message that is logged by firewalld when a request to firewalld fails. But what libvirt can do is realize that in these cases, the firewalld zone is about to be set again anyway, and so we don't need to unset the zone. This patch handles that by adding a bool unsetZone to the arguments of networkRemoveFirewallRules(); most calls to networkRemoveFirewallRules() have unsetZone=true, but in two cases where the zone is about to be reset, networkRemoveFirewallRules() is called with unsetZone=false, which prevents the call to virFirewallDInterfaceUnsetZone() and thus avoids the unnecessary (and confusing to users!) error message that would have been logged by firewalld. Signed-off-by: Laine Stump <laine@redat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-24 12:31:03 -04:00
Laine Stump	e8228a9e79	network: ignore/don't log errors when unsetting firewalld zone The most common "error" when trying to unset the firewalld zone of an interface is for firewalld to tell us that the interface already isn't in any zone. Since this is what we want, no need to alarm the user by logging it as an error. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-24 11:50:41 -04:00
Jiri Denemark	f4dc248a95	domain_capabilities: Report CPU blockers When a CPU model is reported as usable='no' an additional <blockers model='...'> element is added for that CPU model to show which features are missing for the CPU model to become usable. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-24 15:53:51 +02:00
Jiri Denemark	016be5510a	domain_capabilities: Sort CPU models Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-24 15:53:51 +02:00
Jiri Denemark	0c6134f190	util: Introduce virStringListRemoveDuplicates Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-24 15:53:51 +02:00
Jiri Denemark	f928eb5fc8	qemu: Change CPU comparison algorithm for future models When starting a domain we check whether the guest CPU definition is compatible with the host (i.e., when the host supports all features required both explicitly and by the specified CPU model) as long as check == 'partial', which is the default. We are doing so by checking our definition of the CPU model in the CPU map amending it with explicitly mentioned features and comparing it to features QEMU would enabled when started with -cpu host. But since our CPU model definitions often slightly differ from QEMU we may be checking features which are not actually needed and on the other hand not checking something that is part of the CPU model in QEMU. This patch changes the algorithm for CPU models added in the future (changing it for existing models could cause them to suddenly become incompatible with the host and domains using them would fail to start). The new algorithm uses information we probe from QEMU about features that block each model from being directly usable. If all those features are explicitly disabled in the CPU definition we consider the base model compatible with the host. Then we only need to check that all explicitly required features are supported by QEMU on the host to get the result for the whole CPU definition. After this we only use the model definitions (for newly added models) from CPU map for creating a CPU definition for host-model. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:45 +02:00
Jiri Denemark	e373f87034	qemu: Introduce virQEMUCapsGetCPUBlockers A function for accessing a list of features blocking CPU model usability. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	5f8abbb7d0	cpu: Introduce virCPUCompareUnusable As opposed to the existing virCPUCompare{,XML} this function does not use CPU model definitions from CPU map. It relies on CPU model usability info from a hypervisor with a list of blockers that make the selected CPU model unusable. Explicitly requested features are checked against the hypervisor's view of a host CPU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	591b364f49	qemu: Separate partial CPU check into a function The new qemuDomainCheckCPU function is used as a replacement for virCPUCompare to make sure all callers use the same comparison algorithm. As a side effect qemuConnectCompareHypervisorCPU now properly reports CPU compatibility for CPU model that are considered runnable by QEMU even if our definition of the model disagrees. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	52d2a8eb6c	qemu: Use virCPUCompare in qemuConnectCompareHypervisorCPU directly The function already parses CPU XML on s390. By parsing it consistently on all architecture we can switch to virCPUCompare and easily replace it with a QEMU specific helper in the following patch. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	1c45473b93	qemu: Use g_autoptr in qemuConnectCompareHypervisorCPU Let's get rid of the only explicitly freed variable left in qemuConnectCompareHypervisorCPU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	5475688a29	cpu: Introduce virCPUGetCheckMode On x86 the function returns whether an old style compat check mode should be used for a specified CPU model according to the CPU map. All other architectures will always use compat mode. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	cd93f7ddab	cpu_map: Use compat partial check for all x86 CPU models Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Jiri Denemark	f8ade72c2b	cpu_x86: Introduce <check> element for CPU models CPU models in the CPU map may be marked with <check partial="compat"/> to indicate a backward compatible partial check (comparing our definition of the model with the host CPU) should be performed. Other models will be checked using just runnability info from QEMU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-23 16:00:44 +02:00
Peter Krempa	36080e1b57	qemu: snapshot: Delete leftover overlay files for <transient/> disks When a VM is terminated by host reboot libvirt doesn't get to cleaning out the temporary overlay file used for transient disks. Since we create those files with a very specific suffix it's almost guaranteed that if it exists it's a leftover from a libvirt run. Delete them instead of complaining to preserve functionality. Closes: https://gitlab.com/libvirt/libvirt/-/issues/684 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-10-22 08:15:36 +02:00
Peter Krempa	7cbe9e94c4	util: bitmap: Rewrite virBitmapShrink using new helpers Rather than reimplement everything manually use virBitmapBuffsize to find the current number of units, realloc the buffer and clear the tail using virBitmapClearTail(). This fixes a corner case where the buffer would be over-allocated by one unit when shrinking to the boundary of the unit size. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-17 17:09:24 +02:00
Peter Krempa	e506e0b3f1	util: virbitmap: Extract clearing of unused bits at the end of the last unit Extract the clearing of the traling bits from 'virBitmapSetAll' into a new helper. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-17 17:09:24 +02:00
Peter Krempa	e572150ebe	virbitmap: Extract and reuse buffer size calculation into a function Calculating the number of element can come handy in multiple places, extract it from virBitmapNew. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-17 17:09:24 +02:00
Peter Krempa	cfe638ef80	virBitmapNewCopy: Honor sizes of either bitmap when doing memcpy() 'virBitmapNewCopy()' allocates a new bitmap with the same number of bits but uses the internal allocation length as argument for the memcpy() operation to copy the bits. Due to bugs in other code these may not be the same resulting into a buffer overflow if the source is over-allocated. Use the buffer length of the target bitmap instead. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-17 17:09:24 +02:00
Martin Kletzander	f7c89763b1	qemu: Do not hardcode Hyper-V feature names on command line When constructing the command line for QEMU, some Hyper-V features were hardcoded, probably due to the fact that they could not have been automatically translated from the libvirt feature name to QEMU CPU feature name. Well now they can be, thanks to their additions to the virQEMUCapsCPUFeaturesX86 translation table. Translate all such features the same way when constructing the command line. This way any future feature that is not translated will be caught by tests (if a test is added for it) which was not the case when it was just hardcoded. Hopefully this avoids at least some possible future issues. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-17 00:43:36 +02:00
Martin Kletzander	ca8c0862ac	qemu: Add more translations to virQEMUCapsCPUFeatureTranslationTable Hyper-V enlightenment features can have hyphenated names which libvirt exposes under Hyper-V features with underscored names. When libvirt checks that all requested features were enabled by QEMU (on x86 architectures) it first queries for all those that QEMU knows and compiles them in a map while using the virQEMUCapsCPUFeaturesX86 for translations. Some features (well, all Hyper-V features with underscores) were not present in the translation table and were incorrectly reported as not enabled, consequently failing the start of any such domain. Add all hyphenated/underscored Hyper-V feature names into the aforementioned translation table. That way domains with these features enabled can be started when QEMU and the kernel support them. Resolves: https://issues.redhat.com/browse/RHEL-7122 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-17 00:43:35 +02:00
Adam Julis	0fd36e9656	lxc: fix variable storage order before call virDomainConfNWFilterInstantiate() was called without updated net->ifname, it caused in some cases throwing error message. If function failed, change is reverted. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/658 Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-16 16:30:19 +02:00
Martin Kletzander	f2710260d4	qemu_namespace: Only replicate labels on created files Function qemuNamespaceMknodOne() is trying to replicate a file from the parent namespace as perfectly as possible, with the same permissions, labels, ACLs, etc. If that file already existed it means that the qemu process is probably using it already and the current setting is probably more correct than the ones from the parent namespace. In order to reflect that only replicate the file metadata when it was (re-)created in this function. Resolves: https://issues.redhat.com/browse/RHEL-62174 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-16 15:07:10 +02:00
Martin Kletzander	26f249034d	qemu_namespace: Properly report new files Function qemuNamespaceMknodOne() is supposed to return 0 if the file did not exist before this function. If, however, the file existed, but was removed and recreated by this function the @existed flag should be reset to its proper state (false) because the function then behaves the same way as if the file did not exist as it needed to be recreated. So reset the @existed flag to properly reflect what happened. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-16 15:07:10 +02:00
Martin Kletzander	2b19f4b82d	qemu_namespace: Rename variable The boolean actually tells whether the file existed when the function was called and using it in more places later on makes them confusing (e.g. do something with a file if it does not exist). To better reflect the above and prepare for next patch rename this variable. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-16 15:07:10 +02:00
Peter Krempa	baa4edfb79	qemu: chardev: Use 'reconnect-ms' instead of deprecated 'reconnect' qemu-9.2 will deprecate the 'reconnect' field in favor of 'reconnect-ms'. As libvirt currently doesn't track the timeouts in milliseconds we simply convert them to avoid use of the deprecated field. Quite a lot of churn is caused by the need to plumb 'qemuCaps' into the chardev props generator. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-16 14:39:09 +02:00
Peter Krempa	23fa1d2184	qemu: capabilities: Introduce QEMU_CAPS_CHARDEV_RECONNECT_MILISECONDS New qemu introduced the 'reconnect-ms' field for character devices allowing the reconnect timeout to be specified in milliseconds, which also deprecates the existing 'reconnect' field that libvirt uses. To avoid use of deprecated interfaces add a capability which will allow us to use the new field. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-16 14:39:09 +02:00
Andrea Bolognani	81493d8eb6	apparmor: Allow running i686 VMs on Debian 12 In Debian 12, the qemu-system-i386 binary in /usr/bin is a wrapper script, with the actual executable living in /usr/libexec instead. This makes it impossible to run i686 VMs when AppArmor is enabled. Allow running the actual binary. https://bugs.debian.org/1030926 Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com>	2024-10-16 09:46:49 +02:00
Ján Tomko	e996536a3b	Remove pointless bool conversions Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-15 14:48:35 +02:00
Peter Krempa	e2c6f4c800	qemu: snapshot: Remove dead code in 'qemuSnapshotDeleteBlockJobRunning' 'qemuSnapshotDeleteBlockJobIsRunning' returns only 0 and 1. Convert it to bool and remove the dead code handling -1 return in the caller. Closes: https://gitlab.com/libvirt/libvirt/-/issues/682 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-14 16:25:21 +02:00
Peter Krempa	04d6a0ec5d	qemu: migration: Fix blockdev config with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES The idea of migration with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES populated is to sparsify the image. The QEMU NBD client as it was configured in commit `621f879adf` would signal to the destination to do thick allocation of holes which would result in a non-sparse image for any backend except a qcow2 image which I used to test it. Switch to VIR_DOMAIN_DISK_DETECT_ZEROES_UNMAP and VIR_DOMAIN_DISK_DISCARD_UNMAP which tells the NBD client (and that in turn the NBD server) to preserve the sparse blocks it detected from the image. Fixes: `621f879adf` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-14 16:25:21 +02:00
Jiri Denemark	0c653fc9a5	util: Rename variable "major" in virIsDevMapperDevice major() is a macro defined in sys/sysmacros.h so luckily the code works, but it's very confusing. Let's rename the local variable to make the difference between it and the macro more obvious. And while touching the line we can also initialize it to make sure "clever" analyzers do not think it may be used uninitialized. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-10-14 11:48:50 +02:00
Laine Stump	37800af9a4	network: inhibit idle timeout of daemon if there are any active networks When the daemons were split out from the monolithic libvirtd, the network driver didn't implement "inhibit idle timeout if there are any active objects" as was done for other drivers, so virtnetworkd would always exit after 120 seconds of no incoming connections. This didn't every cause any visible problem, although it did mean that anytime a network API was called after an idle time > 120 seconds, that the restarting virtnetworkd would flush and reload all the iptables/nftables rules for any active networks. This patch replicates what is done in the QEMU driver - an nactive is added to the network driver object, along with an inhibitCallback; the latter is passed into networkStateInitialize when the driver is loaded, and the former is incremented for each already-active network, then incremented/decremented each time a network is started or stopped. If nactive transitions from 0 to 1 or 1 to 0, inhibitCallback is called, and it "does the right stuff" to prevent/enable the idle timeout. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-10-10 14:07:12 -04:00
Jim Fehlig	d721b6840f	libxl: Reject VM config referencing nwfilters The Xen libxl driver does not support nwfilter. Introduce a deviceValidateCallback function with a check for nwfilters, returning VIR_ERR_CONFIG_UNSUPPORTED if any are found. Also fail to start any existing VMs referencing nwfilters. Drivers generally ignore unrecognized XML configuration, but ignoring a user's request to filter VM network traffic can be viewed as a security issue. Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-10 08:39:12 -06:00
Laine Stump	c0ba3ed69d	network: a different implementation of unsetting firewalld zone when network is destroyed (this is a remake of commit v10.7.0-78-g200f60b2e1, which was reverted due to a regression in another patch it was dependent on. The new implementation just adds the call to virFirewallDInterfaceUnsetZone() into the existing networkRemoveFirewallRules() (but only if we had set a zone when the network was first started). Replaces: `200f60b2e1` Resolves: https://issues.redhat.com/browse/RHEL-61576 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-09 15:54:08 -04:00
Laine Stump	cb4e38d4b1	network: a different way of supporting firewalld zone for mode='open' networks Now that networkAddFirewallRules and networkRemoveFirewallRules() are being called for mode='open' networks, we just need to move the code that sets the zone outside of the if (mode != ...OPEN) clause, so that it's done for all forward modes, with the exception of setting the implied 'libvirt' zones, which are set when no zone is specified for all forward modes except* 'open'. This was previously done in commit v10.7.0-76-g1a72b83d56, but in a manner that caused the zone to be unset whenever firewalld reloaded its rules. That patch was reverted, and this new better patch takes its place. Replaces: `1a72b83d56` Resolves: https://issues.redhat.com/browse/RHEL-61576 Re-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-09 15:54:08 -04:00
Laine Stump	d552d810b9	network: call network(Add\|Remove)FirewallRules() for forward mode='open' Previously networkAddFirewallRules() and networkRemoveFirewallRules() were only called if the forward mode was none, 'route', or 'nat', so those functions didn't check the forward mode. Although their current contents shouldn't be executed for forward mode='open', soon they will have extra functionality that should be executed for all the current forward modes and also mode='open'. This patch modifies all places either of the functions are called to make sure they are called for mode='open' in addition to current modes (by either adding 'case ..._OPEN:' to the case of a switch statement, or just removing an 'if (mode != ...OPEN)' around the calls; to balance out for that, it puts the entirety of the contents of both functions inside if (mode != ...OPEN) to retain current behavior. (an upcoming patch will add code outside that if clause). debug log messages were also added to make it easier to test that the right thing is being done in all cases. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-09 15:54:08 -04:00
Laine Stump	ef760a4133	Revert "network: support setting firewalld zone for bridge device of open networks" This reverts commit `1a72b83d56`. That patch had made the incorrect assumption that the firewalld zone of a bridge would not be changed/removed when firewalld reloaded its rules (e.g. with "killall -HUP firewalld"). It turns out my memory was faulty, and this does remove the bridge interface's zone, which results in guest networking failure after a firewalld reload, until the virtual network is restarted. The functionality reverted as a result of this patch reversion will be added back in an upcoming patch that keeps the zone setting in networkAddFirewallRules() (rather than moving it into a separate function) so that it is called every time the network's firewall rules are reloaded (including the reload that happens in response to a reload notification from firewalld). Signed-off-by: Laine Stump Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-09 15:54:08 -04:00
Laine Stump	816876f517	Revert "network: unset the firewalld zone while shutting down a network" This reverts commit `200f60b2e1`. The same functionality will be re-added in a different way in an upcoming patch. Signed-off-by: Laine Stump Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-10-09 15:54:08 -04:00
Jim Fehlig	bd6d7ebf62	qemu: Use consistent naming for save image format The image format setting in qemu.conf is named 'save_image_format'. The enum of supported format types is declared with name 'virQEMUSaveFormat'. Let's be consistent and use 'format' instead of 'compressed' when referring to the save image format. Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-09 13:48:39 -06:00
Jim Fehlig	b0dc8a923d	qemu: conf: Improve the foo_image_format setting descriptions The current description of the various foo_image_format settings can be construded to imply the setting is only used to control compression of the image. Improve the documentation to clarify that format describes the representation of guest memory blocks on disk, which includes compression among other possible layouts. Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-09 13:48:38 -06:00
Peter Krempa	aa08a30048	qemu: snapshot: Allow internal snapshots with PFLASH nvram With the new snapshot QMP command we can select which block device backend receives the VM state and thus the main issue with internal snapshots with pflash was addressed. Thus we can relax the check and allow snapshots if the pflash nvram is on qcow2. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Peter Krempa	8be8b7de78	qemuSnapshotActiveInternalDeleteGetDevices: Add warning when deleting inconsistent snapshot As explained in the commit which added the new internal snapshot deletion code we don't want to do any form of strict checking whether the libvirt metadata is consistent with the on-disk state as we didn't historically do that. In order to be able to spot the cases add a warning into the logs if such state is encountered. While warnings are easy to miss it's the only reasonable way to do that. Users will be encouraged to file an issue with the information, without requiring them to enable debug logs as the reproduction of that issue may include very old historical state. The checker is deliberately added separately so that it can be easily reverted once it's no longer needed. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Peter Krempa	eac1a86f72	qemu snapshot: use QMP snapshot-delete for internal snapshots deletion Switch to using the modern QMP command. As the user visible logic when deleting internal snapshots using the old 'delvm' command was very lax in terms of catching inconsistencies between the snapshot metadata and on-disk state we re-implement this behaviour even using the new command. We could improve the validation but that'd go at the cost of possible failures which users might not expect. As 'delvm' was simply ignoring any kind of failure the selection of devices to delete the snapshot from is based on querying qemu first which top level images do have the internal snapshot and then continuing only on those. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel	b93af62c40	qemu snapshot: use QMP snapshot-save for internal snapshots creation The usage of HMP commands are highly discouraged by qemu. Moreover, current snapshot creation routine does not provide flexibility in choosing target device for VM state snapshot. This patch makes use of QMP commands snapshot-save and by default chooses first writable non-shared qcow2 disk (if present) as target for VM state. Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Peter Krempa	6d8ae98fa0	qemu: monitor: Store internal snapshot names from 'query-named-block-nodes' Store the names of internal snapshots present in supported images in the data we dump from 'query-named-block-nodes' so that the upcoming changes to the internal snapshot code can access it. To test this we use the bitmap detection test cases which can be easily extended to dump this data. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel	9df1453db8	qemu: capabilities: Introduce QEMU_CAPS_SNAPSHOT_INTERNAL_QMP capability The 'snapshot-save/delete' QMP commands were introduced in QEMU 6.0.0, so we add a compatible capability to check if target QEMU binary supports it. Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel	ce4ed8deef	qemu: blockjob: Add job types for 'snapshot-save/delete' The snapshot creation/deletion QMP commands use the qemu 'job' API to signal completion thus we need to add corresponding job types. As the job handles everything internally we don't store anything about the job. Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel	5d0773633a	qemu: monitor: Add plumbing for 'snaphot-save'/'snapshot-delete' QMP commands Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Peter Krempa	2e325804cc	qemuDomainObjWait: Annotate with G_GNUC_WARN_UNUSED_RESULT Callers must handle the return value of this function as the VM might have died. Add compiler annotation to force it. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Jiri Denemark	93d97d8fa2	cpu_map: Drop vmx-invvpid-single-context from CPU models QEMU calls the same feature differently, but translating the names in libvirt does not make sense because the name in QEMU conflicts with another feature. QEMU will not change the name for compatibility reasons so we can just drop our invented name as it is not supported by QEMU. Apart from this slightly different reason behind the feature being unsupported by QEMU the situation is similar to vmx-ept-{uc,wb} dropped in the previous patch and so is the implications. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	b1d4196580	cpu_map: Drop vmx-ept-{uc,wb} features from CPU models Although QEMU knows and enables the corresponding MSR bits, it does not allow users to configure them (there are no names attached to them). They should have never been added to the CPU map and definitely not to CPU models as the features will always be considered disabled regardless on their actual state as QEMU will not report them. While we cannot drop them completely for backward compatibility, we can at least remove them from all CPU models. This is effectively no change for CPU models where the features were marked with added='yes' because migration source would always remove the features from domain XML so not adding them to the live XML does not hurt. On the other side the destination could not ever be surprised by the features being suddenly enabled as QEMU never reports them, which means libvirt considers them disabled all the time. GraniteRapids CPU model is the only one which contains the feature ever since it was introduced in libvirt, but it was never possible to migrate a domain with such CPU. The source would always mark vmx-ept-wb as disabled and the destination without the fixes in this series would drop the feature from the XML completely as it is unsupported by QEMU and disabled, but when probing for the actual CPU created by QEMU libvirt would expect the feature to be enabled (as it is included in the CPU model and not explicitly mentioned in the domain definition) and fail the migration. There's nothing the source could do to workaround the behavior on the destination and migration to older libvirt will still be broken. But it's possible to migrate a domain with GraniteRapids to a destination with this series applied from both old and new source. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	29aa9b02aa	qemu: Replace big condition in virQEMUCapsCPUFilterFeatures with array Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	98700d354b	qemu: Translate vmx-invvpid-single-context-noglobals CPU feature This feature is called "vmx-invept-single-context-noglobals" in QEMU and our CPU map even contains the appropriate alias. But we failed to actually translate the name when talking to QEMU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	00e55059e6	qemu: Do not drop unknown CPU features from domain XML CPU features with policy='disable' which are unknown to QEMU may be safely skipped when generating the -cpu command line, but we should still keep them in the domain definition so that we can properly check they are disabled after migrating the domain to a newer QEMU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	aae8a5774b	qemu: Drop vmx-* from migratable CPU model only when origCPU is set When qemuDomainMakeCPUMigratable is called with origCPU == NULL the code just removed all vmx-* features marked as added in the specified CPU model just like when origCPU is not NULL, but does not list any of the vmx-* features. But this is wrong, we should not touch these features at all when no origCPU is supplied, which happens when parsing XML passed by a user (e.g., migration XML). Such XML is supposed to be generated by libvirt as migration XML and contains only vmx-* features explicitly requested by a user. https://issues.redhat.com/browse/RHEL-52314 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:50 +02:00
Martin Kletzander	215cada343	util: Look for newer name of cpu wait time statistic It looks like linux changed the key for wait time in /proc/<pid>/sched and /proc/<pid>/task/<tid>/sched files in commit ceeadb83aea2 (or around that time) from se.statistics.wait_sum to just wait_sum. Similarly to the previous change (from se.wait_sum) just look for the new name first. Resolves: https://issues.redhat.com/browse/RHEL-60030 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 20:43:07 +02:00
Andrea Bolognani	7d6759135e	qemu: Handle locking of TPM state directory for incoming migration By not attempting to lock the lock file, which would fail. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 14:50:06 +02:00
Andrea Bolognani	454219ad6c	security: Allow skipping locking when labeling lock files This is needed when migrating a guest that has persistent TPM state: relabeling (which implies locking) needs to happen before the swtpm process is started on the destination host, but the lock file won't be released by the swtpm process running on the source host before a handshake with the target process has happened, creating a catch-22 scenario. In order to make migration possible, make it so that locking for lock files can be explicitly skipped. All other state files are handled as usual. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:59 +02:00
Andrea Bolognani	8fe803247e	security: Always forget labels for TPM state directory In the case of outgoing migration, we avoid restoring the remembered labels for the TPM state directory because doing so would risk cutting off storage access for the target node. Even in that case though, we should still forget (unref) the remembered labels: if we don't, the source node will keep thinking that the state directory is in use. Note that this change only affects the SELinux driver because the DAC driver doesn't currently implement label remembering for TPM state at all. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:56 +02:00
Peter Krempa	3bfcb35dd5	qemu: migration: Don't remember seclabel for images shared from current host In case when the user exports images from current host and there is an incoming migration from a remote host, security label remembering would be possible but would attempt to remember the label allowing access to the image as the image is already used by a VM on remote host. To prevent remembering the wrong label, we'll skip the remembering of the label for any shared resource, so that the code behaves identically regardless of how the image is accessed. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Peter Krempa	b581045520	storage_source: Add field for skipping seclabel remembering In case of incoming migration where a local directory is shared to other hosts we'll need to avoid seclabel remembering as the code would remember the seclabel already allowing access to the image. As the decision requires a lot of information not available in the security driver it would either require plumbing in unpleasant callbacks able to pass in the data or alternatively we can mark this in the 'virStorageSource' struct. This patch chose to do the latter approach by adding a field called 'seclabelSkipRemember' which will be filled before starting the process in cases when it will be required. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Peter Krempa	eabeae605f	security_(dac\|selinux): Unref remembered security labels on outgoing migration When 'qemuSecurityRestoreAllLabel' is called on outgoing migration it skips the actual relabeling part of the images in dac/selinux drivers in order to avoid cutting off access to the image. As shared filesystems don't really support the trusted XATTR groups, remembering of security labels never worked on those paths so we never actually had remembered seclabels for images that could be migrated. With recent changes we now support migration from local storage to remote in case the admin declares it as shared. This means that in case when the VM is started on local storage we'd actually store seclabels, but when migrating out the XATTRs remembering the seclabels would not actually be unref'd and thus the seclabels would leak. As we can't know whether a remote host will be able to use the XATTRs or not (but really it won't) and at the same time the destination side of migration will actually call 'qemuSecuritySetAllLabel' setting/refing it's own seclabels we really need to unref them on our side. This patch adds the appropriate *RecallLabel() calls on the code paths in which relabelling is skipped due to migration. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Peter Krempa	2983dd44c5	virSecuritySELinuxRestoreImageLabelInt: Move FD image relabeling after 'migrated' check Reorganize the code so that the 'migrated' flag isn't checked multiple times and thus that it's more obvious what is happening when the 'migrated' flag is asserted. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Peter Krempa	568b3c6abe	virParseOwnershipIds: Refactor Use automatic clearing for temporary variable, remove 'cleanup' label and declare parameters according to new coding style rules. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Peter Krempa	7af0b6ea75	virFileIsSharedFSOverride: Export Document the function and export it for use outside of the 'virfile' utils module. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	da0c363835	qemu: Always set labels for TPM state Up until this point, we have avoided setting labels for incoming migration when the TPM state is stored on a shared filesystem. This seems to make sense, because since the underlying storage is shared surely the labels will be as well. There's one problem, though: when a guest is migrated, the SELinux context for the destination process is different from the one of the source process. We haven't hit any issues with the current approach so far because NFS doesn't support SELinux, so effectively it doesn't matter whether relabeling happens or not: even if the SELinux contexts of the source and target processes are different, both will be able to access the storage. Now that it's possible for the local admin to manually mark exported directories as shared filesystems, however, things can get problematic. Consider the case in which one host (mig-one) exports its local filesystem /srv/nfs/libvirt/swtpm via NFS, and at the same time bind-mounts it to /var/lib/libvirt/swtpm; another host (mig-two) mounts the same filesystem to the same location, this time via NFS. Additionally, in order to allow migration in both directions, on mig-one the /var/lib/libvirt/swtpm directory is listed in the shared_filesystems qemu.conf option. When migrating from mig-one to mig-two, things work just fine; going in the opposite direction, however, results in an error: # virsh migrate cirros qemu+ssh://mig-one/system error: internal error: QEMU unexpectedly closed the monitor (vm='cirros'): qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator' qemu-system-x86_64: load of migration failed: Input/output error This is because the directory on mig-one is considered a shared filesystem and thus labeling is skipped, resulting in a SELinux denial. The solution is quite simple: remove the check and always relabel. We know that it's okay to do so not just because it makes the error seen above go away, but also because no such check currently exists for disks and other types of persistent storage such as NVRAM files, which always get relabeled. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	f7b9313ec7	utils: Use overrides in virFileIsSharedFS() If the local admin has explicitly declared that a certain filesystem is to be considered shared, we should treat it as such. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	6952af8b43	qemu: Propagate shared_filesystems virFileIsSharedFS() is the function that ultimately decides whether a filesystem should be considered shared, but the list of manually configured shared filesystems is part of the QEMU driver's configuration, so we need to pass the information through several layers in order to make use of it. Note that with this change the list is propagated all the way through, but its contents are still ignored, so the behavior remains the same for now. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	df3597ee70	qemu: Introduce shared_filesystems configuration option As explained in the comment, this can help in scenarios where a shared filesystem can't be detected as such by libvirt, by giving the admin the opportunity to provide this information manually. https://issues.redhat.com/browse/RHEL-35752 Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:25 +02:00
Andrea Bolognani	5ea466648c	security: Fix alignment Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:25 +02:00
John Levon	c6ba83b3e4	test_driver: provide basic NIC hotunplug support Provide minimal support for hotunplugging ETHERNET or BRIDGE type NICs in the test driver. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-03 09:10:23 +02:00
John Levon	cda4ee02a6	test_driver: provide basic NIC hotplug support Provide minimal support for hotplugging ETHERNET or BRIDGE type NICs in the test driver. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-03 09:10:22 +02:00
Han Han	3b296a98aa	domain_validate: Validate dma_translation for iommu models The attribute dma_translation is only supported by intel-iommu device. Report an error when it is used for the other iommu devices. Fixes: `6866f958c1` Signed-off-by: Han Han <hhan@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-02 16:06:51 +02:00
Rayhan Faizel	8105426d8f	libxl_conf: Add check for unsupported graphics type libxlMakeVfb always succeeds regardless of if the graphics type is actually supported or not. libxl_defbool_val is called in libxlMakeBuildInfoVfb which besides returning the boolean value of the defbool also has an assertion that the defbool value is not set to default. It is possible to fail this assertion if an unsupported graphics type is used. In libxlMakeVfb, the VNC and SDL enable defbools are still left in their default state if the graphics type falls outside the two, which leads to this issue. This patch adds a check to reject graphics types outside of SDL, VNC, and SPICE very early on in libxlMakeVfb. As a safeguard, we also initialize both vnc enable and sdl enable defbools as false early. Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-02 16:06:51 +02:00
Rayhan Faizel	cb2a6ef8b5	libxl_conf: Fix config generation for multiple serial devices Currently, an array of libxl_string_list (char ) or in other words, a triple char pointer is initialized. This is dereferenced to a char type and stored in serial_list, which is NULL at this point. There is an attempt to reference an element of this serial_list when making a call to libxlMakeChrdevStr which causes a segmentation fault. To fix this, we simply allocate an array of char * instead of libxl_string_list. This patch also adds testcases to extend coverage over both single serial and multiple serial cases. Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-02 16:06:50 +02:00
Peter Krempa	621f879adf	qemu: Introduce and wire in 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES' The new 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES' migration parameter allows users of migration to pass in a list of disks where zero-detection (which avoids transferring the zeroed-blocks) should be enabled for the migration connection. This comes at the cost of extra CPU cycles needed to check each block if it's all-zero. This is useful for storage backends where information about the allocation state of a block is not available and thus without this the image would become fully allocated on the destination. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	448b14f74d	qemu: migration: Remove 'nmigration_disks' variable from all places Now that none of the functions need it we can drop it. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	aaefaabf5a	qemu: migration: Extract validation of disk target list The migration code is checking the disk list provided via VIR_MIGRATE_PARAM_MIGRATE_DISKS against existing disks. Extract it to a helper function as we'll be passing another list of disk targets soon. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	4ebf1acb83	qemu: migration: Avoid use of 'nmigration_disks' 'migration_disks' is a NULL-terminated string list, so the code can be converted to either iterate the string-list, use existing accessors or check the presence of the pointers instead of checking the count. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	d98beef107	qemu: migration: Don't log 'nmigrate_disks' The actual number of disks to migrate is not important. The presence of disks to migrate can be inferred from presence of the 'migrate_disks' pointer which is logged. Since 'nmigrate_disks' will eventually be removed remove the logging right now. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	ab52a069ee	qemuMigrationSrcBeginPhaseBlockDirtyBitmaps: Use qemuMigrationAnyCopyDisk() The function open-coded the checking whether a disk is being migrated with non-shared storage and did so badly (not taking into account if user doesn't explicitly provide list of disks to migrate). Use the existing helper instead. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	9bf319147c	virTypedParamsGetStringList: Ensure that returned string list is NULL-terminated This can simplify callers who don't really need to know the number of elements to check that a particular element is present. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	7933310ce9	virTypedParamsGetStringList: Ensure that returned array is NULL if there are no matching fields 'virTypedParamsGetStringList' fills the returned array only with string parameters with matching name. The filtering code though leaves the possibility that all items are filtered out but the return array is still (over)allocated. Since 'virTypedParamsFilter()' now also allows filtering by type we can move the filtering there ensuring that we always allocate the right number of elements and more importantly the returned array will be NULL if none elements are present. Rework the code and adjust docs. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	b74fed0173	virTypedParamsFilter: Introduce option to filter also by type The only caller of this function is doing some additional filtering so it's useful if the filtering function was able to do so internally. Introduce a 'type' parameter which will optionally filter the results by type and extend the testsuite to cover this scenario. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	e5fae984b1	virTypedParamsGetStringList: Refactor and adjust docs Use automatic freeing, declare one variable per line and return early when possible. As this is an internal helper there's no need to check that the caller passed non-NULL @values. Modify the documentation to be accurate and warn callers to not free the strings just the array. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	933ab93e8f	virTypedParamsFilter: Adjust return type and docs The 'virTypedParamsFilter' function can't fail and thus it never returns negative values. Change the return type to 'size_t' and adjust callers to not check the return value for being negative. Adjust the docs to hilight this and also the fact that the filtered typed param list returned via @ret is not a deep copy and thus callers must not use the common function to free it. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	165b30e06a	qemu: migration: Pre-create QCOW2 images for non-shared storage with 0 allocation Specify that the <allocation> parameter for the newly-created qcow2 image is 0 so that only metadata gets preallocated. Otherwise the storage driver code instructs qemu to use 'fallocate' preallocation mode and considers the image fully allocated. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	54109db826	qemu: blockjob: Clean out disk mirror data after concluding the job The 'disk->mirrorJob' and 'disk->mirrorState' fields need to be cleared after a blockjob, but should be kept around while 'disk->mirror' is still in place. As 'disk->mirror' is cleared only after conclusion of the job in 'qemuBlockJobEventProcessConcluded()' we should be resetting them only afterwards. Move the code later, but since the job is unregistered from the disk we need to store the pointer to the disk before concluding the job. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:54:40 +02:00
Peter Krempa	b85b60d140	qemu: blockjob: Update 'mirror' of a copy job before removing images When concluding a job with a 'mirror' we first unplugged the appropriate no-longer used images from qemu and then updated the definition. Normally this wouldn't be a problem because for any other thread this is done under the VM lock thus atomic. Unfortunately though, the AppArmor security backend is using a VM XML to pass data to the helper process and the state of the definition at that point was unsuitable to format a valid XML thus making 'virt-aa-helper' report parsing failure. Since we're removing the images the proper state of the VM definition indeed should not include the mirror element any more at the point when the images are removed. Closes: https://gitlab.com/libvirt/libvirt/-/issues/601 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:54:40 +02:00
Laine Stump	bcd5ae4e73	qemu: fix regression in update-device for interfaces Commit `a37bd2a15b` eliminated a failure to update any change in an interface that was connected via a network that consisted of a pool of VFs using macvtap passthrough mode. Unfortunately it caused a regression that results in failure to update changes to bandwidth/vlan/trustGuestRxFilters in any interface connected via a network that uses a bridge to connect tap devices. This fixes that problem by narrowing the usage of the fix in the earlier patch to only be done in the case that the the interface is connected via a macvtap+passthrough network. Signed-off-by: Laine Stump <laine@redhat.com> Fixes: `a37bd2a15b` Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-01 10:25:12 +02:00
Andrea Bolognani	55c3c09197	qemu: Look for qemu-bridge-helper in more directories Commit `0caacf47d7` recently made it so the new path used for qemu-bridge-helper in Debian would be allowed, but the logic used to actually figure out the complete path for the helper was not updated accordingly. https://bugs.debian.org/1082530 Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-30 10:58:15 +02:00
Jiri Denemark	f527da37be	cpu_map: Fix SierraForest CPU model The model was defined with two CPU features that cannot be explicitly configured in QEMU (it knows the MSR bits, but there's no name associated with them). The features should have never existed in the CPU map. While removing them from the list of features and existing CPU models is not trivial (to avoid compatibility issues), we can at least fix the SierraForest CPU model added in this release cycle. The rest will be handled later in a separate series. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-25 09:57:48 +02:00
Cole Robinson	785dfad13c	rpc: ssh: Allow SSH_ASKPASS_REQUIRE openssh 8.4p1 released in Sep 2020 added a feature to force use of SSH_ASKPASS https://man.openbsd.org/ssh.1#SSH_ASKPASS_REQUIRE Don't strip it from the environment Signed-off-by: Cole Robinson <crobinso@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-09-25 08:46:52 +02:00
Michal Privoznik	6126f743b1	qemu: Provide sane default for dump_guest_core QEMU uses Linux extensions to madvise() to include/exclude guest memory from core dump. These are obviously not available everywhere. Currently, users have two options: 1) configure <memory dumpCore=''/> in domain XML, or 2) configure dump_guest_core in qemu.conf While these work, they may harm user experience as "things just don't work" out of the box. Provide sane default in virQEMUDriverConfigNew() so neither of two options is required. To have predictable results in tests, explicitly set cfg->dumpGuestCore to false in qemuTestDriverInit() (which creates cfg object for tests). Resolves: https://gitlab.com/libvirt/libvirt/-/issues/679 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-25 08:38:09 +02:00
Michal Privoznik	18b61cb4f9	qemu.conf.in: Fix dumpCore capitalization In qemu.conf.in we give examples of enabling/disabling core dumps in domain XML. But the attribute is spelled wrong. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-25 08:38:09 +02:00
Martin Kletzander	6f0974ca32	qemu: Generate domain memory backing path directly This makes qemuDomainGenerateMemoryBackingPath() nicer to call. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	f035f24777	qemu: Rename memory path functions This way they make sense not only based on where they are located but the name also relates to what they are actually doing. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	d599fc3d57	qemu: Make qemuGetMemoryBackingDomainPath static After previous patches it is not used (and should not be used) outside of qemu_domain.c. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	ff49d2a8c2	qemu: Use per-domain private memoryBackingDir for new memory backends The function qemuGetMemoryBackingPath() does not need the @def any more and priv->memoryBackingDir can be used instead of constructing the path by calling qemuGetMemoryBackingDomainPath(). Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	f58a4dc9d5	qemu: Set memoryBackingDir in private data upon start This way we keep the path for each running VM. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	da8a1d7943	qemu: Add memoryBackingDir to qemuDomainObjPrivate This way we _can_ (but do not, yet) remember the memory backing path for running domains even after configuration change and daemon restart. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	c9a35eb255	qemu: Change parameters of qemuGetMemoryBackingDomainPath() This way it does not use driver, since it will be later reworked and the following patches cleaner, hopefully. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	edcf14be9c	qemu: Move domain-related functions to qemu_domain Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Ján Tomko	81e532c701	util: json: remove yajl implementation Since the previous commit removed YAJL detection completely, WITH_YAJL cannot possibly be set. Drop the code. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-09-24 08:24:00 +02:00
Ján Tomko	d96e753d84	meson: options: drop yajl Drop the yajl option and all references to it. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-09-24 08:24:00 +02:00
Ján Tomko	9e6555fd90	util: json: write a json-c implementation Write an alternative implementation of our virJSON functions, using json-c instead of yajl. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-09-24 08:24:00 +02:00
Ján Tomko	28c9872639	meson: switch checks to depend on json-c as well as yajl Ensure both are required during this series to make bisecting smooth. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-09-24 08:24:00 +02:00
Ján Tomko	330cf7f492	util: json: introduce virJSONStringPrettifyBlanks A horribly named function for unifying formatting when pretty-printing empty JSON arrays and objects. Useful for having stable test output even if different JSON libraries format these differently. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-09-24 08:24:00 +02:00
Laine Stump	e14a5fcac4	util: use uint32 instead of char[4] for several virSocketAddrIPv4 operations These 3 functions are easier to understand, and more efficient, when the IPv4 address is viewed as a uint32 rather than an array of bytes. virsocketAddrGetIPv4Addr() has bothered me for a long time - it was doing ntohl of the address into a temporary uint32, and then a loop one-by-one swapping the order of all the bytes back to network order. Of course this only works as described on little-endian architectures - on big-endian architectures the first assignment won't swap the bytes' ordering, but the loop assumes the bytes are now in little-endian order and "swaps them back", so the result will be incorrect. (Do we not support any big-endian targets that would have exposed this bug long before now??) virSocketAddrCheckNetmask() was checking each byte of the two addresses individually, when it could instead just do the operation once on the full 32 bit values. virSocketGetRange() was checking for "range > 65535" by seeing if the first 2 bytes of the start and end were different, and then doing arithmetic combining the lower two bytes (along with necessary bit shifting to account for network byte order) to determine the exact size of the range. Instead we can just get the ntohl of start & end, and do the math directly. Signed-off-by: Laine Stump <laine@redhat.com>	2024-09-21 15:06:09 -04:00
Laine Stump	009464902a	util: make virSocketAddrIPv4 a union virSocketAddrIPv4 is a type used only internally by virsocketaddr.c. It is defined to be a character array, which leads to multiple occurences of extra bit fiddling and byte swapping for no good reason (except to confuse). An IPv4 address is really just a uint32_t with the bytes in network order, which is exactly the type of the s_addr member of the sockaddr_in that is a part of the publicly consumed struct virSocketAddr, and that we are copying in and out of a virSocketAddrIPv4. Sometimes it's simpler to just treat it as a network-order uint32_t, so let's make our virSocketAddrIPv4 a union that has both an unsigned char bytes[4] (for the times when we need to look one byte at a time) and a uint32_t val (for the times when it's simpler to treat it as a single value). For now we just change all the uses from, e.g. x[i] to x.bytes[y]; an upcoming patch will simplify some of the code to remove loops by using x.val instead of x.bytes when appropriate. Signed-off-by: Laine Stump <laine@redhat.com>	2024-09-21 14:39:05 -04:00
Laine Stump	14623a3424	util: fix virSocketAddrMask() when source and result are the same object Many years ago (2011), virSocketAddrMask() had caused a bug by failing to initialize an IPv6-specific field in the result virSocketAddr. This was fixed by memset(0)ing the entire result (network) at the beginning of the function (thus making sure anything and everything was initialized). The problem is that virSocketAddrMask() has a comment above it that says that the source (addr) and destination (network) arguments can point to the same virSocketAddr. But in that case, the memset(network, 0) at the top of the function is actually doing a memset(addr, 0), and so there is nothing left for all the assignments to copy except a giant field of 0's. Fortunately in the 13 years since the memset was added, nobody has ever called virSocketAddrMask() with addr and network being the same. This patch makes the code agree with the comment by copying/masking into a local virSocketAddr (which is initialized to all 0) and then copying that to network after it's finished assigning things from addr. Fixes: `ba08c5932e` Signed-off-by: Laine Stump <laine@redhat.com>	2024-09-21 14:37:54 -04:00
Laine Stump	f7a2d158f7	network: fix argument order/log level in message about firewall_backend Oops. Fixes: `64b966558c` Signed-off-by: Laine Stump <laine@redhat.com>	2024-09-19 16:14:21 -04:00
Laine Stump	c7ea694f7d	qemu: rework needBridgeChange/needReconnect decisions in qemuDomainChangeNet() This patch simplifies (?) the of qemuDomainChangeNet() code while fixing some incorrect decisions about exactly when it's necessary to re-attach an interface's bridge device, or to fail the device update (needReconnect[]) because the type of connection has changed (or within bridge and direct (macvtap) type because some attribute of the connection has changed that can't actually be modified after the tap/macvtap device of the interface is created). Example 1: it's pointless to require the bridge device to be reattached just because the interface has been switched to a different network (i.e. the name of the network is different), since the new network could be using the same bridge as the old network (very uncommon, but technically possible). Instead we should only care if the name of the bridge device* changes (or if something in <virtualport> changes - see Example 3). Example 2: wrt changing the "type" of the interface, a change should be allowed if old and new type both used a bridge device (whether or not the name of the bridge changes), or if old and new type are both "direct" and the device being linked and macvtap mode remain the same. Any other change in interface type cannot be accommodated and should be a failure (i.e. needReconnect). Example 3: there is no valid reason to fail just because the interface has a <virtualport> element - the <virtualport> could just say "type='openvswitch'" in both the before and after cases (in which case it isn't a change by itself, and so is completely acceptable), and even if the interfaceid changes, or the <virtualport> disappears completely, that can still be reconciled by simply re-attaching the bridge device. (If, on the other hand, the modified <virtualport> is for a type='direct' interface, we can't domodify that, and so must fail (needReconnect).) (I tried splitting this into multiple patches, but they were so intertwined that the intermediate patches made no sense.) [*] "needReconnect" was a flag added to this function way back in 2012, when I still believed that QEMU might someday support connecting a new & different device backend (the way the virtual device connects to the host) to an already existing guest netdev (the virtual device as it appears to the guest). Sadly that has never happened, so for the purposes of qemuDOmainChangeNet() "needReconnect" is equivalent to "fail". Resolves: https://issues.redhat.com/browse/RHEL-7036 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:39 -04:00
Laine Stump	601f4160b9	qemu: replace open-coded remove/attach bridge with virNetDevTapReattachBridge() The new function does what the old qemuDomainChangeNetbridge() did manually, except that: 1) the new function supports changing from a bridge of one type to another, e.g. from a Linux host bridge to an OVS bridge. (previously that wasn't handled) 2) the new function doesn't emit audit log messages. This is actually a good thing, because the old code would just log a "detach" followed immediately by "attach" for the same MAC address, so it's essentially a NOP. (the audit logs don't have any more detailed info about the connection - just the VM name and MAC address, so it makes no sense to log the detach/attach pair as it's not providing any information). Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:31 -04:00
Laine Stump	e3f8bccea6	util: don't return early from virNetDevTapReattachBridge() if "force" is true It can be useful to force an interface to be detached/reattached from its bridge even if it's the same bridge - possibly something like the virtualport profileID has changed, and a detach/attach cycle will get it connected with the new profileID. The one and only current use of virNetDevTapReattachBridge() sets force to false, to preserve current behavior. An upcoming patch will use it with force set to true. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:19 -04:00
Laine Stump	a37bd2a15b	qemu: prevent unnecessarily failing live interface update Attempts to use update-device to modify just the link state of a guest interface were failing due to a supposed attempt to modify something in the interface that can't be modified live (even though the only thing that was changing was the link state, which can be modified live). It turned out that this failure happened because the guest interface in question was type='network', and the network in question was a 'direct' network that provides each guest interface with one device from a pool of network devices. As a part of qemuDomainChangeNet() we would always allocate a new port from the network driver for the updated interface definition (by way of calling virDomainNetAllocateActualDevice(newdev)), and this new port (ie the ActualNetDef in newdev) would of course be allocated a new host device from the pool (which would of course be different from the one currently in use by the guest interface (in olddev)). Because direct interfaces don't support changing the host device in a live update, this would cause the update to fail. The solution to this is to realize that as long as the interface doesn't get switched to a different network as a part of the update, the network port information (ie the ActualNetDef) will not change as a part of updating the guest interface itself. So for sake of comparison we can just point the newdev at the ActualNetDef of olddev, and then clear out one or the other when we're done (to avoid a double free or, more likely, attempt to reference freed memory). (If, on the other hand, the name of the network has changed, or if the interface type has changed to type='network' from something else, then we do need to allocate a new port (actual device) from the network driver (as we used to do in all cases when the new type was 'network'), and also indicate that we'll need to replace olddev in the domain with newdev (because either of these changes is major enough that we shouldn't just try to fix up olddev) Partially-Resolves: https://issues.redhat.com/browse/RHEL-7036 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:06 -04:00
Peter Krempa	852380cef5	qemuBuildChardevCommand: Remove unused variable 'charstr' is unused since `36d06a5637`, breaking the build on some platforms. Remove it. Fixes: `36d06a5637` Signed-off-by: Peter Krempa <pkrempa@redhat.com>	2024-09-19 13:12:02 +02:00
Peter Krempa	24d468993c	qemu: Reject unsupported chardev backend protocols QEMU supports only 'raw' and 'telnet' in the <protocol type='telnets'/> element. Reject 'telnets' and 'tls'. TLS transport for qemu chardevs is configured via "tls='yes'" attribute added to the "<source>" element instead, so this prevents potential misconfig as the value would be silently accepted. Closes: https://gitlab.com/libvirt/libvirt/-/issues/412 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	3778964207	conf: Convert 'protocol' field of TCP char device backend to proper type Use virDomainChrTcpProtocol as type, convert the parser to use virXMLPropEnum and fix one switch statement in the VMX driver. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	2256466f70	qemu: monitor: Remove the old chardev backend generator Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	e352a692a7	qemu: Use the new chardev backend JSON props generator also in the monitor Now that we have a unified generator of chardev backend which is also validated against the QMP schema we can replace the old generator with it. This patch modifies the monitor code to take virJSONValue 'props' instead of the chardev definition and adds the conversion from the chardev definition to JSON on higher levels. The monitor code now also attempts to extract the returned 'pty' if returned from qemu, so higher level code needs to report the error if the path is needed and missing. The current monitor generator is for now abandoned in place and will be removed later. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	d897ad2b89	qemu: Move check for chardev backends which can't be hotplugged out of the monitor The upcoming refactor of the monitor code will make the hotplug code paths use the same generator we have for commandline -chardev backends which doesn't refuse to format certain backends which can't be hotplugged. To prepare for this we add a check to qemuHotplugChardevAttach() refusing such hotplug and remove 'qemumonitorjsontest' test cases which will not make sense any more. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:14 +02:00
Peter Krempa	36d06a5637	qemu: Introduce unified chardev backend config generator Similarly to how we approach the generators for -device/-object/-blockdev/-netdev rewrite the generator of -chardev to be unified with the generator for the monitor. Unfortunately with -chardev it will be a bit more quirky when compared to the others as the generator itself will need to know whether it generates command line output or not as a few field names change and data is nested differently. This first step adds the generator and uses it only for command line generation. This was possible to achieve without changing any of the output in tests. In further patches the same generator will then be used also in the monitor code replacing both. As basis for the generator I took the monitor code but modified it to have the same field order as the commandline code and extended it further to support all backend types, even those which are not hotpluggable. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:14 +02:00
Peter Krempa	9c88a566d8	qemu: capabilities: Explain that QEMU_CAPS_CHARDEV_JSON will be used in tests only I've added that capability a long time ago when I was converting various stuff to use JSON but the support in '-chardev' didn't yet materialize. Fix the comment to make that clear and also that it'll be used in tests for the upcoming refactor of the chardev code (so that we can validate generator against the schema even if that doesn't yet work). Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:14 +02:00
Laine Stump	200f60b2e1	network: unset the firewalld zone while shutting down a network When a bridge device for a virtual network had been placed in a firewalld zone while starting the network, then even after the network is shut down and the bridge device is deleted, its name will still show up in the list of interfaces for whichever zone it had been in, and this setting will persist through the next time a device with the same name is created (until a zone is once again explicitly set, or the device is removed via a firewalld API call). Usually this isn't a problem, but in the case of forward mode='open', someone might start the network once with a zone specified, then shut down the network, remove the zone from its config, and start it again; in this case the bridge device would come up using the zone from the previous time it was started. The solution to this is to remove the interface from whatever zone it is in as the network is being shut down. There is no downside to doing this, since the device is going to be deleted anyway. Note that forward mode='bridge' uses a bridge device that was created outside of libvirt, and libvirt won't be deleting that bridge, so we take care to not unset the zone in that case. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-17 11:22:56 -04:00
Laine Stump	1a3778fe0a	network: remove firewalld version check from networkSetBridgeZone() At the time the version check in this function was written, there were still several supported versions of some distros that were using a version of firewalld too old to support the "rich rule priorities" used by the 'libvirt' zone that we installed for firewalld. Today the newest distro that has a version of firewalld < 0.7.0 is RHEL7/CentOS7, so we can remove the complexity and if the libvirt zone is missing simply say "the libvirt zone is missing". Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-17 10:55:14 -04:00
Laine Stump	1a72b83d56	network: support setting firewalld zone for bridge device of open networks The bit of code that sets the firewalld zone was previously a part of the function networkAddFirewallRules(), which is not called for networks with <forward mode='open'/>. Setting the 'libvirt' zone for the bridge device of virtual networks that also add firewall rules is usually necessary in order to get the expected traffic through without modifying firewalld's default zone (which would be a bad idea, because that would affect all the other host interfaces set to the default zone), but in general we would not want the bridge device for a mode='open' virtual network to be automatically placed in the "libvirt" zone. However, a user might want to explicitly set some other firewalld zone for mode='open' networks, and libvirt's network config is a convenient place to do that. We enable this by moving the code that sets the firewalld zone into a separate function that is called for all forward modes that use a bridge device created/managed by libvirt (nat, route, isolated, open). If no zone is specified, then the bridge device will be in whatever zone interfaces are put in by default, but if the <bridge> element has a "zone" attribute, then the new bridge device will be placed in the specified zone. NB: This function is only called when the network is started, and not when the firewall rules of an active network are reloaded at virtnetworkd restart time, because the firewalld zone of an interface isn't something that gets inadvertantly changed as a part of some other unrelated action. For example all iptables rules are cleared by a firewalld restart, including those rules added by libvirt, but there is no blanket action that changes the zone of all interfaces, so it's useful for libvirt to reload its rules when restarting virtnetworkd, but pointless to re-add the interface to its preferred zone. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-17 10:55:14 -04:00
Laine Stump	eeebbc1eec	network: belatedly update an error message The 'open' forward type probably hadn't yet been added when this message was written. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-17 10:55:14 -04:00
Laine Stump	114c0ec656	network: permit <forward mode='open'/> when a network has no IP address The whole point of <forward mode='open'/> is to supress libvirt from adding any firewall rules for a network, and someone might want to create a network with no IP address (i.e. they don't want the guests to have connectivity to the host via this interface) and no firewall rules (they don't want any, or they want to add their own). So there's no reason to fail when a network has <forward mode='open'/> and also has no IP address. Kind-of-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/588 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-17 10:55:14 -04:00
Martin Kletzander	d0a48eeb72	network: Remove unused variable in networkDestroy Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-17 10:43:18 +02:00
Martin Kletzander	8a2717e803	network: Clean up after disappeared transient inactive networks If a network disappeared the daemon should not only remove it from the list of networks, but also do a proper cleanup. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:38 +02:00
Martin Kletzander	2bea2782d5	network: Separate cleanup from networkRemoveInactive The new function (networkCleanupInactive) can be called from an iterator over the list of networks without the risk of deadlock. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:37 +02:00
Martin Kletzander	74a22c09be	network: Try to read dnsmasq PIDs for inactive networks too Just in case one needs a clean up. Resolves: https://issues.redhat.com/browse/RHEL-50968 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:37 +02:00
Martin Kletzander	447fda8981	network: Clean up after inactive objects during start Once networkUpdateState() identifies a dead network it should clean up after it as well. Resolves: https://issues.redhat.com/browse/RHEL-50968 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:36 +02:00
Martin Kletzander	0e43cb09ee	network: Don't check if network is active in networkShutdownNetwork It skips the cleanup from networkStartNetwork and the only other path already checks if the network is active or not. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:35 +02:00
Martin Kletzander	3e43670f01	network: Move port deletion into the shutdown function It will be more useful in there when calling from new places. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:35 +02:00
Martin Kletzander	5988fdec91	network: Do not call virNetworkObjUnsetDefTransient on start cleanup The function networkShutdownNetwork already does that. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:34 +02:00
Martin Kletzander	97ed0574ea	network: Do not update network ports for inactive networks The semantic does not change since inside networkUpdatePort() (well, networkNotifyPort, for which the former is a wrapper) exits for inactive networks, but with an error we can easily avoid with this patch. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-09-17 09:40:34 +02:00
Andrea Bolognani	d622ca04f6	apparmor: Don't check for existence of templates upfront Currently, if either template is missing AppArmor support is completely disabled. This means that uninstalling the LXC driver from a system results in QEMU domains being started without AppArmor confinement, which obviously doesn't make any sense. The problematic scenario was impossible to hit in Debian until very recently, because all AppArmor files were shipped as part of the same package; now that the Debian package is much closer to the Fedora one, and specifically ships the AppArmor files together with the corresponding driver, it becomes trivial to trigger it. Drop the checks entirely. virt-aa-helper, which is responsible for creating the per-domain profiles starting from the driver-specific template, already fails if the latter is not present, so they were always redundant. https://bugs.debian.org/1081396 Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-09-16 17:57:54 +02:00
Martin Kletzander	4b68c7e55b	resctrl: Do not rewrite default MB values for new allocations The code did it "just in case" the allocation was not reset for new subdirectories. That might've happened in the past with CAT settings, but checking it now it is properly reset to its maximum values for each new CLOSID (Class of Service ID). The advantage of this is that we do not rewrite the value with itself which causes an issue with the current linux kernel and mba_MBps option where the default is UINT_MAX (or (uint32_t) -1), but gets rounded up to bandwidth granularity (10), overflows and small number (4) is set instead. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-16 12:07:15 +02:00
Michal Privoznik	ebc4580a5f	Revert "vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs" Unfortunately, devfs on FreeBSD (accessible via /dev/fd) exposes only those FDs which can be represented as a file. To cite manpage [1]: The files /dev/fd/0 through /dev/fd/# refer to file descriptors which can be accessed through the file system. This means FDs representing pipes and/or unnamed sockets are not visible by default. To expose all FDs a slightly different filesystem must be mounted [2]: mount -t fdescfs none /dev/fd Apparently, on my test machine fdescfs is mounted by default and thus I haven't seen any problem. Only after aforementioned patch was merged our CI started reporting problems. While we could try to figure out whether correct FS is mounted, it's a needless micro optimization. Just revert the code to the state it was before I touched it. 1: https://man.freebsd.org/cgi/man.cgi?query=fd&sektion=4&manpath=freebsd-release-ports 2: https://man.freebsd.org/cgi/man.cgi?query=fdescfs&sektion=5&n=1 This reverts commit `308ec0fb2c`. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-16 12:05:19 +02:00
Michal Privoznik	308ec0fb2c	vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs On BSD-like systems "/dev/fd" serves the same purpose as "/proc/self/fd". And since procfs is usually not mounted, on such systems we can use "/dev/fd" instead. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/518 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-13 14:50:43 +02:00
Michal Privoznik	4df8dc576f	vircommand: Make sysconf(_SC_OPEN_MAX) failure non-fatal The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky). But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL). But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator. Therefore, pick a sufficiently large value (glibc falls back to 256, [1], Darwin to 10240 [2] so 10240 should be good enough) to fall back to and make the error non-fatal. 1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getdtsz.c;h=4c5a6208067d2f9eaaac6dba652702fb4af9b7e3;hb=HEAD 2 https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-13 14:50:43 +02:00
Michal Privoznik	6ded014ba3	vircommand: Isolate FD dir parsing into a separate function So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd", iterates over it marking opened FDs in @fds bitmap. Well, we can do the same on other systems (with altered path), like MacOS or FreeBSD. Therefore, isolate dir iteration into a separate function that accepts dir path as an argument. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-13 14:50:43 +02:00
Michal Privoznik	dfe496ae33	vircommand: Drop unused arguments from virCommandMassCloseGetFDs*() Both virCommandMassCloseGetFDsLinux() and virCommandMassCloseGetFDsGeneric() take @cmd argument only to mark it as unused. Drop it from both. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-13 14:50:43 +02:00
Martin Kletzander	bfad111c43	resctrl: Use cache IDs instead of max_id/max_cache_id It is not guaranteed for the cache IDs to be continuous, especially for L3 caches. Hence do not assume so and instead record the individual IDs in a virBitmap. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:57:41 +02:00
Martin Kletzander	f3fd0664cf	resctrl: Don't assume MBA availability in virResctrlAllocNewFromInfo Weirdly, the existence of /sys/fs/resctrl/info/MB does not always mean that MBA is available and used on the system. Instead of assuming that copy the values from the default (root) allocation. This also makes it nicer to use the proper values in case the system does not use percentages or when the root allocation already limits the bandwidth. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Martin Kletzander	bc97a2c043	capabilities: Also report L2 caches Since some systems support control for L2 caches as well as L3 caches it would be useful to report their configuration in capabilities. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Martin Kletzander	4437a775dc	resctrl: Add virResctrlInfoPerTypeFree It will be easier to add more dynamic data later on. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Martin Kletzander	7c40f1ead9	resctrl: Add virResctrlInfoMemBWFree It will be easier to add more dynamic data later on Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Martin Kletzander	03b6383f33	resctrl: Move virResctrlAllocCopyMemBW up in the file This way it can be used later in virResctrlAllocGetUnused(). Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Martin Kletzander	d7e3a15a98	resctrl: Relax the limit of maximum memory bandwidth allocation The value 100 represented the percentage as it was originally done from Intel in the Linux kernel and on their CPUs. Since then the situation changed and there is no error-prone way of figuring out the meaning of the value in the current configuration, let alone its possible maximum. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Martin Kletzander	eae19bb505	resctrl: Account for memory bandwidth of 0 being valid In some scenarios the memory bandwidth in the schemata file might be 0 and so can the minimum allocation in other ones. Remove checks which were added for extra cautiousness. Resolves: https://issues.redhat.com/browse/RHEL-54235 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-13 12:55:39 +02:00
Stepan Zobal	f60e5f87d4	documentation: Remove untrue statement in GetVersion() description The description of virConnectGetVersion() says the function might only work with a privileged access to the hypervisor, not with a read-only connection. However that is not true since commit `a2e2e4652f` and can be safely removed. Signed-off-by: Stepan Zobal <szobal@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-12 15:53:10 +02:00
Jakub Palacky	cc05007a43	vmx: use xmlBufferDetach() when applicable xmlBuffer->content was deprecated in libxml2 v2.13.0-33-gb34dc1e4 xmlBufferDetach(xmlBuffer) should be used instead Signed-off-by: Jakub Palacky <jpalacky@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-12 14:52:55 +02:00
Jakub Palacky	09ebe53349	util/virxml: use xmlCtxtGetLastError when applicable xmlParserCtxt->lastError was deprecated in libxml2 v2.13.0-103-g1228b4e0 xmlCtxtGetLastError(xmlParserCtxt) should be used instead Signed-off-by: Jakub Palacky <jpalacky@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-12 14:52:55 +02:00
Jakub Palacky	317139a316	util/virutil: Use readpassphrase when libbsd is available When libbsd is available, use the preferred readpassphrase() function isntead of getpass() as the getpass() function has been marked as obsolete and shouldnt be used Signed-off-by: Jakub Palacky <jpalacky@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-12 13:12:47 +02:00
Richard W.M. Jones	94e8a5b650	vmx: Allow '' to appear in VMX file keys When connecting to a VMware server (eg using vpx://) we download and try to parse the VMware metadata '.vmx' file of a guest. In this case a VMX file was found which contained this key: pciPassthru.present = "False" The '' character was not previously allowed in keys so this failed to parse with the error: VIR_ERR_CONF_SYNTAX: VIR_FROM_CONF: configuration file syntax error: memory conf:74: expecting an assignment Resolves: https://issues.redhat.com/browse/RHEL-58446 Thanks: Daniel Berrange Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-12 09:51:58 +02:00
Tom	5f6ccb0875	Allow apparmor parser to be executed in /usr/bin This commit modifies the AppArmor profile for virt-aa-helper to accommodate an observed behavior in certain Linux distributions, such as ArchLinux. In these distributions, /usr/sbin symlinks to /usr/bin. To ensure that virt-aa-helper can execute apparmor_parser when it resides in /usr/bin, the profile has been updated accordingly. Signed-off-by: Tom <libvirt-patch@douile.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-09-12 09:44:20 +02:00
Peter Krempa	e562b16ede	virDiskNameParse: Fix integer overflow in disk name parsing The conversion to index entails multiplication and accumulation by user provided data which can easily overflow, use VIR_MULTIPLY_ADD_IS_OVERFLOW to check if the string is valid. Closes: https://gitlab.com/libvirt/libvirt/-/issues/674 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:26:39 +02:00
Peter Krempa	a9ede822da	virconf: Properly fix numeric overflow when parsing numbers in conf files The previous fix didn't check the overflow in addition. Use the new macro to check both multiplication and addition overflows. Fixes: `8666523b7d` Closes: https://gitlab.com/libvirt/libvirt/-/issues/671 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:26:31 +02:00
Peter Krempa	23cb613606	internal: Add helper macro for checking multiply and add overflows The macro does the two checks together so that it's obvious what we're checking as doing it in place is really unpleasant. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:26:28 +02:00
Peter Krempa	3c5839973f	virDomainFeaturesDefParse: Add comment warning about features being specified repeatedly Few of the handlers didn't take that possibility into account. Warn others. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:26:24 +02:00
Peter Krempa	ead2419df3	virDomainFeaturesTCGDefParse: Don't leak 'tcg_features' when '<tcg>' feature is repeated Similarly to other cases users may specify the feature flag multiple times. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:26:20 +02:00
Peter Krempa	574769ceb6	virDomainFeaturesHyperVDefParse: Don't overwrite hypervisor vendor_id In case when the user specifies the '<hyperv/>' feature multiple times we could overwrite already parsed data. Clear it beforehand. As before this isn't trying to address the case of features being specified multiple times not making much sense. Closes: https://gitlab.com/libvirt/libvirt/-/issues/675 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:26:09 +02:00
Peter Krempa	8e28f2c5c2	virBitmapShrink: Do not attempt to clear bits beyond end of buffer 'virBitmapShrink' clears the bits beyond the end of the bitmap when shrinking and then reallocates to match the new size. As it uses the address of the first bit beyond the bitmap to do the clearing it can overrun the allocated buffer if we're not actually going to shrink it and the last bit's address is on the chunk boundary. Fix it by returning in that corner case and add few more tests to be sure. Closes: https://gitlab.com/libvirt/libvirt/-/issues/673 Fixes: `d6e582da80` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:25:37 +02:00
Peter Krempa	bc02cb9506	virDomainDefParseBootInitOptions: Don't leak 'name' on failure One of the failure paths skips code which would assign the string from the temporary variable to the parsed struct, thus leaking it on failure. Closes: https://gitlab.com/libvirt/libvirt/-/issues/672 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-10 14:24:48 +02:00
Michal Privoznik	2feeefc0b4	cpu_map: Install SierraForest description file In one of recent commits new CPU model was introduced. But corresponding change in meson.build is missing which results in the XML file not being installed. Fixes: `3afbb1644c` Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-09 09:30:04 +02:00
Peter Krempa	ecffc91d02	qemuBackupDiskDataCleanupOne: Don't skip rest of cleanup if we can't enter monitor Recent fix to use the proper 'async' monitor function would cause libvirt to leak some of the objects it's supposed to clean up in other places besides qemu. Don't skip the whole function on failure to enter the job but just the monitor section. Fixes: `9b22c25548` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-06 18:14:34 +02:00
Peter Krempa	8666523b7d	virconf: Fix numeric overflow when parsing numbers in conf files The number is parsed manually without making sure it'll fit. Fixes: `3bbac7cdb6` Closes: https://gitlab.com/libvirt/libvirt/-/issues/671 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-06 18:14:34 +02:00
Peter Krempa	5d77061d7e	conf: Don't overwrite KVM feature config struct if the feature is present twice Don't allocate the struct if it exists already. This sidesteps the discussion about whether forbidding multiple feature definitions makes sense. Fixes: `a8e0f9c682` Closes: https://gitlab.com/libvirt/libvirt/-/issues/670 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-06 18:14:34 +02:00
Tim Wiederhake	3afbb1644c	cpu_map: Add SierraForest CPU model This was added in qemu commit 6e82d3b6220777667968a04c87e1667f164ebe88. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-06 18:10:15 +02:00
Tim Wiederhake	6ac72ea6dd	cpu_map: Add missing feature "avx-vnni-int16" Introduced in qemu commit 138c3377a9b27accec516b2c0da90dedef98a780. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-06 18:10:06 +02:00
Peter Krempa	9b22c25548	qemu: backup: Use 'async' monitor in 'qemuBackupDiskDataCleanupOne' 'qemuBackupDiskDataCleanupOne()' is entering the monitor while we're in the async backup job inside 'qemuBackupBegin()' which is semantically wrong and per upstream report causes crashes if some monitoring commands are run in parallel. Use qemuDomainObjEnterMonitorAsync() instead. Closes: https://gitlab.com/libvirt/libvirt/-/issues/668 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-09-05 15:52:26 +02:00
Peter Krempa	61c8a7180e	qemuProcessSetupRawIO: Refactor return value and remove useless #ifdef The function can return directly rather than setting 'ret' as there's no cleanup. It also doesn't make sense to conditionally compile out the 'break' statement when checking whether a disk has rawio enabled if 'CAP_SYS_RAWIO' is _not_ defined as the function will still behave the same. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-09-05 15:24:55 +02:00
Peter Krempa	ce1c9bb8ea	storage: fs: Remove build-time detection of 'showmount' program With the new virCommand infrastructure which can find the program in path automatically we no longer need the build-time detection. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-09-05 15:24:55 +02:00
Martin Kletzander	f6fb097e11	virnetdevtap: Add better error message for a possible common user error When users pre-create a tap device to use with multiqueue interface that has `managed="no"`, change the error so that it does not indicate we are trying to create the device, and on top of that hint at the most probable error cause. Resolves: https://issues.redhat.com/browse/RHEL-55749 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-04 12:21:20 +02:00
Martin Kletzander	4ce9196dc4	virnetdevtap: Do (not) use NULLSTR consistently The function generates *ifname from the get go and most functions do not wrap the string in a NULLSTR as it is not necessary. The few leftovers are outliers that are changed to fit the theme better. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-04 12:21:19 +02:00
Andrea Bolognani	ad92468924	qemu: Use pvpanic by default on aarch64 pvpanic-pci is the only reasonable implementation of a panic device for aarch64/virt guests. Right now we're asking users to provide the model name manually, but we can be more helpful and fill it in automatically instead. With this change, the aarch64-panic-no-model test no longer fails and so it's no longer useful to us. Instead, we can amend the aarch64-virt-default-models test case to include panic coverage, something that until now wasn't possible. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-03 14:08:34 +02:00
Andrea Bolognani	6d92185a49	qemu: Sometimes the default panic model doesn't exist Right now the fallback behavior is to use MODEL_ISA if we haven't been able to find a better match, but that's not very useful as we're still going to hit an error later, when QEMU_CAPS_DEVICE_PANIC is not found at Validate time. Instead of doing that, allow MODEL_DEFAULT to get all the way to Validate and report an error upon encountering it. The reported error changes slightly, but other than that the set of configurations that are allowed and blocked remains the same. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-03 14:07:47 +02:00
Andrea Bolognani	9e1970efa5	qemu: Refactor default panic model Perform decisions based on the architecture and machine type in a single place instead of duplicating them. This technically adds new behavior for MODEL_ISA in qemuDomainDefAddDefaultDevices(), but it doesn't make any difference functionally since we don't set addPanicDevice outside of ppc64(le) and s390(x). If we did, the lack of handling for that value would be a latent bug. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-03 14:06:11 +02:00
Martin Kletzander	ac05dc8d4f	qemu_driver: Fix indentation Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-03 13:13:58 +02:00
Kamil Szczęk	76f6caee3c	qemu: Fix a few comments Fixes: `d292c5ba17` Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-08-29 13:52:12 +02:00
Peter Krempa	8dfb12cb77	udevListInterfaces: Honour array length for zero-length NULL arrays (CVE-2024-8235) The refactor of 'udevListInterfacesByStatus()' which attempted to make it usable as backend for 'udevNumOfInterfacesByStatus()' neglected to consider the corner case of 'g_new0(..., 0)' returning NULL if the user actually requests 0 elements. As the code was modified to report the full number of interfaces in the system when the list of names is NULL, the RPC code would be asked to serialize a NULL-list of interface names with declared lenth of 1+ causing a crash. To fix this corner case we make callers pass '-1' as @names_len (it's conveniently an 'int' due to RPC type usage) if they don't wish to fetch the actual list and convert all decisions to be done on @names_len being non-negative instead of @names being non-NULL. CVE-2024-8235 Fixes: `bc596f2751` Resolves: https://issues.redhat.com/browse/RHEL-55373 Reported-by: Yanqiu Zhang <yanqzhan@redhat.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-29 10:38:40 +02:00
Andrea Bolognani	725afb4e7b	qemu: Expose availability of PS/2 feature in domcaps This advertises the feature only for the architectures and machine types where it can actually be used. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-29 09:44:53 +02:00
Andrea Bolognani	e0e496d90c	qemu: Change signature for virQEMUCapsSupportsI8042Toggle() We will soon need to use it in a context where we don't have a virDomainDef handy. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-29 09:44:51 +02:00
Andrea Bolognani	d292c5ba17	qemu: Export a few functions We're going to need them in a minute. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-29 09:44:46 +02:00
Praveen K Paladugu	d9be0beb4c	ch: Enable bridge network mode Tested with following interface config: <interface type='bridge'> <mac address='52:54:00:71:b9:b6'/> <source bridge='clhbr0'/> <model type='virtio'/> </interface> Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-28 13:36:14 +02:00
Pavel Hrdina	8a44f78625	ch: interface: correctly update nicindexes Originally nicindexes were updated only for VIR_DOMAIN_NET_TYPE_BRIDGE and VIR_DOMAIN_NET_TYPE_DIRECT. The mentioned commit adds support for NAT network mode and changes the code to update nicindexes for VIR_DOMAIN_NET_TYPE_ETHERNET and VIR_DOMAIN_NET_TYPE_NETWORK as well. It doesn't work as intended and after the change nicindexes are updated only for VIR_DOMAIN_NET_TYPE_ETHERNET and VIR_DOMAIN_NET_TYPE_NETWORK. Fixes: `aa64209073` Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-28 10:45:58 +02:00
Sergey Dyasli	87c3fa1cde	conf: check for migration job during domain start It's possible to hit the following situation during qemu p2p live migration: 1. qemu has live migrated and exited (making virDomainObjIsActive() return false) 2. the live migration job is still in progress, waiting for a confirmation from the remote libvirt daemon. This may last for a while with a presence of networking issues (up to keepalive timeout). Any attempt to start the domain again would fail with "domain is already being started" message which is misleading in this situation as it doesn't reflect what's really happening. Add a check for the migration job and report a different error message if the migration job is still running. Signed-off-by: Sergey Dyasli <sergey.dyasli@nutanix.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-27 10:44:58 +02:00
Peter Krempa	805f66d7ca	hypervisor: interface: Stub out virDomainCreateInBridgePortWithHelper using 'socketpair' on win32 Mingw build failed after commit `af87ee7927` as 'socketpair()' is not available on that platform. Stub out the function to return failure. Fixes: `af87ee7927` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-08-27 10:06:31 +02:00
aokblast	1b815465d9	remoteDispatchConnectOpen: Fix check for 'BHYVE' connection type 'bhyveConnectGetType' (which is called from 'virConnectGetType') returns 'BHYVE' as the type, but the code in 'remoteDispatchConnectOpen' responsible for selecting the sub-driver URIs in modular deployment checks for 'bhyve' and thus would not properly fill the URIs to the sub-daemons. Signed-off-by: aokblast <aokblast@FreeBSD.org> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-08-26 18:23:10 +02:00
Praveen K Paladugu	aa64209073	ch: Enable NAT Network mode support From: Praveen K Paladugu <prapal@linux.microsoft.com> enable VIR_DOMAIN_NET_TYPE_NETWORK network support for ch guests. Tested with following config: <interface type='network'> <source network="default" bridge='virbr0'/> <model type='virtio'/> <driver queues="1"/> </interface> Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-26 16:13:23 +02:00
Praveen K Paladugu	af87ee7927	hypervisor: Move domain interface mgmt methods From: Praveen K Paladugu <prapal@linux.microsoft.com> Move methods to connect domain interfaces to host bridges to hypervisor. This is to allow reuse between qemu and ch drivers. Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-26 16:10:04 +02:00
Tim Wiederhake	7b6702d516	hyperv: Support hv-xmm-input enlightenment qemu supports this enlightenment since version 7.10. From the qemu commit: Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-26 11:48:15 +02:00
Tim Wiederhake	0313a500a9	hyperv: Support hv-emsr-bitmap enlightenment qemu supports this enlightenment since version 7.10. From the qemu commit: The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-26 11:48:15 +02:00
Michal Privoznik	0888784f38	qemu: Use virEventThreadStop() in qemuProcessStop() Currently, qemuProcessStop() unlocks given domain object right in the middle of cleanup process. This is dangerous because there might be another thread which is executing virDomainObjListAdd(). And since the domain object is on the list of domain objects AND by the time qemuProcessStop() unlocks it the object is also marked as inactive, the other thread acquires the lock and switches vm->def pointer. The unlocking of domain object is needed though, to allow even processing thread finish its queue. Well, the processing can be done before any cleanup is attempted. Therefore, use freshly introduced virEventThreadStop() to join the event thread and drop lock/unlock from the middle of qemuProcessStop(). Now, there's a comment being removed that mentions qemuDomainObjStopWorker() and why it has to be called only after the domain is marked as dead. This comment is no longed applicable because call to qemuDomainObjStopWorker() is removed also. Moreover, priv->beingDestroyed is set to true before unlocking the domain object, thus any event processing callback is going to see the domain being destroyed and can chose to either exit early or finish processing event. Fixes: `3865410e7f` Resolves: https://issues.redhat.com/browse/RHEL-49607 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-22 13:33:09 +02:00
Michal Privoznik	7aca235d8d	vireventthread: Introduce virEventThreadStop The aim is to move parts of vir_event_thread_finalize() that MAY block into a separate function, so that unrefing the a virEventThread no longer blocks (or require releasing and subsequent re-acquiring of a mutex). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-22 13:33:06 +02:00
Kamil Szczęk	a9a5f8ef39	qemu: Introduce the 'ps2' feature This introduces a new 'ps2' feature which, when disabled, results in no implicit PS/2 bus input devices being automatically added to the domain and addition of the 'i8042=off' machine option to the QEMU command-line. A notable side effect of disabling the i8042 controller in QEMU is that the vmport device won't be created. For this reason we will not allow setting the vmport feature if the ps2 feature is explicitly disabled. Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-21 17:10:51 +02:00
Kamil Szczęk	9eb3c28323	qemu_capabilities: Introduce QEMU_CAPS_MACHINE_I8042_OPT This capability tells us whether given QEMU binary supports the '-machine xxx,i8042=on/off' toggle used to enable/disable PS/2 controller emulation. A few facts: - This option was introduced in QEMU 7.0 and defaults to 'on' - QEMU versions before 7.0 enabled i8042 controller emulation implicitly - This option (and i8042 controller emulation itself) is only supported by descendants of the generic PC machine type (e.g. i440fx, q35, etc.) Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-21 17:10:51 +02:00
Kamil Szczęk	51521d13a8	qemu: Improve PS/2 controller detection Up until now, we've assumed that all x86 machines have a PS/2 controller built-in. This assumption was correct until QEMU v4.2 introduced a new x86-based machine type - microvm. Due to this assumption, a pair of unnecessary PS/2 inputs are implicitly added to all microvm domains. This patch fixes that by whitelisting machine types which are known to include the i8042 PS/2 controller. Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-21 17:10:51 +02:00
Peter Krempa	62d6e8dcb2	qemu: validate: Reject empty USB disks Attempting to start qemu with or hotplug an empty 'usb-storage' based disk results in the following error: qemu-system-x86_64: -device {"driver":"usb-storage","bus":"usb.0","port":"2","id":"usb-disk1","removable":true}: drive property not set Reject such config at validation step and adjust tests. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-21 15:49:36 +02:00
Peter Krempa	204013d4aa	qemu: block: Allow NULL 'data' in 'qemuBlockStorageSourceChainDetach' Some code paths, such as if hotplug of an empty cdrom fails can cause that 'qemuBlockStorageSourceChainDetach' will be called with 'NULL' @data as there is no backend for the disk. The above case became possible once we allowed hotplug of cdroms and subsequently fixed the case when users would hotplug an empty cdrom which ultimately caused the possibility of having no backend in the hotplug code path which was not possible before (see 'Fixes:' below and also the commit linked from there). Make 'qemuBlockStorageSourceChainDetach' tolerate NULL @data by simply returning early. Fixes: `894c6c5c16` Resolves: https://issues.redhat.com/browse/RHEL-54550 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-21 15:49:36 +02:00
Michal Privoznik	85e07fb1ce	security: apparmor: Allow QEMU read /proc/sys/vm/max_map_count In its commit v9.0.0-rc0~1^2 QEMU started to read /proc/sys/vm/max_map_count file to set up coroutine limits better (something about VMAs, mmap(), see the commit for more info). Allow the file in apparmor profile. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/660 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-20 14:44:04 +02:00
Michal Privoznik	a70cdeeb2a	conf: Validate QoS values Since we use 'tc' to set QoS, or we instruct OVS which then uses 'tc', we have to make sure values are within range acceptable to 'tc'. Resolves: https://issues.redhat.com/browse/RHEL-45200 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-20 09:19:28 +02:00
Michal Privoznik	ab489ea318	conf: Introduce virNetDevBandwidthValidate() This function validates whether parsed limits are within range as defined by 'tc' sources (since we use tc to set QoS; or OVS which then uses tc too). The 'tc' program stores speeds in 64bit integers (unit is bytes per second) and sizes in uints (unit is bytes). We use different units: kilobytes per second and kibibytes and therefore we can parse values larger than 'tc' can handle and thus need a function to check if values still fit. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-20 09:19:25 +02:00
Michal Privoznik	ab7f877f27	lib: Use NULLSTR family of macros more There is a family of convenient macros: NULLSTR, NULLSTR_EMPTY, NULLSTR_STAR, NULLSTR_MINUS which hides ternary operator. Generated using the following spatch (and its obvious variants): @@ expression s; @@ <+... - s ? s : "<null>" + NULLSTR(s) ...+> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-19 13:44:12 +02:00
Martin Kletzander	4de8962a79	virarptable: End parsing earlier in case of NLMSG_DONE Check for the last multipart message right as the first thing. The presumption probably was that the last message might still contain a payload we want to parse. However that cannot be true since that would have to be a type RTM_NEWNEIGH. This was not caught because older kernels were note sending NLMSG_DONE and probably relied on the fact that the parsing just stops after all the messages are walked through, which the NLMSG_OK macro successfully did. Resolves: https://issues.redhat.com/browse/RHEL-52449 Resolves: https://bugzilla.redhat.com/2302245 Fixes: `a176d67cdf` Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-08-19 12:16:14 +02:00
Martin Kletzander	ef84581a69	virarptable: Fix check for message length The previous check was all wrong since it calculated the how long would the netlink message be if the netlink header was the payload and then subtracted that from the whole message length, a variable that was not used later in the code. This check can fail if there are no additional payloads, struct rtattr in particular, which we are parsing later, however the RTA_OK macro would've caught that anyway. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-08-19 12:14:26 +02:00
Martin Kletzander	e7530769e8	virarptable: Properly calculate rtattr length Use convenience macro which does almost the same thing we were doing, but also pads out the payload length to a multiple of NLMSG_ALIGNTO (4) bytes. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2024-08-19 12:12:16 +02:00
Tim Wiederhake	03852c85af	cpu_map: Add GraniteRapids CPU model This was added in qemu commit 6d5e9694ef. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-13 17:00:41 +02:00
Tim Wiederhake	19f30f68ce	sync_qemu_models_i386.py: Add missing features This brings the tool's list of features in sync with qemu commit 37fbfda8f4145ba1700f63f0cb7be4c108d545de. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-13 17:00:41 +02:00
Tim Wiederhake	a18b232712	cpu_map: Add libcpuinfo as optional data source This adds an option to use libcpuinfo [1] as data source for libvirt's list of x86 cpu features. This is purely optional and does not change the script's behavior if libcpuinfo is not installed. libcpuinfo is a cross-vendor, cross-architecture source for CPU related information that has the capability to replace libvirt's dependence on qemu's cpu feature list. [1] https://gitlab.com/twiederh/libcpuinfo Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-13 17:00:41 +02:00
Peter Krempa	b3edf03c31	qemu: hotplug: Rollback FD passthrough for 'slirpfd' and 'vdpafd' on hotplug failure On failure to plug the device the cleanup path didn't roll back the FD passing to qemu thus qemu would hold the FDs indefinitely. Resolves: https://issues.redhat.com/browse/RHEL-53964 Fixes: `b79abf9c3c` (vdpafd) Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-08-13 16:34:47 +02:00
Michal Privoznik	1b797e6421	virnetlibsshsession: Reflect API change in libssh As of libssh commit of libssh-0.11.0~70 [1] the ssh_channel_get_exit_status() function is deprecated and a new one is introduced instead: ssh_channel_get_exit_state(). It's not a drop-in replacement, but it's simple enough. Adapt our libssh handling code to this change. 1: https://git.libssh.org/projects/libssh.git/commit/?id=04d86aeeae73c78af8b3dcdabb2e588cd31a8923 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-08-12 15:47:48 +02:00
Daniel P. Berrangé	cd9709a6ab	glibcompat: remove obsolete clang workaround This mostly reverts commit `65491a2dfe`. There was a bug introduced in glib 2.67.0 which impacted libvirt with clang causing -Wincompatible-pointer-types-discards-qualifiers warnings. This was actually fixed quite quickly in 2.67.1 with https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1719 Our workaround was then broken with glib 2.81.1 due to commit 14b3d5da9019150d821f6178a075d85044b4c255 changing the signature of the (private) macro we were overriding. Since odd-number glib releases are development snapshots, and the original problem was only present in 2.67.0 and no other releases, just drop the workaround entirely. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2024-08-12 11:52:57 +01:00
Peter Krempa	63080f0582	glibcompat: "Backport" 'g_string_replace' Backport the implementation of 'g_string_replace' until we require at least glib-2.68 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 16:34:32 +02:00
Purna Pavan Chandra	c4be2cb2de	ch: kill CH process if restore fails Invoke virCHProcessStop to kill CH process incase of any failures during restore operation. Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00
Purna Pavan Chandra	0587ee2aab	ch: support restore with net devices Cloud-hypervisor now supports restoring with new net fds. Ref: https://github.com/cloud-hypervisor/cloud-hypervisor/pull/6402 So, pass new tap fds via SCM_RIGHTS to CH's restore api. Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00
Purna Pavan Chandra	4ae70b7c2d	ch: refactor virCHMonitorSaveVM Remove the unwanted utility function and make api calls directly from virCHMonitorSaveVM fn Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00
Purna Pavan Chandra	fd34fbed79	ch: use monitor socket fd to send restore request Instead of curl, use low-level socket connections to make restore api request to CH. This will enable passing new net FDs to CH while restoring domains with network configuration. Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00
Purna Pavan Chandra	4919f3a120	ch: support poll with -1 in chSocketRecv chSocketRecv fn can be used by operations such as restore, which cannot have a specific poll timeout. The runtime of these operations at server side (vmm) cannot be determined or capped as it depends on the guest configuration. Hence, add a new parameter 'use_timeout' which when set will pass -1 as timeout to poll, otherwise the default PKT_TIMEOUT_MS is used. Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00
Purna Pavan Chandra	ea271081dd	ch: refactor chProcessAddNetworkDevices Move monitor socket connection, response handling and closing FDs code into new functions in preparation for adding restore support for net devices. Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00
Purna Pavan Chandra	3e41cd5e82	ch: Pass net ids explicitly during vm creation Pass "net_<index>" as net id to CH. This is to have better control over the network configs. This id can be further used in performing operations like restore etc. Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 15:03:07 +02:00

... 3 4 5 6 7 ...

36169 Commits