35846 Commits

Author SHA1 Message Date
Martin Kletzander
d599fc3d57 qemu: Make qemuGetMemoryBackingDomainPath static
After previous patches it is not used (and should not be used) outside
of qemu_domain.c.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
ff49d2a8c2 qemu: Use per-domain private memoryBackingDir for new memory backends
The function qemuGetMemoryBackingPath() does not need the @def any more
and priv->memoryBackingDir can be used instead of constructing the path
by calling qemuGetMemoryBackingDomainPath().

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
f58a4dc9d5 qemu: Set memoryBackingDir in private data upon start
This way we keep the path for each running VM.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
da8a1d7943 qemu: Add memoryBackingDir to qemuDomainObjPrivate
This way we _can_ (but do not, yet) remember the memory backing path for
running domains even after configuration change and daemon restart.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
c9a35eb255 qemu: Change parameters of qemuGetMemoryBackingDomainPath()
This way it does not use driver, since it will be later reworked and the
following patches cleaner, hopefully.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Martin Kletzander
edcf14be9c qemu: Move domain-related functions to qemu_domain
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-24 10:12:08 +02:00
Ján Tomko
81e532c701 util: json: remove yajl implementation
Since the previous commit removed YAJL detection completely,
WITH_YAJL cannot possibly be set. Drop the code.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
d96e753d84 meson: options: drop yajl
Drop the yajl option and all references to it.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
9e6555fd90 util: json: write a json-c implementation
Write an alternative implementation of our virJSON functions,
using json-c instead of yajl.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
28c9872639 meson: switch checks to depend on json-c as well as yajl
Ensure both are required during this series to make bisecting smooth.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Ján Tomko
330cf7f492 util: json: introduce virJSONStringPrettifyBlanks
A horribly named function for unifying formatting when pretty-printing
empty JSON arrays and objects. Useful for having stable test output
even if different JSON libraries format these differently.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-09-24 08:24:00 +02:00
Laine Stump
e14a5fcac4 util: use uint32 instead of char[4] for several virSocketAddrIPv4 operations
These 3 functions are easier to understand, and more efficient, when
the IPv4 address is viewed as a uint32 rather than an array of bytes.

virsocketAddrGetIPv4Addr() has bothered me for a long time - it was
doing ntohl of the address into a temporary uint32, and then a loop
one-by-one swapping the order of all the bytes back to network
order. Of course this only works as described on little-endian
architectures - on big-endian architectures the first assignment won't
swap the bytes' ordering, but the loop assumes the bytes are now in
little-endian order and "swaps them back", so the result will be
incorrect. (Do we not support any big-endian targets that would have
exposed this bug long before now??)

virSocketAddrCheckNetmask() was checking each byte of the two
addresses individually, when it could instead just do the operation
once on the full 32 bit values.

virSocketGetRange() was checking for "range > 65535" by seeing if the
first 2 bytes of the start and end were different, and then doing
arithmetic combining the lower two bytes (along with necessary bit
shifting to account for network byte order) to determine the exact
size of the range. Instead we can just get the ntohl of start & end,
and do the math directly.

Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-21 15:06:09 -04:00
Laine Stump
009464902a util: make virSocketAddrIPv4 a union
virSocketAddrIPv4 is a type used only internally by
virsocketaddr.c. It is defined to be a character array, which leads to
multiple occurences of extra bit fiddling and byte swapping for no
good reason (except to confuse).

An IPv4 address is really just a uint32_t with the bytes in network
order, which is exactly the type of the s_addr member of the
sockaddr_in that is a part of the publicly consumed struct
virSocketAddr, and that we are copying in and out of a
virSocketAddrIPv4. Sometimes it's simpler to just treat it as a
network-order uint32_t, so let's make our virSocketAddrIPv4 a union
that has both an unsigned char bytes[4] (for the times when we need to
look one byte at a time) and a uint32_t val (for the times when it's
simpler to treat it as a single value).

For now we just change all the uses from, e.g. x[i] to x.bytes[y];
an upcoming patch will simplify some of the code to remove loops by
using x.val instead of x.bytes when appropriate.

Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-21 14:39:05 -04:00
Laine Stump
14623a3424 util: fix virSocketAddrMask() when source and result are the same object
Many years ago (2011), virSocketAddrMask() had caused a bug by failing
to initialize an IPv6-specific field in the result virSocketAddr. This
was fixed by memset(0)ing the entire result (*network) at the
beginning of the function (thus making sure anything and everything
was initialized).

The problem is that virSocketAddrMask() has a comment above it that
says that the source (addr) and destination (network) arguments can
point to the same virSocketAddr. But in that case, the
memset(*network, 0) at the top of the function is actually doing a
memset(*addr, 0), and so there is nothing left for all the assignments
to copy except a giant field of 0's.

Fortunately in the 13 years since the memset was added, nobody has
ever called virSocketAddrMask() with addr and network being the same.

This patch makes the code agree with the comment by copying/masking
into a local virSocketAddr (which is initialized to all 0) and then
copying that to *network after it's finished assigning things from
addr.

Fixes: ba08c5932e556aa4f5101357127a6224c40e5ebe
Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-21 14:37:54 -04:00
Laine Stump
f7a2d158f7 network: fix argument order/log level in message about firewall_backend
Oops.

Fixes: 64b966558cc6002fe150a0292a24eb2802a792c5
Signed-off-by: Laine Stump <laine@redhat.com>
2024-09-19 16:14:21 -04:00
Laine Stump
c7ea694f7d qemu: rework needBridgeChange/needReconnect decisions in qemuDomainChangeNet()
This patch simplifies (?) the of qemuDomainChangeNet() code while
fixing some incorrect decisions about exactly when it's necessary to
re-attach an interface's bridge device, or to fail the device update
(needReconnect[*]) because the type of connection has changed (or
within bridge and direct (macvtap) type because some attribute of the
connection has changed that can't actually be modified after the
tap/macvtap device of the interface is created).

Example 1: it's pointless to require the bridge device to be
reattached just because the interface has been switched to a different
network (i.e. the name of the network is different), since the new
network could be using the same bridge as the old network (very
uncommon, but technically possible). Instead we should only care if
the name of the *bridge device* changes (or if something in
<virtualport> changes - see Example 3).

Example 2: wrt changing the "type" of the interface, a change should
be allowed if old and new type both used a bridge device (whether or
not the name of the bridge changes), or if old and new type are both
"direct" *and* the device being linked and macvtap mode remain the
same. Any other change in interface type cannot be accommodated and
should be a failure (i.e. needReconnect).

Example 3: there is no valid reason to fail just because the interface
has a <virtualport> element - the <virtualport> could just say
"type='openvswitch'" in both the before and after cases (in which case
it isn't a change by itself, and so is completely acceptable), and
even if the interfaceid changes, or the <virtualport> disappears
completely, that can still be reconciled by simply re-attaching the
bridge device. (If, on the other hand, the modified <virtualport> is
for a type='direct' interface, we can't domodify that, and so must
fail (needReconnect).)

(I tried splitting this into multiple patches, but they were so
intertwined that the intermediate patches made no sense.)

[*] "needReconnect" was a flag added to this function way back in
2012, when I still believed that QEMU might someday support connecting
a new & different device backend (the way the virtual device connects
to the host) to an already existing guest netdev (the virtual device
as it appears to the guest). Sadly that has never happened, so for the
purposes of qemuDOmainChangeNet() "needReconnect" is equivalent to
"fail".

Resolves: https://issues.redhat.com/browse/RHEL-7036
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:39 -04:00
Laine Stump
601f4160b9 qemu: replace open-coded remove/attach bridge with virNetDevTapReattachBridge()
The new function does what the old qemuDomainChangeNetbridge() did
manually, except that:

1) the new function supports changing from a bridge of one type to
   another, e.g. from a Linux host bridge to an OVS
   bridge. (previously that wasn't handled)

2) the new function doesn't emit audit log messages. This is actually
   a good thing, because the old code would just log a "detach"
   followed immediately by "attach" for the same MAC address, so it's
   essentially a NOP. (the audit logs don't have any more detailed
   info about the connection - just the VM name and MAC address, so it
   makes no sense to log the detach/attach pair as it's not providing
   any information).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:31 -04:00
Laine Stump
e3f8bccea6 util: don't return early from virNetDevTapReattachBridge() if "force" is true
It can be useful to force an interface to be detached/reattached from
its bridge even if it's the same bridge - possibly something like the
virtualport profileID has changed, and a detach/attach cycle will get
it connected with the new profileID.

The one and only current use of virNetDevTapReattachBridge() sets
force to false, to preserve current behavior. An upcoming patch will
use it with force set to true.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:19 -04:00
Laine Stump
a37bd2a15b qemu: prevent unnecessarily failing live interface update
Attempts to use update-device to modify just the link state of a guest
interface were failing due to a supposed attempt to modify something
in the interface that can't be modified live (even though the only
thing that was changing was the link state, which *can* be modified
live).

It turned out that this failure happened because the guest interface
in question was type='network', and the network in question was a
'direct' network that provides each guest interface with one device
from a pool of network devices. As a part of qemuDomainChangeNet() we
would always allocate a new port from the network driver for the
updated interface definition (by way of calling
virDomainNetAllocateActualDevice(newdev)), and this new port (ie the
ActualNetDef in newdev) would of course be allocated a new host device
from the pool (which would of course be different from the one
currently in use by the guest interface (in olddev)). Because direct
interfaces don't support changing the host device in a live update,
this would cause the update to fail.

The solution to this is to realize that as long as the interface
doesn't get switched to a different network as a part of the update,
the network port information (ie the ActualNetDef) will not change as
a part of updating the guest interface itself. So for sake of
comparison we can just point the newdev at the ActualNetDef of olddev,
and then clear out one or the other when we're done (to avoid a double
free or, more likely, attempt to reference freed memory).

(If, on the other hand, the name of the network has changed, or if the
interface type has changed to type='network' from something else, then
we *do* need to allocate a new port (actual device) from the network
driver (as we used to do in all cases when the new type was
'network'), and also indicate that we'll need to replace olddev in the
domain with newdev (because either of these changes is major enough
that we shouldn't just try to fix up olddev)

Partially-Resolves: https://issues.redhat.com/browse/RHEL-7036
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-19 13:56:06 -04:00
Peter Krempa
852380cef5 qemuBuildChardevCommand: Remove unused variable
'charstr' is unused since 36d06a5637f, breaking the build on some
platforms. Remove it.

Fixes: 36d06a5637f
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2024-09-19 13:12:02 +02:00
Peter Krempa
24d468993c qemu: Reject unsupported chardev backend protocols
QEMU supports only 'raw' and 'telnet' in the

 <protocol type='telnets'/>

element. Reject 'telnets' and 'tls'. TLS transport for qemu chardevs is
configured via "tls='yes'" attribute added to the "<source>" element
instead, so this prevents potential misconfig as the value would be
silently accepted.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/412
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
3778964207 conf: Convert 'protocol' field of TCP char device backend to proper type
Use virDomainChrTcpProtocol as type, convert the parser to use
virXMLPropEnum and fix one switch statement in the VMX driver.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
2256466f70 qemu: monitor: Remove the old chardev backend generator
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
e352a692a7 qemu: Use the new chardev backend JSON props generator also in the monitor
Now that we have a unified generator of chardev backend which is also
validated against the QMP schema we can replace the old generator with
it.

This patch modifies the monitor code to take virJSONValue 'props'
instead of the chardev definition and adds the conversion from the
chardev definition to JSON on higher levels.

The monitor code now also attempts to extract the returned 'pty' if
returned from qemu, so higher level code needs to report the error if
the path is needed and missing.

The current monitor generator is for now abandoned in place and will be
removed later.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:15 +02:00
Peter Krempa
d897ad2b89 qemu: Move check for chardev backends which can't be hotplugged out of the monitor
The upcoming refactor of the monitor code will make the hotplug code
paths use the same generator we have for commandline -chardev backends
which doesn't refuse to format certain backends which can't be
hotplugged.

To prepare for this we add a check to qemuHotplugChardevAttach()
refusing such hotplug and remove 'qemumonitorjsontest' test cases which
will not make sense any more.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:14 +02:00
Peter Krempa
36d06a5637 qemu: Introduce unified chardev backend config generator
Similarly to how we approach the generators for
-device/-object/-blockdev/-netdev rewrite the generator of -chardev to
be unified with the generator for the monitor.

Unfortunately with -chardev it will be a bit more quirky when compared
to the others as the generator itself will need to know whether it
generates command line output or not as a few field names change and data
is nested differently.

This first step adds the generator and uses it only for command line
generation. This was possible to achieve without changing any of the
output in tests.

In further patches the same generator will then be used also in the
monitor code replacing both.

As basis for the generator I took the monitor code but modified it to
have the same field order as the commandline code and extended it
further to support all backend types, even those which are not
hotpluggable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:14 +02:00
Peter Krempa
9c88a566d8 qemu: capabilities: Explain that QEMU_CAPS_CHARDEV_JSON will be used in tests only
I've added that capability a long time ago when I was converting various
stuff to use JSON but the support in '-chardev' didn't yet materialize.

Fix the comment to make that clear and also that it'll be used in tests
for the upcoming refactor of the chardev code (so that we can validate
generator against the schema even if that doesn't yet work).

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-09-19 10:30:14 +02:00
Laine Stump
200f60b2e1 network: *un*set the firewalld zone while shutting down a network
When a bridge device for a virtual network had been placed in a
firewalld zone while starting the network, then even after the network
is shut down and the bridge device is deleted, its name will still
show up in the list of interfaces for whichever zone it had been in,
and this setting will persist through the next time a device with the
same name is created (until a zone is once again explicitly set, or
the device is removed via a firewalld API call).

Usually this isn't a problem, but in the case of forward mode='open',
someone might start the network once with a zone specified, then
shut down the network, remove the zone from its config, and start it
again; in this case the bridge device would come up using the zone
from the previous time it was started.

The solution to this is to remove the interface from whatever zone it
is in as the network is being shut down. There is no downside to doing
this, since the device is going to be deleted anyway. Note that
forward mode='bridge' uses a bridge device that was created outside of
libvirt, and libvirt won't be deleting that bridge, so we take care to
not unset the zone in that case.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 11:22:56 -04:00
Laine Stump
1a3778fe0a network: remove firewalld version check from networkSetBridgeZone()
At the time the version check in this function was written, there were
still several supported versions of some distros that were using a
version of firewalld too old to support the "rich rule priorities"
used by the 'libvirt' zone that we installed for firewalld. Today the
newest distro that has a version of firewalld < 0.7.0 is
RHEL7/CentOS7, so we can remove the complexity and if the libvirt zone
is missing simply say "the libvirt zone is missing".

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Laine Stump
1a72b83d56 network: support setting firewalld zone for bridge device of open networks
The bit of code that sets the firewalld zone was previously a part of
the function networkAddFirewallRules(), which is not called for
networks with <forward mode='open'/>.

Setting the 'libvirt' zone for the bridge device of virtual networks
that also add firewall rules is usually necessary in order to get the
expected traffic through without modifying firewalld's default zone
(which would be a bad idea, because that would affect all the other
host interfaces set to the default zone), but in general we would
*not* want the bridge device for a mode='open' virtual network to be
automatically placed in the "libvirt" zone. However, a user might want
to *explicitly* set some other firewalld zone for mode='open'
networks, and libvirt's network config is a convenient place to do
that.

We enable this by moving the code that sets the firewalld zone into a
separate function that is called for all forward modes that use a
bridge device created/managed by libvirt (nat, route, isolated,
open). If no zone is specified, then the bridge device will be in
whatever zone interfaces are put in by default, but if the <bridge>
element has a "zone" attribute, then the new bridge device will be
placed in the specified zone.

NB: This function is only called when the network is started, and
*not* when the firewall rules of an active network are reloaded at
virtnetworkd restart time, because the firewalld zone of an interface
isn't something that gets inadvertantly changed as a part of some
other unrelated action. For example all iptables rules are cleared by a
firewalld restart, including those rules added by libvirt, but there
is no blanket action that changes the zone of all interfaces, so it's
useful for libvirt to reload its rules when restarting virtnetworkd,
but pointless to re-add the interface to its preferred zone.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Laine Stump
eeebbc1eec network: belatedly update an error message
The 'open' forward type probably hadn't yet been added when this
message was written.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Laine Stump
114c0ec656 network: permit <forward mode='open'/> when a network has no IP address
The whole point of <forward mode='open'/> is to supress libvirt from
adding any firewall rules for a network, and someone might want to
create a network with no IP address (i.e. they don't want the guests
to have connectivity to the host via this interface) and no firewall
rules (they don't want any, or they want to add their own). So there's
no reason to fail when a network has <forward mode='open'/> and also
has no IP address.

Kind-of-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/588
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:55:14 -04:00
Martin Kletzander
d0a48eeb72 network: Remove unused variable in networkDestroy
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-17 10:43:18 +02:00
Martin Kletzander
8a2717e803 network: Clean up after disappeared transient inactive networks
If a network disappeared the daemon should not only remove it from the
list of networks, but also do a proper cleanup.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:38 +02:00
Martin Kletzander
2bea2782d5 network: Separate cleanup from networkRemoveInactive
The new function (networkCleanupInactive) can be called from an iterator
over the list of networks without the risk of deadlock.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:37 +02:00
Martin Kletzander
74a22c09be network: Try to read dnsmasq PIDs for inactive networks too
Just in case one needs a clean up.

Resolves: https://issues.redhat.com/browse/RHEL-50968
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:37 +02:00
Martin Kletzander
447fda8981 network: Clean up after inactive objects during start
Once networkUpdateState() identifies a dead network it should clean up
after it as well.

Resolves: https://issues.redhat.com/browse/RHEL-50968
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:36 +02:00
Martin Kletzander
0e43cb09ee network: Don't check if network is active in networkShutdownNetwork
It skips the cleanup from networkStartNetwork and the only other path
already checks if the network is active or not.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:35 +02:00
Martin Kletzander
3e43670f01 network: Move port deletion into the shutdown function
It will be more useful in there when calling from new places.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:35 +02:00
Martin Kletzander
5988fdec91 network: Do not call virNetworkObjUnsetDefTransient on start cleanup
The function networkShutdownNetwork already does that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:34 +02:00
Martin Kletzander
97ed0574ea network: Do not update network ports for inactive networks
The semantic does not change since inside networkUpdatePort() (well,
networkNotifyPort, for which the former is a wrapper) exits for inactive
networks, but with an error we can easily avoid with this patch.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-09-17 09:40:34 +02:00
Andrea Bolognani
d622ca04f6 apparmor: Don't check for existence of templates upfront
Currently, if either template is missing AppArmor support is
completely disabled. This means that uninstalling the LXC
driver from a system results in QEMU domains being started
without AppArmor confinement, which obviously doesn't make any
sense.

The problematic scenario was impossible to hit in Debian until
very recently, because all AppArmor files were shipped as part
of the same package; now that the Debian package is much closer
to the Fedora one, and specifically ships the AppArmor files
together with the corresponding driver, it becomes trivial to
trigger it.

Drop the checks entirely. virt-aa-helper, which is responsible
for creating the per-domain profiles starting from the
driver-specific template, already fails if the latter is not
present, so they were always redundant.

https://bugs.debian.org/1081396

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-09-16 17:57:54 +02:00
Martin Kletzander
4b68c7e55b resctrl: Do not rewrite default MB values for new allocations
The code did it "just in case" the allocation was not reset for new
subdirectories.  That might've happened in the past with CAT settings,
but checking it now it is properly reset to its maximum values for each
new CLOSID (Class of Service ID).

The advantage of this is that we do not rewrite the value with itself
which causes an issue with the current linux kernel and mba_MBps option
where the default is UINT_MAX (or (uint32_t) -1), but gets rounded up to
bandwidth granularity (10), overflows and small number (4) is set
instead.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-16 12:07:15 +02:00
Michal Privoznik
ebc4580a5f Revert "vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs"
Unfortunately, devfs on FreeBSD (accessible via /dev/fd) exposes
only those FDs which can be represented as a file. To cite
manpage [1]:

  The files /dev/fd/0 through /dev/fd/# refer to file descriptors
  which can be accessed through the file system.

This means FDs representing pipes and/or unnamed sockets are not
visible by default. To expose all FDs a slightly different
filesystem must be mounted [2]:

  mount -t fdescfs none /dev/fd

Apparently, on my test machine fdescfs is mounted by default and
thus I haven't seen any problem. Only after aforementioned patch
was merged our CI started reporting problems. While we could try
to figure out whether correct FS is mounted, it's a needless
micro optimization. Just revert the code to the state it was
before I touched it.

1: https://man.freebsd.org/cgi/man.cgi?query=fd&sektion=4&manpath=freebsd-release-ports
2: https://man.freebsd.org/cgi/man.cgi?query=fdescfs&sektion=5&n=1

This reverts commit 308ec0fb2c77f4867179f00c628f05d1d784f370.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-16 12:05:19 +02:00
Michal Privoznik
308ec0fb2c vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs
On BSD-like systems "/dev/fd" serves the same purpose as
"/proc/self/fd". And since procfs is usually not mounted, on such
systems we can use "/dev/fd" instead.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/518
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Michal Privoznik
4df8dc576f vircommand: Make sysconf(_SC_OPEN_MAX) failure non-fatal
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big
enough bitmap so that subsequent call to
virCommandMassCloseGetFDsDir() can just set the bit instead of
expanding memory (this code runs in a forked off child and thus
using async-signal-unsafe functions like malloc() is a bit
tricky).

But on some systems the limit for opened FDs is virtually
non-existent (typically macOS Ventura started reporting EINVAL).

But with both glibc and musl using malloc() after fork() is safe.
And with sufficiently new glib too, as it's using malloc() with
newer releases instead of their own allocator.

Therefore, pick a sufficiently large value (glibc falls back to
256, [1], Darwin to 10240 [2] so 10240 should be good enough) to
fall back to and make the error non-fatal.

1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getdtsz.c;h=4c5a6208067d2f9eaaac6dba652702fb4af9b7e3;hb=HEAD
2  https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Michal Privoznik
6ded014ba3 vircommand: Isolate FD dir parsing into a separate function
So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd",
iterates over it marking opened FDs in @fds bitmap. Well, we can
do the same on other systems (with altered path), like MacOS or
FreeBSD. Therefore, isolate dir iteration into a separate
function that accepts dir path as an argument.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Michal Privoznik
dfe496ae33 vircommand: Drop unused arguments from virCommandMassCloseGetFDs*()
Both virCommandMassCloseGetFDsLinux() and
virCommandMassCloseGetFDsGeneric() take @cmd argument only to
mark it as unused. Drop it from both.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-09-13 14:50:43 +02:00
Martin Kletzander
bfad111c43 resctrl: Use cache IDs instead of max_id/max_cache_id
It is not guaranteed for the cache IDs to be continuous, especially for
L3 caches.  Hence do not assume so and instead record the individual IDs
in a virBitmap.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:57:41 +02:00
Martin Kletzander
f3fd0664cf resctrl: Don't assume MBA availability in virResctrlAllocNewFromInfo
Weirdly, the existence of /sys/fs/resctrl/info/MB does not always mean
that MBA is available and used on the system.  Instead of assuming that
copy the values from the default (root) allocation.  This also makes it
nicer to use the proper values in case the system does not use
percentages or when the root allocation already limits the bandwidth.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-09-13 12:55:39 +02:00