This is brand new way of closing FDs before exec(). We need to
close all FDs except those we want to explicitly pass to avoid
leaking FDs into the child. Historically, we've done this by
either iterating over all opened FDs and closing them one by one
(or preserving them), or by iterating over an FD interval [2 ...
N] and closing them one by one followed by calling closefrom(N +
1). This is a lot of syscalls.
That's why Linux kernel developers introduced new close_from
syscall. It closes all FDs within given range, in a single
syscall. Since we keep list of FDs we want to preserve and pass
to the child process, we can use this syscall to close all FDs
in between. We don't even need to care about opened FDs.
Of course, we have to check whether the syscall is available and
fall back to the old implementation if it isn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
We have two version of mass FD closing: one for FreeBSD (because
it has closefrom()) and the other for everything else. But now
that we have closefrom() wrapper even for Linux, we can unify
these two.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
It is handy to close all FDs from given FD to infinity. On
FreeBSD the libc even has a function for that: closefrom(). It
was ported to glibc too, but not musl. At least glibc
implementation falls back to calling:
close_range(from, ~0U, 0);
Now that we have a wrapper for close_range() we implement
closefrom() trivially.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Linux gained new close_range() syscall (in v5.9) that allows
closing a range of FDs in a single syscall. Ideally, we would use
it to close FDs when spawning a process (e.g. via virCommand
module).
Glibc has close_range() wrapper over the syscall, which falls
back to iterative closing of all FDs inside the range if running
under older kernel. We don't wane that as in that case we might
just close opened FDs (see Linux version of
virCommandMassClose()). And musl doesn't have close_range() at
all. Therefore, call syscall directly.
Now, mass close of FDs happens in a fork()-ed off child. While it
could detect whether the kernel does support close_range(), it
has no way of passing this info back to the parent and thus each
child would need to query it again and again.
Since this can't change while we are running we can cache the
information - hence virCloseRangeInit().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
As advertised earlier, now that the _virDomainMemoryDef struct is
cleaned up, we can shorten some names as their placement within
unions define their use.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The _virDomainMemoryDef struct is getting a bit messy. It has
various members and only some of them are valid for given model.
Worse, some are re-used for different models. We tried to make
this more bearable by putting a comment next to each member
describing what models the member is valid for, but that gets
messy too.
Therefore, do what we do elsewhere: introduce an union of structs
and move individual members into their respective groups.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The _virDomainMemoryDef struct is getting a bit messy. It has
various members and only some of them are valid for given model.
Worse, some are re-used for different models. We tried to make
this more bearable by putting a comment next to each member
describing what models the member is valid for, but that gets
messy too.
Therefore, do what we do elsewhere: introduce an union of structs
and move individual members into their respective groups.
This allows us to shorten some names (e.g. nvdimmPath or
sourceNodes) as their purpose is obvious due to their placement.
But to make this commit as small as possible, that'll be
addressed later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When guest acknowledges change in size of virtio-mem (portion
that's exposed to the guest), QEMU emits
MEMORY_DEVICE_SIZE_CHANGE event. We process it in
processMemoryDeviceSizeChange(). So far, QEMU emits the even only
for virtio-mem (as that's the only memory device model that
allows live changes to its size). Nevertheless, if this ever
changes, validate the memory model upon processing the event as
the rest of the code blindly expects virtio-mem model.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is similar to one of my previous commits. Simply speaking,
users can specify address where a memory device is mapped to. And
as such, we should include it when looking up corresponding
device in domain definition (e.g. on device hot unplug).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The qemuDomainChangeMemoryLiveValidateChange() function is called
when a live memory device change is requested (via
virDomainUpdateDeviceFlags()). Currently, the only model that is
allowed to change is VIRTIO_MEM (and the only value that's
allowed to change is requestedsize). The aim of the function is
to check whether the change user requested follows this rule. And
in accordance with defensive programming I made the function
check all virDomainMemoryDef struct members. Even those which are
unused for VIRTIO_MEM model.
Drop these checks as the respective members will be inaccessible
soon (as the struct is refined).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As of v7.9.0-rc1~296 users have ability to adjust what portion of
virtio-mem is exposed to the guest. Then, as of v9.4.0-rc2~5 they
have ability to set address where the memory is mapped. But due
to a missing check it was possible to feed
virDomainUpdateDeviceFlags() API with memory device XML that
changes the address. This is of course not possible and should be
forbidden. Add the missing check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Conceptually, from host POV there's no difference between NVDIMM
and VIRTIO_PMEM. Both expose a file to the guest (which is used
as a permanent storage). Other secdriver treat NVDIMM and
VIRTIO_PMEM the same. Thus, modify virt-aa-helper so that is
appends virtio-pmem backing path into the domain profile too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently, inside of virt-aa-helper code the domain definition is
parsed and then all def->mems are iterated over and for NVDIMM
models corresponding nvdimmPath is set label on. Conceptually,
this code works (except the label should be set for VIRTIO_PMEM
model too, but that is addressed in the next commit), but it can
be written in more extensible way. Firstly, there's no need to
check whether def->mems[i] is not NULL because we're inside a
for() loop that's counting through def->nmems. Secondly, we can
have a helper variable ('mem') to make the code more readable
(just like we do in other loops). Then, we can use switch() to
allow compiler warn us on new memory model.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since its introduction in 4d1b771fbb
it has only been used to differentiate between START and non-START.
Last use of QEMU_DOMAIN_LOG_CONTEXT_MODE_ATTACH was removed by:
commit f709377301
qemu: Fix qemuDomainObjTaint with virtlogd
QEMU_DOMAIN_LOG_CONTEXT_MODE_STOP is unused since:
commit cf3ea0769c
qemu: process: Append the "shutting down" message using the new APIs
Now, the only caller passes QEMU_DOMAIN_LOG_CONTEXT_MODE_START.
Assume that's always the case and remove the 'mode' argument.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that the support to revert external snapshots is implemented we can
drop this check.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
With the introduction of external snapshot revert support we need to
error out in some cases when trying to delete some snapshots.
If users reverts to non-leaf snapshots and would try to delete it after
the revert is done it would not work currently as this operation would
require using block-stream which is not implemented for now as in this
case the snapshot has two children so the disk files have multiple
overlays.
Similarly if user reverts to non-leaf snapshot and would try to delete
snapshot that is non-leaf but not in currently active snapshot chain we
would still need to use block-commit operation. The issue here is that
in order to do that we would have to start new qemu process with
different domain definition than what is currently used by the domain.
If the current domain would be running it would complicate things even
more so this operation is not yet supported.
If user creates new snapshot after reverting to non-leaf snapshot it
creates a new branch. Deleting snapshot with multiple children will
require block-stream which is not implemented for now.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There will be more external snapshot checks introduced by following
patch so group them together.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
With introduction of external snapshot revert we will have to update
backing store of qcow images not actively used be QEMU manually.
The need for this patch comes from the fact that we stop and start QEMU
process therefore after revert not all existing snapshots will be known
to that QEMU process due to reverting to non-leaf snapshot or having
multiple branches.
We need to loop over all existing snapshots and check all disks to see
if they happen to have the image we are deleting as backing store and
update them to point to the new image except for images currently used
by the running QEMU process doing the merge operation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We only need the domain definition from domain object. This will allow
us to use it from snapshot code where we need to pass different domain
definition.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This new helper will allow us to check if we are able to delete external
snapshot after user did revert to non-leaf snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When user creates a new snapshot after reverting to non-leaf snapshot we
no longer need to store the temporary overlays as they will be part of
the VM XMLs stored in the newly created snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting external snapshot and parent snapshot is the currently
active snapshot as user reverted to it we need to properly update the
parent snapshot metadata.
After the delete is done the new overlay files will be the currently
used files created when snapshot revert was done, replacing the original
overlay files.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When block commit is not needed we can just simply unlink the
disk files.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In this case there is no need to run block commit and using qemu process
at all.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Before external snapshot revert every delete operation did block commit
in order to delete a snapshot. But now when user reverts to non-leaf
snapshot deleting leaf snapshot will not have any overlay files so we
can just simply delete the snapshot images.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This part of code is about to grow to make deletion work when user
reverts to non-leaf snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The new name reflects that we prepare data for external snapshot
deletion and the old name will be used later for different part of code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When reverting to external snapshot we need to create new overlay qcow2
files from the disk files the VM had when the snapshot was taken.
There are some specifics and limitations when reverting to a snapshot:
1) When reverting to last snapshot we need to first create new overlay
files before we can safely delete the old overlay files in case the
creation fails so we have still recovery option when we error out.
These new files will not have the suffix as when the snapshot was
created as renaming the original files in order to use the same file
names as when the snapshot was created would add unnecessary
complexity to the code.
2) When reverting to any snapshot we will always create overlay files
for every disk the VM had when the snapshot was done. Otherwise we
would have to figure out if there is any other qcow2 image already
using any of the VM disks as backing store and that itself might be
extremely complex and in some cases impossible.
3) When reverting from any state the current overlay files will be
always removed as that VM state is not meant to be saved. It's the
same as with internal snapshots. If user want's to keep the current
state before reverting they need to create a new snapshot. For now
this will only work if the current snapshot is the last.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Both creating and deleting snapshot are using VIR_ASYNC_JOB_SNAPSHOT but
reverting is using VIR_ASYNC_JOB_START. Let's unify it to make it
consistent for all snapshot operations.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We will need to reuse the functionality when reverting external
snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
To create new overlay files when external snapshot revert support is
introduced we will be using different domain definition than what is
currently used by the domain.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract creation of qcow2 files for external snapshots to separate
function as we will need it for external snapshot revert code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When creating external snapshot this function is called only when the VM
is not running so there is only one definition to care about. However,
it will be used by external snapshot revert code for active and inactive
definition and they may be different if a disk was (un)plugged only for
the active or inactive definition.
The current code would crash so use virDomainDiskByName() to get the
correct disk from the domain definition based on the disk name and make
sure it exists.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract the code that updates disks in domain definition while creating
external snapshots. We will use it later in the external snapshot revert
code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This new option will be used by external snapshot revert code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This new element will hold the new disk overlay created when reverting
to non-leaf snapshot in order to remember the files libvirt created.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Commit <ef3f3884a2432958bdd4ea0ce45509d47a91a453> introduced new
argument for virDomainSnapshotAlignDisks() that allows passing alternate
domain definition in case the snapshot parent.dom is NULL.
In case of redefining snapshot it will not hit the part of code that
unconditionally uses parent.dom as there will not be need to generate
default external file names.
It should be still fixed to make it safe. Future external snapshot
revert code will use this to generate default file names and in this
case it would crash.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We will need to call this function from qemu_snapshot when introducing
external snapshot revert support.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We will need to call this function from qemu_snapshot when introducing
external snapshot revert support.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
qemu removed the support for the old 'ivshmem' device in 4.0 release.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The device was removed in qemu-4.0 and is superseded by 'ivshmem-plain'
and 'ivshmem-doorbell'.
Always report error when the old version is used and drop the irrelevant
tests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Historically we've used QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS as witness
that the topology must cover the maximum number ov vcpus. qemu started
to enforce this in qemu-2.5, thus we can now always do the check.
This change also requires aligning the topology in certain test files.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Due to the way the information is stored by the XML parser, we've
had this quirk where specifying any information about the loader
or NVRAM would implicitly set its format to raw. That is,
<nvram>/path/to/guest_VARS.fd</nvram>
would effectively be interpreted as
<nvram format='raw'>/path/to/guest_VARS.fd</nvram>
forcing the use of raw format firmware even when qcow2 format
would normally be preferred based on the ordering of firmware
descriptors. This behavior can be worked around in a number of
ways, but it's fairly unintuitive.
In order to remove this quirk, move the selection of the default
firmware format from the parser down to the individual drivers.
Most drivers only support raw firmware images, so they can
unconditionally set the format early and be done with it; the
QEMU driver, however, supports multiple formats and so in that
case we want this default to be applied as late as possible,
when we have already ruled out the possibility of using qcow2
formatted firmware images.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Keep things consistent by using the same file extension for the
generated NVRAM path as the NVRAM template.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If the user included loader.readonly=no in the domain XML, we
should not pick a firmware build that expects to work with
loader.readonly=yes.
https://bugzilla.redhat.com/show_bug.cgi?id=2196178
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Right now, we only generate it after finding a matching entry
either among firmware descriptors or in the legacy firmware
list.
Even if the domain is configured to use a custom firmware build
that we know nothing about, however, we should still automatically
generate the NVRAM path instead of requiring the user to provide
it manually.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Just like the more common split builds, these are of type
QEMU_FIRMWARE_DEVICE_FLASH; however, they have no associated
NVRAM template, so we can't access the corresponding structure
member unconditionally or we'll trigger a crash.
https://bugzilla.redhat.com/show_bug.cgi?id=2196178
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The documentation states that, just like the Modern() variant,
this function should return 1 if a match wasn't found. It
currently doesn't do that, and returns 0 instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In mu previous commits I've moved internals of
qemuDomainChrDefDropDefaultPath() into a separate function
(qemuDomainChrMatchDefaultPath()) but forgot to remove @buf and
@regexp variables which are now unused.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For historical reasons (i.e. unknown reason) we put channel
sockets into a path derived from cfg->libDir which is a path that
survives host reboots (e.g. /var/lib/libvirt/...). This is not
necessary and in fact for session daemon creates a longer prefix:
XDG_CONFIG_HOME -> /home/user/.config
XDG_RUNTIME_DIR -> /run/user/1000
Worse, if host is rebooted suddenly (e.g. due to power loss) then
we leave files behind and nobody will ever remove them.
Therefore, place the channel target dir into state dir.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2173980
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
A <channel/> device is basically an UNIX socket into guest.
Whatever is sent from the host, appears in the guest and vice
versa. But because of that, the length of the path to the socket
is important (underscored by fact that we derive the path from
domain short name). But there are still cases where we might not
fit into UNIX_PATH_MAX limit (usually 108 characters), because
the path is derived also from other variables, e.g.
XDG_CONFIG_HOME for session domains.
There are two components though, that are needless: "/target/"
and "domain-" prefix. Drop them. This is safe to do, because
running domains have their path saved in status XML and even
though paths are dropped on migration, they are not part of guest
ABI and thus we are free to change them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
If a user passes a list of disks to migrate but don't actually use
'VIR_MIGRATE_NON_SHARED_DISK' or 'VIR_MIGRATE_NON_SHARED_INC' flags the
parameter would be simply ignored without informing the user of the
error.
Add a proper error in such case.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When VIR_MIGRATE_TUNNELLED is used without
VIR_MIGRATE_NON_SHARED_DISK/VIR_MIGRATE_NON_SHARED_INC
an error was reported without actually returning failure.
This was caused by a refactor which dropped many error paths.
Fixes: 6111b23522
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The '/dev' filesystem convenience directory for a LVM volume group is
not created when the volume group is empty.
The logic in 'virStorageBackendLogicalCheckPool' which is used to see
whether a pool is active was first checking presence of the directory,
which failed for an empty VG.
Since the second step is virStorageBackendLogicalMatchPoolSource which
is checking mapping between configured PVs and the VG, we can simply
rely on the function to also check presence of the pool.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2228223
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In case of invalid placement its value should
be passed as a parameter of virReportError
instead of mode.
Fixes: 93e82727ec ("numatune: Encapsulate numatune configuration in order to unify results")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When spawning a new container (via clone()) we allocate stack for
lxcContainerChild(). So far, we allocate 4 pages for the stack
and this used to be enough until we started rewriting everything
to glib. With glib we switched to g_strerror() which localizes
errno strings and thus increases stack usage, while the
previously used strerror_r() was more compact.
Fortunately, the solution is easy - just increase how much stack
the child can use (16 pages ought to be enough for anybody).
And while at it, lets use mmap() for allocation which offer some
nice features:
MAP_STACK - align allocation to be suitable for stack (even
though, currently ignored on Linux),
MAP_GROWSDOWN - kernel guards out of bounds access from child
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/511
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This fixes
commit 38abf9c34d
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Wed Jun 21 13:22:40 2023 +0100
src: set max open file limit to match systemd >= 240 defaults
The bug referenced in that commit had suggested to set
LimitNOFile=512000:1024
on the basis that matches current systemd default behaviour and is
compatible with old systemd. That was good except
* The setting is LimitNOFILE and these are case sensitive
* The hard and soft limits were inverted - soft must come
first and so it would have been ignored even if the
setting name was correct.
* The default hard limit is 524288 not 512000
Reported-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When a query for an interface via virInterfaceLookupByMACString finds
multiple interfaces an error is returned. Treat such error with the same
'debug' priority as we treat when the interface was not found to avoid
spamming logs with such configurations.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/514
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Since the error message originates from a log file it contains a
trailing newline. Strip it as all error handling adds it's own newline.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The caller passes in a 1k buffer, which when debug logging is in use is
easily filled with debug messages only. Thus after the first pass which
is common if the controller process already terminated the buffer will
not contain the real error, but rather a truncated debug message,
which will result in an error such as:
error: internal error: guest failed to start: 2023-08-01 12:58:31.948+0000: 798195: i
instead of the proper error:
error: internal error: guest failed to start: Failure in libvirt_lxc startup: Failed to create /home/rootfs/.oldroot: Permission denied
To fix the above retry the reading loop if the filtering function made
space in the buffer.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Avoid logging multiline debug logs so that the function which attempts
to extract a non-debug log error message can work properly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
If one of previous commits taught us something, it's that:
sizeof(variable) and sizeof(type) are not the same. Especially
because for live enough code the type might change (e.g. as we
use autoptr more). And since we don't get any warnings when an
incorrect length is passed to memset() it is easy to mess up. But
with sizeof(variable) instead, it's not as easy. Therefore,
switch to using memset(variable, 0, sizeof(*variable)), or its
alternatives, depending on level of pointers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
There are some cases left after previous commit which were not
picked up by coccinelle. Mostly, becuase the spatch was not
generic enough. We are left with cases like: two variables
declared on one line, a variable declared in #ifdef-s (there are
notoriously difficult for coccinelle), arrays, macro definitions,
etc.
Finish what coccinelle started, by hand.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
This is a more concise approach and guarantees there is
no time window where the struct is uninitialized.
Generated using the following semantic patch:
@@
type T;
identifier X;
@@
- T X;
+ T X = { 0 };
... when exists
(
- memset(&X, 0, sizeof(X));
|
- memset(&X, 0, sizeof(T));
)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Ideally, these would be fixed by coccinelle (see next commit),
but because of various reasons they aren't. Fix them manually.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Instead of suggesting to zero structs out using memset() we
should suggest initializing structs with zero initializer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
The fds variable inside of virNetlinkCommand() is not used
really. It's passed to memset() (hence compilers do not
complain), but that's about it. Drop it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
This is a residue of v6.8.0-rc1~100. The error variable inside of
virFirewallDApplyRule() is already initialized to NULL. There's
no need to memset() it to zero again.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Inside of remoteAuthSASL() the sargs variable is already
initialized to zero during declaration. There's no need to
memset() it again as it's unused in between it's declaration and
said memset().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
When a VSERPORT_CHANGE event is processed, we firstly do a little
detour and try to detect whether the event is coming from guest
agent. If so, we notify threads that are currently talking to the
agent about this fact. Then we proceed with usual event
processing (BeginJob(), update domain def, emit event, and so
on).
In both cases we use the same @dev variable to refer to domain
device. While this works, it will make writing semantic patch
unnecessary harder (see next commit(s)). Therefore, introduce a
separate variable for the detour code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
There are couple of variables that are declared at function
beginning but then used solely within a block (either for() loop
or if() statement). And just before their use they are zeroed
explicitly using memset(). Decrease their scope, use struct zero
initializer and drop explicit memset().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
This is similar to the previous commit, except this is for a
different type (vahControl) and also fixes the case where _ctl is
passed not initialized to vah_error() (via ctl pointer so that's
probably why compilers don't complain).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
When I implemented passt support in libvirt, I saw the --mac-addr
option on the passt commandline, immediately assumed that this was
used for setting the guest interface's mac address somewhere within
passt, and read no further. As a result, "--mac-addr" is always added
to the passt commandline, specifying the setting from <mac
addr='blah'/> in the guest's interface config.
But as pointed out in this bugzilla comment:
https://bugzilla.redhat.com/2184967#c8
That is *not at all* what passt's --mac-addr option does. Instead, it
is used to force the *remote* mac address for incoming traffic to a
specific value. So setting --mac-addr results in all traffic on the
interface having the same (the guest's) mac address for both source
and destination in all traffic. Surprisingly, this still works, so
nobody noticed it during testing.
The proper thing is to not specify any mac address to passt - the
remote MAC addresses can and should remain untouched, and the local
MAC address will end up being known to passt just by the guest sending
out packets with that MAC address.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
This reverts commit 8511b96a31.
Turns out, we need to do a bit more than just plain
qemuSecurityDomainSetPathLabel() which sets svirt_image_t. Passt
has its own SELinux policy and as a part of that they invent
passt_log_t for log files. Right now, I don't know how libvirt
could query that and even if I did, passt SELinux policy would
need to permit relabelling from svirt_t to passt_log_t, which it
doesn't [1].
Until these problems are addressed we shouldn't be pre-creating
the file as it puts users into way worse position - even
scenarios that used to work don't work. But then again - using
log file for passt is usually valuable for developers only and
not regular users.
1: https://bugzilla.redhat.com/show_bug.cgi?id=2209191#c10
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This reverts commit 83686f1eea.
This is needed only so that the next revert is clean.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
We dropped our private virXXXPtr typedefs in v7.3.0-rc1~229 but
somehow v7.9.0-rc1~292 introduced one back:
virDomainEventMemoryDeviceSizeChangePtr. There's no need for it
and it's internal only. Drop it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
A new bug was introduced as a part of use-after-free fix below:
commit 411cbe7199
Author: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Date: Tue Jul 4 13:10:22 2023 +0600
remote: fix stream use-after-free
When the message was processed partially, it is actually supposed to
stay in the queue to be processed again. In such case, reinsert it back.
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The virRandomGenerateWWN() is used solely by nodedev driver to
autogenerate WWNN and WWNP when parsing a nodedev XML. Now, the
idea was (at least during monolithic daemon) that depending on
which hypervisor driver called the nodedev XML parsing (and
virRandomGenerateWWN() under the hood) the corresponding OUI is
used (e.g. "001a4a" for the QEMU driver).
But in era of split daemons things are not that easy. We do not
know which hypervisor driver called us. And there might be no
hypervisor driver at all - users are allowed to connect to
individual drivers directly (e.g. "nodedev:///system").
In this case, we can't use proper OUI. Well, do the next best
thing: pick one (QUMRANET_OUI).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When automatically adding a NUMA node (qemuDomainDefNumaAutoAdd()) the
memory size of the node is computed as:
total_memory - sum(memory devices)
And we have a nice helper for that: virDomainDefGetMemoryInitial() so
it looks logical to just call it. Except, this code runs in post parse
callback, i.e. memory sizes were not validated and it may happen that
the sum is greater than the total memory. This would be caught by
virDomainDefPostParseMemory() but that runs only after driver specific
callbacks (i.e. after qemuDomainDefNumaAutoAdd()) and because the
domain config was changed and memory was increased to this huge
number no error is caught.
So let's do what virDomainDefGetMemoryInitial() would do, but
with error checking.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2216236
Fixes: f5d4f5c8ee
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
This brings the tool's list of features in sync with qemu
commit 6f05a92ddc73ac8aa16cfd6188f907b30b0501e3.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Inside daemonStreamHandleWrite on stream completion (status=OK) we
reuse msg object to send confirmation.
Only after that, msg is poped from the queue and checked for continue.
By that time, msg might've already been processed for the confirmation
and freed.
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Helped to debug next patch use-after-free.
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If a per-domain SWTPM state directory exists but is empty our
code still considers it a valid state and skips running
'swtpm_setup' (handled in qemuTPMEmulatorRunSetup()).
While we should not try to inspect individual files created by
swtpm, we can still consider empty folder as non-existent state.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/320
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There might be cases where we want to know whether given
directory is empty or not. Introduce a helper for that:
virDirIsEmpty().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Queues is supported by virtio bus, including virtio-blk and
vhost-user-blk.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we don't use it for probing at all we can remove all the
corresponding monitor code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The capability code now probes the presence of commands from the QMP
schema instead of using 'query-commands'. Don't call the command and
adjust the '.replies' files.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move the probing code to extract the data from the QMP schema rather
than invoking 'query-commands'. This patch doesn't yet remove the actual
invocation of 'query-commands', just moves the actual probing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
nodeDeviceUpdateMediatedDevices invokes virMdevctlListDefined and
virMdevctlListActive both of which were passed the same 'errmsg' buffer.
Since virCommandSetErrorBuffer() always allocates the error buffer one
of them was leaked.
Fix it by populating the 'errmsg' buffer only on failure of
virMdevctlListActive|Defined which invoke the command.
Add a comment to nodeDeviceGetMdevctlListCommand reminding how
virCommandSetErrorBuffer() works.
Fixes: 44a0f2f0c8
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
The support for configuring the 'wwn' of a IDE disk was added in qemu
commit 95ebda85e09 (v1.0-1869-g95ebda85e0) and can't be compiled
out.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The support for configuring the 'wwn' of a SCSI disk was added in qemu
commit 27395add759ff4caeb0 (v1.0-3326-g27395add75) and can't be compiled
out.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>