The pidfile is guaranteed to be non-NULL (thanks to glib allocation
functions) and it's dereferenced two lines above anyway.
Reported by coverity:
/src/qemu/qemu_passt.c: 278 in qemuPasstStart()
272 return 0;
273
274 error:
275 ignore_value(virPidFileReadPathIfLocked(pidfile, &pid));
276 if (pid != -1)
277 virProcessKillPainfully(pid, true);
>>> CID 404360: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "pidfile" suggests that it may be null, but it
>>> has already been dereferenced on all paths leading to the check.
278 if (pidfile)
279 unlink(pidfile);
280
281 return -1;
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
In our current code the function is not called with NULL argument, but
we should follow our common practice and make it safe anyway.
Reported by coverity:
/src/conf/domain_conf.c: 2635 in virDomainNetPortForwardFree()
2629 {
2630 size_t i;
2631
2632 if (pf)
2633 g_free(pf->dev);
2634
>>> CID 404359: Null pointer dereferences (FORWARD_NULL)
>>> Dereferencing null pointer "pf".
2635 for (i = 0; i < pf->nRanges; i++)
2636 g_free(pf->ranges[i]);
2637
2638 g_free(pf->ranges);
2639 g_free(pf);
2640 }
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
To ensure same behaviour when remote driver is or is not used we must
not steal the FDs and array holding them passed to qemuDomainFDAssociate
but rather duplicate them. At the same time the remote driver must close
and free them to prevent leak.
Pointed out by Coverity as FD leak on error path:
*** CID 404348: Resource leaks (RESOURCE_LEAK)
/src/remote/remote_daemon_dispatch.c: 7484 in remoteDispatchDomainFdAssociate()
7478 rv = 0;
7479
7480 cleanup:
7481 if (rv < 0)
7482 virNetMessageSaveError(rerr);
7483 virObjectUnref(dom);
>>> CID 404348: Resource leaks (RESOURCE_LEAK)
>>> Variable "fds" going out of scope leaks the storage it points to.
7484 return rv;
Fixes: abd9025c2f
Fixes: f762f87534
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The remote_*_args methods will generally borrow pointers
passed in the caller, so should not be freed.
On failure of the virTypedParamsSerialize method, however,
xdr_free was being called. This is presumably because it
was thought that the params may have been partially
serialized and need cleaning up. This is incorrect, as
virTypedParamsSerialize takes care to cleanup partially
serialized data. This xdr_free call would lead to free'ing
the borrowed cookie pointers, which would be a double free.
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A few admin client methods had the xdr_free call the wrong
side of the cleanup label, so typed parameters would not
be freed on error.
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This consists of (1) adding the necessary args to the qemu commandline
netdev option, and (2) starting a passt process prior to starting
qemu, and making sure that it is terminated when it's no longer
needed. Under normal circumstances, passt will terminate itself as
soon as qemu closes its socket, but in case of some error where qemu
is never started, or fails to startup completely, we need to terminate
passt manually.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
passt support requires "-netdev stream", which was added to QEMU in
qemu-7.2.0.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This implements XML config to represent a subset of the features
supported by 'passt' (https://passt.top), which is an alternative
backend for emulated network devices that requires no elevated
privileges (similar to slirp, but "better").
Along with setting the backend to use passt (via <backend
type='passt'/> when the interface type='user'), we also support
passt's --log-file and --interface options (via the <backend>
subelement logFile and upstream attributes) and its --tcp-ports and
--udp-ports options (which selectively forward incoming connections to
the host on to the guest) via the new <portForward> subelement of
<interface>. Here is an example of the config for a network interface
that uses passt to connect:
<interface type='user'>
<mac address='52:54:00:a8:33:fc'/>
<ip address='192.168.221.122' family='ipv4'/>
<model type='virtio'/>
<backend type='passt' logFile='/tmp/xyzzy.log' upstream='eth0'/>
<portForward address='10.0.0.1' proto='tcp' dev='eth0'>
<range start='2022' to='22'/>
<range start='5000' end='5099' to='1000'/>
<range start='5010' end='5029' exclude='yes'/>
</portForward>
<portForward proto='udp'>
<range start='10101'/>
</portForward>
</interface>
In this case:
* the guest will be offered address 192.168.221.122 for its interface
via DHCP
* the passt process will write all log messages to /tmp/xyzzy.log
* routes to the outside for the guest will be derived from the
addresses and routes associated with the host interface "eth0".
* incoming tcp port 2022 to the host will be forwarded to port 22
on the guest.
* incoming tcp ports 5000-5099 (with the exception of ports 5010-5029)
to the host will be forwarded to port 1000-1099 on the guest.
* incoming udp packets on port 10101 will be forwarded (unchanged) to
the guest.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Initial support for network devices using passt (https://passt.top)
for the backend connection will require:
* new attributes of the <backend> subelement:
* "type" that can have the value "passt" (to differentiate from
slirp, because both slirp and passt will use <interface
type='user'>)
* "logFile" (a path to a file that passt should use for its logging)
* "upstream" (a netdev name, e.g. "eth0").
* a new subelement <portForward> (described in more detail later)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This will allow us to call parser/formatter functions with a pointer
to just the backend part.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This fits better with the element containing the value (<driver>), and
allows us to use virDomainNetBackend* for things in the <backend>
element.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Assert support for VIR_DOMAIN_DEF_FEATURE_DISK_FD in the qemu driver
now that all code paths are adapted.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Probing stats and block copy to a FD passed image is not yet supported.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We assume that FD passed images already exist so all existance checks
are skipped.
For the case that a FD-passed image is passed without a terminated
backing chain (thus forcing us to detect) we attempt to read the header
from the FD.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Unfortunately unlike with DAC we can't simply ignore labelling for the
FD and it also influences the on-disk state.
Thus we need to relabel the FD and we also store the existing label in
cases when the user will request best-effort label replacement.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
DAC security label is irrelevant once you have the FD. Disable all
labelling for such images.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Prepare the internal data for passing FDs instead of having qemu open
the file internally.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When starting up a VM with FD-passed images we need to look up the
corresponding named FD set and associate it with the virStorageSource
based on the name.
The association is brought into virStorageSource as security labelling
code will need to access the FD to perform selinux labelling.
Similarly when startup is complete in certain cases we no longer need to
keep the copy of FDs and thus can close them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The helper will be used in various places that need to check that a disk
source struct is using FD passing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new helper qemuDomainStartupCleanup is used to perform cleanup after
a startup of a VM (successful or not). The initial implementation just
calls qemuDomainSecretDestroy, which can be un-exported.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The 'fdgroup' will allow users to specify a passed FD (via the
'virDomainFDAssociate()' API) to be used instead of opening a path.
This is useful in cases when e.g. the file is not accessible from inside
a container.
Since this uses the same disk type as when we open files via names this
patch also introduces a hypervisor feature which the hypervisor asserts
that code paths are ready for this possibility.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Introduce a new argument type for testQemuInfoSetArgs named ARG_FD_GROUP
which allows users to instantiate tests with populated FD passing hash
table.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Implement passing and storage of FDs for the qemu driver. The FD tuples
are g_object instances stored in a per-domain hash table and are
automatically removed once the connection is closed.
In the future we can consider supporting also to not tie the lifetime of
the passed FDs bound to the connection.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
For FD-passing of disk sources we'll need to keep the FDs around.
Introduce a data type helper based on a g_object so that we get
reference counting.
One instance will (due to security labelling) will need to be part of
the virStorageSource struct thus it's declared in the storage_source_conf
module.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The API can be used to associate one or more (e.g. a RO and RW fd for a
disk backend image) FDs to a VM. They can be then used per definition.
The primary use case for now is for complex deployment where
libvirtd/virtqemud may be run inside a container and getting the image
into the container is complicated.
In the future it will also allow passing e.g. vhost FDs and other
resources to a VM without the need to have a filesystem representation
for it.
Passing raw FDs has few intricacies and thus libvirt will by default not
restore security labels.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The patch moving the code didn't faithfully represent the typecasting
of the 'bandwidth' variable needed to properly convert from the legacy
'unsigned long' argument which resulted in a build failure on 32 bit
systems:
../src/qemu/qemu_block.c: In function ‘qemuBlockCommit’:
../src/qemu/qemu_block.c:3249:23: error: comparison is always false due to limited range of data type [-Werror=type-limits]
3249 | if (bandwidth > LLONG_MAX >> 20) {
| ^
Fix it by returning the check into qemuDomainBlockCommit as it's needed
only because of the legacy argument type in the old API and use
'unsigned long long' for qemuBlockCommit.
Fixes: f5a77198bf
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Some compilers aren't happy when an automatically freed variable is used
just to free something (thus it's only assigned in the code):
When compiling qemuSnapshotDelete after recent commits they complain:
../src/qemu/qemu_snapshot.c:3153:61: error: variable 'delData' set but not used [-Werror,-Wunused-but-set-variable]
g_autoslist(qemuSnapshotDeleteExternalData) delData = NULL;
^
To work around the issue we can restructure the code which also has the
following semantic implications:
- since qemuSnapshotDeleteExternalPrepare does validation we error out
sooner than attempting to start the VM
- we read the temporary variable at least in one code path
Fixes: 4a4d89a925
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Use a temporary variable to avoid memory alignment issues on ARM:
../src/nwfilter/nwfilter_dhcpsnoop.c: In function ‘virNWFilterSnoopLeaseFileLoad’:
../src/nwfilter/nwfilter_dhcpsnoop.c:1745:20: error: cast increases required alignment of target type [-Werror=cast-align]
1745 | (unsigned long long *) &ipl.timeout,
|
Fixes: 0d278aa089
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Now that deletion of external snapshot is implemented document the
current virDomainSnapshotDelete supported state.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
If the daemon crashes or is restarted while the snapshot delete is in
progress we have to handle it gracefully to not leave any block jobs
active.
For now we will simply abort the snapshot delete operation so user can
start it again. We need to refuse deleting external snapshots if there
is already another active job as we would have to figure out which jobs
we can abort.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When daemon is restarted and libvirt tries to recover domain jobs we
need to know if the snapshot job was a snapshot delete in order to
safely abort running QEMU block jobs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting external snapshots the operation may fail at any point
which could lead to situation that some disks finished the block commit
operation but for some disks it failed and the libvirt job ends.
In order to make sure that the qcow2 images are in consistent state
introduce new element "<snapshotDeleteInProgress/>" that will mark the
disk in snapshot metadata as invalid until the snapshot delete is
completed successfully.
This will prevent deleting snapshot with the invalid disk and in future
reverting to snapshot with the invalid disk.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
With external snapshots we need to modify the metadata bit more then
what is required for internal snapshots. Mainly the storage source
location changes with every external snapshot.
This means that if we delete non-leaf snapshot we need to update all
children snapshots and modify the disk sources for all affected disks.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When deleting snapshot we are starting block-commit job over all disks
that are part of the snapshot.
This operation may fail as it writes data changes to the backing qcow2
image so we need to wait for all the disks to finish the operation and
wait for correct signal from QEMU. If deleting active snapshot we will
get `ready` signal and for inactive snapshots we need to disable
autofinalize in order to get `pending` signal.
At this point if commit for any disk fails for some reason and we abort
the VM is still in consistent state and user can fix the reason why the
deletion failed.
After that we do `pivot` or `finalize` if it's active snapshot or not to
finish the block job. It still may fail but there is nothing else we can
do about it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In order to save some CPU cycles we will collect all the necessary data
to delete external snapshot before we even start. They will be later
used by code that deletes the snapshots and updates metadata when
needed.
With external snapshots we need data that libvirt gets from running QEMU
process so if the VM is not running we need to start paused QEMU process
for the snapshot deletion and kill at afterwards.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting external snapshots will require to run it as async domain job,
the same way as we do for snapshot creation.
For internal snapshots modify the job mask in order to forbid any other
job to be started.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting internal snapshot when the currently active disk image is
different than where the internal snapshot was taken doesn't work
correctly.
This applies to a running VM only as we are using QMP command and
talking to the QEMU process that is using different disk.
This works correctly when the VM is shut of as in this case we spawn
qemu-img process to delete the snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Prepare the validation function for external snapshot delete support.
There is one exception when deleting `children-only` snapshots. If the
snapshot tree is like this example:
snap1 (external)
|
+- snap2 (internal)
|
+- snap3 (internal)
|
+- snap4 (internal)
and user calls `snapshot-delete snap1 --children-only` the current
snapshot is external but all the children snapshots are internal only
and we are able to delete it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract the code deleting external snapshot metadata to separate
function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Previously the reparent happened before the actual snapshot deletion.
This change moves the code closer to the rest of the code handling
snapshot metadata when deletion happens. This makes the metadate
deletion happen after the data files are deleted.
Following patch will extract it into separate function
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This simplifies the code a bit by reusing existing parts that deletes
a single snapshot.
The drawback of this change is that we will now call the re-parent bits
to keep the metadata in sync for every child even though it will get
deleted as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract code that deletes children of specific snapshot to separate
function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Extract code that deletes single snapshot to separate function.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Move code around to make it clear what is called when deleting single
snapshot or children snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Looks up disk storage source within storage source chain using storage
source object instead of path to make it work with all disk types.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
QEMU emits this signal when the job finished its work and is about to be
finalized. If the job is started with autofinalize disabled the job
waits for user input to finalize the job.
This will be used by snapshot delete code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The created job will be needed by external snapshot delete code so
rework qemuBlockCommit to return that pointer.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
External snapshots will use this to synchronize qemu block jobs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Deleting external snapshots will require configuring autofinalize to
synchronize the block jobs for disks withing single snapshot in order to
be able safely abort of one of the jobs fails.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Upcoming snapshot deletion code will require that multiple commit jobs
are finished in sync. To allow aborting then if one fails we will need
to use manual finalization of the jobs.
This commit implements the monitor code for `job-finalize`.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow to use it while having async domain job active which we
will use when deleting external snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow to use it while having async domain job active which we
will use when deleting external snapshots. At the same time we will need
to have the block job started as synchronous.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Move the code for finishing a job in the ready state to qemu_block.c.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Up until commit 629282d884, using mode=restrictive caused
virNumaSetupMemoryPolicy() to be called from qemuProcessHook(),
and that in turn resulted in virNumaNodesetIsAvailable() being
called and the nodeset being validated.
After that change, the only validation for the nodeset is the one
happening in qemuBuildMemoryBackendProps(), which is skipped when
using mode=restrictive.
Make sure virNumaNodesetIsAvailable() is called whenever a
nodeset has been provided by the user, regardless of the mode.
https://bugzilla.redhat.com/show_bug.cgi?id=2156289
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When post-copy migration fails, the domain stays running on the
destination with a VIR_DOMAIN_RUNNING_POSTCOPY_FAILED reason. Both the
state and the reason can later be rewritten in case the domain gets
paused for other reasons (such as an I/O error). Thus we need a separate
place to remember the post-copy migration failed to be able to resume
the migration.
https://bugzilla.redhat.com/show_bug.cgi?id=2111948
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The parameter was only used to select which states correspond to an
active or failed post-copy migration. But these states are either
applicable to both operations or the check would just paper over a code
bug in case of an impossible combination of state and operation. By
dropping the check we can make the code simpler and also reuse existing
virDomainObjIsFailedPostcopy function and only check for active
post-copy states.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Convert to a switch instead of a bunch of 'if (type == ...).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function currently didn't have a return value. Returning the
'virLockGuard' struct allows the callers to use automatic unlocking of
the mutex.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Require check of return value of the ACL checking functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Now that all code was refactored to use the new version we can remove
the old code.
For now the new close callbacks code has no error messages so
syntax-check forced me to remove the POTFILES entry for
virclosecallbacks.c
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The qemu driver uses connection close callbacks in more places requiring
more changes than other drivers, but luckily the changes are very
straightforward. The migration code was written in a way ensuring that
there's just one callback present so this can be preserved directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The rewrite is straightforward as bhyve registers only the
'bhyveProcessAutoDestroy' callback which by design doesn't need any
special handling (there's just one caller which can start the VM thus
implicitly there's only one possible registration for that function).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The rewrite is straightforward as LXC registers only the
'lxcProcessAutoDestroy' callback which by design doesn't need any
special handling (there's just one caller which can start the VM thus
implicitly there's only one possible registration for that function).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new APIs store the list of callbacks for a VM inside the
virDomainObj and also allow registering multiple callbacks for a single
domain and also for multiple connections.
For now this code is dormant until each driver using the old APIs is not
refactored to use the new APIs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new connect close callbacks for domains will be represented by a
virObject associated with the domain object itself.
To simplify handling the pointer to the close callback data will be done
by an immutable pointer allocated directly when allocating the
corresponding virDomainObj struct.
This patch adds the 'closecallbacks' field to virDomainObj and a
corresponding callback to allocate it into virDomainXMLOption.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function can't fail so there's no point in returning anything.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Introduce a helper which will return a list of all domain objects inside
of the list without filtering and thus without the need to lock
individual members.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Use the new style which doesn't require re-aligning the argument list
once you change the return type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Remove extraneous spaces and put comment on a single line.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
'virObjectNew' can't return NULL. If we pre-check the arguments we don't
need a cleanup label.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Coverity scan reports:
"A time_t value is stored in an integer with too few bits to accommodate
it. The expression timeout is cast to unsigned int"
We are already casting and storing time_t timeout variable into unsigned int.
We can use time_t for timeout and cast it to unsigned long (should be big enough)
instead of unsigned int in sscanf, g_strdup_printf as required.
Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
1.clear passwd in debug log
2.alignment
3.use the same variable name for function definition and declaration
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
These while loops exit directly due to break after entering.
Use if instead of these while loops.
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fix a misspelling in the documation of 'daemonCreateClientStream'.
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In a recent commit I've introduced an umount() call. But the
function where the call lives is compiled on all OSes, not just
Linux. But umount() is Linux specific. Other OSes have unmount
(FreeBSD), or maybe something else. But since namespaces are
Linux specific, we can wrap the call in #ifdef __linux__ and not
care about other OSes.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When calling virConnectGetDomainCapabilities() (exposed as virsh
domcapabilities) users have option to specify whatever sub-set of
{ emulatorbin, arch, machine, virttype } they want. Then we have
a logic (hidden in virQEMUCapsCacheLookupDefault()) that picks
qemuCaps that satisfy values passed by user. And whatever was not
specified is then set to the default value as specified by picked
qemuCaps. For instance: if no machine type was provided but
emulatorbin was, then the machine type is set to the default one
as defined by the emulatorbin.
Or, when just virttype was set then the remaining three values
are set to their respective defaults. Except, we have a crasher
in this case:
# virsh domcapabilities --virttype hvf
error: Disconnected from qemu:///system due to end of file
error: failed to get emulator capabilities
error: End of file while reading data: Input/output error
This is because for 'hvf' virttype (at least my) QEMU does not
have any machine type. Therefore, @machine is set to NULL and the
rest of the code does not expect that.
What we can do about this is to validate all arguments. Well,
except for the emulatorbin which is obtained from passed
qemuCaps. This also fixes the issue when domcapabilities for a
virttype of a different driver are requested, or a different
arch.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When deciding whether to bind mount a path in domain's namespace,
we look at the QEMU mount table (/proc/$pid/mounts) and try to
match prefix of given path with one of mount points. Well, we
do that in a bit clumsy way. For instance, if there's
"/dev/hugepages" already mounted inside the namespace and we are
deciding whether to bind mount "/dev/hugepages1G/..." we decide
to skip over the path and NOT bind mount it. This is because
plain STRPREFIX() is used and yes, the former is prefix of the
latter. What we need to check also is whether the next character
after the prefix is slash.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Our code relies on mount events propagating into the namespace we
create for a domain. However, there's one caveat. In v8.8.0-rc1~8
I've tried to make us detect differences in mount tables between
the namespace in which libvirtd runs and the domain namespace.
This is crucial for any mounts that happen after the domain was
started (for instance new hugetlbfs can be mounted on say
/dev/hugepages1G).
Therefore, we take a look into /proc/$(pgrep qemu)/mounts to see
what filesystems are mounted under /dev. Now, since we don't
umount the original /dev, just mount a tmpfs over it, we get all
the events (e.g. aforementioned hugetlbfs mount on
/dev/hugepages1G), but we are not really able to access it
because of the tmpfs that's placed on top. This then confuses our
algorithm for detecting which filesystems are mounted (the
algorithm is implemented in qemuDomainGetPreservedMounts()).
To break the link between host's and guest's /dev we just need to
umount() the original /dev in the namespace. Just before our
artificially created tmpfs is moved into its place.
Fixes: 46b03819ae
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151869#c6
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Inside of qemuCaps (for the corresponding accelerator) we have
full host CPU expansion stored, among with supported Hyper-V
Enlightenments. To report them in the domain capabilities, we
just have to pick those starting with "hv-" and see if we know
them.
You may notice that neither of our domaincapsdata test shows any
enlightenment. This is because the test works by parsing
corresponding qemucapabilitiesdata/caps_*.xml file and none of
these store the full host CPU expansion (hostCPU.fullQEMU)
because that is runtime piece of information and not formatted
into virQEMUCaps XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717611
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that we have qemuMonitorGetCPUModelExpansion() aware of
Hyper-V Enlightenments, we can start querying it. Two conditions
need to be met:
1) KVM is in use,
2) Arch is either x86 or arm.
It may look like modifying the first call to
qemuMonitorGetCPUModelExpansion() inside of
virQEMUCapsProbeQMPHostCPU() would be sufficient but it is not.
We really need to ask QEMU for full expansion and the first call
does not guarantee that.
For the test data, I've just copied whatever
'query-cpu-model-expansion' returned earlier, therefore there are
no hv-* props. But that's okay - the full expansion is not stored
in cache (and thus not formatted in
tests/qemucapabilitiesdata/caps_*.replies files either). This is
purely runtime thing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This continues and finishes propagation of the @hv_passthrough
argument started in the previous commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Apart from setting @migratable prop to the
query-cpu-model-expansion command, we will need @hv-passthrough
so that we can query for expansion of Hyper-V Enlightenments
supported on the current host. The idea is to run:
{
"execute": "query-cpu-model-expansion",
"arguments": {
"type": "full",
"model": {
"name": "host",
"props": {
"hv-passthrough": true
}
}
}
}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The virDomainCapsEnumFormat() function does not return anything
but zero and none of its callers is interested in the failure
anyways. Switch its return type from integer to void.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We are formatting <enum/> element and its children using
virBufferAddLit(), virBufferAsprintf(), virBufferAdjustIndent(),
etc. Well, we can avoid that when switching to
virXMLFormatElement().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In a recent commit, when ditching virXPathULong() the parsing of
<selfvers/> was changed. But it was changed to virXMLPropUInt()
which is not correct because the value we're interested in is not
in an attribute but element itself.
Fixes: a3c7426839
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Since we really only need to handle key skipping in the top level object
the caller doesn't at this point even pass it to the array formatting
helper function. Remove the unused argument.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Skipping of a specific key is needed only for the top level object to
specially handle the object type. We must not pass it to any recursed
printing of nested objects as skipping keys there might be surprising
and also is unhandlable later when formatting the commandline.
Until now this did not pose a problem but was discovered when adding a
new netdev backend which has a nested config object which also has the
'type' key which was being skipped.
Modern usage will prefer JSON directly but fix the commandline generator
to prevent surprises.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The @hash variable inside of virQEMUCapsProbeQMPHostCPU() is used
only within a block, but declared at the beginning of the
function. Bring the variable declaration into the said block.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There's nothing qemu specific about
qemuDomainCapsFeatureFormatSimple() and in fact, the function
lives in hypervisor agnostic location and thus mustn't have qemu
prefix.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
After previous cleanup this function is no longer used and thus
can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When starting swtpm binary, the qemuSecurityStartTPMEmulator() is
called which sets seclabel on the TPM state and then uses
qemuSecurityCommandRun() to execute the swtpm binary with proper
seclabel. Well, the aim is to ditch
qemuSecurityStartTPMEmulator() because it entangles two distinct
operations. Just call functions for them separately.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
If swtpm binary fails to start after successful exec() (e.g. it
fails to initialize itself), the seclabels set in
qemuSecurityStartTPMEmulator() are not restored. This is due to
lacking qemuSecurityRestoreTPMLabels() call in the error path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Now that we have qemuSecurityRestoreTPMLabels() we might as well
have qemuSecuritySetTPMLabels(). The aim here is to remove
qemuSecurityStartTPMEmulator() which couples two separate things
into a single function call.
Therefore, introduce qemuSecuritySetTPMLabels() which does only
set seclabels on the TPM state.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The qemuSecurityCleanupTPMEmulator() function calls
virSecurityManagerRestoreTPMLabels() and thus the proper name is
qemuSecurityRestoreTPMLabels(). Rename it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Currently, qemuSecurityCleanupTPMEmulator() returns nothing which
means a caller (well, there's only one - qemuExtTPMStop()) can't
produce a warning when restoring seclabels on TPM state failed.
True, qemuSecurityCleanupTPMEmulator() does report a warning
itself, but only in one specific error path.
Make the function return an integer, just like the rest of
qemuSecurity*Restore() functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
qemu is about to deprecate the '-no-hpet' option in favor of configuring
the timer via '-machine'.
Use the QEMU_CAPS_MACHINE_HPET capability to switch to the new syntax
and mask out the old QEMU_CAPS_NO_HPET capability at the same time to
prevent using the old syntax.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The capability represents that qemu accepts the configuration of the
HPET timer via -machine hpet=on/off rather than the
soon-to-be-deprecated '-no-hpet' option.
The capability is detected from 'query-command-line-options' which
recently added the 'hpet' option.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
That way it actually fits with what the condition checks for.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Our secret driver divides secrets into two groups: ephemeral
(stored only in memory) and persistent (stored on disk). Now, the
aim of ephemeral secrets is to define them shortly before being
used and then undefine them. But 'shortly before being used' is a
very vague time frame. And since we default to socket activation
and thus pass '--timeout 120' to every daemon it may happen that
just defined ephemeral secret is gone among with the virtsecretd.
This is no problem for persistent secrets as their definition
(and value) is restored when the virtsecretd starts again, but
ephemeral secrets can't be restored.
Therefore, we could view ephemeral secrets as active objects that
the daemon manages and thus inhibit automatic shutdown (just like
hypervisor daemons do when a guest is running).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Xen 4.17 has strict parsing of 'soundhw' option that allows only
specific values (instead of passing through any value directly to
qemu's -soundhw option, it uses -device now). For 'intel-hda' audio
device, it requires "hda" string. "hda" works with older libxl too.
Other supported models are the same as in libvirt XML.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Xen supports only subset of libvirt's sound devices, and starting with
Xen 4.17 it is enforced by libxl. Verify it early.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Automatically free 'sec' and remove the 'cleanup' section and 'ret'
variables.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virStorageBackendRBDRADOSConfSet' logs its arguments but it's also
used to set the RBD secret/key.
All the security theatre with securely erasing the string we do to fetch
the secret would be quite pointless if we log it thus introduce
virStorageBackendRBDRADOSConfSetQuiet and use it to avoid logging the
password.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The initialization vector is not optional thus we also don't need to
check whether the caller passed it in. Additionally we can use c99
initializers for the gnutls_datum_t structs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'gnutls_datum_t' simply holds pointers to the encryption key and its
length. There's absolutely no point in securely erasing that.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce a new backend type 'external' for connecting to a swtpm daemon
not managed by libvirtd.
Mostly in one commit, thanks to -Wswitch and the way we generate
capabilities.
https://bugzilla.redhat.com/show_bug.cgi?id=2063723
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In the past, the preferred policy
(VIR_DOMAIN_NUMATUNE_MEM_PREFERRED) required exactly one (host)
NUMA node. This made sense because:
1) the libnuma API - numa_set_preferred() allowed exactly one
node, because
2) corresponding kernel syscall (__NR_set_mempolicy) accepted
exactly one node (for MPOL_PREFERRED mode).
But things have changed since then. Firstly, kernel introduced
new MPOL_PREFERRED_MANY mode (v5.15-rc1~107^2~21) which was then
exposed in libnuma as numa_set_preferred_many() (v2.0.15~24).
Fortunately, libnuma also exposes numa_has_preferred_many() which
returns whether the kernel has support for the new mode (1) or
not (0).
Putting this all together, we can lift our check for sufficiently
new kernel and libnuma.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151064
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Although the qemuMigrationSrcPerformResume actually got called
indirectly via qemuMigrationSrcPerformNative and the recovery process
worked, wrong job phases were used for the "perform" phase, which could
cause issues when libvirt daemon crashed (or was otherwise restarted)
during post-copy recovery.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It will need to be called from a place above its current definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When qemuDomainObjReleaseAsyncJob is called when the current async job
is already released we emit quite useless warning which was implemented
to warn about releasing a job owned by another thread.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The function is called even if QEMU reports migration as
postcopy-paused, i.e., it's not migrating anymore. And while changing
the warning, we can drop the part about unattended migration to make the
warning shorter and consistent with qemuMigrationSrcPostcopyFailed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There are some cases when the internal state of disks can change
without qemu sending events about it (e.g. a disk can close
during reset). In case this happens, we should emit an event
about the modified disk.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1824722#c20
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
While only a couple of the message types include sensitive data,
the overhead of calling secure erase is not noticable enough
to worry about making the erasure selective per type. Thus it is
simplest to unconditionally securely erase the buffer.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The buffer length refers to the allocated buffer memory size,
while the offset refers to have much of the buffer we have
read/written. After reading the message payload we must thus
update the latter.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is available on at least FreeBSD and GLibc >= 2.25.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The BPF_CGROUP_DEVICE constant was introduced to Linux in
commit ebc614f687369f9df99828572b1d85a7c2de3d92
Author: Roman Gushchin <roman.gushchin@linux.dev>
Date: Sun Nov 5 08:15:32 2017 -0500
bpf, cgroup: implement eBPF-based device controller for cgroup v2
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The BPF_PROG_QUERY constant was introduced to Linux in
commit defd9c476fa6b01b4eb5450452bfd202138decb7
Author: Alexei Starovoitov <ast@kernel.org>
Date: Mon Oct 2 22:50:26 2017 -0700
libbpf: sync bpf.h
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The VHOST_VSOCK_SET_GUEST_CID constant was introduced to Linux in
commit 433fc58e6bf2c8bd97e57153ed28e64fd78207b8
Author: Asias He <asias@redhat.com>
Date: Thu Jul 28 15:36:34 2016 +0100
VSOCK: Introduce vhost_vsock.ko
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The linux/magic.h header has existed since
commit e18fa700c9a31360bc8f193aa543b7ef7b39a06b
Author: Jeff Garzik <jeff@garzik.org>
Date: Sun Sep 24 11:13:19 2006 -0400
Move several *_SUPER_MAGIC symbols to include/linux/magic.h.
This is old enough that all our supported platforms can be assumed
to have this header.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The DEVLINK_CMD_ESWITCH_GET constant was introduced to Linux in
commit adf200f31c000d707e4afe238ed1d1199e0cce7c
Author: Jiri Pirko <jiri@mellanox.com>
Date: Thu Feb 9 15:54:33 2017 +0100
devlink: fix the name of eswitch commands
This is old enough that all our supported platforms can be assumed
to have this feature.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>