This enables support for running QEMU embedded to the calling
application process using a URI:
qemu:///embed?root=/some/path
Note that it is important to keep the path reasonably short to
avoid risk of hitting the limit on UNIX socket path names
which is 108 characters.
When using the embedded mode with a root=/var/tmp/embed, the
driver will use the following paths:
logDir: /var/tmp/embed/log/qemu
swtpmLogDir: /var/tmp/embed/log/swtpm
configBaseDir: /var/tmp/embed/etc/qemu
stateDir: /var/tmp/embed/run/qemu
swtpmStateDir: /var/tmp/embed/run/swtpm
cacheDir: /var/tmp/embed/cache/qemu
libDir: /var/tmp/embed/lib/qemu
swtpmStorageDir: /var/tmp/embed/lib/swtpm
defaultTLSx509certdir: /var/tmp/embed/etc/pki/qemu
These are identical whether the embedded driver is privileged
or unprivileged.
This compares with the system instance which uses
logDir: /var/log/libvirt/qemu
swtpmLogDir: /var/log/swtpm/libvirt/qemu
configBaseDir: /etc/libvirt/qemu
stateDir: /run/libvirt/qemu
swtpmStateDir: /run/libvirt/qemu/swtpm
cacheDir: /var/cache/libvirt/qemu
libDir: /var/lib/libvirt/qemu
swtpmStorageDir: /var/lib/libvirt/swtpm
defaultTLSx509certdir: /etc/pki/qemu
At this time all features present in the QEMU driver are available when
running in embedded mode, availability matching whether the embedded
driver is privileged or unprivileged.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The function virSecretGetSecretString calls into secret driver and is
used from other hypervisors drivers and as such makes more sense in
util.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Libvirt's original atomic ops impls were largely copied
from GLib's code at the time. The only API difference
was that libvirt's virAtomicIntInc() would return a
value, but g_atomic_int_inc was void. We thus use
g_atomic_int_add(v, 1) instead, though this means
virAtomicIntInc() now returns the original value,
instead of the new value.
This rewrites libvirt's impl in terms of g_atomic_int*
as a short term conversion. The key motivation was to
quickly eliminate use of GNULIB's verify_expr() macro
which is not a direct match for G_STATIC_ASSERT_EXPR.
Long term all the callers should be updated to use
g_atomic_int* directly.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When pause-before-switchover QEMU capability is enabled, we get STOP
event before MIGRATION event with postcopy-active state. To properly
handle post-copy migration and emit correct events commit
v4.10.0-rc1-4-geca9d21e6c added a hack to
qemuProcessHandleMigrationStatus which translates the paused state
reason to VIR_DOMAIN_PAUSED_POSTCOPY and emits
VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event when migration state changes
to post-copy.
However, the code was effective on both sides of migration resulting in
a confusing VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event on the destination
host, where entering post-copy mode is already properly advertised by
VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event.
https://bugzilla.redhat.com/show_bug.cgi?id=1791458
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function needs domain definition really, we don't need to
pass the whole domain object. This saves couple of dereferences
and characters esp. in more checks to come.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Commit d75f865fb9 caused a job-deadlock if
a VM is running the backup job and being destroyed as it removed the
cleanup of the async job type and there was nothing to clean up the
backup job.
Add an explicit cleanup of the backup job when destroying a VM.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
qemuDomainObjPrivateDataClear clears state which become invalid after VM
stopped running and the node name allocator belongs there.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
If we use fake reboot then domain goes thru running->shutdown->running
state changes with shutdown state only for short period of time. At
least this is implementation details leaking into API. And also there is
one real case when this is not convinient. I'm doing a backup with the
help of temporary block snapshot (with the help of qemu's API which is
used in the newly created libvirt's backup API). If guest is shutdowned
I want to continue to backup so I don't kill the process and domain is
in shutdown state. Later when backup is finished I want to destroy qemu
process. So I check if it is in shutdowned state and destroy it if it
is. Now if instead of shutdown domain got fake reboot then I can destroy
process in the middle of fake reboot process.
After shutdown event we also get stop event and now as domain state is
running it will be transitioned to paused state and back to running
later. Though this is not critical for the described case I guess it is
better not to leak these details to user too. So let's leave domain in
running state on stop event if fake reboot is in process.
Reconnection code handles this patch without modification. It detects
that qemu is not running due to shutdown and then calls qemuProcessShutdownOrReboot
which reboots as fake reboot flag is set.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The 'cleanup' flag is doing no cleaup in this function. We can
remove it and return NULL on error or qemuBuildCommandLine().
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The g_auto*() changes made by the previous patches made a lot
of 'cleanup' labels obsolete. Let's remove them.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Change all feasible pointers to use g_autoptr().
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Change all feasible strings and scalar pointers to use g_autofree.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Now, that we have everything prepared, we can generate command
line for NVMe disks.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This reverts commit 7be5fe66cd.
This commit broke resctrl, because it missed the fact that the
virResctrlInfoGetCache() has side-effects causing it to actually
change the virResctrlInfo parameter, not merely get data from
it.
This code will need some refactoring before we can try separating
it from virCapabilities again.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When QEMU uid/gid is set to non-root this is pointless as if we just
used a regular setuid/setgid call, the process will have all its
capabilities cleared anyway by the kernel.
When QEMU uid/gid is set to root, this is almost (always?) never
what people actually want. People make QEMU run as root in order
to access some privileged resource that libvirt doesn't support
yet and this often requires capabilities. As a result they have
to go find the qemu.conf param to turn this off. This is not
viable for libguestfs - they want to control everything via the
XML security label to request running as root regardless of the
qemu.conf settings for user/group.
Clearing capabilities was implemented originally because there
was a proposal in Fedora to change permissions such that root,
with no capabilities would not be able to compromise the system.
ie a locked down root account. This never went anywhere though,
and as a result clearing capabilities when running as root does
not really get us any security benefit AFAICT. The root user
can easily do something like create a cronjob, which will then
faithfully be run with full capabilities, trivially bypassing
the restriction we place.
IOW, our clearing of capabilities is both useless from a security
POV, and breaks valid use cases when people need to run as root.
This removes the clear_emulator_capabilities configuration
option from qemu.conf, and always runs QEMU with capabilities
when root. The behaviour when non-root is unchanged.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We will want to use the async job infrastructure along with all the APIs
and event for the backup job so add the backup job as a new async job
type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We always refresh the capabilities object when using virResctrlInfo
during process startup. This is undesirable overhead, because we can
just directly create a virResctrlInfo instead.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Avoid grabbing the whole virCapsPtr object when we only need the
host CPU information.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Avoid grabbing the whole virCapsPtr object when we only need the
NUMA information.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The NUMA cells are stored directly in the virCapsHostPtr
struct. This moves them into their own struct allowing
them to be stored independantly of the rest of the host
capabilities. The change is used as an excuse to switch
the representation to use a GPtrArray too.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Now that the domain XML APIs don't use virCapsPtr we can stop passing it
around many QEMU driver methods.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Our normal practice is for the object type to be the name prefix, and
the object instance be the first parameter passed in.
Rename these to virDomainObjSave and virDomainDefSave moving their
primary parameter to be the first one. Ensure that the xml options
are passed into both functions in prep for future work.
Finally enforce checking of the return type and mark all parameters
as non-NULL.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate the need to use virCapsPtr for anything
other than the virConnectGetCapabilies() API impl, cache the host arch
against the QEMU driver struct and use that field directly.
In the tests we move virArchFromHost() globally in testutils.c so that
every test runs with a fixed default architecture reported.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In v5.9.0-370-g8fa0374c5b I've tried to fix a bug by removing
some stale XATTRs in qemuProcessStop(). However, I forgot to
do nothing when the VIR_QEMU_PROCESS_STOP_NO_RELABEL flag was
specified.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We clear some capabilities here so the lockouts need to be
re-evaluated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Do all post-processing of capabilities in qemuProcessPrepareQEMUCaps.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Move the post-processing of the QEMU_CAPS_CHARDEV_FD_PASS flag to the
new function.
The clearing of the capability is based on the presence of
VIR_QEMU_PROCESS_START_STANDALONE so we must also pass in the process
start flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Start aggregating all capability post-processing code in one place.
The comment was modified while moving it as it was mentioning floppies
which are no longer clearing the blockdev capability.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is now used only in qemu_process.c so move it there and
name it 'qemuProcessPrepareQEMUCaps' which is more appropriate to what
it's doing.
The reworded comment now mentions that it will also post-process the
caps for VM startup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The redetection was originally added in 43c01d3838 as a way to recover
from libvirtd upgrade from the time when we didn't persist the qemu
capabilities in the status XML. Also this the oldest supported qemu by
more than two years.
Even if somebody would have a running VM running at least qemu 1.5 with
such an old libvirt we certainly wouldn't do the right thing by
redetecting the capabilities and then trying to communicate with qemu.
For now it will be the best to just stop considering this scenario any
more and error out for such VM.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit c90fb5a828 added explicit use of the private copy of the qemu
capabilities to various places. The change to qemuProcessInit was bogus
though as at the point where we re-initiate the post parse callbacks
priv->qemuCaps is still NULL as we clear it after shutdown of the VM and
don't initiate it until a later point.
Using the value from priv->qemuCaps might mislead readers of the code
into thinking that something useful is being passed at that point so go
with an explicit NULL instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Block jobs may be members of async jobs so it makes more sense to
refresh block job state after we do steps for async job recovery.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This also isn't required (due to the vportprofile being stored in the
NetDef as a pointer rather than being directly contained), but it
seemed dishonest to not mark it as const (and thus permit users to
modify its contents)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
If user starts a blockcommit or a blockcopy then we modify access
for qemu on both images and leave it like that until the job
terminates. So far so good. Problem is, if user instead of
terminating the job (where we would modify the access again so
that the state before the job is restored) calls destroy on the
domain or if qemu dies whilst executing the block job. In this
case we don't ever clear the access we granted at the beginning.
To fix this, maybe a bit harsh approach is used, but it works:
after all labels were restored (that is after
qemuSecurityRestoreAllLabel() was called), we iterate over each
disk in the domain and remove XATTRs from the whole backing chain
and also from any file the disk is being mirrored to.
This would have been done at the time of pivot, but it isn't
because user decided to kill the domain instead. If we don't do
this and leave some XATTRs behind the domain might be unable to
start.
Also, secdriver can't do this because it doesn't know if there is
any job running. It's outside of its scope - the hypervisor
driver is responsible for calling secdriver's APIs.
Moreover, this is safe to call because we don't remember labels
for any member of a backing chain except of the top layer. But
that one was restored in qemuSecurityRestoreAllLabel() call done
earlier. Therefore, not only we don't remember labels (and thus
this is basically a NOP for other images in the backing chain) it
is also safe to call this when no blockjob was started in the
first place, or if some parts of the backing chain are shared
with some other domains - this is NOP, unless a block job is
active at the time of domain destroy.
https://bugzilla.redhat.com/show_bug.cgi?id=1741456#c19
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Install the convertor function which enables the internals that will use
-blockdev to make qemu open the firmware image and stop using -drive.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The functions return virDomainCapsCPUModelsPtr and thus they should be
called *CPUModels for consistency. Functions called *CPUDefinitions will
work on qemuMonitorCPUDefsPtr.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function would return a valid virDomainCapsCPUModelsPtr with empty
CPU models list if query-cpu-definitions exists in QEMU, but returns
GenericError meaning it's not in fact implemented. This behaviour is a
bit strange especially after such virDomainCapsCPUModels structure is
stored in capabilities XML and parsed back, which will result in NULL
virDomainCapsCPUModelsPtr rather than a structure containing nothing.
Let's just keep virDomainCapsCPUModelsPtr NULL if the QMP command is not
implemented and change the return value to int so that callers can
easily check for failure or success.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Some callers of virQEMUCapsGetCPUDefinitions will need to filter the
returned list of CPU models. Let's add the filtering parameters directly
to virQEMUCapsGetCPUDefinitions to avoid copying the CPU models list
twice.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than returning a direct pointer the list stored in qemuCaps the
function now creates a new copy of the CPU models list.
The main purpose of this seemingly useless change is to update callers
to free the result returned by virQEMUCapsGetCPUDefinitions because the
internals of this function will change significantly in the following
patches.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The @def variable holds pointer to the domain defintion, but is
set only somewhere in the middle of the function. This is
suboptimal.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Commit <f136b83139c63f20de0df3285d9e82df2fb97bfc> reworked process
affinity setting but did not take cgroups into account which introduced
an issue when starting VM with custom cpuset.cpus for the whole machine
group.
If the machine group is limited to some pCPUs libvirt should not try to
set a VM to run on all pCPUs as it will result in permission denied when
writing to cpuset.cpus.
To fix this the affinity has to be set separately from cgroups cpuset.
Resolves: <https://bugzilla.redhat.com/show_bug.cgi?id=1746517>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As suggested by Cole, this patch uses the domain capabilities to
validate the supported video model types. This allows us to remove the
model type validation from qemu_process.c and qemu_domain.c and
consolidates it all in a single place that will automatically adjust
when new domain capabilities are added.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Continue consolidation of video device validation started in previous
patch.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
The goal is to move all of the video device validation to a single place
and use domain caps to validate the supported video device models. Since
qemuDomainDeviceDefValidateVideo() is called from
qemuProcessStartValidate(), these changes should not change anny
behavior.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
When a CPU definition wants to explicitly disable some features that are
unknown to QEMU, we can safely drop them from the definition before
starting QEMU. Naturally QEMU won't enable such features implicitly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In few places we have the following code pattern:
int ret;
... /* @ret is not accessed here */
ret = f(...);
return ret;
This pattern can be written less verbose:
...
return f(...);
This patch was generated with following coccinelle spatch:
@@
type T;
constant C;
expression f;
identifier ret;
@@
-T ret = C;
... when != ret
-ret = f;
-return ret;
+return f;
Afterwards I needed to fix a few places, e.g. comment in
virDomainNetIPParseXML() was removed too because coccinelle
thinks it refers to @ret while in fact it doesn't. Also in few
places it replaced @ret declaration with a few spaces instead of
removing the line. But nothing terribly wrong.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Replace all occurrences of
if (VIR_STRDUP(a, b) < 0)
/* effectively dead code */
with:
a = g_strdup(b);
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Replace all the occurrences of
ignore_value(VIR_STRDUP_QUIET(a, b));
with
a = g_strdup(b);
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
On musl libc "stderr" is a preprocessor macro whose expansion leads to
compilation errors:
In file included from qemu/qemu_process.c:66:
qemu/qemu_process.c: In function 'qemuProcessQMPFree':
qemu/qemu_process.c:8418:21: error: expected identifier before '(' token
VIR_FREE((proc->stderr));
^~~~~~
Prevent this by renaming the homonymous field in the _qemuProcessQMP
struct to "stdErr".
Signed-off-by: Carlos Santos <casantos@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Provide some consistency over error message variable name and usage
when saving error messages across possible other errors or possibility
of resetting of the last error.
Instead of virSaveLastError paired up with virSetError and virFreeError,
we should use the newer virErrorPreserveLast and virRestoreError.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that all the types using VIR_AUTOUNREF have a cleanup func defined
to virObjectUnref, use g_autoptr instead of VIR_AUTOUNREF.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since commit 44e7f02915
util: rewrite auto cleanup macros to use glib's equivalent
VIR_AUTOPTR aliases to g_autoptr. Replace all of its use by the GLib
macro version.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since commit 44e7f02915
util: rewrite auto cleanup macros to use glib's equivalent
VIR_AUTOFREE is just an alias for g_autofree. Use the GLib macros
directly instead of our custom aliases.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Also define the macro for building with GLib older than 2.60
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use G_GNUC_UNUSED from GLib instead of ATTRIBUTE_UNUSED.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In upcoming commits, virSecurityManagerSetAllLabel() will perform
rollback in case of failure by calling
virSecurityManagerRestoreAllLabel(). But in order to do that, the
former needs to have @migrated argument so that it can be passed
to the latter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The usleep function was missing on older mingw versions, but we can rely
on it existing everywhere these days. It may only support times upto 1
second in duration though, so we'll prefer to use g_usleep instead.
The commandhelper program is not changed since that can't link to glib.
Fortunately it doesn't need to build on Windows platforms either.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Converting from virObject to GObject is reasonably straightforward,
as illustrated by this patch for virIdentity
In the header file
- Remove
typedef struct _virIdentity virIdentity
- Add
#define VIR_TYPE_IDENTITY virIdentity_get_type ()
G_DECLARE_FINAL_TYPE (virIdentity, vir_identity, VIR, IDENTITY, GObject);
Which provides the typedef we just removed, and class
declaration boilerplate and various other constants/macros.
In the source file
- Change 'virObject parent' to 'GObject parent' in the struct
- Remove the virClass variable and its initializing call
- Add
G_DEFINE_TYPE(virIdentity, vir_identity, G_TYPE_OBJECT)
which declares the instance & class constructor functions
- Add an impl of the instance & class constructors
wiring up the finalize method to point to our dispose impl
In all files
- Replace VIR_AUTOUNREF(virIdentityPtr) with g_autoptr(virIdentity)
- Replace virObjectRef/Unref with g_object_ref/unref. Note
the latter functions do *NOT* accept a NULL object where as
libvirt's do. If you replace g_object_unref with g_clear_object
it is NULL safe, but also clears the pointer.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When constructing QMP capabilities we allocate a dummy domain
object to pass to qemuMonitorOpen(). However, after 75dd595861
the function also expects domain definition to be allocated for
the domain object. The referenced commit already fixed
qemumonitortestutils.c but forgot to fix the other caller:
qemuProcessQMPConnectMonitor().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This device is a very simple framebuffer device supported by qemu that
is mostly intended to use as a boot framebuffer in conjunction with a
vgpu. However, there is also a standalone ramfb device that can be used
as a primary display device and is useful for e.g. aarch64 guests where
different memory mappings between the host and guest can prevent use of
other devices with framebuffers such as virtio-vga.
https://bugzilla.redhat.com/show_bug.cgi?id=1679680 describes the
issues in more detail.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
When the bochs display type was added, the capability was never checked.
Add that check in the same place as the other video device capability
checks.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
This reverts commit a5a777a8ba.
After previous commit the domain won't disappear while connecting
to monitor. There's no need to ref monitor config then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When connecting to qemu's monitor the @vm object is unlocked.
This is justified - connecting may take a long time and we don't
want to wait with the domain object locked. However, just before
the domain object is locked again, the monitor's FD is registered
in the event loop. Therefore, there is a small window where the
event loop has a chance to call a handler for an event that
occurred on the monitor FD but vm is not initalized properly just
yet (i.e. priv->mon is not set). For instance, if there's an
incoming migration, qemu creates its socket but then fails to
initialize (for various reasons, I'm reproducing this by using
hugepages but leaving the HP pool empty) then the following may
happen:
1) qemuConnectMonitor() unlocks @vm
2) qemuMonitorOpen() connects to the monitor socket and by
calling qemuMonitorOpenInternal() which subsequently calls
qemuMonitorRegister() the event handler is installed
3) qemu fails to initialize and exit()-s, which closes the
monitor
4) The even loop sees EOF on the monitor and the control gets to
qemuProcessEventHandler() which locks @vm and calls
processMonitorEOFEvent() which then calls
qemuMonitorLastError(priv->mon). But priv->mon is not set just
yet.
5) qemuMonitorLastError() dereferences NULL pointer
The solution is to unlock the domain object for a shorter time
and most importantly, register event handler with domain object
locked so that any possible event processing is done only after
@vm's private data was properly initialized.
This issue is also mentioned in v4.2.0-99-ga5a777a8ba.
Since we are unlocking @vm and locking it back, another thread
might have destroyed the domain meanwhile. Therefore we have to
check if domain is still active, and we have to do it at the
same place where domain lock is acquired back, i.e. in
qemuMonitorOpen(). This creates a small problem for our test
suite which calls qemuMonitorOpen() directly and passes @vm which
has no definition. This makes virDomainObjIsActive() call crash.
Fortunately, allocating empty domain definition is sufficient.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
QEMU 2.11 for ppc64 changed all CPU model names to lower case. Since
libvirt can't change the model names for compatibility reasons, we need
to translate the matching lower case models to the names known by
libvirt.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Call qemuExtVhostUserGPUPrepareDomain() to fill the domain with the
location of the vhost-user binary to start.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Check qemu capability, and accept 3d acceleration. 3d acceleration
support is checked when looking for a suitable vhost-user helper.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The fact that qemu is capable -netdev socket is not enough to
start a migratable domain. It also needs dbus-vmstate capability.
Since there are already some qemu releases which have
net-socket-dgram capability and don't have dbus-vmstate we need
to check for dbus-vmstate.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If managed='no', then the tap device must already exist, and setting
of MAC address and online status (IFF_UP) is skipped.
NB: we still set IFF_VNET_HDR and IFF_MULTI_QUEUE as appropriate,
because those bits must be properly set in the TUNSETIFF we use to set
the tap device name of the handle we've opened - if IFF_VNET_HDR has
not been set and we set it the request will be honored even when
running libvirtd unprivileged; if IFF_MULTI_QUEUE is requested to be
different than how it was created, that will result in an error from
the kernel. This means that you don't need to pay attention to
IFF_VNET_HDR when creating the tap devices, but you *do* need to set
IFF_MULTI_QUEUE if you're going to use multiple queues for your tap
device.
NB2: /dev/vhost-net normally has permissions 600, so it can't be
opened by an unprivileged process. This would normally cause a warning
message when using a virtio net device from an unprivileged
libvirtd. I've found that setting the permissions for /dev/vhost-net
permits unprivileged libvirtd to use vhost-net for virtio devices, but
have no idea what sort of security implications that has. I haven't
changed libvrit's code to avoid *attempting* to open /dev/vhost-net -
if you are concerned about the security of opening up permissions of
/dev/vhost-net (probably a good idea at least until we ask someone who
knows about the code) then add <driver name='qemu'/> to the interface
definition and you'll avoid the warning message.
Note that virNetDevTapCreate() is the correct function to call in the
case of an existing device, because the same ioctl() that creates a
new tap device will also open an existing tap device.
Resolves: https://bugzilla.redhat.com/1723367 (partially)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In some places where virDomainObjListForEach() is called the
passed callback calls virDomainObjListRemoveLocked(). Well, this
is unsafe, because the former only grabs a read lock but the
latter modifies the list.
I've identified the following unsafe calls:
- qemuProcessReconnectAll()
- libxlReconnectDomains()
The rest seem to be safe.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When the network interface is of "user" type, and QEMU has the "-net
socket,fd=" datagram support, call qemuInterfacePrepareSlirp() to
probe and associate a slirp-helper with the interface.
The usage of automated slirp-helper can be prevented with
disableSlirp (in particular when resuming a
VM that didn't start with slirp-helper before).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If a slirp-helper is associated with a network interface,
prepare/start/stop the process via qemu-extdevice.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
pid filenames (from swtpm and other helpers from this series) are
based on VM shortname, which is derived from VM id. If the id is reset
to early, the state filenames will not be found.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Once QEMU is started, the qemuDomainLogContext is owned by it, and can
no longer be used from libvirt. Instead, use
qemuDomainLogAppendMessage() which will redirect the log.
This is not strictly necessary for swtpm, but the following patches
are going to reuse qemuExtDeviceLogCommand().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
With blockdev we must issue the block_set_io_throttle QMP command to
setup disk throttling as we currently can't do it with the 'throttle'
layer.
Unfortunately there's nothing we can do if it fails.
https://bugzilla.redhat.com/show_bug.cgi?id=1733163
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
QEMU-4.1 supports 'Direct Mode' for Hyper-V synthetic timers
(hv-stimer-direct CPU flag): Windows guests can request that timer
expiration notifications are delivered as normal interrupts (and not
VMBus messages). This is used by Hyper-V on KVM.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Since qemuDomainDefPostParse callback requires qemuCaps, we need to make
sure it gets the capabilities stored in the domain's private data if the
domain is running. Passing NULL may cause QEMU capabilities probing to
be triggered in case QEMU binary changed in the meantime. When this
happens while a running domain object is locked, QMP event delivered to
the domain before QEMU capabilities probing finishes will deadlock the
event loop.
This patch fixes all paths leading to virDomainDefPostParse.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since qemuDomainDefPostParse callback requires qemuCaps, we need to make
sure it gets the capabilities stored in the domain's private data if the
domain is running. Passing NULL may cause QEMU capabilities probing to
be triggered in case QEMU binary changed in the meantime. When this
happens while a running domain object is locked, QMP event delivered to
the domain before QEMU capabilities probing finishes will deadlock the
event loop.
Several general functions from domain_conf.c were lazily passing NULL as
the parseOpaque pointer instead of letting their callers pass the right
data. This patch fixes all paths leading to virDomainDefCopy to do the
right thing.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since qemuDomainDefPostParse callback requires qemuCaps, we need to make
sure it gets the capabilities stored in the domain's private data if the
domain is running. Passing NULL may cause QEMU capabilities probing to
be triggered in case QEMU binary changed in the meantime. When this
happens while a running domain object is locked, QMP event delivered to
the domain before QEMU capabilities probing finishes will deadlock the
event loop.
This patch fixes all paths leading to qemuDomainDefFormatBufInternal.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'default monitor of an allocation' is defined as the resctrl
monitor group that created along with an resctrl allocation,
which is created by resctrl file system. If the monitor group
specified in domain configuration file is happened to be a
default monitor group of an allocation, then it is not necessary
to create monitor group since it is already created. But if
an monitor group is not an allocation default group, you
should create the group under folder
'/sys/fs/resctrl/mon_groups' and fill the vcpu PIDs to 'tasks'
file.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
hv-spinlocks is not a CPUID feature and should not be checked as such.
While starting a domain with hv-spinlocks enabled, we would report a
warning about unsupported hyperv spinlocks feature even though it was
set properly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Originally the names of the hyperv CPU features were only used
internally for looking up their CPUID bits. So we used "__kvm_hv_"
prefix for them to make sure the names do not collide with normal CPU
features stored in our CPU map.
But with QEMU 4.1 we check which features were enabled or disabled by a
freshly started QEMU process using their names rather than their CPUID
bits (mostly because of MSR features). Thus we need to change our made
up internal names into the actual names used by QEMU. Most of the names
are only used with QEMU 4.1 and newer and the reset was introduced with
QEMU recently enough to already support spelling with "-". Thus we don't
need to define them as "hv_*" with a translation to "hv-*" for new QEMU.
Without this patch libvirt would mistakenly report all hyperv features
as unavailable and refuse to start any domain using them with QEMU 4.1.
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In case of an incoming migration we do not need to run swtpm_setup
with all the parameters but only want to get the benefit of it
creating a TPM state file for us that we can then label with an
SELinux label. The actual state will be overwritten by the in-
coming state. So we have to pass an indicator for incomingMigration
all the way to the command line parameter generation for swtpm_setup.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
If we are using -blockdev, then node names are always available
(because we set them). But when not using it, we have to scrape node
names from QMP, and want to do so as infrequently as possible. We
were scraping node names after reconnecting a new libvirtd to an
existing guest (see qemuProcessReconnect), and after any block job
that may have changed the set of node names we care about (legacy
block jobs), but forgot to scrape the names when first starting a
guest. Do so now in order to allow the checkpoint code to always have
access to a node name without having to repeat a node name scrape
itself.
Future patches may need to clean up qemuDomainSetBlockThreshold (if
node names are always available, then it doesn't need to repeat a
scrape) and/or hotplug and media changes (if the addition of new nodes
can result in a null node name, then scraping at that point in time
would be appropriate). But for now, this patch addresses only the
most common instance of a missing node name.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The block job event handler qemuProcessHandleBlockJob looks at the block
job data to see whether the job requires synchronous handling. Since the
block job event may arrive before we continue the job handling (if the
job has no data to copy) we could hit the state when the job is still
set as QEMU_BLOCKJOB_STATE_NEW (as we move it to the
QEMU_BLOCKJOB_STATE_RUNNING state only after returning from monitor).
If the event handler uses qemuBlockJobStartupFinalize it would
unregister and free the job. Thankfully this is not a big problem for
legacy blockjobs as we don't need much data for them but since we'd
re-instantiate the job data structure we'd report wrong job type for
active commit as qemu reports it as a regular commit job.
Fix it by not using qemuBlockJobStartupFinalize function in
qemuProcessHandleBlockJob as it is not starting the job anyways.
https://bugzilla.redhat.com/show_bug.cgi?id=1721375
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The PR manager is a property of the format layer in qemu so we need to
be able to track it also in the chains of orphaned block jobs.
Add a helper for qemu to look also into the blockjob state.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Refresh the state of the jobs and process any events that might have
happened while libvirt was not running.
The job state processing requires some care to figure out if a job
needs to be bumped.
For any invalid job try doing our best to cancel it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add support for handling the event either synchronously or
asynchronously using the event thread.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
With blockdev we'll need to use the JOB_STATUS_CHANGE so gate the old
events by the blockdev capability.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that block job data is stored in the status XML portion we need to
make sure that everything which changes the state also saves the status
XML. The job registering function is used while parsing the status XML
so in that case we need to skip the XML saving.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add the job structure to the table when instantiating a new job and
remove it when it terminates/fails.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Pass an xmlopt argument through all the needed network conf
functions, like is done for domain XML handling. No functional
change for now
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Filter out the given capabilities and set domain taint if we've done so.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For testing purposes it's sometimes desired to be able to control the
presence of capabilities of qemu. This adds the possibility to do this
via the qemu namespace.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When connecting to virtlogd fails e.g. due to wrong libvirtd selinux
process label we'd report an utterly useless error message:
$ virsh start upstream
error: Failed to start domain upstream
error: Cannot recv data: Connection reset by peer
Use virLastErrorPrefixMessage in the correct place to give a better
sense of what's going on:
$ virsh start upstream
error: Failed to start domain upstream
error: can't connect to virtlogd: Cannot recv data: Connection reset by peer
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Without "unavailable-features" CPU property we cannot properly detect
whether a specific MSR feature we asked for (either explicitly or
implicitly via a CPU model) was disabled by QEMU for some reason.
Because this could break migration, snapshots, and save/restore
operaions, it's better to just forbid any use of MSR features with QEMU
which lacks "unavailable-features" CPU property.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Always assume JSON monitor was requested, since all the callers
pass true anyway.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
Now that we no longer support the HMP monitor, remove some dead code.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
Now that the virDomainQemuAttach API returns an error, we can remove the
unused qemuProcessAttach function as well, deleting the only user
that possibly could have requested to open a non-JSON monitor.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
If spawning qemu fails then we report an error and proceed to
writing status XML onto the disk. This is unnecessary as we are
sure that the domain is not running.
At the same time, if virPidFileReadPath() fails it returns
-errno. Use it in the error message. It may explain what went
wrong.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When updating guest CPU definition according to the vCPU actually
created by QEMU, we want to use the generic qemuMonitorGetGuestCPU to
get both CPUID and MSR features.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It was never implemented or used for anything else anyway. Mainly
because it uses CPUID features bits. The function is renamed as
qemuMonitorGetGuestCPUx86 to make this explicit.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Properly filter features which should not be passed to QEMU because they
were never supported by QEMU or they did nothing and QEMU dropped them.
Currently they are just silently ignored by the command line generator.
Let's make this process more visible and clean by dropping the features
from the domain's active definition in qemuProcessUpdateGuestCPU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If libvirt receives DISCONNECTED event and prDaemonRunning is set
to false, and qemuDomainRemoveDiskDevice() is performing in the
meantime, then qemuDomainRemoveDiskDevice() will fail to remove
pr-helper object because prDaemonRunning is false. But removing
that check from qemuHotplugRemoveManagedPR() is not enough,
because after removing the object through monitor the
qemuProcessKillManagedPRDaemon() is called which contains the
same check. Thus the pr-helper process might be left behind.
Signed-off-by: Jie Wang <wangjie88@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The hash table returned by qemuMonitorGetAllBlockJobInfo is organized by
the frontend name (which skipps the 'drive-' prefix). While our code
properly matches the jobs to the disk, qemu needs the full job name
including the 'drive-' prefix to be able to identify jobs.
Fix this by adding an argument to qemuMonitorGetAllBlockJobInfo which
does not modify the job name before filling the hash.
This fixes a regression where users would not be able to cancel/pivot
block jobs after restarting libvirtd while a blockjob is running.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 2f2254c7f4 attempted to fix a memory leak by ensuring
cpumapToSet is always a freshly allocated bitmap, but regrettably
introduced a NULL pointer access while doing so, because it called
virBitmapCopy() without allocating the destination bitmap first.
Solve the issue by using virBitmapNewCopy() instead.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
We're using VIR_AUTOPTR() for everything now, plus the
cleanup section was not doing anything useful anyway.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In two out of three scenarios we are cleaning up properly after
ourselves, but commit 5f2212c062 has changed the remaining one
in a way that caused it to start leaking cpumapToSet.
Refactor the logic so that cpumapToSet is always a freshly
allocated bitmap that gets cleaned up automatically thanks to
VIR_AUTOPTR(); this also allows us to remove the hostcpumap
variable.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Ever since the feature was introduced with commit 0f8e7ae33a,
it has contained a logic error in that it attempted to use a NUMA
node map where a CPU map was expected.
Because of that, guests using <numatune> might fail to start:
# virsh start guest
error: Failed to start domain guest
error: cannot set CPU affinity on process 40055: Invalid argument
This was particularly easy to trigger on POWER 8 machines, where
secondary threads always show up as offline in the host: having
<numatune>
<memory mode='strict' placement='static' nodeset='1'/>
</numatune>
in the guest configuration, for example, would result in libvirt
trying to set the process affinity so that it would prefer
running on CPU 1, but since that's a secondary thread and thus
shows up as offline, the operation would fail, and so would
starting the guest.
Use the newly introduced virNumaNodesetToCPUset() to convert the
NUMA node map to a CPU map, which in the example above would be
48,56,64,72,80,88 - a valid input for virProcessSetAffinity().
https://bugzilla.redhat.com/show_bug.cgi?id=1703661
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When migrating a domain with invtsc CPU feature enabled, the TSC
frequency of the destination host must match the frequency used when the
domain was started on the source host or the destination host has to
support TSC scaling.
If the frequencies do not match and the destination host does not
support TSC scaling, QEMU will fail to set the right TSC frequency when
starting vCPUs on the destination and thus migration will fail. However,
this is quite late since both host might have spent significant time
transferring memory and perhaps even storage data.
By adding the check to libvirt we can let migration fail before any data
starts to be sent over. If for some reason libvirt is unable to detect
the host's TSC frequency or scaling support, we'll just let QEMU try and
the migration will either succeed or fail later.
Luckily, we mandate TSC frequency to be explicitly set in the domain XML
to even allow migration of domains with invtsc. We can just check
whether the requested frequency is compatible with the current host
before starting QEMU.
https://bugzilla.redhat.com/show_bug.cgi?id=1641702
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If the scheduler is set before vCPU0 cannot be moved into its cpu,cpuacct
cgroup. While it is not yet known whether this is a bug or not, it makes sense
for us to do that later as otherwise the scheduler would be inherited by vCPU
and I/O Threads even when they do not have any such setting specified.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It's funny how this went unnoticed for such a long time. Long
story short, if a domain is configured with
VIR_DOMAIN_NUMATUNE_MEM_STRICT libvirt doesn't really honour
that. This is because of 7e72ac7878 after which libvirt allowed
qemu to allocate memory just anywhere and only after that it used
some magic involving cpuset.memory_migrate and cpuset.mems to
move the memory to desired NUMA nodes. This was done in order to
work around some KVM bug where KVM would fail if there wasn't a
DMA zone available on the NUMA node. Well, while the work around
might stopped libvirt tickling the KVM bug it also caused a bug
on libvirt side: if there is not enough memory on configured NUMA
node(s) then any attempt to start a domain must fail. Because of
the way we play with guest memory domains can start just happily.
The solution is to move the child we've just forked into emulator
cgroup, set up cpuset.mems and exec() qemu only after that.
This basically reverts 7e72ac7878 which was a workaround
for kernel bug. This bug was apparently fixed because I've tested
this successfully with recent kernel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The call to resolve the actual network type will turn any NICs with
type=network into one of the other types. Thus there should be no need
to handle type=network in later switch() statements jumping off the
actual type.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The APIs for allocating/notifying/removing network ports just take
an internal domain interface struct right now. As a step towards
turning these into public facing APIs, add a virNetworkPtr argument
to all of them.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The port allocation APIs are currently called unconditionally for all
types of NIC, but (mostly) only do anything for NICs with type=network.
The exception is the port allocate API which does some validation even
for NICs with type!=network. Relying on this validation is flawed,
however, since the network driver may not even be installed. IOW virt
drivers must not delegate validation to the network driver for NICs
with type != network.
This change allows us to report errors when the virtual network driver
is not registered.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This helps in a scenarios where vCPUs run with a priority that is so high they
might starve the emulator thread. And it also fits with the rest of the
settings.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This does not cause a problem in usual scenarios thanks to us allowing
CAP_DAC_OVERRIDE for the qemu process, however in some scenarios this might be
an issue because the directory is created with mkdtemp(3) which explicitly
creates that with 0700 permissions and qemu running as non-root cannot access
that.
The scenarios include:
- Builds without CAPNG
- Running libvirtd in certain container configurations [1]
- and possibly others.
[1] https://github.com/kubevirt/kubevirt/pull/2181#issuecomment-481840304
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Since the STOP event handler can use the pausedReason as sent to
qemuProcessStopCPUs, we no longer need to send duplicate suspended
lifecycle events because we know what caused the stop along with extra
details. This processing allows us to also remove the duplicated state
change from qemuProcessStopCPUs.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Map is based on existing cases in code where we send suspended
event after changing domain state to paused.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Similar to commit [1] which saves and passes the running reason to
the RESUME event handler, during qemuProcessStopCPUs let's save and pass
the pause reason in the domain private data so that the STOP event
handler can use it.
[1] 5dab984ed : qemu: Pass running reason to RESUME event handler
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Now that the core of SnapshotObj is agnostic to snapshots and can be
shared with upcoming checkpoint code, it is time to rename the struct
and the functions specific to list operations. A later patch will
shuffle which file holds the common code. This is a fairly mechanical
patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Replace virDomainChrSourceDefFree with virObjectUnref.
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Use refcounting for priv->monConfig instead of asymmetric freeing.
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
qemuProcessQMPStart starts a QEMU process and monitor connection that
can be used by multiple functions possibly for multiple QMP commands.
The QMP exchange to exit capabilities negotiation mode and enter command
mode can only be performed once after the monitor connection is
established.
Move responsibility for entering QMP command mode into the
qemuProcessQMP code so multiple functions can issue QMP commands in
arbitrary orders.
This also simplifies the functions using the connection provided by
qemuProcessQMPStart to issue QMP commands.
Test code now needs to call qemuMonitorSetCapabilities to send the
message to switch to command mode because the test code does not use the
qemuProcessQMP command that internally calls qemuMonitorSetCapabilities.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Multiple QEMU processes for QMP commands can operate concurrently.
Use a unique directory under libDir for each QEMU process to avoid
pidfile and unix socket collision between processes.
The pid file name is changed from "capabilities.pidfile" to "qmp.pid"
because we no longer need to avoid a possible clash with a qemu domain
called "capabilities" now that the processes artifacts are stored in
their own unique temporary directories.
"Capabilities" was changed to "qmp" in the pid file name because these
processes are no longer specific to the capabilities usecase and are
more generic in terms of being used for any general purpose QMP message
exchanges with a QEMU process that is not associated with a domain.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Users qemuProcessQMP struct were always forced to call both
qemuProcessQMPStop and qemuProcessQMPFree when they are done with the
process. We can just call qemuProcessQMPStop from qemuProcessQMPFree and
let users call qemuProcessQMPFree only.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuProcessQMPNew is one of the public functions used to create and
manage a QEMU process for QMP command exchanges outside of domain
operations.
Add descriptive comment block, debug statement and make source
consistent with the cleanup / VIR_STEAL_PTR format used elsewhere.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The monitor config data is removed from the qemuProcessQMP struct.
The monitor config data can be initialized immediately before call to
qemuMonitorOpen and does not need to be maintained after the call
because qemuMonitorOpen copies any strings it needs.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move code for setting paths and prepping file system from
qemuProcessQMPNew to qemuProcessQMPInit.
This keeps qemuProcessQMPNew limited to data structures and path
initialization is done in qemuProcessQMPInit.
The patch is a non-functional, cut / paste change, however goto is now
"cleanup" rather than "error".
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Store libDir path in the qemuProcessQMP struct in anticipation of moving
path construction code into qemuProcessQMPInit function.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All code related to QEMU monitor is moved from qemuProcessQMPNew and
qemuProcessQMPInit into qemuProcessQMPConnectMonitor.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is a replacement for qemuProcessQMPRun to make the name consistent
with qemuProcessStart. The original qemuProcessQMPRun function is
renamed as qemuProcessQMPLaunch and becomes one of the simpler functions
called from the main qemuProcessQMPStart entry point. The following
patches will move parts of the code in qemuProcessQMPLaunch to the other
functions (qemuProcessQMPInit and qemuProcessQMPConnectMonitor).
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Keep the pointer to QEMU stderr output in qemuProcessQMP struct instead
of requiring the caller to provide it (and free it).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While qemuProcessQMPRun and virQEMUCapsInitQMPMonitor* functions called
from virQEMUCapsInit ignore some errors, the caller of virQEMUCapsInit
would report an error unless usedQMP is true anyway. And since usedQMP
can only be true if the probing code really succeeded (i.e., no errors
were ignored), we can just simplify the logic by not ignoring the errors
in the first place.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In new process code, move from model where qemuProcessQMP struct can be
used to activate a series of Qemu processes to model where one
qemuProcessQMP struct is used for one and only one Qemu process.
By allowing only one process activation per qemuProcessQMP struct, the
struct can safely store process outputs like status and stderr, without
being overwritten, until qemuProcessQMPFree is called.
By doing this, process outputs like status and stderr can remain stored
in the qemuProcessQMP struct without being overwritten by subsequent
process activations.
The forceTCG parameter (use / don't use KVM) will be passed when the
qemuProcessQMP struct is initialized since the qemuProcessQMP struct
won't be reused.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virQEMUCapsInitQMP now stops QEMU process in all execution paths,
before freeing the process structure.
The qemuProcessQMPStop function can be called multiple times without
problems... Won't attempt to stop processes and free resources multiple
times.
Follow the convention established in qemu_process of
1) alloc process structure
2) start process
3) use process
4) stop process
5) free process data structure
The process data structure persists after the process activation fails
or the process dies or is killed so stderr strings can be retrieved
until the process data structure is freed.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
s/qemuProcessQMPAbort/qemuProcessQMPStop/ applied to change function
name used to stop QEMU processes in process code moved from
qemu_capabilities.
No functionality change.
The new name, qemuProcessQMPStop, is consistent with the existing
function qemuProcessStop used to stop Domain processes.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
s/cmd/proc/ in process code imported from qemu_capabilities.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add the const qualifier on non modified strings
(string only copied inside qemuProcessQMPNew)
so that const strings can be used directly in calls to
qemuProcessQMPNew in future patches.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>