qemuProcessQMPStart starts a QEMU process and monitor connection that
can be used by multiple functions possibly for multiple QMP commands.
The QMP exchange to exit capabilities negotiation mode and enter command
mode can only be performed once after the monitor connection is
established.
Move responsibility for entering QMP command mode into the
qemuProcessQMP code so multiple functions can issue QMP commands in
arbitrary orders.
This also simplifies the functions using the connection provided by
qemuProcessQMPStart to issue QMP commands.
Test code now needs to call qemuMonitorSetCapabilities to send the
message to switch to command mode because the test code does not use the
qemuProcessQMP command that internally calls qemuMonitorSetCapabilities.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Multiple QEMU processes for QMP commands can operate concurrently.
Use a unique directory under libDir for each QEMU process to avoid
pidfile and unix socket collision between processes.
The pid file name is changed from "capabilities.pidfile" to "qmp.pid"
because we no longer need to avoid a possible clash with a qemu domain
called "capabilities" now that the processes artifacts are stored in
their own unique temporary directories.
"Capabilities" was changed to "qmp" in the pid file name because these
processes are no longer specific to the capabilities usecase and are
more generic in terms of being used for any general purpose QMP message
exchanges with a QEMU process that is not associated with a domain.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Users qemuProcessQMP struct were always forced to call both
qemuProcessQMPStop and qemuProcessQMPFree when they are done with the
process. We can just call qemuProcessQMPStop from qemuProcessQMPFree and
let users call qemuProcessQMPFree only.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuProcessQMPNew is one of the public functions used to create and
manage a QEMU process for QMP command exchanges outside of domain
operations.
Add descriptive comment block, debug statement and make source
consistent with the cleanup / VIR_STEAL_PTR format used elsewhere.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The monitor config data is removed from the qemuProcessQMP struct.
The monitor config data can be initialized immediately before call to
qemuMonitorOpen and does not need to be maintained after the call
because qemuMonitorOpen copies any strings it needs.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move code for setting paths and prepping file system from
qemuProcessQMPNew to qemuProcessQMPInit.
This keeps qemuProcessQMPNew limited to data structures and path
initialization is done in qemuProcessQMPInit.
The patch is a non-functional, cut / paste change, however goto is now
"cleanup" rather than "error".
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Store libDir path in the qemuProcessQMP struct in anticipation of moving
path construction code into qemuProcessQMPInit function.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All code related to QEMU monitor is moved from qemuProcessQMPNew and
qemuProcessQMPInit into qemuProcessQMPConnectMonitor.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is a replacement for qemuProcessQMPRun to make the name consistent
with qemuProcessStart. The original qemuProcessQMPRun function is
renamed as qemuProcessQMPLaunch and becomes one of the simpler functions
called from the main qemuProcessQMPStart entry point. The following
patches will move parts of the code in qemuProcessQMPLaunch to the other
functions (qemuProcessQMPInit and qemuProcessQMPConnectMonitor).
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Keep the pointer to QEMU stderr output in qemuProcessQMP struct instead
of requiring the caller to provide it (and free it).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While qemuProcessQMPRun and virQEMUCapsInitQMPMonitor* functions called
from virQEMUCapsInit ignore some errors, the caller of virQEMUCapsInit
would report an error unless usedQMP is true anyway. And since usedQMP
can only be true if the probing code really succeeded (i.e., no errors
were ignored), we can just simplify the logic by not ignoring the errors
in the first place.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In new process code, move from model where qemuProcessQMP struct can be
used to activate a series of Qemu processes to model where one
qemuProcessQMP struct is used for one and only one Qemu process.
By allowing only one process activation per qemuProcessQMP struct, the
struct can safely store process outputs like status and stderr, without
being overwritten, until qemuProcessQMPFree is called.
By doing this, process outputs like status and stderr can remain stored
in the qemuProcessQMP struct without being overwritten by subsequent
process activations.
The forceTCG parameter (use / don't use KVM) will be passed when the
qemuProcessQMP struct is initialized since the qemuProcessQMP struct
won't be reused.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virQEMUCapsInitQMP now stops QEMU process in all execution paths,
before freeing the process structure.
The qemuProcessQMPStop function can be called multiple times without
problems... Won't attempt to stop processes and free resources multiple
times.
Follow the convention established in qemu_process of
1) alloc process structure
2) start process
3) use process
4) stop process
5) free process data structure
The process data structure persists after the process activation fails
or the process dies or is killed so stderr strings can be retrieved
until the process data structure is freed.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
s/qemuProcessQMPAbort/qemuProcessQMPStop/ applied to change function
name used to stop QEMU processes in process code moved from
qemu_capabilities.
No functionality change.
The new name, qemuProcessQMPStop, is consistent with the existing
function qemuProcessStop used to stop Domain processes.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
s/cmd/proc/ in process code imported from qemu_capabilities.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add the const qualifier on non modified strings
(string only copied inside qemuProcessQMPNew)
so that const strings can be used directly in calls to
qemuProcessQMPNew in future patches.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
QEMU process code in qemu_capabilities.c is moved to qemu_process.c in
order to make the code usable outside the original capabilities use
cases.
The moved code activates and manages QEMU processes without establishing
a guest domain.
This patch is a straight cut/paste move between files.
Signed-off-by: Chris Venteicher <cventeic@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For normal starts (no incoming migration) the refresh of the QEMU
state must be done before the VCPUs getting started since otherwise
there might be a race condition between a possible shutdown of the
guest OS and the QEMU monitor queries.
This fixes "qemu: migration: Refresh device information after
transferring state" (93db7eea1b864).
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1503284
The way we currently start qemu from CPU affinity POV is as
follows:
1) the child process is set affinity to all online CPUs (unless
some vcpu pinning was given in the domain XML)
2) Once qemu is running, cpuset cgroup is configured taking
memory pinning into account
Problem is that we let qemu allocate its memory just anywhere in
1) and then rely in 2) to be able to move the memory to
configured NUMA nodes. This might not be always possible (e.g.
qemu might lock some parts of its memory) and is very suboptimal
(copying large memory between NUMA nodes takes significant amount
of time).
The solution is to set affinity to one of (in priority order):
- The CPUs associated with NUMA memory affinity mask
- The CPUs associated with emulator pinning
- All online host CPUs
Later (once QEMU has allocated its memory) we then change this
again to (again in priority order):
- The CPUs associated with emulator pinning
- The CPUs returned by numad
- The CPUs associated with vCPU pinning
- All online host CPUs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Be more sensible when setting labels of the target of a
virDomainBlockCopy operation. Previously we'd relabel everything in case
it's a copy job even if there's no unlabelled backing chain. Since we
are also not sure whether the backing chain is shared we don't relabel
the chain on completion of the blockjob. This certainly won't play nice
with the image permission relabelling feature.
While this does not fix the case where the image is reused and has
backing chain it certainly sanitizes all the other cases. Later on it
will also allow to do the correct thing in cases where only one layer
was introduced.
The change is necessary as in case when -blockdev will be used we will
need to hotplug the backing chain and thus labeling needs to be setup in
advance and not only at the time of pivot. To avoid multiple code paths
move the labeling now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When we need to detect a chain for a image which will become the new
source for a disk (e.g. after a disk media change or a blockjob) we'd
need to replace disk->src temporarily to do so.
Move the 'disksrc' temporary variable to an argument and adjust callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If we validate that memballoon is NONE|VIRTIO at parse time,
we can drop similar checks elsewhere in the qemu driver
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Hanlde all the possible failure codes as per ACPI standard documented in
the function header.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1660410
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We forgot to document the specific fields for the 0x103 and 0x200
sources which are tied to device removal and device hotplug
respectively.
The value description is based on the ACPI 6.2A standard Table 6-207 and
Table 6-208. At the time of writing of this patch the standard can be
accessed e.g. at:
https://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Be generic instead of trying to enumerate all the involved
device types.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently the job name corresponds to the disk the job belongs to. For
jobs which will not correspond to disks we'll need to track the name
separately.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a field tracking the current state of job so that it can be queried
later. Until now the job state e.g. that the job is _READY for
finalizing was tracked only for mirror jobs. Add tracking of state for
all jobs.
Similarly to 'qemuBlockJobType' this maps the existing states of the
blockjob from virConnectDomainEventBlockJobStatus to
'qemuBlockJobState' so that we can track some internal states as well.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We can properly track the job type when starting the job so that we
don't have to infer it later.
This patch also adds an enum of block job types specific to qemu
(qemuBlockjobType) which mirrors the public block job types
(virDomainBlockJobType) but allows for other types to be added later
which will not be public.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than directly modifying fields in the qemuBlockJobDataPtr
structure add a bunch of fields which allow to do the transitions.
This will help later when adding more complexity to the job handling.
APIs introduced in this patch are:
qemuBlockJobDiskNew - prepare for starting a new blockjob on a disk
qemuBlockJobDiskGetJob - get the block job data structure for a disk
For individual job state manipulation the following APIs are added:
qemuBlockJobStarted - Sets the job as started with qemu. Until that
the job can be cancelled without asking qemu.
qemuBlockJobStartupFinalize - finalize job startup. If the job was
started in qemu already, just releases
reference to the job object. Otherwise
clears everything as if the job was never
started.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The field is used to note the state the job has transitioned to while
handling the blockjob state change event. Rename the field so that it's
obvious that this is the new state and not the general state of the
blockjob.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Block job state was widely untracked by libvirt across restarts which
was allowed by a stateless block job finishing handler which discarded
disk state and redetected it. This is undesirable since we'll need to
track more information for individual blockjobs due to -blockdev
integration requirements.
In case of legacy blockjobs we can recover whether the job is present at
reconnect time by querying qemu. Adding tracking whether a job is
present will allow simplification of the non-shared-storage cancellation
code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'cleanup' label was accessed only from a jump to 'error'. Consolidate
everyting into 'cleanup'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Struct qemuDomainDiskPrivate was holding multiple variables connected to
a disk block job. Consolidate them into a new struct qemuBlockJobData.
This will also allow simpler extensions to the block job mechanisms.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In the previous commit we are using uint64_t for storing subnet
prefix and interface id that qemu reports in
RDMA_GID_STATUS_CHANGED event. We also report them in some debug
messages. This poses a problem because uint64_t can be UL or ULL
depending on the host architecture and hence we wouldn't know
which format to use. Switch to ULL which is big enough and
doesn't suffer from the issue.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This event is emitted on the monitor when a GID table in pvrdma device
is modified and the change needs to be propagate to the backend RDMA
device's GID table.
The control over the RDMA device's GID table is done by updating the
device's Ethernet function addresses.
Usually the first GID entry is determine by the MAC address, the second
by the first IPv6 address and the third by the IPv4 address. Other
entries can be added by adding more IP addresses. The opposite is the
same, i.e. whenever an address is removed, the corresponding GID entry
is removed.
The process is done by the network and RDMA stacks. Whenever an address
is added the ib_core driver is notified and calls the device driver's
add_gid function which in turn update the device.
To support this in pvrdma device we need to hook into the create_bind
and destroy_bind HW commands triggered by pvrdma driver in guest.
Whenever a changed is made to the pvrdma device's GID table a special
QMP messages is sent to be processed by libvirt to update the address of
the backend Ethernet device.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Before launching a SEV guest we take the base64-encoded guest owner's
data specified in launchSecurity and create files with the same content
under /var/lib/libvirt/qemu/<domain>. The reason for this is that we
need to pass these files on to QEMU which then uses them to communicate
with the SEV firmware, except when it doesn't have permissions to open
those files since we don't relabel them.
https://bugzilla.redhat.com/show_bug.cgi?id=1658112
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Since SEV operates on a per domain basis, it's very likely that all
SEV launch-related data will be created under
/var/lib/libvirt/qemu/<domain_name>. Therefore, when calling into
qemuProcessSEVCreateFile we can assume @libDir as the directory prefix
rather than passing it explicitly.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Because missing optional storage source is not error. The patch
address only local files. Fixing other cases is a bit ugly.
Below is example of error notice in log now:
error: virStorageFileReportBrokenChain:427 :
Cannot access storage file '/path/to/missing/optional/disk':
No such file or directory
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The QEMU command line arguments are very long and currently all written
on a single line to /var/log/libvirt/qemu/$GUEST.log. This introduces
logic to add line breaks after every env variable and "-" optional
argument, and every positional argument. This will create a clearer log
file, which will in turn present better in bug reports when people cut +
paste from the log into a bug comment.
An example log file entry now looks like this:
2018-12-14 12:57:03.677+0000: starting up libvirt version: 5.0.0, qemu version: 3.0.0qemu-3.0.0-1.fc29, kernel: 4.19.5-300.fc29.x86_64, hostname: localhost.localdomain
LC_ALL=C \
PATH=/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin \
HOME=/home/berrange \
USER=berrange \
LOGNAME=berrange \
QEMU_AUDIO_DRV=none \
/usr/bin/qemu-system-ppc64 \
-name guest=guest,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/home/berrange/.config/libvirt/qemu/lib/domain-33-guest/master-key.aes \
-machine pseries-2.10,accel=tcg,usb=off,dump-guest-core=off \
-m 1024 \
-realtime mlock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid c8a74977-ab18-41d0-ae3b-4041c7fffbcd \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=23,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device qemu-xhci,id=usb,bus=pci.0,addr=0x1 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
2018-12-14 12:57:03.730+0000: shutting down, reason=failed
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Require that all headers are guarded by a symbol named
LIBVIRT_$FILENAME
where $FILENAME is the uppercased filename, with all characters
outside a-z changed into '_'.
Note we do not use a leading __ because that is technically a
namespace reserved for the toolchain.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This introduces a syntax-check script that validates header files use a
common layout:
/*
...copyright header...
*/
<one blank line>
#ifndef SYMBOL
# define SYMBOL
....content....
#endif /* SYMBOL */
For any file ending priv.h, before the #ifndef, we will require a
guard to prevent bogus imports:
#ifndef SYMBOL_ALLOW
# error ....
#endif /* SYMBOL_ALLOW */
<one blank line>
The many mistakes this script identifies are then fixed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Unlike with SPICE and SDL which use the <gl> subelement to enable OpenGL
acceleration, specifying egl-headless graphics in the XML has
essentially the same meaning, thus in case of egl-headless we don't have
a need for the 'enable' element attribute and we'll only be interested
in the 'rendernode' one further down the road.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Up until now, we formatted 'rendernode=' onto QEMU cmdline only if the
user specified it in the XML, otherwise we let QEMU do it for us. This
causes permission issues because by default the /dev/dri/renderDX
permissions are as follows:
crw-rw----. 1 root video
There's literally no reason why it shouldn't be libvirt picking the DRM
render node instead of QEMU, that way (and because we're using
namespaces by default), we can safely relabel the device within the
namespace.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Post-copy migration has been broken on the source since commit
v3.8.0-245-g32c29f10db which implemented support for
pause-before-switchover QEMU migration capability.
Even though the migration itself went well, the source did not really
know when it switched to the post-copy mode despite the messages logged
by MIGRATION event handler. As a result of this, the events emitted by
source libvirtd were not accurate and statistics of the completed
migration would cover only the pre-copy part of migration. Moreover, if
migration failed during the post-copy phase for some reason, the source
libvirtd would just happily resume the domain, which could lead to disk
corruption.
With the pause-before-switchover capability enabled, the order of events
emitted by QEMU changed:
pause-before-switchover
disabled enabled
MIGRATION, postcopy-active STOP
STOP MIGRATION, pre-switchover
MIGRATION, postcopy-active
The STOP even handler checks the migration status (postcopy-active) and
sets the domain state accordingly. Which is sufficient when
pause-before-switchover is disabled, but once we enable it, the
migration status is still active when we get STOP from QEMU. Thus the
domain state set in the STOP handler has to be corrected once we are
notified that migration changed to postcopy-active.
This results in two SUSPENDED events to be emitted by the source
libvirtd during post-copy migration. The first one with
VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED detail, while the second one reports
the corrected VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY detail. This is
inevitable because we don't know whether migration will eventually
switch to post-copy at the time we emit the first event.
https://bugzilla.redhat.com/show_bug.cgi?id=1647365
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For metadata locking we might need an extra fork() which given
latest attempts to do fewer fork()-s is suboptimal. Therefore,
there will be a qemu.conf knob to {en|dis}able this feature. But
since the feature is actually not metadata locking itself rather
than remembering of the original owner of the file this is named
as 'rememberOwner'. But patches for that feature are not even
posted yet so there is actually no qemu.conf entry in this patch
nor a way to enable this feature.
Even though this is effectively a dead code for now it is still
desired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>