Once we called qemuDomainObjEnterRemote to talk to the destination
daemon during a peer to peer migration, the vm lock is released and we
only hold an async job. If the source domain dies at this point the
monitor EOF callback is allowed to do its job and (among other things)
clear all private data irrelevant for stopped domain. Thus when we call
qemuDomainObjExitRemote, the domain may already be gone and we should
avoid touching runtime private data (such as current job info).
In other words after acquiring the lock in qemuDomainObjExitRemote, we
need to check the domain is still alive. Unless we're doing offline
migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1589730
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The point is to break QEMU_JOB_* into smaller pieces which
enables us to achieve higher throughput. For instance, if there
are two threads, one is trying to query something on qemu
monitor while the other is trying to query something on agent
monitor these two threads would serialize. There is not much
reason for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Introduce guest agent specific job categories to allow threads to
run agent monitor specific jobs while normal monitor jobs can
also be running.
Alter _qemuDomainJobObj in order to duplicate certain fields that
will be used for guest agent specific tasks to increase
concurrency and throughput and reduce serialization.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The aim of this API is to allow the caller to do best effort.
Some functions can work even when acquiring the job fails (e.g.
qemuConnectGetAllDomainStats()). But what they can't bear is
delay if they have to wait up to 30 seconds for each domain that
is processing some other job.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
And replace all calls with virObjectEventStateQueue such that:
qemuDomainEventQueue(driver, event);
becomes:
virObjectEventStateQueue(driver->domainEventState, event);
And remove NULL checking from all callers.
Signed-off-by: Anya Harter <aharter@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Remove the call to the validating function from the function which sets
stuff up.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Convert the function to just prepare data for the disk. Callers need to
do the looping since there's more to do than just copy the data around.
The code path in qemuDomainPrepareDiskSource doesn't need to loop over
the chain yet, since there currently is no chain at this point. This
will be addressed later in the blockdev series where we will setup much
more stuff.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's desired to keep the alias around to allow referencing of the secret
object used with qemu. Add set of APIs which will destroy all data
except the alias.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function checks whether the storage source requires authentication
secret setup. Rename it accordingly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This helper checks that the vm has the master key setup and libvirt
supports the given encryption algorithm.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Create a new vsock endpoint by opening /dev/vhost-vsock,
set the requested CID via ioctl (or assign a free one if auto='yes'),
pass the file descriptor to QEMU and build the command line.
https://bugzilla.redhat.com/show_bug.cgi?id=1291851
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Allow saving various aspects necessary to do NBD migration via blockdev
by storing a 'virStorageSource' in the disk private data meant to store
the NBD target of migration. Along with this add code to parse and
format it into the status XML.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Rather than always checking which path to use pre-assign it when
preparing storage source.
This reduces the need to pass 'vm' around too much. For later use the
path can be retrieved from the status XML.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Before we exec() qemu we have to spawn pr-helper processes for
all managed reservations (well, technically there can only one).
The only caveat there is that we should place the process into
the same namespace and cgroup as qemu (so that it shares the same
view of the system). But we can do that only after we've forked.
That means calling the setup function between fork() and exec().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
For command line we need two things:
1) -object pr-manager-helper,id=$alias,path=$socketPath
2) -drive file.pr-manager=$alias
In -object pr-manager-helper we tell qemu which socket to connect
to, then in -drive file-pr-manager we just reference the object
the drive in question should use.
For managed PR helper the alias is always "pr-helper0" and socket
path "${vm->priv->libDir}/pr-helper0.sock".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Add helper which will map values of disk cache mode to the flags which
are accepted by various parts of the qemu block layer.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
We store the flags passed to the API which started the migration. Let's
use them instead of a separate bool to check if post-copy migration was
requested.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We store the flags passed to the API which started QEMU_ASYNC_JOB_DUMP
and we can use them to check whether a memory-only dump is running.
There's no need for a specific bool flag.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When an async job is running, we sometimes need to know how it was
started to distinguish between several types of the job, e.g., post-copy
vs. normal migration. So far we added a specific bool item to
qemuDomainJobObj for such cases, which doesn't scale very well and
storing such bools in status XML would be painful so we didn't do it.
A better approach is to store the flags passed to the API which started
the async job, which can be easily stored in status XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function checks whether QEMU supports TLS migration and stores the
original value of tls-creds parameter to priv->migTLSAlias. This is no
longer needed because we already have the original value stored in
priv->migParams.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Any job which touches migration parameters will first store their
original values (i.e., QEMU defaults) to qemuDomainJobObj to make it
easier to reset them back once the job finishes.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since the function is tightly connected to migration, it was renamed as
qemuMigrationCapsCheck and moved to qemu_migration_params.c.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that we assume QEMU_CAPS_NETDEV, the only thing left to check
is whether we need to use the legacy -net syntax because of
a non-conforming armchitecture.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
It will be necessary to initialize various aspects for the detected
members of the backing chain. Add a function that will handle it and
call it from qemuDomainPrepareDiskSource and qemuDomainDetermineDiskChain
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
During domain startup there are many places where we need to acquire
secrets. Currently code passes around a virConnectPtr, except in the
places where we pass in NULL. So there are a few codepaths where ability
to start guests using secrets will fail. Change to acquire a handle to
the secret driver when needed.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since it may be possible that the state is unknown in some cases we
should store it as a tristate so that other code using it can determine
whether the state was updated.
Handle a DUMP_COMPLETED event processing the status, stats, and
error string. Use the @status in order to copy the error that
was generated whilst processing the @stats data. If an error was
provided by QEMU, then use that instead.
If there's no async job, we can just ignore the data.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Define the qemuMonitorDumpStats as a new job JobStatsType to handle
being able to get memory dump statistics. For now do nothing with
the new TYPE_MEMDUMP.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add a TYPE_SAVEDUMP so that when coalescing stats for a save or
dump we don't needlessly try to get the mirror stats for a migration.
Other conditions can still use MIGRATION and SAVEDUMP interchangably
including usage of the @migStats field to fetch/store the data.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Convert the stats field in _qemuDomainJobInfo to be a union. This
will allow for the collection of various different types of stats
in the same field.
When starting the async job that will end up being used for stats,
set the @statsType value appropriately. The @mirrorStats are
special and are used with stats.mig in order to generate the
returned job stats for a migration.
Using the NONE should avoid the possibility that some random
async job would try to return stats for migration even though
a migration is not in progress.
For now a migration and a save job will use the same statsType
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add and use qemuProcessEventFree for freeing qemuProcessEvents. This
is less error-prone as the compiler can help us make sure that for
every new enumeration value of qemuProcessEventType the
qemuProcessEventFree function has to be adapted.
All process*Event functions are *only* called by
qemuProcessHandleEvent and this function does the freeing by itself
with qemuProcessEventFree. This means that an explicit freeing of
processEvent->data is no longer required in each process*Event
handler.
The effectiveness of this change is also demonstrated by the fact that
it fixes a memory leak of the panic info data in
qemuProcessHandleGuestPanic.
Reported-by: Wang Dong <dongdwdw@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch pass event error up to the place where we can
use it. Error is passed only for sync blockjob event mode
as we can't use the error in async mode. In async mode we
just pass the event details to the client thru event API
but current blockjob event API can not carry extra parameter.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When doing block commit we need to allow write for members of the
backing chain so that we can commit the data into them.
qemuDomainDiskChainElementPrepare was used for this which since commit
786d8d91b4 calls qemuDomainNamespaceSetupDisk which has very adverse
side-effects, namely it relabels the nodes to the same label it has in
the main namespace. This was messing up permissions for the commit
operation since its touching various parts of a single backing chain.
Since we are are actually not introducing new images at that point add a
flag for qemuDomainDiskChainElementPrepare which will refrain from
calling to the namespace setup function.
Calls from qemuDomainSnapshotCreateSingleDiskActive and
qemuDomainBlockCopyCommon do introduce new members all calls from
qemuDomainBlockCommit do not, so the calls are anotated accordingly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1506072
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
qemuDomainGetImageIds and qemuDomainStorageFileInit are helpful when
trying to access a virStorageSource from the qemu driver since they
figure out the correct uid and gid for the image.
When accessing members of a backing chain the permissions for the top
level would be used. To allow using specific permissions per backing
chain level but still allow inheritance from the parent of the chain we
need to add a new parameter to the image ID APIs.
This new capability enables a pause before device state serialization so
that we can finish all block jobs without racing with the end of the
migration. The pause is indicated by "pre-switchover" state. Once we're
done QEMU enters "device" migration state.
This patch just defines the new capability and QEMU migration states and
their mapping to our job states.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Each time we need to check whether a given migration capability is
supported by QEMU, we call query-migrate-capabilities QMP command and
lookup the capability in the returned list. Asking for the list of
supported capabilities once when we connect to QEMU and storing the
result in a bitmap is much better and we don't need to enter a monitor
just to check whether a migration capability is supported.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since the encryption information can also be disk source specific
move it from qemuDomainDiskPrivate to qemuDomainStorageSourcePrivate
Since the last allocated element from qemuDomainDiskPrivate is
removed, that means we no longer need qemuDomainDiskPrivateDispose.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Since the secret information is really virStorageSource specific
piece of data, let's manage the privateData from there instead of
at the Disk level.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Add the object definition and helpers to store security-related private
data for virStorageSources.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
We need to send allowReboot in the migration cookie to ensure the same
behavior of the virDomainSetLifecycleAction() API on the destination.
Consider this scenario:
1. On the source the domain is started with:
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
2. User calls an API to set "destroy" for <on_reboot>:
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
3. The guest is migrated to a different host
4a. Without the allowReboot in the migration cookie the QEMU
process on destination would be started with -no-reboot
which would prevent using the virDomainSetLifecycleAction() API
for the rest of the guest lifetime.
4b. With the allowReboot in the migration cookie the QEMU process
on destination is started without -no-reboot like it was started
on the source host and the virDomainSetLifecycleAction() API
continues to work.
The following patch adds a QEMU implementation of the
virDomainSetLifecycleAction() API and that implementation disallows
using the API if all actions are set to "destroy" because we add
"-no-reboot" on the QEMU command line. Changing the lifecycle action
is in this case pointless because the QEMU process is always terminated.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This will be used later on in implementation of new API
virDomainSetLifecycleAction(). In order to use it, we need to store
the value in status XML to not lose the information if libvirtd is
restarted.
If some guest was started by old libvirt where it was not possible
to change the lifecycle action for running guest, we can safely
detect it based on the current actions from the status XML.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
So we have a syntax-check rule to catch all tab indents but it naturally
can't catch tab spacing, i.e. as a delimiter. This patch is a result of
running 'vim -en +retab +wq'
(using tabstop=8 softtabstop=4 shiftwidth=4 expandtab) on each file from
a list generated by the following:
find . -regextype gnu-awk \
-regex ".*\.(rng|syms|html|s?[ch]|py|pl|php(\.code)?)(\.in)?" \
| xargs git grep -lP "\t"
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When libvirt older than 3.9.0 reconnected to a running domain started by
old libvirt it could have messed up the expansion of host-model by
adding features QEMU does not support (such as cmt). Thus whenever we
reconnect to a running domain, revert to an active snapshot, or restore
a saved domain we need to check the guest CPU model and remove the
CPU features unknown to QEMU. We can do this because we know the domain
was successfully started, which means the CPU did not contain the
features when libvirt started the domain.
https://bugzilla.redhat.com/show_bug.cgi?id=1495171
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Introduce a new function to prepare domain disks which will also do the
volume source to actual disk source translation.
The 'pretend' condition is not transferred to the new location since it
does not help in writing tests and also no tests abuse it.
qemu 2.7.0 introduces multiqueue virtio-blk(commit 2f27059).
This patch introduces a new attribute "queues". An example of
the XML:
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' queues='4'/>
The corresponding QEMU command line:
-device virtio-blk-pci,scsi=off,num-queues=4,id=virtio-disk0
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Introduce a function to setup any TLS needs for a disk source.
If there's a configuration or other error setting up the disk source
for TLS, then cause the domain startup to fail.
For VxHS, follow the chardevTLS model where if the src->haveTLS hasn't
been configured, then take the system/global cfg->haveTLS setting for
the storage source *and* mark that we've done so via the tlsFromConfig
setting in storage source.
Next, if we are using TLS, then generate an alias into a virStorageSource
'tlsAlias' field that will be used to create the TLS object and added to
the disk object in order to link the two together for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com>
VM private data is cleared when the VM is turned off and also when the
VM object is being freed. Some of the clearing code was duplicated.
Extract it to a separate function.
This also removes the now unnecessary function
qemuDomainClearPrivatePaths.
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. The commit itself tried to address
this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive.
Not to mention that masking a previously set flag is only acceptable if
the flag was set by a public API user. Internally, libvirt should have
never set the flag in the first place.
https://bugzilla.redhat.com/show_bug.cgi?id=1485022
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Block job QMP commands with underscores rather than dashes were never
released in upstream qemu, (they were added, but modified in the same
release [1]), but a certain distro managed to backport the version in the
middle.
The change also slightly modified semantics for the abort command, which
made us have a lot of code which was only ever present in certain
downstream distros.
Clean the upstream code from the legacy cruft and support only the
upstream implementations.
[1] See qemu commit v1.0-2176-gdb58f9c060
Reviewed-by: Eric Blake <eblake@redhat.com>
No need to pass a @driver parameter since all that's done is deref
the @cfg especially since the only caller can just pass an already
referenced @cfg.
Also, looks like commit id '0298531b' at one time had a different
name for the API, so I took the liberty of fixing the comments too
since I would already be updating them for the @cfg variable.
In case of real migration (not migrating to file on save, dump etc)
migration info is not complete at time qemu finishes migration
in normal (non postcopy) mode. We need to update disks stats,
downtime info etc. Thus let's not expose this job status as
completed.
To archive this let's set status to 'qemu completed' after
qemu reports migration is finished. It is not visible as complete
job to clients. Cookie code on confirm phase will finally turn
job into completed. As we don't need more things to do when
migrating to file status is set to 'completed' as before
in this case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When getting job info in case mirror does not reach ready phase
fetch mirror stats from qemu. Otherwise mirror stats are already
saved in current job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking stat.status let's set status to migrating
as soon as migrate command is send (waiting for completion
is a good place too).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status
instead of checking job.current->stats.status. The latter can be changed
when fetching migration statistics. Moving state function from the variable
and leave only store function seems more managable.
This patch removes all state checking usage of stats except for
qemuDomainGetJobStatsInternal. This place will be handled separately.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch simply switches code from using VIR_DOMAIN_JOB_* to
introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom
to introduce states for postcopy and mirroring phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
At some places we either already have synchronous job or we just
released it. Also, some APIs might want to use this code without
having to release their job. Anyway, the job acquire code is
moved out to qemuDomainRemoveInactiveJob so that
qemuDomainRemoveInactive does just what it promises.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
It is more related to a domain as we might use it even when there is
no systemd and it does not use any dbus/systemd functions. In order
not to use code from conf/ in util/ pass machineName in cgroups code
as a parameter. That also fixes a leak of machineName in the lxc
driver and cleans up and de-duplicates some code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
vcpu properties gathered from query-hotpluggable cpus need to be passed
back to qemu. As qemu did not use the node-id property until now and
libvirt forgot to pass it back properly (it was parsed but not passed
around) we did not honor this.
This patch adds node-id to the structures where it was missing and
passes it around as necessary.
The test data was generated with a VM with following config:
<numa>
<cell id='0' cpus='0,2,4,6' memory='512000' unit='KiB'/>
<cell id='1' cpus='1,3,5,7' memory='512000' unit='KiB'/>
</numa>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1452053
In QEMU driver we can use virtlogd as stdio handler for source backend
of char devices if current QEMU is new enough and it's enabled in
qemu.conf. We should store this information while starting a guest
because the config option may change while the guest is running.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When making ABI stability checks for an active domain, we need to make
sure we use the same migratable definition which virDomainGetXMLDesc
with the MIGRATABLE flag provides, otherwise the ABI check will fail.
This is implemented in the new qemuDomainCheckABIStability which takes a
domain object and generates the right migratable definition from it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
If QEMU is new enough and we have the live updated CPU definition in
either save or migration cookie, we can use it to enforce ABI. The
original guest CPU from domain XML will be stored in private data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since the domain XML saved in a snapshot or saved image uses the
original guest CPU definition but we still want to enforce ABI when
restoring the domain if libvirt and QEMU are new enough, we save the
live updated CPU definition in a save cookie.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The destination host may not be able to start a domain using the live
updated CPU definition because either libvirt or QEMU may not be new
enough. Thus we need to send the original guest CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When starting a domain we update the guest CPU definition to match what
QEMU actually provided (since it is allowed to add or removed some
features unless check='full' is specified). Let's store the original CPU
in domain private data so that we can use it to provide a backward
compatible domain XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This patch implements a new save cookie object and callbacks for qemu
driver. The actual useful content will be added in the object later.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This will be used later when a save cookie will become part of the
snapshot XML using new driver specific parser/formatter functions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1450349
Problem is, qemu fails to load guest memory image if these
attribute change on migration/restore from an image.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not all async jobs are visible via virDomainGetJobStats (either they are
too fast or getting the stats is not allowed during the job), but
forcing all of them to advertise the operation is easier than hunting
the jobs for which fetching statistics is allowed. And we won't need to
think about this when we add support for getting stats for more jobs.
https://bugzilla.redhat.com/show_bug.cgi?id=1441563
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Introduce new wrapper functions without *Machine* in the function
name that take the whole virDomainDef structure as argument and
call the existing functions with *Machine* in the function name.
Change the arguments of existing functions to *machine* and *arch*
because they don't need the whole virDomainDef structure and they
could be used in places where we don't have virDomainDef.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This way qemuDomainLogContextRef() and qemuDomainLogContextFree() is
no longer needed. The naming qemuDomainLogContextFree() was also
somewhat misleading. Additionally, it's easier to turn
qemuDomainLogContext in a self-locking object.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Instead of having a separate function, we can simply return
zero from the existing qemuDomainGetMemLockLimitBytes() to
signal the caller that the memory locking limit doesn't need
to be set for the guest.
Having a single function instead of two makes it less likely
that we will use the wrong value, which is exactly what
happened when we started applying the limit that was meant
for VFIO-using guests to <memoryBacking><locked>-using
guests.
So far, the official support is for x86_64 arch guests so unless a
different device API than vfio-pci is available let's only turn on
support for PCI address assignment. Once a different device API is
introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
To allow matching the node names gathered via 'query-named-block-nodes'
we need to query and then use the top level nodes from 'query-block'.
Add the data to the structure returned by qemuMonitorGetBlockInfo.
The code is currently simple, but if we later add node names, it will be
necessary to generate the names based on the node name. Add a helper so
that there's a central point to fix once we add self-generated node
names.
If the migration flags indicate this migration will be using TLS,
then set up the destination during the prepare phase once the target
domain has been started to add the TLS objects to perform the migration.
This will create at least an "-object tls-creds-x509,endpoint=server,..."
for TLS credentials and potentially an "-object secret,..." to handle the
passphrase response to access the TLS credentials. The alias/id used for
the TLS objects will contain "libvirt_migrate".
Once the objects are created, the code will set the "tls-creds" and
"tls-hostname" migration parameters to signify usage of TLS.
During the Finish phase we'll be sure to attempt to clear the
migration parameters and delete those objects (whether or not they
were created). We'll also perform the same reset during recovery
if we've reached FINISH3.
If the migration isn't using TLS, then be sure to check if the
migration parameters exist and clear them if so.
https://bugzilla.redhat.com/show_bug.cgi?id=1430634
If a qemu process has died, we get EOF on its monitor. At this
point, since qemu process was the only one running in the
namespace kernel has already cleaned the namespace up. Any
attempt of ours to enter it has to fail.
This really happened in the bug linked above. We've tried to
attach a disk to qemu and while we were in the monitor talking to
qemu it just died. Therefore our code tried to do some roll back
(e.g. deny the device in cgroups again, restore labels, etc.).
However, during the roll back (esp. when restoring labels) we
still thought that domain has a namespace. So we used secdriver's
transactions. This failed as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The functions in virCommand() after fork() must be careful with regard
to accessing any mutexes that may have been locked by other threads in
the parent process. It is possible that another thread in the parent
process holds the lock for the virQEMUDriver while fork() is called.
This leads to a deadlock in the child process when
'virQEMUDriverGetConfig(driver)' is called and therefore the handshake
never completes between the child and the parent process. Ultimately
the virDomainObjectPtr will never be unlocked.
It gets much worse if the other thread of the parent process, that
holds the lock for the virQEMUDriver, tries to lock the already locked
virDomainObject. This leads to a completely unresponsive libvirtd.
It's possible to reproduce this case with calling 'virsh start XXX'
and 'virsh managedsave XXX' in a tight loop for multiple domains.
This commit fixes the deadlock in the same way as it is described in
commit 61b52d2e38.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:
a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged
If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The bare fact that mnt namespace is available is not enough for
us to allow/enable qemu namespaces feature. There are other
requirements: we must copy all the ACL & SELinux labels otherwise
we might grant access that is administratively forbidden or vice
versa.
At the same time, the check for namespace prerequisites is moved
from domain startup time to qemu.conf parser as it doesn't make
much sense to allow users to start misconfigured libvirt just to
find out they can't start a single domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions do not need to see the whole virDomainDiskDef.
Moreover, they are going to be called from places where we don't
have access to the full disk definition. Sticking with
virStorageSource is more than enough.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far we allow to set MTU for libvirt networks. However, not all
domain interfaces have to be plugged into a libvirt network and
even if they are, they might want to have a different MTU (e.g.
for testing purposes).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1405269
If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:
"unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
can't find value 'drive-scsi0-0-0-0'"
because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.
Access to unlock the volume will thus be left to the application.
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Prime time. When it comes to spawning qemu process and
relabelling all the devices it's going to touch, there's inherent
race with other applications in the system (e.g. udev). Instead
of trying convincing udev to not touch libvirt managed devices,
we can create a separate mount namespace for the qemu, and mount
our own /dev there. Of course this puts more work onto us as we
have to maintain /dev files on each domain start and device
hot(un-)plug. On the other hand, this enhances security also.
From technical POV, on domain startup process the parent
(libvirtd) creates:
/var/lib/libvirt/qemu/$domain.dev
/var/lib/libvirt/qemu/$domain.devpts
The child (which is going to be qemu eventually) calls unshare()
to create new mount namespace. From now on anything that child
does is invisible to the parent. Child then mounts tmpfs on
$domain.dev (so that it still sees original /dev from the host)
and creates some devices (as explained in one of the previous
patches). The devices have to be created exactly as they are in
the host (including perms, seclabels, ACLs, ...). After that it
moves $domain.dev mount to /dev.
What's the $domain.devpts mount there for then you ask? QEMU can
create PTYs for some chardevs. And historically we exposed the
host ends in our domain XML allowing users to connect to them.
Therefore we must preserve devpts mount to be shared with the
host's one.
To make this patch as small as possible, creating of devices
configured for domain in question is implemented in next patches.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainObjExitAgent is unsafe.
First it accesses domain object without domain lock.
Second it uses outdated logic that goes back to commit 79533da1 of
year 2009 when code was quite different. (unref function
instead of unreferencing only unlocked and disposed object
in case of last reference and leaved unlocking to the caller otherwise).
Nowadays this logic may lead to disposing locked object
i guess.
Another problem is that the callers of qemuDomainObjEnterAgent
use domain object again (namely priv->agent) without domain lock.
This patch address these two problems.
qemuDomainGetAgent is dropped as unused.
Detect on reconnect to a running qemu VM whether the alias of a
hotpluggable memory device (dimm) does not match the dimm slot number
where it's connected to. This is necessary as qemu is actually
considering the alias as machine ABI used to connect the backend object
to the dimm device.
This will require us to keep them consistent so that we can reliably
restore them on migration. In some situations it was currently possible
to create a mismatched configuration and qemu would refuse to restore
the migration stream.
To avoid breaking existing VMs we'll need to keep the old algorithm
though.
Add the secret object so the 'passwordid=' can be added if the command line
if there's a secret defined in/on the host for TCP chardev TLS objects.
Preparation for the secret involves adding the secinfo to the char source
device prior to command line processing. There are multiple possibilities
for TCP chardev source backend usage.
Add test for at least a serial chardev as an example.
Adding a field to the domain's private vcpu object to hold the halted
state information.
Adding two functions in support of the halted state:
- qemuDomainGetVcpuHalted: retrieve the halted state from a
private vcpu object
- qemuDomainRefreshVcpuHalted: obtain the per-vcpu halted states
via qemu monitor and store the results in the private vcpu objects
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Hao QingFeng <haoqf@linux.vnet.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Add an optional "tls='yes|no'" attribute for a TCP chardev.
For QEMU, this will allow for disabling the host config setting of the
'chardev_tls' for a domain chardev channel by setting the value to "no" or
to attempt to use a host TLS environment when setting the value to "yes"
when the host config 'chardev_tls' setting is disabled, but a TLS environment
is configured via either the host config 'chardev_tls_x509_cert_dir' or
'default_tls_x509_cert_dir'
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit id '5f2a132786' should have placed the data in the host source
def structure since that's also used by smartcard, redirdev, and rng in
order to provide a backend tcp channel. The data in the private structure
will be necessary in order to provide the secret properly.
This also renames the previous names from "Chardev" to "ChrSource" for
the private data structures and API's
Modeled after the qemuDomainHostdevPrivatePtr (commit id '27726d8c'),
create a privateData pointer in the _virDomainChardevDef to allow storage
of private data for a hypervisor in order to at least temporarily store
secret data for usage during qemuBuildCommandLine.
NB: Since the qemu_parse_command (qemuParseCommandLine) code is not
expecting to restore the secret data, there's no need to add code
code to handle this new structure there.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This improves commit 706b5b6277 in a way that we check qemu capabilities
instead of what architecture we are running on to detect whether we can
use *virtio-vga* model or not. This is not a case only for arm/aarch64.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Sometimes adding a separate variable to access vm->privateData is not
necessary. Add a macro that will do the typecasting rather than having
to add a temp variable to force the compiler to typecast it.
Put it into qemuDomainPrepareShmemChardev() so it can be used later.
Also don't fill in the path unless the server option is enabled.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:
"info migration reply was missing return status"
This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.
Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.
https://bugzilla.redhat.com/show_bug.cgi?id=1374613
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.
We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.
The helper will store the order information in places where we are
certain that it's necessary.
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.
As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
Validate the presence of the thread id according to state of the vCPU
rather than just checking the vCPU count. Additionally put the new
validation code into a separate function so that the information
retrieval can be split from the validation.
Until now we simply errored out when the translation from pool+volume
failed. However, we should instead check whether that disk is needed or
not since there is an option for that.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1168453
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Dropping the caching of ccw address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
Dropping the caching of virtio serial address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
Credit goes to Cole Robinson.
Introduce a helper to help determine if a disk src could be possibly used
for a disk secret... Going to need this for hot unplug.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Put it into separate function called qemuDomainPrepareChannel() and call
it from the new qemuProcessPrepareDomain().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Along with the virtlogd addition of the log file appending API implement
a helper for logging one-shot entries to the log file including the
fallback approach of using direct file access.
This will be used for noting the shutdown of the qemu proces and
possibly other actions such as VM migration and other critical VM
lifecycle events.
Based on some digital archaeology performed by jtomko, it's been determined
that the persistentAddrs variable is no longer necessary...
The variable was added by:
commit 141dea6bc7
CommitDate: 2010-02-12 17:25:52 +0000
Add persistence of PCI addresses to QEMU
Where it was set to 0 on domain startup if qemu did not support the
QEMUD_CMD_FLAG_DEVICE capability, to clear the addresses at shutdown,
because QEMU might make up different ones next time.
As of commit f5dd58a608
CommitDate: 2012-07-11 11:19:05 +0200
qemu: Extended qemuDomainAssignAddresses to be callable from
everywhere.
this was broken, when the persistentAddrs = 0 assignment was moved
inside qemuDomainAssignPCIAddresses and while it pretends to check
for !QEMU_CAPS_DEVICE, its parent qemuDomainAssignAddresses is only
called if QEMU_CAPS_DEVICE is present.
Extract information for all disks and update tray state and source only
for removable drives. Additionally store whether a drive is removable
and whether it has a tray.
Extract whether a given drive has a tray and whether there is no image
inserted.
Negative logic for the image insertion is chosen so that the flag is set
only if we are certain of the fact.
Rather than returning a "char *" indicating perhaps some sized set of
characters that is NUL terminated, alter the function to return 0 or -1
for success/failure and add two parameters to handle returning the
buffer and it's size.
The function no longer encodes the returned secret, rather it returns
the unencoded secret forcing callers to make the necessary adjustments.
Alter the callers to handle the adjusted model.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Extract the relevant parts of the existing checker and reuse them for
blockcopy since copying to a non-block device creates an invalid
configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1209802
Add the data structure and infrastructure to support an initialization
vector (IV) secrets. The IV secret generation will need to have access
to the domain private master key, so let's make sure the prepare disk
and hostdev functions can accept that now.
Anywhere that needs to make a decision over which secret type to use
in order to fill in or use the IV secret has a switch added.
Signed-off-by: John Ferlan <jferlan@redhat.com>
A recent review of related changes noted that we should split the creation
(or generation) of the master key into the qemuProcessPrepareDomain and leave
the writing of the master key for qemuProcessPrepareHost.
Made the adjustment and modified some comments to functions that have
changed calling parameters, but didn't change the intro doc.
Signed-off-by: John Ferlan <jferlan@redhat.com>
From a review after push, add the "_TYPE" into the name.
Also use qemuDomainSecretInfoType in the struct rather than int
with the comment field containing the struct name
Signed-off-by: John Ferlan <jferlan@redhat.com>
Similar to the qemuDomainSecretDiskPrepare, generate the secret
for the Hostdev's prior to call qemuProcessLaunch which calls
qemuBuildCommandLine. Additionally, since the secret is not longer
added as part of building the command, the hotplug code will need
to make the call to add the secret in the hostdevPriv.
Since this then is the last requirement to pass a virConnectPtr
to qemuBuildCommandLine, we now can remove that as part of these
changes. That removal has cascading effects through various callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Modeled after the qemuDomainDiskPrivatePtr logic, create a privateData
pointer in the _virDomainHostdevDef to allow storage of private data
for a hypervisor in order to at least temporarily store auth/secrets
data for usage during qemuBuildCommandLine.
NB: Since the qemu_parse_command (qemuParseCommandLine) code is not
expecting to restore the auth/secret data, there's no need to add
code to handle this new structure there.
Updated copyrights for modules touched. Some didn't have updates in a
couple years even though changes have been made.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than needing to pass the conn parameter to various command
line building API's, add qemuDomainSecretPrepare just prior to the
qemuProcessLaunch which calls qemuBuilCommandLine. The function
must be called after qemuProcessPrepareHost since it's expected
to eventually need the domain masterKey generated during the prepare
host call. Additionally, future patches may require device aliases
(assigned during the prepare domain call) in order to associate
the secret objects.
The qemuDomainSecretDestroy is called after the qemuProcessLaunch
finishes in order to clear and free memory used by the secrets
that were recently prepared, so they are not kept around in memory
too long.
Placing the setup here is beneficial for future patches which will
need the domain masterKey in order to generate an encrypted secret
along with an initialization vector to be saved and passed (since
the masterKey shouldn't be passed around).
Finally, since the secret is not added during command line build,
the hotplug code will need to get the secret into the private disk data.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Introduce a new private structure to hold qemu domain auth/secret data.
This will be stored in the qemuDomainDiskPrivate as a means to store the
auth and fetched secret data rather than generating during building of
the command line.
The initial changes will handle the current username and secret values
for rbd and iscsi disks (in their various forms). The rbd secret is
stored as a base64 encoded value, while the iscsi secret is stored as
a plain text value. Future changes will store encoded/encrypted secret
data as well as an initialization vector needed to be given to qemu
in order to decrypt the encoded password along with the domain masterKey.
The inital assumption will be that VIR_DOMAIN_SECRET_INFO_PLAIN is
being used.
Although it's expected that the cleanup of the secret data will be
done immediately after command line generation, reintroduce the object
dispose function qemuDomainDiskPrivateDispose to handle removing
memory associated with the structure for "normal" cleanup paths.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When creating the master key, we used mode 0600 (which we should) but
because we were creating it as root, the file is not readable by any
qemu running as non-root. Fortunately, it's just a matter of labelling
the file. We are generating the file path few times already, so let's
label it in the same function that has access to the path already.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Similarly to the DEVICE_DELETED event we will be able to tell when
unplug of certain device types will be rejected by the guest OS. Wire up
the device deletion signalling code to allow handling this.
Add a masterKey and masterKeyLen to _qemuDomainObjPrivate to store a
random domain master key and its length in order to support the ability
to encrypt/decrypt sensitive data shared between libvirt and qemu. The
key will be base64 encoded and written to a file to be used by the
command line building code to share with qemu.
New API's from this patch:
qemuDomainGetMasterKeyFilePath:
Return a path to where the key is located
qemuDomainWriteMasterKeyFile: (private)
Open (create/trunc) the masterKey path and write the masterKey
qemuDomainMasterKeyReadFile:
Using the master key path, open/read the file, and store the
masterKey and masterKeyLen. Expected use only from qemuProcessReconnect
qemuDomainGenerateRandomKey: (private)
Generate a random key using available algorithms
The key is generated either from the gnutls_rnd function if it
exists or a less cryptographically strong mechanism using
virGenerateRandomBytes
qemuDomainMasterKeyRemove:
Remove traces of the master key, remove the *KeyFilePath
qemuDomainMasterKeyCreate:
Generate the domain master key and save the key in the location
returned by qemuDomainGetMasterKeyFilePath.
This API will first ensure the QEMU_CAPS_OBJECT_SECRET is set
in the capabilities. If not, then there's no need to generate
the secret or file.
The creation of the key will be attempted from qemuProcessPrepareHost
once the libDir directory structure exists.
The removal of the key will handled from qemuProcessStop just prior
to deleting the libDir tree.
Since the key will not be written out to the domain object XML file,
the qemuProcessReconnect will read the saved file and restore the
masterKey and masterKeyLen.
The paths have the domain ID in them. Without cleaning them, they would
contain the same ID even after multiple restarts. That could cause
various problems, e.g. with access.
Add function qemuDomainClearPrivatePaths() for this as a counterpart of
qemuDomainSetPrivatePaths().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If use of virtlogd is enabled, then use it for backing the
character device log files too. This avoids the possibility
of a guest denial of service by writing too much data to
the log file.
With a very old QEMU which doesn't support events we need to explicitly
call qemuMigrationSetOffline at the end of migration to update our
internal state. On the other hand, if we talk to QEMU using QMP, we
should just wait for the STOP event and let the event handler update the
state and trigger a libvirt event.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When SPICE graphics is configured for a domain but we did not ask the
client to switch to the destination, we should not wait for
SPICE_MIGRATE_COMPLETED event (which will never come).
https://bugzilla.redhat.com/show_bug.cgi?id=1151723
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Per-domain directories were introduced in order to be able to
completely separate security labels for each domain (commit
f1f68ca334). However when the domain
name is long (let's say a ridiculous 110 characters), we cannot
connect to the monitor socket because on length of UNIX socket address
is limited. In order to get around this, let's shorten it in similar
fashion and in order to avoid conflicts, throw in an ID there as well.
Also save that into the status XML and load the old status XMLs
properly (to clean up after older domains). That way we can change it
in the future.
The shortening can be seen in qemuxml2argv tests, for example in the
hugepages-pages2 case.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Stopping a domain without a job risks a race condition with another
thread which started a job a which does not expect anyone else to be
messing around with the same domain object.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules. But we always allowed
international characters in domain names. Thus we need to modify the
machine name we are passing to systemd.
In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain. Even for domains that were started with older libvirt. That
can be achieved thanks to virSystemdGetMachineNameByPID(). And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.
So this patch modifies the naming in the following way. It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters. That way we can start domains we
couldn't start before. Well, at least on systemd.
To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data. That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.
The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virDomainSnapshotDefFormat calls into virDomainDefFormat,
so should be providing a non-NULL virCapsPtr instance. On the
qemu driver we change qemuDomainSnapshotWriteMetadata to also
include caps since it calls virDomainSnapshotDefFormat.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
While this is no functional change, whole channel definition is
going to be needed very soon. Moreover, while touching this obey
const correctness rule in qemuAgentOpen() - so far it was passed
regular pointer to channel config even though the function is
expected to not change pointee at all. Pass const pointer
instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The structure actually contains migration statistics rather than just
the status as the name suggests. Renaming it as
qemuMonitorMigrationStats removes the confusion.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add qemuDomainHasVCpuPids to do the checking and replace in place checks
with it.
We no longer need checking whether the thread contains fake data
(vcpupids[0] == vm->pid) as in b07f3d821d
and 65686e5a81 this was removed.
Currently the QEMU monitor is given an FD to the logfile. This
won't work in the future with virtlogd, so it needs to use the
qemuDomainLogContextPtr instead, but it shouldn't directly
access that object either. So define a callback that the
monitor can use for reporting errors from the log file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuDomainTaint APIs currently expect to be passed a log file
descriptor. Change them to instead use a qemuDomainLogContextPtr
to hide the implementation details.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the places which create/open log files to use the new
qemuDomainLogContextPtr object instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a qemuDomainLogContext object to encapsulate
handling of I/O to/from the domain log file. This will
hide details of the log file implementation from the
rest of the driver, making it easier to introduce
support for virtlogd later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are two pretty similar functions qemuProcessReadLog and
qemuProcessReadChildErrors. Both read from the QEMU log file
and try to strip out libvirt messages. The latter then reports
an error, while the former lets the callers report an error.
Re-write qemuProcessReadLog so that it uses a single read
into a dynamically allocated buffer. Then introduce a new
qemuProcessReportLogError that calls qemuProcessReadLog
and reports an error.
Convert all callers to use qemuProcessReportLogError.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We only started an async job for incoming migration from another host.
When we were starting a domain from scratch or restoring from a saved
state (migration from file) we didn't set any async job. Let's introduce
a new QEMU_ASYNC_JOB_START for these cases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If mlock is required either due to use of VFIO hostdevs or due to the
fact that it's enabled it needs to be tweaked prior to adding new memory
or after removing a module. Add a helper to determine when it's
necessary and reuse it both on hotplug and hotunplug.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1273491