Commit Graph

1563 Commits

Author SHA1 Message Date
Peter Krempa
96d98a4b19 qemu: Remove return value from qemuMonitorDomainIOErrorCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
bd9a14cf6e qemu: Remove return value from qemuMonitorDomainWatchdogCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
8ed88fe9a0 qemu: Remove return value from qemuMonitorDomainRTCChangeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
1b5097172b qemu: Remove return value from qemuMonitorDomainResumeCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
e57a537ad2 qemu: Remove return value from qemuMonitorDomainStopCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
8e95b76b1a qemu: Remove return value from qemuMonitorDomainResetCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
40950f60fc qemu: Remove return value from qemuMonitorDomainShutdownCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Peter Krempa
b2bf8d5bab qemu: Remove return value from qemuMonitorDomainEventCallback
Change the callback prototype and fix the callback registered in the
process code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-07-23 10:01:47 +02:00
Boris Fiuczynski
9568a4d410 conf: Add s390-pv as launch security type
Add launch security type 's390-pv' as well as some tests.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-07-21 13:30:25 +02:00
Boris Fiuczynski
96bc8312aa conf: Refactor launch security to allow more types
Adding virDomainSecDef for general launch security data
and moving virDomainSEVDef as an element for SEV data.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-07-21 13:30:14 +02:00
Jiri Denemark
364995ed57 qemu: Signal domain condition in qemuProcessStop a bit later
Signaling the condition before vm->def->id is reset to -1 is dangerous:
in case a waiting thread wakes up, it does not see anything interesting
(the domain is still marked as running) and just enters virDomainObjWait
where it waits forever because the condition will never be signalled
again.

Originally it was impossible to get into such situation because the vm
object was locked all the time between signaling the condition and
resetting vm->def->id, but after commit 860a999802 released in 6.8.0,
qemuDomainObjStopWorker called in qemuProcessStop between
virDomainObjBroadcast and setting vm->def->id to -1 unlocks the vm
object giving other threads a chance to wake up and possibly hang.

In real world, this can be easily reproduced by killing, destroying, or
just shutting down (from the guest OS) a domain while it is being
migrated somewhere else. The migration job would never finish.

So let's make sure we delay signaling the domain condition to the point
when a woken up thread can detect the domain is not active anymore.

https://bugzilla.redhat.com/show_bug.cgi?id=1949869

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-19 15:49:16 +02:00
Kristina Hanicova
c39757f700 qemu: Do not erase duplicate devices from namespace if error occurs
If the attempt to attach a device failed, we erased the
unattached device from the namespace. This resulted in erasing an
already attached device in case of a duplicate. We need to check
for existing file in the namespace in order to determine erasing
it in case of a failure.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1780508

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-15 12:00:24 +02:00
Peter Krempa
a3edda6b9e qemu: Prevent two threshold events when it was registered with index
Remember whether the user passed an explicit index when registering the
event so that we can avoid the top level event when it isn't needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-12 16:34:17 +02:00
zhangjl02
2f922b2c46 qemu: interface: check and use ovs command to set qos of ovs managed port
When qos is set or delete, we have to check if the port is an ovs managed
port. If true, call the virNetDevOpenvswitchInterfaceSetQos function when qos
is set, and call the virNetDevOpenvswitchInterfaceClearQos function when
the interface is to be destroyed.

Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-12 09:40:13 +02:00
Michal Privoznik
fb1289c155 qemu: Don't set NVRAM label when creating it
The NVRAM label is set in qemuSecuritySetAllLabel(). There's no
need to set its label upfront. In fact, setting it twice creates
an imbalance because it's unset only once which mangles seclabel
remembering. However, plain removal of the
qemuSecurityDomainSetPathLabel() undoes the fix for the original
bug (when dynamic ownership is off then the NVRAM is not created
with cfg->user and cfg->group but as root:root). Therefore, we
have to switch to virFileOpenAs() and pass cfg->user and
cfg->group and VIR_FILE_OPEN_FORCE_OWNER flag. There's no need to
pass VIR_FILE_OPEN_FORCE_MODE because the file will be created
with the proper mode.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1969347
Fixes: bcdaa91a27
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-06-17 09:15:09 +02:00
Masayoshi Mizuma
7c69f72230 qemuProcessSetupDisksTransientSnapshot: Skip enabling transientOverlayCreated flag
QEMU_DOMAIN_DISK_PRIVATE(disk)->transientOverlayCreated flag
gets true unexpectedly on qemuProcessSetupDisksTransientSnapshot() when
the disk has <transient shareBacking='yes'> option.

The flag should be enabled on qemuDomainAttachDiskGeneric() after the
overlay setup is completed.

Skip enabling transientOverlayCreated for the disk here.

Fixes: 75871da0ec
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-01 08:20:01 +02:00
Peter Krempa
75871da0ec qemu: Allow <transient> disks with images shared accross VMs
Implement this behaviour by skipping the disks on traditional
commandline and hotplug them before resuming CPUs. That allows to use
the support for hotplugging of transient disks which inherently allows
sharing of the backing image as we open it read-only.

This commit implements the validation code to allow it only with buses
supporting hotplug and the hotplug code while starting up the VM.

When we have such disk we need to issue a system-reset so that firmware
tables are regenerated to allow booting from such device.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:08 +02:00
Peter Krempa
34c3291139 qemu: Track creation of <transient> disk overlay individually
In preparation for hotplug of <transient> disks we'll need to track
whether the overlay file was created individually per-disk.

Add 'transientOverlayCreated' to 'struct _qemuDomainDiskPrivate' and
remove 'inhibitDiskTransientDelete' from 'qemuDomainObjPrivate' and
adjust the code for the change.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:08 +02:00
Peter Krempa
2976b6aaeb qemu: Move 'bootindex' handling for disks out of command line formatter
The logic assigning the bootindices from the legacy boot order
configuration was spread through the command line formatters for the
disk device and for the floppy controller.

This patch adds 'effectiveBootindex' property to the disk private data
which holds the calculated boot index and moves the logic of determining
the boot index into 'qemuProcessPrepareDomainDiskBootorder' called from
'qemuProcessPrepareDomainStorage'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Masayoshi Mizuma
b4d87669ba qemu_snapshot: Add the guest name to the transient disk path
Later patches will implement sharing of the backing file, so we'll need
to be able to discriminate the overlays per VM.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Peter Krempa
b7583a5ba3 qemu: snapshot: move transient snapshot code to qemu_process
The code deals with the startup of the VM and just uses the snapshot
code to achieve the desired outcome.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Peter Krempa
06e9b0c28d qemu: process: Setup transient disks only when starting a fresh VM
Creating the overlay for the disk is needed when starting a new VM only.
Additionally for now migration with transient disks is forbidden
anyways.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-05-24 20:38:07 +02:00
Peter Krempa
92a3eddd03 Remove static analysis assertions
None of them are currently needed to pass our upstream CI, most were
either for ancient clang versions or coverity for silencing false
positives.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-24 20:26:20 +02:00
Peter Krempa
bbd55e9284 Drop magic comments for coverity
They were added mostly randomly and we don't really want to keep working
around of false positives.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-24 20:26:20 +02:00
Kristina Hanicova
bcdaa91a27 qemu: Use qemuDomainOpenFile() in qemuPrepareNVRAM()
Previously, nvram file was created with user/group owner as
'root', rather than specifications defined in libvirtd.conf. The
solution is to call qemuDomainOpenFile(), which creates file with
defined permissions and qemuSecurityDomainSetPathLabel() to set
security label for created nvram file.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1783255

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-21 14:36:57 +02:00
Kristina Hanicova
19967f64f4 qemu: Add check for needed paths for memory devices
When building a commandline for a DIMM memory device with
non-default access mode, the qemuBuildMemoryBackendProps() will
tell QEMU to allocate memory from per-domain memory backing dir.
But later, when preparing the host, the
qemuProcessNeedMemoryBackingPath() does not check for memory
devices at all resulting in per-domain memory backing dir not
being created which upsets QEMU.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1961114

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-05-21 08:51:11 +02:00
Michal Privoznik
655f67c68a qemu_process: Drop needless check in qemuProcessNeedMemoryBackingPath()
The aim of this function is to return whether domain definition
and/or memory device that user intents to hotplug needs a private
path inside cfg->memoryBackingDir. The rule for the memory device
that's being hotplug includes checking whether corresponding
guest NUMA node needs memoryBackingDir. Well, while the rationale
behind makes sense it is not necessary to check for that really -
just a few lines above every guest NUMA node was checked exactly
for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-05-18 17:47:58 +02:00
Michal Privoznik
4d779874ef qemu_process: Deduplicate code in qemuProcessNeedHugepagesPath()
The aim of qemuProcessNeedHugepagesPath() is to return whether
guest needs private path inside HugeTLBFS mounts (deducted from
domain definition @def) or whether the memory device that user is
hotplugging in needs the private path (deducted from the @mem
argument). The actual creation of the path is done in the only
caller qemuProcessBuildDestroyMemoryPaths().

The rule for the first case (@def) and the second case (@mem) is
the same (domain has a DIMM device that has HP requested) and is
written twice. Move the logic into a function to deduplicate the
code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-05-18 17:47:58 +02:00
Michal Privoznik
310b37e486 qemu: Don't double free @node_cpus in qemuProcessSetupPid()
When placing vCPUs into CGroups the qemuProcessSetupPid() is
called which then enters a for() loop (around its middle) where
it calls virDomainNumaGetNodeCpumask() for each guest NUMA node.
But the latter returns only a pointer not new reference/copy and
thus the caller must not free it. But the variable is decorated
with g_autoptr() which leads to a double free.

Fixes: 2d37d8dbc9
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-04-23 11:02:21 +02:00
Luyao Zhong
2d37d8dbc9 qemu: Add support for 'restrictive' mode in numatune
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-04-19 11:39:21 +02:00
Michal Privoznik
c8238579fb lib: Drop internal virXXXPtr typedefs
Historically, we declared pointer type to our types:

  typedef struct _virXXX virXXX;
  typedef virXXX *virXXXPtr;

But usefulness of such declaration is questionable, at best.
Unfortunately, we can't drop every such declaration - we have to
carry some over, because they are part of public API (e.g.
virDomainPtr). But for internal types - we can do drop them and
use what every other C project uses 'virXXX *'.

This change was generated by a very ugly shell script that
generated sed script which was then called over each file in the
repository. For the shell script refer to the cover letter:

https://listman.redhat.com/archives/libvir-list/2021-March/msg00537.html

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-13 17:00:38 +02:00
Tim Wiederhake
c5d4d0198f qemuProcessUpdateGuestCPU: Check host cpu for forbidden features
See https://bugzilla.redhat.com/show_bug.cgi?id=1840770

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com
2021-03-26 11:40:55 +01:00
Jiri Denemark
1107c0b9c3 Do not check return value of VIR_REALLOC_N
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-03-22 12:44:18 +01:00
Jiri Denemark
b8c919b5b4 qemu: Drop redundant checks for qemuCaps before virQEMUCapsGet
virQEMUCapsGet checks for qemuCaps itself, no need to do it explicitly.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-03-22 12:44:18 +01:00
Peter Krempa
8967ad7be6 qemu: backup: Restore security label on backup disk store image on VM termination
When the backup job is terminated normally the security label is
restored by the blockjob finishing handler.

If the VM dies or is destroyed that wouldn't happen as the blockjob
handler wouldn't be called.

Restore the security label on disk store where we remember that the job
was running at the point when 'qemuBackupJobTerminate' was called.

Not resetting the security label means that we also leak the xattr
attributes remembering the label which prevents any further use of the
file, which is a problem for block devices.

This also requires that the call to 'qemuBackupJobTerminate' from
'qemuProcessStop' happens only after 'vm->pid' was reset as otherwise
the security subdrivers attempt to enter the process namespace which
fails if the process isn't running any more.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1939082
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-19 16:41:39 +01:00
Michal Privoznik
6e9c4811be qemu_process: Use accessor for def->mem.total_memory
When connecting to the monitor, a timeout is calculated that is
bigger the more memory guest has (because QEMU has to allocate
and possibly zero out the memory and what not, empirically
deducted). However, when computing the timeout the @total_memory
mmember is accessed directly even though
virDomainDefGetMemoryTotal() should have been used.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-16 09:16:13 +01:00
Peter Krempa
aa372e5a01 backup: Store 'apiFlags' in private section of virDomainBackupDef
'qemuBackupJobTerminate' needs the API flags to see whether
VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL. Unfortunately when called via
qemuProcessReconnect()->qemuProcessStop() early (e.g. if the qemu
process died while we were reconnecting) the job is cleared temporarily
so that other APIs can be called. This would mean that we couldn't clean
up the files in some cases.

Save the 'apiFlags' inside the backup object and set it from the
'qemuDomainJobObj' 'apiFlags' member when reconnecting to a VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-12 10:59:05 +01:00
Andrea Bolognani
c2180c2fd6 qemu: Set limits only when explicitly asked to do so
The current code is written under the assumption that, for all
limits except the core size, asking for the limit to be set to
zero is a no-op, and so the operation is performed
unconditionally.

While this is the behavior we want for the QEMU driver, the
virCommand and virProcess facilities are generic, and should not
implement this kind of policy: asking for a limit to be set to
zero should result in that limit being set to zero every single
time.

Add some checks in the QEMU driver, effectively moving the
policy where it belongs.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-08 22:41:40 +01:00
Andrea Bolognani
bd33680f02 qemu: Set all limits at the same time
qemuProcessLaunch() is the correct place to set process limits,
and in fact is where we were dealing with almost all of them,
but the memory locking limit was handled in
qemuBuildCommandLine() instead for some reason.

The code is rewritten so that the desired limit is calculated
and applied in separated steps, which will help with further
changes, but this doesn't alter the behavior.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-08 22:41:40 +01:00
Andrea Bolognani
9bf5c00f9b qemu: Make some minor tweaks
Doing this now will make the next changes nicer.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-03-08 22:41:40 +01:00
Peter Krempa
3c546f7eb4 qemuProcessReportLogError: Don't mark "%s: %s" as translatable
The function is constructing an error message from a prefix and the
contents of the qemu log file. Marking just two string modifiers as
translatable is pointless and will certainly confuse translators.

Remove the marking and add a comment which bypasses the
sc_libvirt_unmarked_diagnostics check.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-05 15:01:29 +01:00
Peter Krempa
c8ff56c7ad qemuProcessReportLogError: Remove unnecessary math for max error message
Now that error message formatting doesn't use fixed size buffers we can
drop the math for calculating the maximum chunk of log to report in the
error message and use a round number. This also makes it obvious that
the chosen number is arbitrary.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-05 15:01:29 +01:00
Michal Privoznik
7f482a67e4 lib: Replace virFileMakePath() with g_mkdir_with_parents()
Generated using the following spatch:

  @@
  expression path;
  @@
  - virFileMakePath(path)
  + g_mkdir_with_parents(path, 0777)

However, 14 occurrences were not replaced, e.g. in
virHostdevManagerNew(). I don't really understand why.
Fixed by hand afterwards.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-04 20:52:23 +01:00
Michal Privoznik
b1e3728dec lib: Replace virFileMakePathWithMode() with g_mkdir_with_parents()
These functions are identical. Made using this spatch:

  @@
  expression path, mode;
  @@
  - virFileMakePathWithMode(path, mode)
  + g_mkdir_with_parents(path, mode)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-03-04 20:52:23 +01:00
Jiri Denemark
c8f3b83c72 qemu_domainjob: Make copy of owner API
Using the job owner API name directly works fine as long as it is a
static string or the owner's thread is still running. However, this is
not always the case. For example, when the owner API name is filled in a
job when we're reconnecting to existing domains after daemon restart,
the dynamically allocated owner name will disappear with the
reconnecting thread. Any follow up usage of the pointer will read random
memory.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-02-25 09:55:31 +01:00
Peng Liang
1ac703a7d0 qemu: Add missing lock in qemuProcessHandleMonitorEOF
qemuMonitorUnregister will be called in multiple threads (e.g. threads
in rpc worker pool and the vm event thread).  In some cases, it isn't
protected by the monitor lock, which may lead to call g_source_unref
more than one time and a use-after-free problem eventually.

Add the missing lock in qemuProcessHandleMonitorEOF (which is the only
position missing lock of monitor I found).

Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-24 15:00:51 +01:00
Stefan Berger
f30aa2ec74 qemu: Fix libvirt hang due to early TPM device stop
This patch partially reverts commit 5cde9dee where the qemuExtDevicesStop()
was moved to a location before the QEMU process is stopped. It may be
alright to tear down some devices before QEMU is stopped, but it doesn't work
for the external TPM (swtpm) which assumes that QEMU sends it a signal to stop
it before libvirt may try to clean it up. So this patch moves the
virFileDeleteTree() calls after the call to qemuExtDevicesStop() so that the
pid file of virtiofsd is not deleted before that call.

Afftected libvirt versions are 6.10 and 7.0.

Fixes: 5cde9dee8c
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-19 17:31:37 +01:00
Daniel P. Berrangé
17f001c451 qemu: record deprecation messages against the domain
These messages are only valid while the domain is running.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:12 +00:00
Peter Krempa
480fecaa21 Replace virStringListJoin by g_strjoinv
Our implementation was inspired by glib anyways. The difference is only
the order of arguments.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:05:34 +01:00
Peter Krempa
56cedfcf38 Replace virStringListHasString by g_strv_contains
The glib variant doesn't accept NULL list, but there's just one caller
where it wasn't checked explicitly, thus there's no need for our own
wrapper.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:05:33 +01:00
Peter Krempa
d9f7e87673 qemuProcessUpdateDevices: Refactor cleanup and memory handling
Use automatic memory freeing and remove the 'cleanup' label. Also make
it a bit more obvious that nothing happens if the 'old' list wasn't
present.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:05:33 +01:00
Daniel P. Berrangé
c32f172d12 qemu: wire up support for maximum CPU model
The "max" model can be treated the same way as "host" model in general.

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-10 11:44:48 +00:00
Laine Stump
674719afe6 qemu: replace VIR_FREE with g_free in all vir*Free() functions
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-02-05 00:20:43 -05:00
Pavel Hrdina
836e0a960b storage_source: use virStorageSource prefix for all functions
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-01-22 11:10:27 +01:00
Pavel Hrdina
01f7ade912 util: extract virStorageFile code into storage_source
Up until now we had a runtime code and XML related code in the same
source file inside util directory.

This patch takes the runtime part and extracts it into the new
storage_file directory.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-01-22 11:10:27 +01:00
Laine Stump
c2b2cdf746 call virDomainNetNotifyActualDevice() for all interface types
Now that this function can be called regardless of interface type (and
whether or not we have a conn for the network driver), let's actually
call it for all interface types. This will assure that we re-connect
any disconnected bridge devices for <interface type='bridge'> as
mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1730084#c26
(until now we've only been reconnecting bridge devices for <interface
type='network'>)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-01-08 11:34:49 -05:00
Michal Privoznik
5ac2439a83 qemu_process: Release domain seclabel later in qemuProcessStop()
Some secdrivers (typically SELinux driver) generate unique
dynamic seclabel for each domain (unless a static one is
requested in domain XML). This is achieved by calling
qemuSecurityGenLabel() from qemuProcessPrepareDomain() which
allocates unique seclabel and stores it in domain def->seclabels.
The counterpart is qemuSecurityReleaseLabel() which releases the
label and removes it from def->seclabels. Problem is, that with
current code the qemuProcessStop() may still want to use the
seclabel after it was released, e.g. when it wants to restore the
label of a disk mirror.

What is happening now, is that in qemuProcessStop() the
qemuSecurityReleaseLabel() is called, which removes the SELinux
seclabel from def->seclabels, yada yada yada and eventually
qemuSecurityRestoreImageLabel() is called. This bubbles down to
virSecuritySELinuxRestoreImageLabelSingle() which find no SELinux
seclabel (using virDomainDefGetSecurityLabelDef()) and this
returns early doing nothing.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1751664
Fixes: 8fa0374c5b
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-01-06 13:29:09 +01:00
Jiri Denemark
f7c40b5c71 qemu: The TSC tolerance interval should be closed
The kernel refuses to set guest TSC frequency less than a minimum
frequency or greater than maximum frequency (both computed based on the
host TSC frequency). When writing the libvirt code with a reversed logic
(return success when the requested frequency falls within the tolerance
interval) I forgot to include the boundaries.

Fixes: d8e5b45600
https://bugzilla.redhat.com/show_bug.cgi?id=1839095

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-01-06 11:24:37 +01:00
Peter Krempa
d0819b9f02 qemu: Properly handle setting of <iotune> for empty cdrom
When starting a VM with an empty cdrom which has <iotune> configured the
startup fails as qemu is not happy about setting tuning for an empty
drive:

 error: internal error: unable to execute 'block_set_io_throttle', unexpected error: 'Device has no medium'

Resolve this by skipping the setting of throttling for empty drives and
updating the throttling when new medium is inserted into the drive.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/111
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-01-06 09:24:48 +01:00
Shi Lei
9b5d741a9d netdevmacvlan: Use helper function to create unique macvlan/macvtap name
Simplify ReserveName/GenerateName for macvlan and macvtap by using
common functions.

Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2020-12-15 13:35:33 -05:00
Shi Lei
c36cad1a31 netdevtap: Use common helper function to create unique tap name
Simplify GenerateName/ReserveName for netdevtap by using common
functions.

Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2020-12-15 13:35:27 -05:00
Daniel Henrique Barboza
9432693e2b domain_conf.c: move virDomainDeviceDefValidate() to domain_validate.c
Move virDomainDeviceDefValidate() and all its helper functions to
domain_validate.c.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-14 09:29:09 -03:00
Peter Krempa
18de9dfd77 virDomainDefValidate: Add per-run 'opaque' data
virDomainDefPostParse infrastructure has apart from the global opaque
data also per-run data, but this was not duplicated into the validation
callbacks.

This is important when drivers want to use correct run-state for the
validation.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-09 09:33:47 +01:00
Peter Krempa
c1720b9ac7 qemuDomainDiskLookupByNodename: Lookup also backup 'store' nodenames
Nodename may be asociated to a disk backup job, add support to looking
up in that chain too. This is specifically useful for the
BLOCK_WRITE_THRESHOLD event which can be registered for any nodename.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-08 15:12:34 +01:00
Michal Privoznik
40a162f83e qemu: Don't cache NUMA caps
In v6.0.0-rc1~439 (and friends) we tried to cache NUMA
capabilities because we assumed they are immutable. And to some
extent they are (NUMA hotplug is not a thing, is it). However,
our capabilities contain also some runtime info that can change,
e.g. hugepages pool allocation sizes or total amount of memory
per node (host side memory hotplug might change the value).

Because of the caching we might not be reporting the correct
runtime info in 'virsh capabilities'.

The NUMA caps are used in three places:

  1) 'virsh capabilities'
  2) domain startup, when parsing numad reply
  3) parsing domain private data XML

In cases 2) and 3) we need NUMA caps to construct list of
physical CPUs that belong to NUMA nodes from numad reply. And
while this may seem static, it's not really because of possible
CPU hotplug on physical host.

There are two possible approaches:

  1) build a validation mechanism that would invalidate the
     cached NUMA caps, or
  2) drop the caching and construct NUMA caps from scratch on
     each use.

In this commit, the latter approach is implemented, because it's
easier.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1819058
Fixes: 1a1d848694
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-07 11:32:40 +01:00
Peter Krempa
0f7b80691b qemuMonitorBlockJobInfo: Store 'ready' and 'ready_present' separately
Don't make the logic confusing by representing the 3 options using an
integer with negative values.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2020-12-07 10:15:00 +01:00
Daniel Henrique Barboza
5a34d0667d qemu: move memory size align to qemuProcessPrepareDomain()
qemuBuildCommandLine() is calling qemuDomainAlignMemorySizes(),
which is an operation that changes live XML and domain and has
little to do with the command line build process.

Move it to qemuProcessPrepareDomain() where we're supposed to
make live XML and domain changes before launch. qemuProcessStart()
is setting VIR_QEMU_PROCESS_START_NEW if !migrate && !snapshot,
same conditions used in qemuBuildCommandLine() to call
qemuDomainAlignMemorySizes(), making this change seamless.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-03 17:19:35 -03:00
Daniel Henrique Barboza
3bb9ed8bc2 qemu_process.c: check migrateURI when setting VIR_QEMU_PROCESS_START_NEW
qemuProcessCreatePretendCmdPrepare() is setting the
VIR_QEMU_PROCESS_START_NEW regardless of whether this is
a migration case or not. This behavior differs from what we're
doing in qemuProcessStart(), where the flag is set only
if !migrate && !snapshot.

Fix it by making the flag setting consistent with what we're
doing in qemuProcessStart().

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-12-03 17:16:33 -03:00
John Ferlan
148cfcf051 qemu: Pass / fill niothreads for qemuMonitorGetIOThreads
Let's pass along / fill @niothreads rather than trying to make dual
use as a return value and thread count.

This resolves a Coverity issue detected in qemuDomainGetIOThreadsMon
where if qemuDomainObjExitMonitor failed, then a -1 was returned and
overwrite @niothreads causing a memory leak.

Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-12-03 17:06:07 +01:00
Michal Privoznik
b7d4e6b67e lib: Replace VIR_AUTOSTRINGLIST with GStrv
Glib provides g_auto(GStrv) which is in-place replacement of our
VIR_AUTOSTRINGLIST.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-02 15:43:07 +01:00
Pavel Hrdina
82bda55e2f qemuProcessHandleGraphics: no need to check for NULL
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-11-16 17:25:41 +01:00
Jiri Denemark
d8e5b45600 qemu: Do not require TSC frequency to strictly match host
Some CPUs provide a way to read exact TSC frequency, while measuring it
is required on other CPUs. However, measuring is never exact and the
result may slightly differ across reboots. For this reason both Linux
kernel and QEMU recently started allowing for guests TSC frequency to
fall into +/- 250 ppm tolerance interval around the host TSC frequency.

Let's do the same to avoid unnecessary failures (esp. during migration)
in case the host frequency does not exactly match the frequency
configured in a domain XML.

https://bugzilla.redhat.com/show_bug.cgi?id=1839095

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-11-12 17:29:16 +01:00
Masayoshi Mizuma
5cde9dee8c qemu: Move qemuExtDevicesStop() before removing the pidfiles
A qemu guest which has virtiofs config fails to start if the previous
starting failed because of invalid option or something.

That's because the virtiofsd isn't killed by virPidFileForceCleanupPath()
on the former failure because the pidfile was already removed by
virFileDeleteTree(priv->libDir) in qemuProcessStop(), so
virPidFileForceCleanupPath() just returned.

Move qemuExtDevicesStop() before virFileDeleteTree(priv->libDir) so that
virPidFileForceCleanupPath() can kill virtiofsd correctly.

For example of the reproduction:

  # virsh start guest
  error: Failed to start domain guest
  error: internal error: process exited while connecting to monitor: qemu-system-x86_64: -foo: invalid option

  ... fix the option ...

  # virsh start guest
  error: Failed to start domain guest
  error: Cannot open log file: '/var/log/libvirt/qemu/guest-fs0-virtiofsd.log': Device or resource busy
  #

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-11-11 15:20:12 +01:00
Peter Krempa
62a01d84a3 util: hash: Retire 'virHashTable' in favor of 'GHashTable'
Don't hide our use of GHashTable behind our typedef. This will also
promote the use of glibs hash function directly.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
2020-11-06 10:40:51 +01:00
Daniel P. Berrangé
99a1cfc438 qemu: honour fatal errors dealing with qemu slirp helper
Currently all errors from qemuInterfacePrepareSlirp() are completely
ignored by the callers. The intention is that missing qemu-slirp binary
should cause the caller to fallback to the built-in slirp impl.

Many of the possible errors though should indeed be considered fatal.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-10-27 12:03:19 +00:00
zhenwei pi
7555a55470 qemu: implement memory failure event
Since QEMU 5.2 (commit-77b285f7f6), QEMU supports 'memory failure'
event, posts event to monitor if hitting a hardware memory error.
Fully support this feature for QEMU.

Test with commit 'libvirt: support memory failure event', build a
little complex environment(nested KVM):
1, install newly built libvirt in L1, and start a L2 vm. run command
in L1:
 ~# virsh event l2 --event memory-failure

2, run command in L0 to inject MCE to L1:
 ~# virsh qemu-monitor-command l1 --hmp mce 0 9 0xbd000000000000c0 0xd 0x62000000 0x8c

Test result in l1(recipient hypervisor case):
event 'memory-failure' for domain l2:
recipient: hypervisor
action: ignore
flags:
        action required: 0
        recursive: 0

Test result in l1(recipient guest case):
event 'memory-failure' for domain l2:
recipient: guest
action: inject
flags:
        action required: 0
        recursive: 0

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-23 09:42:00 +02:00
Peter Krempa
d6d4c08daf util: hash: Change type of hash table name/key to 'char'
All users of virHashTable pass strings as the name/key of the entry.
Make this an official requirement by turning the variables to 'const
char *'.

For any other case it's better to use glib's GHashTable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-10-22 15:02:46 +02:00
Daniel P. Berrangé
7b1ed1cd73 qemu: stop passing -enable-fips to QEMU >= 5.2.0
Use of the -enable-fips option is being deprecated in QEMU >= 5.2.0. If
FIPS compliance is required, QEMU must be built with libcrypt which will
unconditionally enforce it.

Thus there is no need for libvirt to pass -enable-fips to modern QEMU.
Unfortunately there was never any way to probe for -enable-fips in the
first instance, it was enabled by libvirt based on version number
originally, and then later unconditionally enabled when libvirt dropped
support for older QEMU. Similarly we now use a version number check to
decide when to stop passing -enable-fips.

Note that the qemu-5.2 capabilities are currently from the pre-release
version and will be updated once qemu-5.2 is released.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-10-22 09:03:33 +02:00
Jonathon Jongsma
08f8fd8413 conf: Add support for vDPA network devices
This patch adds new schema and adds support for parsing and formatting
domain configurations that include vdpa devices.

vDPA network devices allow high-performance networking in a virtual
machine by providing a wire-speed data path. These devices require a
vendor-specific host driver but the data path follows the virtio
specification.

When a device on the host is bound to an appropriate vendor-specific
driver, it will create a chardev on the host at e.g.  /dev/vhost-vdpa-0.
That chardev path can then be used to define a new interface with
type='vdpa'.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2020-10-20 14:46:52 -04:00
Peter Krempa
7b0ced89e7 qemu: Prepare hostdev data which depends on the host state separately
SCSI hostdev setup requires querying the host os for the actual path of
the configured hostdev. This was historically done in the command line
formatter. Our new approach is to split out this part into
'qemuProcessPrepareHost' which is designed to be skipped in tests.

Refactor the hostdev code to use this new semantics, and add appropriate
handlers filling in the data for tests and the qemuConnectDomainXMLToNative
users.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-20 15:08:22 +02:00
Peter Krempa
9ff3ad9058 qemuProcessCreatePretendCmd: Split up preparation and command building
Host preparation steps which are deliberately skipped when
pretend-creating a commandline are normally executed after VM object
preparation. In the test code we are faking some of the host
preparation steps, but we were doing that prior to the call to
qemuProcessPrepareDomain embedded in qemuProcessCreatePretendCmd.

By splitting up qemuProcessCreatePretendCmd into two functions we can
ensure that the ordering of the prepare steps stays consistent.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-20 15:08:22 +02:00
Erik Skultety
ccb40cf288 qemu: process: sev: Fill missing 'cbitpos' & 'reducedPhysBits' from caps
These XML attributes have been mandatory since the introduction of SEV
support to libvirt. This design decision was based on QEMU's
requirement for these to be mandatory for migration purposes, as
differences in these values across platforms must result in the
pre-migration checks failing (not that migration with SEV works at the
time of this patch).

This patch enables autofill of these attributes right before launching
QEMU and thus updating the live XML.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-10-19 11:03:27 +02:00
Erik Skultety
1fdc907325 qemu: process: Move SEV capability check to qemuValidateDomainDef
Checks such as this one should be done at domain def validation time,
not before starting the QEMU process.
As for this change, existing domains will see some QEMU error when
starting as opposed to a libvirt error that this QEMU binary doesn't
support SEV, but that's okay, we never guaranteed error messages to
remain the same.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-10-19 11:03:16 +02:00
Erik Skultety
649f720a9a qemu_process: sev: Drop an unused variable
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-10-19 11:01:56 +02:00
Pavel Hrdina
5ad8272888 util: vircgroup: change virCgroupFree to take only virCgroupPtr
As preparation for g_autoptr() we need to change the function to take
only virCgroupPtr.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2020-10-09 16:24:35 +02:00
Ján Tomko
cc3190cc4c qemu: process: use g_new0
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-10-05 16:44:06 +02:00
Ján Tomko
868c350752 qemu: separate out VIR_ALLOC calls
Move them to separate conditions to reduce churn
in following patches.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-10-05 16:44:06 +02:00
Cole Robinson
0fa5c23865 qemu: Taint cpu host-passthrough only after migration
From a discussion last year[1], Dan recommended libvirt drop the tain
flag for cpu host-passthrough, unless the VM has been migrated.

This repurposes the existing host-cpu taint flag to do just that.

[1]: https://www.redhat.com/archives/virt-tools-list/2019-February/msg00041.html

https://bugzilla.redhat.com/show_bug.cgi?id=1673098

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2020-10-05 10:08:26 -04:00
Peter Krempa
faa88866f5 Don't check return value of virBitmapNewCopy
The function will not fail any more.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-10-05 15:50:45 +02:00
Peter Krempa
cb6fdb0125 virBitmapNew: Don't check return value
Remove return value check from all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-10-05 15:38:47 +02:00
Masayoshi Mizuma
1c9227de5d qemu: process: Handle transient disks on VM startup
Add overlays after the VM starts before we start executing guest code.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
2020-10-01 09:55:02 +02:00
Peter Krempa
afc25e8553 qemu: prepare cleanup for <transient/> disk overlays
Later patches will implement support for <transient/> disks in libvirt
by installing an overlay on top of the configured image. This will
require cleanup after the VM will be stopped so that the state is
correctly discarded.

Since the overlay will be installed only during the startup phase of the
VM we need to ensure that qemuProcessStop doesn't delete the original
file on some previous failure. This is solved by adding
'inhibitDiskTransientDelete' VM private data member which is set prior
to any startup step and will be cleared once transient disk overlays are
established.

Based on that we can then delete the overlays for any <transient/> disk.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
2020-10-01 09:55:02 +02:00
Peter Krempa
3673bdbe13 qemu: domain: Extract preparation of hostdev specific data to a separate function
Historically we've prepared secrets for all objects in one place. This
doesn't make much sense and it's semantically more appealing to prepare
everything for a single device type in one place.

Move the setup of the (iSCSI|SCSI) hostdev secrets into a new function
which will be used to setup other things as well in the future.

This is a similar approach we do for disks.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-15 15:20:23 +02:00
Ján Tomko
af16e754cd qemuProcessReconnect: clear 'oldjob'
After we started copying the privateData pointer in
qemuDomainObjRestoreJob, we should also free them
once we're done with them.

Register the clear function and use g_auto.
Also add a check for job->cb to qemuDomainObjClearJob,
to prevent freeing an uninitialized job.

https://bugzilla.redhat.com/show_bug.cgi?id=1878450

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: aca37c3fb2
2020-09-14 18:10:56 +02:00
Tim Wiederhake
caf5a88e59 qemu: Use glib memory functions in qemuProcessReadLog
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-09-11 18:19:58 +02:00
Michal Privoznik
ec46e6d44b qemu_process: Separate VIR_PERF_EVENT_* setting into a function
When starting a domain, qemuProcessLaunch() iterates over all
VIR_PERF_EVENT_* values and (possibly) enables them. While there
is nothing wrong with the code, the for loop where it's done makes
it harder to jump onto next block of code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-08 10:57:24 +02:00
Martin Kletzander
f5b486daea qemu: Allow setting affinity to fail and don't report error
This is just a clean-up of commit 3791f29b08 using the new parameter of
virProcessSetAffinity() introduced in commit 9514e24984 so that there is
no error reported in the logs.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-07 14:48:57 +02:00
Martin Kletzander
9514e24984 Do not report error when setting affinity is allowed to fail
Suggested-by: Ján Tomko <jtomko@redhat.com>

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-07 11:35:36 +02:00
Nikolay Shirokovskiy
5c0cd375d1 qemu: don't shutdown event thread in monitor EOF callback
This hunk was introduced in [1] in order to avoid loosing
events from monitor on stopping qemu process. But as explained
in [2] on destroy we won't get neither EOF nor any other
events as monitor is just closed. In case of crash/shutdown
we won't get any more events as well and qemuDomainObjStopWorker
will be called by qemuProcessStop eventually. Thus let's
remove qemuDomainObjStopWorker from qemuProcessHandleMonitorEOF
as it is not useful anymore.

[1] e6afacb0f: qemu: start/stop an event loop thread for domains
[2] d2954c072: qemu: ensure domain event thread is always stopped

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-07 09:33:59 +03:00
Martin Kletzander
fc7d53edf4 qemu: Fix comment in qemuProcessSetupPid
This was supposed to be done in commit 3791f29b08, but I missed a spot.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2020-09-06 13:44:27 +02:00
Martin Kletzander
3791f29b08 qemu: Do not error out when setting affinity failed
Consider a host with 8 CPUs. There are the following possible scenarios

1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs

2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs

3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus;
   QEMU should get 8 CPUs

4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus;
   QEMU should get 8 CPUs

5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus;
   QEMU should get 4 CPUs

6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus;
   QEMU should get 4 CPUs

Scenarios 1 & 2 always work unless systemd restricted libvirtd privs.

Scenario 3 works because libvirt checks current affinity first and
skips the sched_setaffinity call, avoiding the SYS_NICE issue

Scenario 4 works only if CAP_SYS_NICE is availalbe

Scenarios 5 & 6 works only if CAP_SYS_NICE is present *AND* the cgroups
cpuset is not set on the container.

If libvirt blindly ignores the sched_setaffinity failure, then scenarios
4, 5 and 6 should all work, but with caveat in case 4 and 6, that
QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively.
This is still better than failing.

Therefore libvirt can blindly ignore the setaffinity failure, but *ONLY*
ignore it when there was no affinity specified in the XML config.
If user specified affinity explicitly, libvirt must report an error if
it can't be honoured.

Resolves: https://bugzilla.redhat.com/1819801

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-04 14:44:21 +02:00
Michal Privoznik
95b9db4ee2 lib: Prefer WITH_* prefix for #if conditionals
Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH.
Things got way better with Pavel's work on meson, but apparently,
mixing these two lead to confusing and easy to miss bugs (see
31fb929eca for instance). While we were forced to use HAVE_
prefix with autotools, we are free to chose our own prefix with
meson and since WITH_ prefix appears to be more popular let's use
it everywhere.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-02 10:28:10 +02:00
Laine Stump
95089f481e util: assign tap device names using a monotonically increasing integer
When creating a standard tap device, if provided with an ifname that
contains "%d", rather than taking that literally as the name to use
for the new device, the kernel will instead use that string as a
template, and search for the lowest number that could be put in place
of %d and produce an otherwise unused and unique name for the new
device. For example, if there is no tap device name given in the XML,
libvirt will always send "vnet%d" as the device name, and the kernel
will create new devices named "vnet0", "vnet1", etc. If one of those
devices is deleted, creating a "hole" in the name list, the kernel
will always attempt to reuse the name in the hole first before using a
name with a higher number (i.e. it finds the lowest possible unused
number).

The problem with this, as described in the previous patch dealing with
macvtap device naming, is that it makes "immediate reuse" of a newly
freed tap device name *much* more common, and in the aftermath of
deleting a tap device, there is some other necessary cleanup of things
which are named based on the device name (nwfilter rules, bandwidth
rules, OVS switch ports, to name a few) that could end up stomping
over the top of the setup of a new device of the same name for a
different guest.

Since the kernel "create a name based on a template" functionality for
tap devices doesn't exist for macvtap, this patch for standard tap
devices is a bit different from the previous patch for macvtap - in
particular there was no previous "bitmap ID reservation system" or
overly-complex retry loop that needed to be removed. We simply find
and unused name, and pass that name on to the kernel instead of
"vnet%d".

This counter is also wrapped when either it gets to INT_MAX or if the
full name would overflow IFNAMSIZ-1 characters. In the case of
"vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly
someday someone will change the name from vnet to something else.

(NB: It is still possible for a user to provide their own
parameterized template name (e.g. "mytap%d") in the XML, and libvirt
will just pass that through to the kernel as it always has.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-09-01 14:16:44 -04:00
Laine Stump
d7f38beb2e util: replace macvtap name reservation bitmap with a simple counter
There have been some reports that, due to libvirt always trying to
assign the lowest numbered macvtap / tap device name possible, a new
guest would sometimes be started using the same tap device name as
previously used by another guest that is in the process of being
destroyed *as the new guest is starting.

In some cases this has led to, for example, the old guest's
qemuProcessStop() code deleting a port from an OVS switch that had
just been re-added by the new guest (because the port name is based on
only the device name using the port). Similar problems can happen (and
I believe have) with nwfilter rules and bandwidth rules (which are
both instantiated based on the name of the tap device).

A couple patches have been previously proposed to change the ordering
of startup and shutdown processing, or to put a mutex around
everything related to the tap/macvtap device name usage, but in the
end no matter what you do there will still be possible holes, because
the device could be deleted outside libvirt's control (for example,
regular tap devices are automatically deleted when the qemu process
terminates, and that isn't always initiated by libvirt but could
instead happen completely asynchronously - libvirt then has no control
over the ordering of shutdown operations, and no opportunity to
protect it with a mutex.)

But this only happens if a new device is created at the same time as
one is being deleted. We can effectively eliminate the chance of this
happening if we end the practice of always looking for the lowest
numbered available device name, and instead just keep an integer that
is incremented each time we need a new device name. At some point it
will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15
character limit if nothing else), and we can't guarantee that the new
name really will be the *least* recently used name, but "math"
suggests that it will be *much* less common that we'll try to re-use
the *most* recently used name.

This patch implements such a counter for macvtap/macvlan, replacing
the existing, and much more complicated, "ID reservation" system. The
counter is set according to whatever macvtap/macvlan devices are
already in use by guests when libvirtd is started, incremented each
time a new device name is needed, and wraps back to 0 when either
INT_MAX is reached, or when the resulting device name would be longer
than IFNAMSIZ-1 characters (which actually is what happens when the
template for the device name is "maccvtap%d"). The result is that no
macvtap name will be re-used until the host has created (and possibly
destroyed) 99,999,999 devices.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-09-01 14:16:36 -04:00
Ján Tomko
0a37e0695b Split declarations from initializations
Split those initializations that depend on a statement
above them.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
a5152f23e7 Move declarations before statements
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Laine Stump
5cad64ec03 qemu: remove unreachable code in qemuProcessStart()
Back when the original version of this chunk of code was added (commit
41b087198 in libvirt-0.8.1 in April 2010), we used virExecDaemonize()
to start the qemu process, and would continue on in the function
(which at that time was called qemudStartVMDaemon()) even if a -1 was
returned. So it was possible to get to this code with rv == -1 (it was
called "ret" in that version of the code).

In modern libvirt code, qemu is started with virCommandRun(); then we
call virPidFileReadPath(); those are the only two ways of setting "rv"
prior to this code being removed, and in either case if the new value
of rv < 0, then we immediately skip over the rest of the code to the
cleanup: label.

This means that the code being removed by this patch is
unreachable.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-08-24 23:46:51 -04:00
Michal Privoznik
9048dc4e62 qemuDomainBuildNamespace: Populate basic /dev from daemon's namespace
As mentioned in previous commit, populating domain's namespace
from pre-exec() hook is dangerous. This commit moves population
of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm,
etc.) into daemon's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
8da362fe62 qemu_domain_namespace: Repurpose qemuDomainBuildNamespace()
Okay, here is the deal. Currently, the way we build namespace is
very fragile. It is done from pre-exec hook when starting a
domain, after we mass closed all FDs and before we drop
privileges and exec() QEMU. This fact poses some limitations onto
the namespace build code, e.g. it has to make sure not to keep
any FD opened (not even through a library call), because it would
be leaked to QEMU. Also, it has to call only async signal safe
functions. These requirements are hard to meet - in fact as of my
commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling
libdevmapper functions.

To solve this issue and avoid similar problems in the future, we
should change our paradigm. We already have functions which can
populate domain's namespace with nodes from the daemon context.
If we use them to populate the namespace and keep only the bare
minimum in the pre-exec hook, we've mitigated the risk.

Therefore, the old qemuDomainBuildNamespace() is renamed to
qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace()
function is introduced. So far, the new function is basically a
NOP and domain's namespace is still populated from the pre-exec
hook - next patches will fix it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
764eaf1aa4 qemu_domain_namespace: Rename qemuDomainCreateNamespace()
The name of this function is not very helpful, because it doesn't
create anything, it just flips a bit in a bitmask when domain is
starting up. Move the function internals into qemu_process.c and
forget the function ever existed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:33 +02:00
Michal Privoznik
90eee87569 qemu: Separate out namespace handling code
The qemu_domain.c file is big as is and we should split it into
separate semantic blocks. Start with code that handles domain
namespaces.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:32:27 +02:00
Ján Tomko
ee247e1d3f Use g_strfeev instead of virStringFreeList
Both accept a NULL value gracefully and virStringFreeList
does not zero the pointer afterwards, so a straight replace
is safe.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:37:36 +02:00
Ján Tomko
1edf164848 Remove redundant conditions
All of these have been checked earlier.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:19:28 +02:00
Ján Tomko
6c7ba7b496 qemu: Fix affinity typo
Fixes: 4c0398b528
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-07-22 15:51:26 +02:00
Bihong Yu
3ee423c363 qemu: pre-create the dbus directory in qemuStateInitialize
There are races condiction to make '/run/libvirt/qemu/dbus' directory in
virDirCreateNoFork() while concurrent start VMs, and get "failed to create
directory '/run/libvirt/qemu/dbus': File exists" error message. pre-create the
dbus directory in qemuStateInitialize.

Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-07-22 09:40:15 +02:00
Jiri Denemark
1031db3600 qemu: Properly set //cpu/@migratable default value for running domains
Since active domains which do not have the attribute already set were
not started by libvirt that probed for CPU migratable property, we need
to check this property on reconnect and update the domain definition
accordingly.

https://bugzilla.redhat.com/show_bug.cgi?id=1857967

Reported-by: Mark Mielke <mark.mielke@gmail.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-07-21 15:40:01 +02:00
Daniel Henrique Barboza
f187b2fb98 qemu_process.c: modernize qemuProcessQMPNew()
Use g_autoptr() and remove the 'cleanup' label.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200717211556.1024748-3-danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-07-21 15:34:36 +02:00
Peter Krempa
db712b0673 qemuDomainDiskLookupByNodename: Remove unused 'idx'
All callers pass NULL as the value. Remove the argument.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-21 09:52:46 +02:00
Peter Krempa
c414ab00e2 qemuProcessHandleBlockThreshold: Report correct indexes
The index returned by qemuDomainDiskLookupByNodename is the position in
the backing chain rather than the index we report in the XML.

Since with -blockdev they differ now and additionally the disk source
also has an index we need to fix the 'threshold' events we report:

1) If it's the top level image we must always trigger the event without
   any suffix as we did until now

2) We must report the correct index

3) We must report the correct index also for the top level image, when
   blockdev is used.

This means that we need to potentially emit 2 events, one for the device
without the index and then when blockdev is used and the top level image
has an index we must do it also with the index.

This will fix it for blockdev cases, while also not removing previous
semantics.

https://bugzilla.redhat.com/show_bug.cgi?id=1857204

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-21 09:52:46 +02:00
Peter Krempa
4a19b7b832 qemuDomainDiskBackingStoreGetName: Remove unused argument
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-21 09:52:46 +02:00
Prathamesh Chavan
aca37c3fb2 qemu_domainjob: introduce privateData for qemuDomainJob
To remove dependecy of `qemuDomainJob` on job specific
paramters, a `privateData` pointer is introduced.
To handle it, structure of callback functions is
also introduced.

Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-07-20 15:34:58 +02:00
Michal Privoznik
824e349397 qemu: Use qemuSecuritySetSavedStateLabel() to label restore path
Currently, when restoring from a domain the path that the domain
restores from is labelled under qemuSecuritySetAllLabel() (and after
v6.3.0-rc1~108 even outside transactions). While this grants QEMU
the access, it has a flaw, because once the domain is restored, up
and running then qemuSecurityDomainRestorePathLabel() is called,
which is not real counterpart. In case of DAC driver the
SetAllLabel() does nothing with the restore path but
RestorePathLabel() does - it chown()-s the file back and since there
is no original label remembered, the file is chown()-ed to
root:root. While the apparent solution is to have DAC driver set the
label (and thus remember the original one) in SetAllLabel(), we can
do better.

Turns out, we are opening the file ourselves (because it may live on
a root squashed NFS) and then are just passing the FD to QEMU. But
this means, that we don't have to chown() the file at all, we need
to set SELinux labels and/or add the path to AppArmor profile.

And since we want to restore labels right after QEMU is done loading
the migration stream (we don't want to wait until
qemuSecurityRestoreAllLabel()), the best way to approach this is to
have separate APIs for labelling and restoring label on the restore
file.

I will investigate whether AppArmor can use the SavedStateLabel()
API instead of passing the restore path to SetAllLabel().

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1851016

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-07-10 14:18:07 +02:00
Daniel P. Berrangé
fd460ef561 qemu: stop checking virObjectUnref return value
Some, but not all, of the monitor event handlers check
the virObjectUnref return value to see if the domain
was disposed.

It should not be possible for this to happen, since
the function already holds a lock on the domain and
has only just acquired an extra reference on the
domain a few lines earlier.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-06-03 10:20:17 +01:00
Daniel Henrique Barboza
9665b27dba qemuProcessRefreshCPU: skip 'host-model' logic for pSeries guests
Commit v3.10.0-182-g237f045d9a ("qemu: Ignore fallback CPU attribute
on reconnect") forced CPU 'fallback' to ALLOW, regardless of user
choice. This fixed a situation in which guests created with older
Libvirt versions, which used CPU mode 'host-model' in runtime, would
fail to launch in a newer Libvirt if the fallback was set to FORBID.
This would lead to a scenario where the CPU was translated to 'host-model'
to 'custom', but then the FORBID setting would make the translation
process fail.

PSeries can operate with 'host-model' in runtime due to specific PPC64
mechanics regarding compatibility mode. The update() implementation of
the cpuDriverPPC64 driver is a NO-OP if CPU mode is 'host-model', and
the driver does not implement translate(). The commit mentioned above
is causing PSeries guests to get their 'fallback' setting to ALLOW,
overwriting user choice, exposing a design problem in
qemuProcessRefreshCPU() - for PSeries guests, handling 'host-model'
as it is being done does not apply.

All other cpuArchDrivers implements update() and changes guest mode
to VIR_CPU_MODE_CUSTOM, meaning that PSeries is currently the only
exception to this logic. Let's make it official.

https://bugzilla.redhat.com/show_bug.cgi?id=1660711

Suggested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200525123945.4049591-2-danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-05-25 16:20:25 +02:00
Daniel Henrique Barboza
f600c42627 qemu_process.c: modernize qemuProcessUpdateCPU code path
Use automatic cleanup on qemuProcessUpdateCPU and the functions called
by it.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200522195620.3843442-5-danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-05-25 12:31:14 +02:00
Peter Krempa
78d30aa0bf qemu: Prepare for testing of 'netdev_add' props via qemuxml2argvtest
qemuxml2argv test suite is way more comprehensive than the hotplug
suite. Since we share the code paths for monitor and command line
hotplug we can easily test the properties of devices against the QAPI
schema.

To achieve this we'll need to skip the JSON->commandline conversion for
the test run so that we can analyze the pure properties. This patch adds
flags for the comand line generator and hook them into the
JSON->commandline convertor for -netdev. An upcoming patch will make use
of this new infrastructure.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-05-20 09:41:58 +02:00
Michal Privoznik
8fd2749b2d qemuProcessStop: Reattach NVMe disks a domain is mirroring into
If the mirror destination is not a file but a NVMe disk, then
call qemuHostdevReAttachOneNVMeDisk() to reattach the NVMe back
to the host.

This would be done by blockjob code when the job finishes, but in
this case the job won't finish - QEMU is killed meanwhile.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1825785

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-05-18 15:14:27 +02:00
Michal Privoznik
0230e38384 qemuProcessStop: Use XATTRs to restore seclabels on disks a domain is mirroring into
In v5.10.0-rc1~42 (which was later fixed in v6.0.0-rc1~487) I am
removing XATTRs for a file that QEMU is mirroring a disk into but
it is killed meanwhile. Well, we can call
qemuSecurityRestoreImageLabel() which will not only remove XATTRs
but also use them to restore the original owner of the file.

This would be done by blockjob code when the job finishes, but in
this case the job won't finish - QEMU is killed meanwhile

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-05-18 15:13:14 +02:00
Ján Tomko
006782a8bc qemu: only stop external devices after the domain
A failure in qemuProcessLaunch would lead to qemuExtDevicesStop
being called twice - once in the cleanup section and then again
in qemuProcessStop.

However, the first one is called while the QEMU process is
still running, which is too soon for the swtpm process, because
the swtmp_ioctl command can lock up:

https://bugzilla.redhat.com/show_bug.cgi?id=1822523

Remove the first call and only leave the one in qemuProcessStop,
which is called after the QEMU process is killed.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-05-13 15:29:37 +02:00
Peter Krempa
3bcbdc51da qemu: process: Don't clear QEMU_CAPS_BLOCKDEV when SD card is present
Help QEMU in deprecation of -drive if=none without the need to refactor
all old boards. Stop masking out -blockdev support when -drive if=sd
needs to be used. We achieve this by forbidding blockjobs and
special-casing all other code paths. Blockjobs are sacrificed in this
case as SD cards are a corner case for some ARM boards and are thus not
used commonly.

https://bugzilla.redhat.com/show_bug.cgi?id=1821692

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-05-12 06:55:00 +02:00
Peter Krempa
e664abb62e qemu: Prepare for 'sd' card use together with blockdev
SD cards need to be instantiated via -drive if=sd. This means that all
cases where we use the blockdev path need to be special-cased for SD
cards.

Note that at this point QEMU_CAPS_BLOCKDEV is still cleared if the VM
config has a SD card.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-05-12 06:55:00 +02:00
Peter Krempa
d876a93f05 qemu: Handle cases when 'qomName' isn't present
Use the drive alias for all cases when we can't generate qomName. This
is meant to handle disks on 'sd' bus which are instantiated via -drive
if=sd as there isn't any specific QOM name for them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-05-12 06:55:00 +02:00
Peter Krempa
cc4a277db2 qemu: Rename qemuDiskBusNeedsDriveArg to qemuDiskBusIsSD
The function effectively boils down to whether the disk is 'SD'. Since
we'll need to make more decisions based on the fact whether the disk is
on the SD bus, rename the function.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-05-12 06:54:59 +02:00
Michal Privoznik
1d3a9ee9da qemu: Make memory path generation embed driver aware
So far, libvirt generates the following path for memory:

  $memoryBackingDir/$id-$shortName/ram-nodeN

where $memoryBackingDir is the path where QEMU mmaps() memory for
the guest (e.g. /var/lib/libvirt/qemu/ram), $id is domain ID
and $shortName is shortened version of domain name. So for
instance, the generated path may look something like this:

  /var/lib/libvirt/qemu/ram/1-QEMUGuest/ram-node0

While in case of embed driver the following path would be
generated by default:

  $root/lib/qemu/ram/1-QEMUGuest/ram-node0

which is not clashing with other embed drivers, we allow users to
override the default and have all embed drivers use the same
prefix. This can create clashing paths. Fortunately, we can reuse
the approach for machined name generation
(v6.1.0-178-gc9bd08ee35) and include part of hash of the root in
the generated path.

Note, the important change is in qemuGetMemoryBackingBasePath().
The rest is needed to pass driver around.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-04-07 15:26:32 +02:00
Michal Privoznik
bf54784cb1 qemu: Make hugepages path generation embed driver aware
So far, libvirt generates the following path for hugepages:

  $mnt/libvirt/qemu/$id-$shortName

where $mnt is the mount point of hugetlbfs corresponding to
hugepages of desired size (e.g. /dev/hugepages), $id is domain ID
and $shortName is shortened version of domain name. So for
instance, the generated path may look something like this:

  /dev/hugepages/libvirt/qemu/1-QEMUGuest

But this won't work with embed driver really, because if there
are two instances of embed driver, and they both want to start a
domain with the same name and with hugepages, both drivers will
generate the same path which is not desired. Fortunately, we can
reuse the approach for machined name generation
(v6.1.0-178-gc9bd08ee35) and include part of hash of the root in
the generated path.

Note, the important change is in qemuGetBaseHugepagePath(). The
rest is needed to pass driver around.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-04-07 15:26:26 +02:00
Daniel P. Berrangé
d2954c0729 qemu: ensure domain event thread is always stopped
In previous commit:

  commit e6afacb0fe
  Author: Daniel P. Berrangé <berrange@redhat.com>
  Date:   Wed Feb 12 12:26:11 2020 +0000

    qemu: start/stop an event loop thread for domains

A bogus comment was added claiming we didn't need to shutdown the
event thread in the qemuProcessStop method, because this would be
done in the monitor EOF callback. This was wrong because the EOF
callback only runs in the case of a QEMU crash or a guest initiated
clean shutdown & poweroff.  In the case where the libvirt admin
calls virDomainDestroy, the EOF callback never fires because we
have already unregistered the event callbacks. We must thus always
attempt to stop the event thread in qemuProcessStop.

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reported-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-03-30 16:48:15 +01:00
Marc-André Lureau
db670b8d67 qemu: prepare and stop the dbus daemon
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-03-24 15:57:33 +01:00
Michal Privoznik
a02c589886 qemuProcessStartManagedPRDaemon: Don't pass -f pidfile to the daemon
Now, that our virCommandSetPidFile() is more intelligent we don't
need to rely on the daemon to create and lock the pidfile and use
virCommandSetPidFile() at the same time.

NOTE that as advertised in the previous commit, this was
temporarily broken, because both virCommand and
qemuProcessStartManagedPRDaemon() would try to lock the pidfile.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2020-03-24 15:53:03 +01:00
Gaurav Agrawal
d2c43a5b51 qemu: convert DomainLogContext class to use GObject
Signed-off-by: Gaurav Agrawal <agrawalgaurav@gnome.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-03-16 17:28:39 +01:00
Ján Tomko
b0eea635b3 Use g_strerror instead of virStrerror
Remove lots of stack-allocated buffers.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-03-13 17:26:55 +01:00
Nikolay Shirokovskiy
b47e3b9b5c qemu: agent: sync once if qemu has serial port event
Sync was introduced in [1] to check for ga presence. This
check is racy but in the era before serial events are available
there was not better solution I guess.

In case we have the events the sync function is different. It allows us
to flush stateless ga channel from remnants of previous communications.
But we need to do it only once. Until we get timeout on issued command
channel state is ok.

[1] qemu_agent: Issue guest-sync prior to every command

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-03-12 18:07:50 +01:00
Daniel P. Berrangé
a18f2c52ac qemu: convert agent to use the per-VM event loop
This converts the QEMU agent APIs to use the per-VM
event loop, which involves switching from virEvent APIs
to GMainContext / GSource APIs.

A GSocket is used as a convenient way to create a GSource
for a socket, but is not yet used for actual I/O.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-03-11 14:45:01 +00:00
Daniel P. Berrangé
436a56e37d qemu: convert monitor to use the per-VM event loop
This converts the QEMU monitor APIs to use the per-VM
event loop, which involves switching from virEvent APIs
to GMainContext / GSource APIs.

A GSocket is used as a convenient way to create a GSource
for a socket, but is not yet used for actual I/O.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-03-11 14:44:55 +00:00
Daniel P. Berrangé
92890fbfa1 qemu: start/stop an event thread for QMP probing
In common with regular QEMU guests, the QMP probing
will need an event loop for handling monitor I/O
operations.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-03-11 14:44:47 +00:00
Daniel P. Berrangé
e6afacb0fe qemu: start/stop an event loop thread for domains
The event loop thread will be responsible for handling
any per-domain I/O operations, most notably the QEMU
monitor and agent sockets.

We start this event loop when launching QEMU, but stopping
the event loop is a little more complicated. The obvious
idea is to stop it in qemuProcessStop(), but if we do that
we risk loosing the final events from the QEMU monitor, as
they might not have been read by the event thread at the
time we tell the thread to stop.

The solution is to delay shutdown of the event thread until
we have seen EOF from the QEMU monitor, and thus we know
there are no further events to process.

Note that this assumes that we don't have events to process
from the QEMU agent.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-03-11 14:44:44 +00:00
Michal Privoznik
13eb6c1468 qemu: Tell secdrivers which images are top parent
When preparing images for block jobs we modify their seclabels so
that QEMU can open them. However, as mentioned in the previous
commit, secdrivers base some it their decisions whether the image
they are working on is top of of the backing chain. Fortunately,
in places where we call secdrivers we know this and the
information can be passed to secdrivers.

The problem is the following: after the first blockcommit from
the base to one of the parents the XATTRs on the base image are
not cleared and therefore the second attempt to do another
blockcommit fails. This is caused by blockcommit code calling
qemuSecuritySetImageLabel() over the base image, possibly
multiple times (to ensure RW/RO access). A naive fix would be to
call the restore function. But this is not possible, because that
would deny QEMU the access to the base image.  Fortunately, we
can use the fact that seclabels are remembered only for the top
of the backing chain and not for the rest of the backing chain.
And thanks to the previous commit we can tell secdrivers which
images are top of the backing chain.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1803551

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-03-09 14:14:55 +01:00
Daniel P. Berrangé
5bff668dfb src: improve thread naming with human targetted names
Historically threads are given a name based on the C function,
and this name is just used inside libvirt. With OS level thread
naming this name is now visible to debuggers, but also has to
fit in 15 characters on Linux, so function names are too long
in some cases.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-03-05 12:23:04 +00:00
Ján Tomko
b164eac5e1 qemuExtDevicesStart: pass logManager
Pass logManager to qemuExtDevicesStart for future usage.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
2020-03-04 12:08:50 +01:00
Ján Tomko
feb69a19ac conf: do not pass vm object to virDomainClearNetBandwidth
This function only uses the domain definition.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-02-25 17:50:47 +01:00
Ján Tomko
7e0d11be5b virsh: include virutil.h where used
Include virutil.h in all files that use it,
instead of relying on it being pulled in somehow.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-02-24 23:15:50 +01:00
Michal Privoznik
74ec3f4d7d qemu: Don't explicitly remove pidfile after virPidFileForceCleanupPath()
In two places where virPidFileForceCleanupPath() is called, we
try to unlink() the pidfile again. This is needless because
virPidFileForceCleanupPath() has done just that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-20 12:57:19 +01:00
zhenwei pi
26badd13e8 qemu: support Panic Crashloaded event handling
Pvpanic device supports bit 1 as crashloaded event, it means that
guest actually panicked and run kexec to handle error by guest side.

Handle crashloaded as a lifecyle event in libvirt.

Test case:
Guest side:
before testing, we need make sure kdump is enabled,
1, build new pvpanic driver (with commit from upstream
   e0b9a42735f2672ca2764cfbea6e55a81098d5ba
   191941692a3d1b6a9614502b279be062926b70f5)
2, insmod new kmod
3, enable crash_kexec_post_notifiers,
  # echo 1 > /sys/module/kernel/parameters/crash_kexec_post_notifiers
4, trigger kernel panic
  # echo 1 > /proc/sys/kernel/sysrq
  # echo c > /proc/sysrq-trigger

Host side:
1, build new qemu with pvpanic patches (with commit from upstream
   600d7b47e8f5085919fd1d1157f25950ea8dbc11
   7dc58deea79a343ac3adc5cadb97215086054c86)
2, build libvirt with this patch
3, handle lifecycle event and trigger guest side panic
  # virsh event stretch --event lifecycle
  event 'lifecycle' for domain stretch: Crashed Crashloaded
  events received: 1

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-07 14:05:25 +00:00
Jiri Denemark
80791859ac qemu: Pass machine type to virQEMUCapsIsCPUModeSupported
The usability of a specific CPU mode may depend on machine type, let's
prepare for this by passing it to virQEMUCapsIsCPUModeSupported.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-07 09:19:02 +01:00
Michal Privoznik
a37a8c569d Drop virAtomic module
Now, that every use of virAtomic was replaced with its g_atomic
equivalent, let's remove the module.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-02 16:36:58 +01:00
Michal Privoznik
7390ff3caa src: Drop virAtomicIntDecAndTest() with g_atomic_int_dec_and_test()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-02 16:36:56 +01:00
Michal Privoznik
574678a27f src: Replace virAtomicIntInc() with g_atomic_int_add()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-02 16:36:54 +01:00
Daniel P. Berrangé
068efae5b1 qemu: add support for running QEMU driver in embedded mode
This enables support for running QEMU embedded to the calling
application process using a URI:

   qemu:///embed?root=/some/path

Note that it is important to keep the path reasonably short to
avoid risk of hitting the limit on UNIX socket path names
which is 108 characters.

When using the embedded mode with a root=/var/tmp/embed, the
driver will use the following paths:

                logDir: /var/tmp/embed/log/qemu
           swtpmLogDir: /var/tmp/embed/log/swtpm
         configBaseDir: /var/tmp/embed/etc/qemu
              stateDir: /var/tmp/embed/run/qemu
         swtpmStateDir: /var/tmp/embed/run/swtpm
              cacheDir: /var/tmp/embed/cache/qemu
                libDir: /var/tmp/embed/lib/qemu
       swtpmStorageDir: /var/tmp/embed/lib/swtpm
 defaultTLSx509certdir: /var/tmp/embed/etc/pki/qemu

These are identical whether the embedded driver is privileged
or unprivileged.

This compares with the system instance which uses

                logDir: /var/log/libvirt/qemu
           swtpmLogDir: /var/log/swtpm/libvirt/qemu
         configBaseDir: /etc/libvirt/qemu
              stateDir: /run/libvirt/qemu
         swtpmStateDir: /run/libvirt/qemu/swtpm
              cacheDir: /var/cache/libvirt/qemu
                libDir: /var/lib/libvirt/qemu
       swtpmStorageDir: /var/lib/libvirt/swtpm
 defaultTLSx509certdir: /etc/pki/qemu

At this time all features present in the QEMU driver are available when
running in embedded mode, availability matching whether the embedded
driver is privileged or unprivileged.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-27 11:04:03 +00:00
Pavel Hrdina
894556ca81 secret: move virSecretGetSecretString into virsecret
The function virSecretGetSecretString calls into secret driver and is
used from other hypervisors drivers and as such makes more sense in
util.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-17 15:52:37 +01:00
Daniel P. Berrangé
7b9645a7d1 util: replace atomic ops impls with g_atomic_int*
Libvirt's original atomic ops impls were largely copied
from GLib's code at the time. The only API difference
was that libvirt's virAtomicIntInc() would return a
value, but g_atomic_int_inc was void. We thus use
g_atomic_int_add(v, 1) instead, though this means
virAtomicIntInc() now returns the original value,
instead of the new value.

This rewrites libvirt's impl in terms of g_atomic_int*
as a short term conversion. The key motivation was to
quickly eliminate use of GNULIB's verify_expr() macro
which is not a direct match for G_STATIC_ASSERT_EXPR.
Long term all the callers should be updated to use
g_atomic_int* directly.

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-17 10:02:00 +00:00
Jiri Denemark
bd04d63ad9 qemu: Don't emit SUSPENDED_POSTCOPY event on destination
When pause-before-switchover QEMU capability is enabled, we get STOP
event before MIGRATION event with postcopy-active state. To properly
handle post-copy migration and emit correct events commit
v4.10.0-rc1-4-geca9d21e6c added a hack to
qemuProcessHandleMigrationStatus which translates the paused state
reason to VIR_DOMAIN_PAUSED_POSTCOPY and emits
VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event when migration state changes
to post-copy.

However, the code was effective on both sides of migration resulting in
a confusing VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event on the destination
host, where entering post-copy mode is already properly advertised by
VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event.

https://bugzilla.redhat.com/show_bug.cgi?id=1791458

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-01-16 15:12:19 +01:00
Michal Privoznik
50d7465f3d qemu_firmware: Pass virDomainDef into qemuFirmwareFillDomain()
This function needs domain definition really, we don't need to
pass the whole domain object. This saves couple of dereferences
and characters esp. in more checks to come.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 16:26:47 +01:00
Peter Krempa
5632ed8bad qemu: process: Terminate backup job on VM destroy
Commit d75f865fb9 caused a job-deadlock if
a VM is running the backup job and being destroyed as it removed the
cleanup of the async job type and there was nothing to clean up the
backup job.

Add an explicit cleanup of the backup job when destroying a VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
728b993c8a qemu: Reset the node-name allocator in qemuDomainObjPrivateDataClear
qemuDomainObjPrivateDataClear clears state which become invalid after VM
stopped running and the node name allocator belongs there.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Nikolay Shirokovskiy
6c6d93bc62 qemu: hide details of fake reboot
If we use fake reboot then domain goes thru running->shutdown->running
state changes with shutdown state only for short period of time.  At
least this is implementation details leaking into API. And also there is
one real case when this is not convinient. I'm doing a backup with the
help of temporary block snapshot (with the help of qemu's API which is
used in the newly created libvirt's backup API). If guest is shutdowned
I want to continue to backup so I don't kill the process and domain is
in shutdown state. Later when backup is finished I want to destroy qemu
process. So I check if it is in shutdowned state and destroy it if it
is. Now if instead of shutdown domain got fake reboot then I can destroy
process in the middle of fake reboot process.

After shutdown event we also get stop event and now as domain state is
running it will be transitioned to paused state and back to running
later. Though this is not critical for the described case I guess it is
better not to leak these details to user too. So let's leave domain in
running state on stop event if fake reboot is in process.

Reconnection code handles this patch without modification. It detects
that qemu is not running due to shutdown and then calls qemuProcessShutdownOrReboot
which reboots as fake reboot flag is set.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-24 09:22:40 +03:00
Daniel Henrique Barboza
7a7d36055c qemu_process.c: remove 'cleanup' label from qemuProcessCreatePretendCmd()
The 'cleanup' flag is doing no cleaup in this function. We can
remove it and return NULL on error or qemuBuildCommandLine().

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
d8eb3ab9e1 qemu_process.c: remove cleanup labels after g_auto*() changes
The g_auto*() changes made by the previous patches made a lot
of 'cleanup' labels obsolete. Let's remove them.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
d234efc59a qemu_process.c: use g_autoptr()
Change all feasible pointers to use g_autoptr().

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
982ea95142 qemu_process.c: use g_autofree
Change all feasible strings and scalar pointers to use g_autofree.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Michal Privoznik
8e2026cc18 qemu: Generate command line of NVMe disks
Now, that we have everything prepared, we can generate command
line for NVMe disks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Daniel P. Berrangé
766c8ae963 Revert "qemu: directly create virResctrlInfo ignoring capabilities"
This reverts commit 7be5fe66cd.

This commit broke resctrl, because it missed the fact that the
virResctrlInfoGetCache() has side-effects causing it to actually
change the virResctrlInfo parameter, not merely get data from
it.

This code will need some refactoring before we can try separating
it from virCapabilities again.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-12 11:16:44 +00:00
Daniel P. Berrangé
1902356231 qemu: keep capabilities when running QEMU as root
When QEMU uid/gid is set to non-root this is pointless as if we just
used a regular setuid/setgid call, the process will have all its
capabilities cleared anyway by the kernel.

When QEMU uid/gid is set to root, this is almost (always?) never
what people actually want. People make QEMU run as root in order
to access some privileged resource that libvirt doesn't support
yet and this often requires capabilities. As a result they have
to go find the qemu.conf param to turn this off. This is not
viable for libguestfs - they want to control everything via the
XML security label to request running as root regardless of the
qemu.conf settings for user/group.

Clearing capabilities was implemented originally because there
was a proposal in Fedora to change permissions such that root,
with no capabilities would not be able to compromise the system.
ie a locked down root account. This never went anywhere though,
and as a result clearing capabilities when running as root does
not really get us any security benefit AFAICT. The root user
can easily do something like create a cronjob, which will then
faithfully be run with full capabilities, trivially bypassing
the restriction we place.

IOW, our clearing of capabilities is both useless from a security
POV, and breaks valid use cases when people need to run as root.

This removes the clear_emulator_capabilities configuration
option from qemu.conf, and always runs QEMU with capabilities
when root.  The behaviour when non-root is unchanged.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-11 16:01:20 +00:00
Peter Krempa
3656bb0a13 qemu: domain: Introduce QEMU_ASYNC_JOB_BACKUP async job type
We will want to use the async job infrastructure along with all the APIs
and event for the backup job so add the backup job as a new async job
type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-10 12:41:57 +01:00
Daniel P. Berrangé
7be5fe66cd qemu: directly create virResctrlInfo ignoring capabilities
We always refresh the capabilities object when using virResctrlInfo
during process startup. This is undesirable overhead, because we can
just directly create a virResctrlInfo instead.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
adf009b48f qemu: use host CPU object directly
Avoid grabbing the whole virCapsPtr object when we only need the
host CPU information.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
1a1d848694 qemu: use NUMA capabilities object directly
Avoid grabbing the whole virCapsPtr object when we only need the
NUMA information.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
6cc992bd1a conf: move NUMA capabilities into self contained object
The NUMA cells are stored directly in the virCapsHostPtr
struct. This moves them into their own struct allowing
them to be stored independantly of the rest of the host
capabilities. The change is used as an excuse to switch
the representation to use a GPtrArray too.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
bc1676fc2f qemu: drop virCapsPtr param & vars from many APIs
Now that the domain XML APIs don't use virCapsPtr we can stop passing it
around many QEMU driver methods.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
78d8228eec conf: drop virCapsPtr param from APIs for saving domains
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
24d87d2e88 conf: drop virCapsPtr param from domain APIs for copying config
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
b5f591cdb4 conf: drop virCapsPtr param from domain post parse & validate APIs
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:17:27 +00:00
Daniel P. Berrangé
908701c64a conf: sanitize virDomainSaveStatus & virDomainSaveConfig APIs
Our normal practice is for the object type to be the name prefix, and
the object instance be the first parameter passed in.

Rename these to virDomainObjSave and virDomainDefSave moving their
primary parameter to be the first one. Ensure that the xml options
are passed into both functions in prep for future work.

Finally enforce checking of the return type and mark all parameters
as non-NULL.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:15:16 +00:00
Daniel P. Berrangé
bce3b0807e qemu: cache host arch separately from virCapsPtr
As part of a goal to eliminate the need to use virCapsPtr for anything
other than the virConnectGetCapabilies() API impl, cache the host arch
against the QEMU driver struct and use that field directly.

In the tests we move virArchFromHost() globally in testutils.c so that
every test runs with a fixed default architecture reported.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-09 10:15:15 +00:00
Michal Privoznik
516b867685 qemuProcessStop: Remove image metadata only when allowed
In v5.9.0-370-g8fa0374c5b I've tried to fix a bug by removing
some stale XATTRs in qemuProcessStop(). However, I forgot to
do nothing when the VIR_QEMU_PROCESS_STOP_NO_RELABEL flag was
specified.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-12-05 15:08:28 +01:00
Peter Krempa
cd1e6fd478 qemu: process: Re-process qemu capability lockout in qemuProcessPrepareQEMUCaps
We clear some capabilities here so the lockouts need to be
re-evaluated.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
78c2a8b934 qemu: process: Move handling of qemu capability overrides
Do all post-processing of capabilities in qemuProcessPrepareQEMUCaps.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
97c9ece79b qemu: process: Move clearing of QEMU_CAPS_CHARDEV_FD_PASS to qemuProcessPrepareQEMUCaps
Move the post-processing of the QEMU_CAPS_CHARDEV_FD_PASS flag to the
new function.

The clearing of the capability is based on the presence of
VIR_QEMU_PROCESS_START_STANDALONE so we must also pass in the process
start flags.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
3a075524d9 qemu: process: Move clearing of the BLOCKDEV capability to qemuProcessPrepareQEMUCaps
Start aggregating all capability post-processing code in one place.

The comment was modified while moving it as it was mentioning floppies
which are no longer clearing the blockdev capability.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
dbbc9a3c40 qemu: Move and rename qemuDomainUpdateQEMUCaps
The function is now used only in qemu_process.c so move it there and
name it 'qemuProcessPrepareQEMUCaps' which is more appropriate to what
it's doing.

The reworded comment now mentions that it will also post-process the
caps for VM startup.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
530d7a73f4 qemu: process: Don't try to redetect missing qemuCaps on reconnect
The redetection was originally added in 43c01d3838 as a way to recover
from libvirtd upgrade from the time when we didn't persist the qemu
capabilities in the status XML. Also this the oldest supported qemu by
more than two years.

Even if somebody would have a running VM running at least qemu 1.5 with
such an old libvirt we certainly wouldn't do the right thing by
redetecting the capabilities and then trying to communicate with qemu.

For now it will be the best to just stop considering this scenario any
more and error out for such VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
2504dbeb5d qemu: process: Make it obvious that virDomainDefPostParse is called with NULL opaque
Commit c90fb5a828 added explicit use of the private copy of the qemu
capabilities to various places. The change to qemuProcessInit was bogus
though as at the point where we re-initiate the post parse callbacks
priv->qemuCaps is still NULL as we clear it after shutdown of the VM and
don't initiate it until a later point.

Using the value from priv->qemuCaps might mislead readers of the code
into thinking that something useful is being passed at that point so go
with an explicit NULL instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-03 15:26:54 +01:00
Peter Krempa
ccde9ca1f4 qemu: process: Move block job refresh after async job recovery
Block jobs may be members of async jobs so it makes more sense to
refresh block job state after we do steps for async job recovery.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-11-27 15:59:33 +01:00
Laine Stump
fdcd273be2 conf: return a const from virDomainNetGetActualVirtPortProfile
This also isn't required (due to the vportprofile being stored in the
NetDef as a pointer rather than being directly contained), but it
seemed dishonest to not mark it as const (and thus permit users to
modify its contents)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-11-25 15:29:56 -05:00
Michal Privoznik
8fa0374c5b qemuProcessStop: Remove image metadata for running mirror jobs
If user starts a blockcommit or a blockcopy then we modify access
for qemu on both images and leave it like that until the job
terminates.  So far so good. Problem is, if user instead of
terminating the job (where we would modify the access again so
that the state before the job is restored) calls destroy on the
domain or if qemu dies whilst executing the block job.  In this
case we don't ever clear the access we granted at the beginning.
To fix this, maybe a bit harsh approach is used, but it works:
after all labels were restored (that is after
qemuSecurityRestoreAllLabel() was called), we iterate over each
disk in the domain and remove XATTRs from the whole backing chain
and also from any file the disk is being mirrored to.

This would have been done at the time of pivot, but it isn't
because user decided to kill the domain instead. If we don't do
this and leave some XATTRs behind the domain might be unable to
start.

Also, secdriver can't do this because it doesn't know if there is
any job running. It's outside of its scope - the hypervisor
driver is responsible for calling secdriver's APIs.

Moreover, this is safe to call because we don't remember labels
for any member of a backing chain except of the top layer. But
that one was restored in qemuSecurityRestoreAllLabel() call done
earlier. Therefore, not only we don't remember labels (and thus
this is basically a NOP for other images in the backing chain) it
is also safe to call this when no blockjob was started in the
first place, or if some parts of the backing chain are shared
with some other domains - this is NOP, unless a block job is
active at the time of domain destroy.

https://bugzilla.redhat.com/show_bug.cgi?id=1741456#c19

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-22 10:48:14 +01:00
Peter Krempa
86085c9a2f qemu: Instantiate pflash via -machine when using blockdev
Install the convertor function which enables the internals that will use
-blockdev to make qemu open the firmware image and stop using -drive.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-11-22 08:32:25 +01:00
Jiri Denemark
76baa994b7 qemu: Rename virQEMUCaps{Get,Fetch}CPUDefinitions
The functions return virDomainCapsCPUModelsPtr and thus they should be
called *CPUModels for consistency. Functions called *CPUDefinitions will
work on qemuMonitorCPUDefsPtr.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-20 17:22:05 +01:00
Jiri Denemark
a94f67ee69 qemu: Change return type of virQEMUCapsFetchCPUDefinitions
The function would return a valid virDomainCapsCPUModelsPtr with empty
CPU models list if query-cpu-definitions exists in QEMU, but returns
GenericError meaning it's not in fact implemented. This behaviour is a
bit strange especially after such virDomainCapsCPUModels structure is
stored in capabilities XML and parsed back, which will result in NULL
virDomainCapsCPUModelsPtr rather than a structure containing nothing.

Let's just keep virDomainCapsCPUModelsPtr NULL if the QMP command is not
implemented and change the return value to int so that callers can
easily check for failure or success.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-20 17:22:05 +01:00
Jiri Denemark
4d74990143 qemu: Filter models in virQEMUCapsGetCPUDefinitions
Some callers of virQEMUCapsGetCPUDefinitions will need to filter the
returned list of CPU models. Let's add the filtering parameters directly
to virQEMUCapsGetCPUDefinitions to avoid copying the CPU models list
twice.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-20 17:22:05 +01:00
Jiri Denemark
e20a11eecf qemu: Copy CPU models in virQEMUCapsGetCPUDefinitions
Rather than returning a direct pointer the list stored in qemuCaps the
function now creates a new copy of the CPU models list.

The main purpose of this seemingly useless change is to update callers
to free the result returned by virQEMUCapsGetCPUDefinitions because the
internals of this function will change significantly in the following
patches.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-20 17:22:05 +01:00
Daniel Henrique Barboza
6c63adc4a0 qemu: remove unneeded cleanup labels
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-19 15:22:37 +01:00
Michal Privoznik
6c37ee4da2 qemuProcessStop: Set @def early
The @def variable holds pointer to the domain defintion, but is
set only somewhere in the middle of the function. This is
suboptimal.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2019-11-19 10:25:56 +01:00