libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-01-10 14:57:42 +00:00

Author	SHA1	Message	Date
Michal Privoznik	655f67c68a	qemu_process: Drop needless check in qemuProcessNeedMemoryBackingPath() The aim of this function is to return whether domain definition and/or memory device that user intents to hotplug needs a private path inside cfg->memoryBackingDir. The rule for the memory device that's being hotplug includes checking whether corresponding guest NUMA node needs memoryBackingDir. Well, while the rationale behind makes sense it is not necessary to check for that really - just a few lines above every guest NUMA node was checked exactly for that. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-05-18 17:47:58 +02:00
Michal Privoznik	4d779874ef	qemu_process: Deduplicate code in qemuProcessNeedHugepagesPath() The aim of qemuProcessNeedHugepagesPath() is to return whether guest needs private path inside HugeTLBFS mounts (deducted from domain definition @def) or whether the memory device that user is hotplugging in needs the private path (deducted from the @mem argument). The actual creation of the path is done in the only caller qemuProcessBuildDestroyMemoryPaths(). The rule for the first case (@def) and the second case (@mem) is the same (domain has a DIMM device that has HP requested) and is written twice. Move the logic into a function to deduplicate the code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-05-18 17:47:58 +02:00
Michal Privoznik	310b37e486	qemu: Don't double free @node_cpus in qemuProcessSetupPid() When placing vCPUs into CGroups the qemuProcessSetupPid() is called which then enters a for() loop (around its middle) where it calls virDomainNumaGetNodeCpumask() for each guest NUMA node. But the latter returns only a pointer not new reference/copy and thus the caller must not free it. But the variable is decorated with g_autoptr() which leads to a double free. Fixes: `2d37d8dbc9` Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-04-23 11:02:21 +02:00
Luyao Zhong	2d37d8dbc9	qemu: Add support for 'restrictive' mode in numatune Signed-off-by: Luyao Zhong <luyao.zhong@intel.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-04-19 11:39:21 +02:00
Michal Privoznik	c8238579fb	lib: Drop internal virXXXPtr typedefs Historically, we declared pointer type to our types: typedef struct _virXXX virXXX; typedef virXXX virXXXPtr; But usefulness of such declaration is questionable, at best. Unfortunately, we can't drop every such declaration - we have to carry some over, because they are part of public API (e.g. virDomainPtr). But for internal types - we can do drop them and use what every other C project uses 'virXXX '. This change was generated by a very ugly shell script that generated sed script which was then called over each file in the repository. For the shell script refer to the cover letter: https://listman.redhat.com/archives/libvir-list/2021-March/msg00537.html Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-04-13 17:00:38 +02:00
Tim Wiederhake	c5d4d0198f	qemuProcessUpdateGuestCPU: Check host cpu for forbidden features See https://bugzilla.redhat.com/show_bug.cgi?id=1840770 Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com	2021-03-26 11:40:55 +01:00
Jiri Denemark	1107c0b9c3	Do not check return value of VIR_REALLOC_N Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2021-03-22 12:44:18 +01:00
Jiri Denemark	b8c919b5b4	qemu: Drop redundant checks for qemuCaps before virQEMUCapsGet virQEMUCapsGet checks for qemuCaps itself, no need to do it explicitly. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2021-03-22 12:44:18 +01:00
Peter Krempa	8967ad7be6	qemu: backup: Restore security label on backup disk store image on VM termination When the backup job is terminated normally the security label is restored by the blockjob finishing handler. If the VM dies or is destroyed that wouldn't happen as the blockjob handler wouldn't be called. Restore the security label on disk store where we remember that the job was running at the point when 'qemuBackupJobTerminate' was called. Not resetting the security label means that we also leak the xattr attributes remembering the label which prevents any further use of the file, which is a problem for block devices. This also requires that the call to 'qemuBackupJobTerminate' from 'qemuProcessStop' happens only after 'vm->pid' was reset as otherwise the security subdrivers attempt to enter the process namespace which fails if the process isn't running any more. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1939082 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-03-19 16:41:39 +01:00
Michal Privoznik	6e9c4811be	qemu_process: Use accessor for def->mem.total_memory When connecting to the monitor, a timeout is calculated that is bigger the more memory guest has (because QEMU has to allocate and possibly zero out the memory and what not, empirically deducted). However, when computing the timeout the @total_memory mmember is accessed directly even though virDomainDefGetMemoryTotal() should have been used. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-03-16 09:16:13 +01:00
Peter Krempa	aa372e5a01	backup: Store 'apiFlags' in private section of virDomainBackupDef 'qemuBackupJobTerminate' needs the API flags to see whether VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL. Unfortunately when called via qemuProcessReconnect()->qemuProcessStop() early (e.g. if the qemu process died while we were reconnecting) the job is cleared temporarily so that other APIs can be called. This would mean that we couldn't clean up the files in some cases. Save the 'apiFlags' inside the backup object and set it from the 'qemuDomainJobObj' 'apiFlags' member when reconnecting to a VM. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-03-12 10:59:05 +01:00
Andrea Bolognani	c2180c2fd6	qemu: Set limits only when explicitly asked to do so The current code is written under the assumption that, for all limits except the core size, asking for the limit to be set to zero is a no-op, and so the operation is performed unconditionally. While this is the behavior we want for the QEMU driver, the virCommand and virProcess facilities are generic, and should not implement this kind of policy: asking for a limit to be set to zero should result in that limit being set to zero every single time. Add some checks in the QEMU driver, effectively moving the policy where it belongs. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-03-08 22:41:40 +01:00
Andrea Bolognani	bd33680f02	qemu: Set all limits at the same time qemuProcessLaunch() is the correct place to set process limits, and in fact is where we were dealing with almost all of them, but the memory locking limit was handled in qemuBuildCommandLine() instead for some reason. The code is rewritten so that the desired limit is calculated and applied in separated steps, which will help with further changes, but this doesn't alter the behavior. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-03-08 22:41:40 +01:00
Andrea Bolognani	9bf5c00f9b	qemu: Make some minor tweaks Doing this now will make the next changes nicer. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-03-08 22:41:40 +01:00
Peter Krempa	3c546f7eb4	qemuProcessReportLogError: Don't mark "%s: %s" as translatable The function is constructing an error message from a prefix and the contents of the qemu log file. Marking just two string modifiers as translatable is pointless and will certainly confuse translators. Remove the marking and add a comment which bypasses the sc_libvirt_unmarked_diagnostics check. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-03-05 15:01:29 +01:00
Peter Krempa	c8ff56c7ad	qemuProcessReportLogError: Remove unnecessary math for max error message Now that error message formatting doesn't use fixed size buffers we can drop the math for calculating the maximum chunk of log to report in the error message and use a round number. This also makes it obvious that the chosen number is arbitrary. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-03-05 15:01:29 +01:00
Michal Privoznik	7f482a67e4	lib: Replace virFileMakePath() with g_mkdir_with_parents() Generated using the following spatch: @@ expression path; @@ - virFileMakePath(path) + g_mkdir_with_parents(path, 0777) However, 14 occurrences were not replaced, e.g. in virHostdevManagerNew(). I don't really understand why. Fixed by hand afterwards. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-03-04 20:52:23 +01:00
Michal Privoznik	b1e3728dec	lib: Replace virFileMakePathWithMode() with g_mkdir_with_parents() These functions are identical. Made using this spatch: @@ expression path, mode; @@ - virFileMakePathWithMode(path, mode) + g_mkdir_with_parents(path, mode) Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2021-03-04 20:52:23 +01:00
Jiri Denemark	c8f3b83c72	qemu_domainjob: Make copy of owner API Using the job owner API name directly works fine as long as it is a static string or the owner's thread is still running. However, this is not always the case. For example, when the owner API name is filled in a job when we're reconnecting to existing domains after daemon restart, the dynamically allocated owner name will disappear with the reconnecting thread. Any follow up usage of the pointer will read random memory. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-02-25 09:55:31 +01:00
Peng Liang	1ac703a7d0	qemu: Add missing lock in qemuProcessHandleMonitorEOF qemuMonitorUnregister will be called in multiple threads (e.g. threads in rpc worker pool and the vm event thread). In some cases, it isn't protected by the monitor lock, which may lead to call g_source_unref more than one time and a use-after-free problem eventually. Add the missing lock in qemuProcessHandleMonitorEOF (which is the only position missing lock of monitor I found). Suggested-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Peng Liang <liangpeng10@huawei.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-02-24 15:00:51 +01:00
Stefan Berger	f30aa2ec74	qemu: Fix libvirt hang due to early TPM device stop This patch partially reverts commit `5cde9dee` where the qemuExtDevicesStop() was moved to a location before the QEMU process is stopped. It may be alright to tear down some devices before QEMU is stopped, but it doesn't work for the external TPM (swtpm) which assumes that QEMU sends it a signal to stop it before libvirt may try to clean it up. So this patch moves the virFileDeleteTree() calls after the call to qemuExtDevicesStop() so that the pid file of virtiofsd is not deleted before that call. Afftected libvirt versions are 6.10 and 7.0. Fixes: `5cde9dee8c` Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-02-19 17:31:37 +01:00
Daniel P. Berrangé	17f001c451	qemu: record deprecation messages against the domain These messages are only valid while the domain is running. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2021-02-12 09:19:12 +00:00
Peter Krempa	480fecaa21	Replace virStringListJoin by g_strjoinv Our implementation was inspired by glib anyways. The difference is only the order of arguments. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-02-11 17:05:34 +01:00
Peter Krempa	56cedfcf38	Replace virStringListHasString by g_strv_contains The glib variant doesn't accept NULL list, but there's just one caller where it wasn't checked explicitly, thus there's no need for our own wrapper. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-02-11 17:05:33 +01:00
Peter Krempa	d9f7e87673	qemuProcessUpdateDevices: Refactor cleanup and memory handling Use automatic memory freeing and remove the 'cleanup' label. Also make it a bit more obvious that nothing happens if the 'old' list wasn't present. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-02-11 17:05:33 +01:00
Daniel P. Berrangé	c32f172d12	qemu: wire up support for maximum CPU model The "max" model can be treated the same way as "host" model in general. Reviewed-by: Pavel Hrdina <phrdina@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2021-02-10 11:44:48 +00:00
Laine Stump	674719afe6	qemu: replace VIR_FREE with g_free in all vir*Free() functions Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2021-02-05 00:20:43 -05:00
Pavel Hrdina	836e0a960b	storage_source: use virStorageSource prefix for all functions Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-01-22 11:10:27 +01:00
Pavel Hrdina	01f7ade912	util: extract virStorageFile code into storage_source Up until now we had a runtime code and XML related code in the same source file inside util directory. This patch takes the runtime part and extracts it into the new storage_file directory. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-01-22 11:10:27 +01:00
Laine Stump	c2b2cdf746	call virDomainNetNotifyActualDevice() for all interface types Now that this function can be called regardless of interface type (and whether or not we have a conn for the network driver), let's actually call it for all interface types. This will assure that we re-connect any disconnected bridge devices for <interface type='bridge'> as mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1730084#c26 (until now we've only been reconnecting bridge devices for <interface type='network'>) Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2021-01-08 11:34:49 -05:00
Michal Privoznik	5ac2439a83	qemu_process: Release domain seclabel later in qemuProcessStop() Some secdrivers (typically SELinux driver) generate unique dynamic seclabel for each domain (unless a static one is requested in domain XML). This is achieved by calling qemuSecurityGenLabel() from qemuProcessPrepareDomain() which allocates unique seclabel and stores it in domain def->seclabels. The counterpart is qemuSecurityReleaseLabel() which releases the label and removes it from def->seclabels. Problem is, that with current code the qemuProcessStop() may still want to use the seclabel after it was released, e.g. when it wants to restore the label of a disk mirror. What is happening now, is that in qemuProcessStop() the qemuSecurityReleaseLabel() is called, which removes the SELinux seclabel from def->seclabels, yada yada yada and eventually qemuSecurityRestoreImageLabel() is called. This bubbles down to virSecuritySELinuxRestoreImageLabelSingle() which find no SELinux seclabel (using virDomainDefGetSecurityLabelDef()) and this returns early doing nothing. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1751664 Fixes: `8fa0374c5b` Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2021-01-06 13:29:09 +01:00
Jiri Denemark	f7c40b5c71	qemu: The TSC tolerance interval should be closed The kernel refuses to set guest TSC frequency less than a minimum frequency or greater than maximum frequency (both computed based on the host TSC frequency). When writing the libvirt code with a reversed logic (return success when the requested frequency falls within the tolerance interval) I forgot to include the boundaries. Fixes: `d8e5b45600` https://bugzilla.redhat.com/show_bug.cgi?id=1839095 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2021-01-06 11:24:37 +01:00
Peter Krempa	d0819b9f02	qemu: Properly handle setting of <iotune> for empty cdrom When starting a VM with an empty cdrom which has <iotune> configured the startup fails as qemu is not happy about setting tuning for an empty drive: error: internal error: unable to execute 'block_set_io_throttle', unexpected error: 'Device has no medium' Resolve this by skipping the setting of throttling for empty drives and updating the throttling when new medium is inserted into the drive. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/111 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2021-01-06 09:24:48 +01:00
Shi Lei	9b5d741a9d	netdevmacvlan: Use helper function to create unique macvlan/macvtap name Simplify ReserveName/GenerateName for macvlan and macvtap by using common functions. Signed-off-by: Shi Lei <shi_lei@massclouds.com> Reviewed-by: Laine Stump <laine@redhat.com>	2020-12-15 13:35:33 -05:00
Shi Lei	c36cad1a31	netdevtap: Use common helper function to create unique tap name Simplify GenerateName/ReserveName for netdevtap by using common functions. Signed-off-by: Shi Lei <shi_lei@massclouds.com> Reviewed-by: Laine Stump <laine@redhat.com>	2020-12-15 13:35:27 -05:00
Daniel Henrique Barboza	9432693e2b	domain_conf.c: move virDomainDeviceDefValidate() to domain_validate.c Move virDomainDeviceDefValidate() and all its helper functions to domain_validate.c. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-12-14 09:29:09 -03:00
Peter Krempa	18de9dfd77	virDomainDefValidate: Add per-run 'opaque' data virDomainDefPostParse infrastructure has apart from the global opaque data also per-run data, but this was not duplicated into the validation callbacks. This is important when drivers want to use correct run-state for the validation. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-12-09 09:33:47 +01:00
Peter Krempa	c1720b9ac7	qemuDomainDiskLookupByNodename: Lookup also backup 'store' nodenames Nodename may be asociated to a disk backup job, add support to looking up in that chain too. This is specifically useful for the BLOCK_WRITE_THRESHOLD event which can be registered for any nodename. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-12-08 15:12:34 +01:00
Michal Privoznik	40a162f83e	qemu: Don't cache NUMA caps In v6.0.0-rc1~439 (and friends) we tried to cache NUMA capabilities because we assumed they are immutable. And to some extent they are (NUMA hotplug is not a thing, is it). However, our capabilities contain also some runtime info that can change, e.g. hugepages pool allocation sizes or total amount of memory per node (host side memory hotplug might change the value). Because of the caching we might not be reporting the correct runtime info in 'virsh capabilities'. The NUMA caps are used in three places: 1) 'virsh capabilities' 2) domain startup, when parsing numad reply 3) parsing domain private data XML In cases 2) and 3) we need NUMA caps to construct list of physical CPUs that belong to NUMA nodes from numad reply. And while this may seem static, it's not really because of possible CPU hotplug on physical host. There are two possible approaches: 1) build a validation mechanism that would invalidate the cached NUMA caps, or 2) drop the caching and construct NUMA caps from scratch on each use. In this commit, the latter approach is implemented, because it's easier. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1819058 Fixes: `1a1d848694` Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-12-07 11:32:40 +01:00
Peter Krempa	0f7b80691b	qemuMonitorBlockJobInfo: Store 'ready' and 'ready_present' separately Don't make the logic confusing by representing the 3 options using an integer with negative values. Signed-off-by: Peter Krempa <pkrempa@redhat.com>	2020-12-07 10:15:00 +01:00
Daniel Henrique Barboza	5a34d0667d	qemu: move memory size align to qemuProcessPrepareDomain() qemuBuildCommandLine() is calling qemuDomainAlignMemorySizes(), which is an operation that changes live XML and domain and has little to do with the command line build process. Move it to qemuProcessPrepareDomain() where we're supposed to make live XML and domain changes before launch. qemuProcessStart() is setting VIR_QEMU_PROCESS_START_NEW if !migrate && !snapshot, same conditions used in qemuBuildCommandLine() to call qemuDomainAlignMemorySizes(), making this change seamless. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-12-03 17:19:35 -03:00
Daniel Henrique Barboza	3bb9ed8bc2	qemu_process.c: check migrateURI when setting VIR_QEMU_PROCESS_START_NEW qemuProcessCreatePretendCmdPrepare() is setting the VIR_QEMU_PROCESS_START_NEW regardless of whether this is a migration case or not. This behavior differs from what we're doing in qemuProcessStart(), where the flag is set only if !migrate && !snapshot. Fix it by making the flag setting consistent with what we're doing in qemuProcessStart(). Reviewed-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-12-03 17:16:33 -03:00
John Ferlan	148cfcf051	qemu: Pass / fill niothreads for qemuMonitorGetIOThreads Let's pass along / fill @niothreads rather than trying to make dual use as a return value and thread count. This resolves a Coverity issue detected in qemuDomainGetIOThreadsMon where if qemuDomainObjExitMonitor failed, then a -1 was returned and overwrite @niothreads causing a memory leak. Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-12-03 17:06:07 +01:00
Michal Privoznik	b7d4e6b67e	lib: Replace VIR_AUTOSTRINGLIST with GStrv Glib provides g_auto(GStrv) which is in-place replacement of our VIR_AUTOSTRINGLIST. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-12-02 15:43:07 +01:00
Pavel Hrdina	82bda55e2f	qemuProcessHandleGraphics: no need to check for NULL Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-11-16 17:25:41 +01:00
Jiri Denemark	d8e5b45600	qemu: Do not require TSC frequency to strictly match host Some CPUs provide a way to read exact TSC frequency, while measuring it is required on other CPUs. However, measuring is never exact and the result may slightly differ across reboots. For this reason both Linux kernel and QEMU recently started allowing for guests TSC frequency to fall into +/- 250 ppm tolerance interval around the host TSC frequency. Let's do the same to avoid unnecessary failures (esp. during migration) in case the host frequency does not exactly match the frequency configured in a domain XML. https://bugzilla.redhat.com/show_bug.cgi?id=1839095 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-11-12 17:29:16 +01:00
Masayoshi Mizuma	5cde9dee8c	qemu: Move qemuExtDevicesStop() before removing the pidfiles A qemu guest which has virtiofs config fails to start if the previous starting failed because of invalid option or something. That's because the virtiofsd isn't killed by virPidFileForceCleanupPath() on the former failure because the pidfile was already removed by virFileDeleteTree(priv->libDir) in qemuProcessStop(), so virPidFileForceCleanupPath() just returned. Move qemuExtDevicesStop() before virFileDeleteTree(priv->libDir) so that virPidFileForceCleanupPath() can kill virtiofsd correctly. For example of the reproduction: # virsh start guest error: Failed to start domain guest error: internal error: process exited while connecting to monitor: qemu-system-x86_64: -foo: invalid option ... fix the option ... # virsh start guest error: Failed to start domain guest error: Cannot open log file: '/var/log/libvirt/qemu/guest-fs0-virtiofsd.log': Device or resource busy # Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-11-11 15:20:12 +01:00
Peter Krempa	62a01d84a3	util: hash: Retire 'virHashTable' in favor of 'GHashTable' Don't hide our use of GHashTable behind our typedef. This will also promote the use of glibs hash function directly. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Matt Coleman <matt@datto.com>	2020-11-06 10:40:51 +01:00
Daniel P. Berrangé	99a1cfc438	qemu: honour fatal errors dealing with qemu slirp helper Currently all errors from qemuInterfacePrepareSlirp() are completely ignored by the callers. The intention is that missing qemu-slirp binary should cause the caller to fallback to the built-in slirp impl. Many of the possible errors though should indeed be considered fatal. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-10-27 12:03:19 +00:00
zhenwei pi	7555a55470	qemu: implement memory failure event Since QEMU 5.2 (commit-77b285f7f6), QEMU supports 'memory failure' event, posts event to monitor if hitting a hardware memory error. Fully support this feature for QEMU. Test with commit 'libvirt: support memory failure event', build a little complex environment(nested KVM): 1, install newly built libvirt in L1, and start a L2 vm. run command in L1: ~# virsh event l2 --event memory-failure 2, run command in L0 to inject MCE to L1: ~# virsh qemu-monitor-command l1 --hmp mce 0 9 0xbd000000000000c0 0xd 0x62000000 0x8c Test result in l1(recipient hypervisor case): event 'memory-failure' for domain l2: recipient: hypervisor action: ignore flags: action required: 0 recursive: 0 Test result in l1(recipient guest case): event 'memory-failure' for domain l2: recipient: guest action: inject flags: action required: 0 recursive: 0 Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-10-23 09:42:00 +02:00
Peter Krempa	d6d4c08daf	util: hash: Change type of hash table name/key to 'char' All users of virHashTable pass strings as the name/key of the entry. Make this an official requirement by turning the variables to 'const char *'. For any other case it's better to use glib's GHashTable. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2020-10-22 15:02:46 +02:00
Daniel P. Berrangé	7b1ed1cd73	qemu: stop passing -enable-fips to QEMU >= 5.2.0 Use of the -enable-fips option is being deprecated in QEMU >= 5.2.0. If FIPS compliance is required, QEMU must be built with libcrypt which will unconditionally enforce it. Thus there is no need for libvirt to pass -enable-fips to modern QEMU. Unfortunately there was never any way to probe for -enable-fips in the first instance, it was enabled by libvirt based on version number originally, and then later unconditionally enabled when libvirt dropped support for older QEMU. Similarly we now use a version number check to decide when to stop passing -enable-fips. Note that the qemu-5.2 capabilities are currently from the pre-release version and will be updated once qemu-5.2 is released. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-10-22 09:03:33 +02:00
Jonathon Jongsma	08f8fd8413	conf: Add support for vDPA network devices This patch adds new schema and adds support for parsing and formatting domain configurations that include vdpa devices. vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. When a device on the host is bound to an appropriate vendor-specific driver, it will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type='vdpa'. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2020-10-20 14:46:52 -04:00
Peter Krempa	7b0ced89e7	qemu: Prepare hostdev data which depends on the host state separately SCSI hostdev setup requires querying the host os for the actual path of the configured hostdev. This was historically done in the command line formatter. Our new approach is to split out this part into 'qemuProcessPrepareHost' which is designed to be skipped in tests. Refactor the hostdev code to use this new semantics, and add appropriate handlers filling in the data for tests and the qemuConnectDomainXMLToNative users. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-10-20 15:08:22 +02:00
Peter Krempa	9ff3ad9058	qemuProcessCreatePretendCmd: Split up preparation and command building Host preparation steps which are deliberately skipped when pretend-creating a commandline are normally executed after VM object preparation. In the test code we are faking some of the host preparation steps, but we were doing that prior to the call to qemuProcessPrepareDomain embedded in qemuProcessCreatePretendCmd. By splitting up qemuProcessCreatePretendCmd into two functions we can ensure that the ordering of the prepare steps stays consistent. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-10-20 15:08:22 +02:00
Erik Skultety	ccb40cf288	qemu: process: sev: Fill missing 'cbitpos' & 'reducedPhysBits' from caps These XML attributes have been mandatory since the introduction of SEV support to libvirt. This design decision was based on QEMU's requirement for these to be mandatory for migration purposes, as differences in these values across platforms must result in the pre-migration checks failing (not that migration with SEV works at the time of this patch). This patch enables autofill of these attributes right before launching QEMU and thus updating the live XML. Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-10-19 11:03:27 +02:00
Erik Skultety	1fdc907325	qemu: process: Move SEV capability check to qemuValidateDomainDef Checks such as this one should be done at domain def validation time, not before starting the QEMU process. As for this change, existing domains will see some QEMU error when starting as opposed to a libvirt error that this QEMU binary doesn't support SEV, but that's okay, we never guaranteed error messages to remain the same. Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-10-19 11:03:16 +02:00
Erik Skultety	649f720a9a	qemu_process: sev: Drop an unused variable Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-10-19 11:01:56 +02:00
Pavel Hrdina	5ad8272888	util: vircgroup: change virCgroupFree to take only virCgroupPtr As preparation for g_autoptr() we need to change the function to take only virCgroupPtr. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>	2020-10-09 16:24:35 +02:00
Ján Tomko	cc3190cc4c	qemu: process: use g_new0 Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2020-10-05 16:44:06 +02:00
Ján Tomko	868c350752	qemu: separate out VIR_ALLOC calls Move them to separate conditions to reduce churn in following patches. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2020-10-05 16:44:06 +02:00
Cole Robinson	0fa5c23865	qemu: Taint cpu host-passthrough only after migration From a discussion last year[1], Dan recommended libvirt drop the tain flag for cpu host-passthrough, unless the VM has been migrated. This repurposes the existing host-cpu taint flag to do just that. [1]: https://www.redhat.com/archives/virt-tools-list/2019-February/msg00041.html https://bugzilla.redhat.com/show_bug.cgi?id=1673098 Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Cole Robinson <crobinso@redhat.com>	2020-10-05 10:08:26 -04:00
Peter Krempa	faa88866f5	Don't check return value of virBitmapNewCopy The function will not fail any more. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-05 15:50:45 +02:00
Peter Krempa	cb6fdb0125	virBitmapNew: Don't check return value Remove return value check from all callers. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-05 15:38:47 +02:00
Masayoshi Mizuma	1c9227de5d	qemu: process: Handle transient disks on VM startup Add overlays after the VM starts before we start executing guest code. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Peter Krempa	afc25e8553	qemu: prepare cleanup for <transient/> disk overlays Later patches will implement support for <transient/> disks in libvirt by installing an overlay on top of the configured image. This will require cleanup after the VM will be stopped so that the state is correctly discarded. Since the overlay will be installed only during the startup phase of the VM we need to ensure that qemuProcessStop doesn't delete the original file on some previous failure. This is solved by adding 'inhibitDiskTransientDelete' VM private data member which is set prior to any startup step and will be cleared once transient disk overlays are established. Based on that we can then delete the overlays for any <transient/> disk. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Peter Krempa	3673bdbe13	qemu: domain: Extract preparation of hostdev specific data to a separate function Historically we've prepared secrets for all objects in one place. This doesn't make much sense and it's semantically more appealing to prepare everything for a single device type in one place. Move the setup of the (iSCSI\|SCSI) hostdev secrets into a new function which will be used to setup other things as well in the future. This is a similar approach we do for disks. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Ján Tomko	af16e754cd	qemuProcessReconnect: clear 'oldjob' After we started copying the privateData pointer in qemuDomainObjRestoreJob, we should also free them once we're done with them. Register the clear function and use g_auto. Also add a check for job->cb to qemuDomainObjClearJob, to prevent freeing an uninitialized job. https://bugzilla.redhat.com/show_bug.cgi?id=1878450 Signed-off-by: Ján Tomko <jtomko@redhat.com> Fixes: `aca37c3fb2`	2020-09-14 18:10:56 +02:00
Tim Wiederhake	caf5a88e59	qemu: Use glib memory functions in qemuProcessReadLog Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-11 18:19:58 +02:00
Michal Privoznik	ec46e6d44b	qemu_process: Separate VIR_PERF_EVENT_* setting into a function When starting a domain, qemuProcessLaunch() iterates over all VIR_PERF_EVENT_* values and (possibly) enables them. While there is nothing wrong with the code, the for loop where it's done makes it harder to jump onto next block of code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-08 10:57:24 +02:00
Martin Kletzander	f5b486daea	qemu: Allow setting affinity to fail and don't report error This is just a clean-up of commit `3791f29b08` using the new parameter of virProcessSetAffinity() introduced in commit `9514e24984` so that there is no error reported in the logs. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-07 14:48:57 +02:00
Martin Kletzander	9514e24984	Do not report error when setting affinity is allowed to fail Suggested-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-07 11:35:36 +02:00
Nikolay Shirokovskiy	5c0cd375d1	qemu: don't shutdown event thread in monitor EOF callback This hunk was introduced in [1] in order to avoid loosing events from monitor on stopping qemu process. But as explained in [2] on destroy we won't get neither EOF nor any other events as monitor is just closed. In case of crash/shutdown we won't get any more events as well and qemuDomainObjStopWorker will be called by qemuProcessStop eventually. Thus let's remove qemuDomainObjStopWorker from qemuProcessHandleMonitorEOF as it is not useful anymore. [1] `e6afacb0f`: qemu: start/stop an event loop thread for domains [2] `d2954c072`: qemu: ensure domain event thread is always stopped Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-07 09:33:59 +03:00
Martin Kletzander	fc7d53edf4	qemu: Fix comment in qemuProcessSetupPid This was supposed to be done in commit `3791f29b08`, but I missed a spot. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2020-09-06 13:44:27 +02:00
Martin Kletzander	3791f29b08	qemu: Do not error out when setting affinity failed Consider a host with 8 CPUs. There are the following possible scenarios 1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs 2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs 3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus; QEMU should get 8 CPUs 4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 8 CPUs 5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus; QEMU should get 4 CPUs 6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 4 CPUs Scenarios 1 & 2 always work unless systemd restricted libvirtd privs. Scenario 3 works because libvirt checks current affinity first and skips the sched_setaffinity call, avoiding the SYS_NICE issue Scenario 4 works only if CAP_SYS_NICE is availalbe Scenarios 5 & 6 works only if CAP_SYS_NICE is present AND the cgroups cpuset is not set on the container. If libvirt blindly ignores the sched_setaffinity failure, then scenarios 4, 5 and 6 should all work, but with caveat in case 4 and 6, that QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively. This is still better than failing. Therefore libvirt can blindly ignore the setaffinity failure, but ONLY ignore it when there was no affinity specified in the XML config. If user specified affinity explicitly, libvirt must report an error if it can't be honoured. Resolves: https://bugzilla.redhat.com/1819801 Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-04 14:44:21 +02:00
Michal Privoznik	95b9db4ee2	lib: Prefer WITH_* prefix for #if conditionals Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH. Things got way better with Pavel's work on meson, but apparently, mixing these two lead to confusing and easy to miss bugs (see `31fb929eca` for instance). While we were forced to use HAVE_ prefix with autotools, we are free to chose our own prefix with meson and since WITH_ prefix appears to be more popular let's use it everywhere. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-02 10:28:10 +02:00
Laine Stump	95089f481e	util: assign tap device names using a monotonically increasing integer When creating a standard tap device, if provided with an ifname that contains "%d", rather than taking that literally as the name to use for the new device, the kernel will instead use that string as a template, and search for the lowest number that could be put in place of %d and produce an otherwise unused and unique name for the new device. For example, if there is no tap device name given in the XML, libvirt will always send "vnet%d" as the device name, and the kernel will create new devices named "vnet0", "vnet1", etc. If one of those devices is deleted, creating a "hole" in the name list, the kernel will always attempt to reuse the name in the hole first before using a name with a higher number (i.e. it finds the lowest possible unused number). The problem with this, as described in the previous patch dealing with macvtap device naming, is that it makes "immediate reuse" of a newly freed tap device name much more common, and in the aftermath of deleting a tap device, there is some other necessary cleanup of things which are named based on the device name (nwfilter rules, bandwidth rules, OVS switch ports, to name a few) that could end up stomping over the top of the setup of a new device of the same name for a different guest. Since the kernel "create a name based on a template" functionality for tap devices doesn't exist for macvtap, this patch for standard tap devices is a bit different from the previous patch for macvtap - in particular there was no previous "bitmap ID reservation system" or overly-complex retry loop that needed to be removed. We simply find and unused name, and pass that name on to the kernel instead of "vnet%d". This counter is also wrapped when either it gets to INT_MAX or if the full name would overflow IFNAMSIZ-1 characters. In the case of "vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly someday someone will change the name from vnet to something else. (NB: It is still possible for a user to provide their own parameterized template name (e.g. "mytap%d") in the XML, and libvirt will just pass that through to the kernel as it always has.) Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-01 14:16:44 -04:00
Laine Stump	d7f38beb2e	util: replace macvtap name reservation bitmap with a simple counter There have been some reports that, due to libvirt always trying to assign the lowest numbered macvtap / tap device name possible, a new guest would sometimes be started using the same tap device name as previously used by another guest that is in the process of being destroyed as the new guest is starting. In some cases this has led to, for example, the old guest's qemuProcessStop() code deleting a port from an OVS switch that had just been re-added by the new guest (because the port name is based on only the device name using the port). Similar problems can happen (and I believe have) with nwfilter rules and bandwidth rules (which are both instantiated based on the name of the tap device). A couple patches have been previously proposed to change the ordering of startup and shutdown processing, or to put a mutex around everything related to the tap/macvtap device name usage, but in the end no matter what you do there will still be possible holes, because the device could be deleted outside libvirt's control (for example, regular tap devices are automatically deleted when the qemu process terminates, and that isn't always initiated by libvirt but could instead happen completely asynchronously - libvirt then has no control over the ordering of shutdown operations, and no opportunity to protect it with a mutex.) But this only happens if a new device is created at the same time as one is being deleted. We can effectively eliminate the chance of this happening if we end the practice of always looking for the lowest numbered available device name, and instead just keep an integer that is incremented each time we need a new device name. At some point it will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15 character limit if nothing else), and we can't guarantee that the new name really will be the least* recently used name, but "math" suggests that it will be much less common that we'll try to re-use the most recently used name. This patch implements such a counter for macvtap/macvlan, replacing the existing, and much more complicated, "ID reservation" system. The counter is set according to whatever macvtap/macvlan devices are already in use by guests when libvirtd is started, incremented each time a new device name is needed, and wraps back to 0 when either INT_MAX is reached, or when the resulting device name would be longer than IFNAMSIZ-1 characters (which actually is what happens when the template for the device name is "maccvtap%d"). The result is that no macvtap name will be re-used until the host has created (and possibly destroyed) 99,999,999 devices. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-01 14:16:36 -04:00
Ján Tomko	0a37e0695b	Split declarations from initializations Split those initializations that depend on a statement above them. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Ján Tomko	a5152f23e7	Move declarations before statements Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Laine Stump	5cad64ec03	qemu: remove unreachable code in qemuProcessStart() Back when the original version of this chunk of code was added (commit `41b087198` in libvirt-0.8.1 in April 2010), we used virExecDaemonize() to start the qemu process, and would continue on in the function (which at that time was called qemudStartVMDaemon()) even if a -1 was returned. So it was possible to get to this code with rv == -1 (it was called "ret" in that version of the code). In modern libvirt code, qemu is started with virCommandRun(); then we call virPidFileReadPath(); those are the only two ways of setting "rv" prior to this code being removed, and in either case if the new value of rv < 0, then we immediately skip over the rest of the code to the cleanup: label. This means that the code being removed by this patch is unreachable. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-08-24 23:46:51 -04:00
Michal Privoznik	9048dc4e62	qemuDomainBuildNamespace: Populate basic /dev from daemon's namespace As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:40:36 +02:00
Michal Privoznik	8da362fe62	qemu_domain_namespace: Repurpose qemuDomainBuildNamespace() Okay, here is the deal. Currently, the way we build namespace is very fragile. It is done from pre-exec hook when starting a domain, after we mass closed all FDs and before we drop privileges and exec() QEMU. This fact poses some limitations onto the namespace build code, e.g. it has to make sure not to keep any FD opened (not even through a library call), because it would be leaked to QEMU. Also, it has to call only async signal safe functions. These requirements are hard to meet - in fact as of my commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling libdevmapper functions. To solve this issue and avoid similar problems in the future, we should change our paradigm. We already have functions which can populate domain's namespace with nodes from the daemon context. If we use them to populate the namespace and keep only the bare minimum in the pre-exec hook, we've mitigated the risk. Therefore, the old qemuDomainBuildNamespace() is renamed to qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace() function is introduced. So far, the new function is basically a NOP and domain's namespace is still populated from the pre-exec hook - next patches will fix it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:40:36 +02:00
Michal Privoznik	764eaf1aa4	qemu_domain_namespace: Rename qemuDomainCreateNamespace() The name of this function is not very helpful, because it doesn't create anything, it just flips a bit in a bitmask when domain is starting up. Move the function internals into qemu_process.c and forget the function ever existed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:40:33 +02:00
Michal Privoznik	90eee87569	qemu: Separate out namespace handling code The qemu_domain.c file is big as is and we should split it into separate semantic blocks. Start with code that handles domain namespaces. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:32:27 +02:00
Ján Tomko	ee247e1d3f	Use g_strfeev instead of virStringFreeList Both accept a NULL value gracefully and virStringFreeList does not zero the pointer afterwards, so a straight replace is safe. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2020-08-03 15:37:36 +02:00
Ján Tomko	1edf164848	Remove redundant conditions All of these have been checked earlier. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2020-08-03 15:19:28 +02:00
Ján Tomko	6c7ba7b496	qemu: Fix affinity typo Fixes: `4c0398b528` Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-07-22 15:51:26 +02:00
Bihong Yu	3ee423c363	qemu: pre-create the dbus directory in qemuStateInitialize There are races condiction to make '/run/libvirt/qemu/dbus' directory in virDirCreateNoFork() while concurrent start VMs, and get "failed to create directory '/run/libvirt/qemu/dbus': File exists" error message. pre-create the dbus directory in qemuStateInitialize. Signed-off-by: Bihong Yu <yubihong@huawei.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-07-22 09:40:15 +02:00
Jiri Denemark	1031db3600	qemu: Properly set //cpu/@migratable default value for running domains Since active domains which do not have the attribute already set were not started by libvirt that probed for CPU migratable property, we need to check this property on reconnect and update the domain definition accordingly. https://bugzilla.redhat.com/show_bug.cgi?id=1857967 Reported-by: Mark Mielke <mark.mielke@gmail.com> Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-07-21 15:40:01 +02:00
Daniel Henrique Barboza	f187b2fb98	qemu_process.c: modernize qemuProcessQMPNew() Use g_autoptr() and remove the 'cleanup' label. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200717211556.1024748-3-danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-07-21 15:34:36 +02:00
Peter Krempa	db712b0673	qemuDomainDiskLookupByNodename: Remove unused 'idx' All callers pass NULL as the value. Remove the argument. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-07-21 09:52:46 +02:00
Peter Krempa	c414ab00e2	qemuProcessHandleBlockThreshold: Report correct indexes The index returned by qemuDomainDiskLookupByNodename is the position in the backing chain rather than the index we report in the XML. Since with -blockdev they differ now and additionally the disk source also has an index we need to fix the 'threshold' events we report: 1) If it's the top level image we must always trigger the event without any suffix as we did until now 2) We must report the correct index 3) We must report the correct index also for the top level image, when blockdev is used. This means that we need to potentially emit 2 events, one for the device without the index and then when blockdev is used and the top level image has an index we must do it also with the index. This will fix it for blockdev cases, while also not removing previous semantics. https://bugzilla.redhat.com/show_bug.cgi?id=1857204 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-07-21 09:52:46 +02:00
Peter Krempa	4a19b7b832	qemuDomainDiskBackingStoreGetName: Remove unused argument Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-07-21 09:52:46 +02:00
Prathamesh Chavan	aca37c3fb2	qemu_domainjob: introduce `privateData` for `qemuDomainJob` To remove dependecy of `qemuDomainJob` on job specific paramters, a `privateData` pointer is introduced. To handle it, structure of callback functions is also introduced. Signed-off-by: Prathamesh Chavan <pc44800@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-07-20 15:34:58 +02:00
Michal Privoznik	824e349397	qemu: Use qemuSecuritySetSavedStateLabel() to label restore path Currently, when restoring from a domain the path that the domain restores from is labelled under qemuSecuritySetAllLabel() (and after v6.3.0-rc1~108 even outside transactions). While this grants QEMU the access, it has a flaw, because once the domain is restored, up and running then qemuSecurityDomainRestorePathLabel() is called, which is not real counterpart. In case of DAC driver the SetAllLabel() does nothing with the restore path but RestorePathLabel() does - it chown()-s the file back and since there is no original label remembered, the file is chown()-ed to root:root. While the apparent solution is to have DAC driver set the label (and thus remember the original one) in SetAllLabel(), we can do better. Turns out, we are opening the file ourselves (because it may live on a root squashed NFS) and then are just passing the FD to QEMU. But this means, that we don't have to chown() the file at all, we need to set SELinux labels and/or add the path to AppArmor profile. And since we want to restore labels right after QEMU is done loading the migration stream (we don't want to wait until qemuSecurityRestoreAllLabel()), the best way to approach this is to have separate APIs for labelling and restoring label on the restore file. I will investigate whether AppArmor can use the SavedStateLabel() API instead of passing the restore path to SetAllLabel(). Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1851016 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2020-07-10 14:18:07 +02:00
Daniel P. Berrangé	fd460ef561	qemu: stop checking virObjectUnref return value Some, but not all, of the monitor event handlers check the virObjectUnref return value to see if the domain was disposed. It should not be possible for this to happen, since the function already holds a lock on the domain and has only just acquired an extra reference on the domain a few lines earlier. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-06-03 10:20:17 +01:00
Daniel Henrique Barboza	9665b27dba	qemuProcessRefreshCPU: skip 'host-model' logic for pSeries guests Commit v3.10.0-182-g237f045d9a ("qemu: Ignore fallback CPU attribute on reconnect") forced CPU 'fallback' to ALLOW, regardless of user choice. This fixed a situation in which guests created with older Libvirt versions, which used CPU mode 'host-model' in runtime, would fail to launch in a newer Libvirt if the fallback was set to FORBID. This would lead to a scenario where the CPU was translated to 'host-model' to 'custom', but then the FORBID setting would make the translation process fail. PSeries can operate with 'host-model' in runtime due to specific PPC64 mechanics regarding compatibility mode. The update() implementation of the cpuDriverPPC64 driver is a NO-OP if CPU mode is 'host-model', and the driver does not implement translate(). The commit mentioned above is causing PSeries guests to get their 'fallback' setting to ALLOW, overwriting user choice, exposing a design problem in qemuProcessRefreshCPU() - for PSeries guests, handling 'host-model' as it is being done does not apply. All other cpuArchDrivers implements update() and changes guest mode to VIR_CPU_MODE_CUSTOM, meaning that PSeries is currently the only exception to this logic. Let's make it official. https://bugzilla.redhat.com/show_bug.cgi?id=1660711 Suggested-by: Jiri Denemark <jdenemar@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200525123945.4049591-2-danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-05-25 16:20:25 +02:00
Daniel Henrique Barboza	f600c42627	qemu_process.c: modernize qemuProcessUpdateCPU code path Use automatic cleanup on qemuProcessUpdateCPU and the functions called by it. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200522195620.3843442-5-danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-05-25 12:31:14 +02:00
Peter Krempa	78d30aa0bf	qemu: Prepare for testing of 'netdev_add' props via qemuxml2argvtest qemuxml2argv test suite is way more comprehensive than the hotplug suite. Since we share the code paths for monitor and command line hotplug we can easily test the properties of devices against the QAPI schema. To achieve this we'll need to skip the JSON->commandline conversion for the test run so that we can analyze the pure properties. This patch adds flags for the comand line generator and hook them into the JSON->commandline convertor for -netdev. An upcoming patch will make use of this new infrastructure. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-05-20 09:41:58 +02:00
Michal Privoznik	8fd2749b2d	qemuProcessStop: Reattach NVMe disks a domain is mirroring into If the mirror destination is not a file but a NVMe disk, then call qemuHostdevReAttachOneNVMeDisk() to reattach the NVMe back to the host. This would be done by blockjob code when the job finishes, but in this case the job won't finish - QEMU is killed meanwhile. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1825785 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-05-18 15:14:27 +02:00
Michal Privoznik	0230e38384	qemuProcessStop: Use XATTRs to restore seclabels on disks a domain is mirroring into In v5.10.0-rc1~42 (which was later fixed in v6.0.0-rc1~487) I am removing XATTRs for a file that QEMU is mirroring a disk into but it is killed meanwhile. Well, we can call qemuSecurityRestoreImageLabel() which will not only remove XATTRs but also use them to restore the original owner of the file. This would be done by blockjob code when the job finishes, but in this case the job won't finish - QEMU is killed meanwhile Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-05-18 15:13:14 +02:00
Ján Tomko	006782a8bc	qemu: only stop external devices after the domain A failure in qemuProcessLaunch would lead to qemuExtDevicesStop being called twice - once in the cleanup section and then again in qemuProcessStop. However, the first one is called while the QEMU process is still running, which is too soon for the swtpm process, because the swtmp_ioctl command can lock up: https://bugzilla.redhat.com/show_bug.cgi?id=1822523 Remove the first call and only leave the one in qemuProcessStop, which is called after the QEMU process is killed. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2020-05-13 15:29:37 +02:00
Peter Krempa	3bcbdc51da	qemu: process: Don't clear QEMU_CAPS_BLOCKDEV when SD card is present Help QEMU in deprecation of -drive if=none without the need to refactor all old boards. Stop masking out -blockdev support when -drive if=sd needs to be used. We achieve this by forbidding blockjobs and special-casing all other code paths. Blockjobs are sacrificed in this case as SD cards are a corner case for some ARM boards and are thus not used commonly. https://bugzilla.redhat.com/show_bug.cgi?id=1821692 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:55:00 +02:00
Peter Krempa	e664abb62e	qemu: Prepare for 'sd' card use together with blockdev SD cards need to be instantiated via -drive if=sd. This means that all cases where we use the blockdev path need to be special-cased for SD cards. Note that at this point QEMU_CAPS_BLOCKDEV is still cleared if the VM config has a SD card. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:55:00 +02:00
Peter Krempa	d876a93f05	qemu: Handle cases when 'qomName' isn't present Use the drive alias for all cases when we can't generate qomName. This is meant to handle disks on 'sd' bus which are instantiated via -drive if=sd as there isn't any specific QOM name for them. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:55:00 +02:00
Peter Krempa	cc4a277db2	qemu: Rename qemuDiskBusNeedsDriveArg to qemuDiskBusIsSD The function effectively boils down to whether the disk is 'SD'. Since we'll need to make more decisions based on the fact whether the disk is on the SD bus, rename the function. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:54:59 +02:00
Michal Privoznik	1d3a9ee9da	qemu: Make memory path generation embed driver aware So far, libvirt generates the following path for memory: $memoryBackingDir/$id-$shortName/ram-nodeN where $memoryBackingDir is the path where QEMU mmaps() memory for the guest (e.g. /var/lib/libvirt/qemu/ram), $id is domain ID and $shortName is shortened version of domain name. So for instance, the generated path may look something like this: /var/lib/libvirt/qemu/ram/1-QEMUGuest/ram-node0 While in case of embed driver the following path would be generated by default: $root/lib/qemu/ram/1-QEMUGuest/ram-node0 which is not clashing with other embed drivers, we allow users to override the default and have all embed drivers use the same prefix. This can create clashing paths. Fortunately, we can reuse the approach for machined name generation (v6.1.0-178-gc9bd08ee35) and include part of hash of the root in the generated path. Note, the important change is in qemuGetMemoryBackingBasePath(). The rest is needed to pass driver around. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-04-07 15:26:32 +02:00
Michal Privoznik	bf54784cb1	qemu: Make hugepages path generation embed driver aware So far, libvirt generates the following path for hugepages: $mnt/libvirt/qemu/$id-$shortName where $mnt is the mount point of hugetlbfs corresponding to hugepages of desired size (e.g. /dev/hugepages), $id is domain ID and $shortName is shortened version of domain name. So for instance, the generated path may look something like this: /dev/hugepages/libvirt/qemu/1-QEMUGuest But this won't work with embed driver really, because if there are two instances of embed driver, and they both want to start a domain with the same name and with hugepages, both drivers will generate the same path which is not desired. Fortunately, we can reuse the approach for machined name generation (v6.1.0-178-gc9bd08ee35) and include part of hash of the root in the generated path. Note, the important change is in qemuGetBaseHugepagePath(). The rest is needed to pass driver around. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-04-07 15:26:26 +02:00
Daniel P. Berrangé	d2954c0729	qemu: ensure domain event thread is always stopped In previous commit: commit `e6afacb0fe` Author: Daniel P. Berrangé <berrange@redhat.com> Date: Wed Feb 12 12:26:11 2020 +0000 qemu: start/stop an event loop thread for domains A bogus comment was added claiming we didn't need to shutdown the event thread in the qemuProcessStop method, because this would be done in the monitor EOF callback. This was wrong because the EOF callback only runs in the case of a QEMU crash or a guest initiated clean shutdown & poweroff. In the case where the libvirt admin calls virDomainDestroy, the EOF callback never fires because we have already unregistered the event callbacks. We must thus always attempt to stop the event thread in qemuProcessStop. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-30 16:48:15 +01:00
Marc-André Lureau	db670b8d67	qemu: prepare and stop the dbus daemon Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-03-24 15:57:33 +01:00
Michal Privoznik	a02c589886	qemuProcessStartManagedPRDaemon: Don't pass -f pidfile to the daemon Now, that our virCommandSetPidFile() is more intelligent we don't need to rely on the daemon to create and lock the pidfile and use virCommandSetPidFile() at the same time. NOTE that as advertised in the previous commit, this was temporarily broken, because both virCommand and qemuProcessStartManagedPRDaemon() would try to lock the pidfile. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2020-03-24 15:53:03 +01:00
Gaurav Agrawal	d2c43a5b51	qemu: convert DomainLogContext class to use GObject Signed-off-by: Gaurav Agrawal <agrawalgaurav@gnome.org> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-03-16 17:28:39 +01:00
Ján Tomko	b0eea635b3	Use g_strerror instead of virStrerror Remove lots of stack-allocated buffers. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-03-13 17:26:55 +01:00
Nikolay Shirokovskiy	b47e3b9b5c	qemu: agent: sync once if qemu has serial port event Sync was introduced in [1] to check for ga presence. This check is racy but in the era before serial events are available there was not better solution I guess. In case we have the events the sync function is different. It allows us to flush stateless ga channel from remnants of previous communications. But we need to do it only once. Until we get timeout on issued command channel state is ok. [1] qemu_agent: Issue guest-sync prior to every command Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-03-12 18:07:50 +01:00
Daniel P. Berrangé	a18f2c52ac	qemu: convert agent to use the per-VM event loop This converts the QEMU agent APIs to use the per-VM event loop, which involves switching from virEvent APIs to GMainContext / GSource APIs. A GSocket is used as a convenient way to create a GSource for a socket, but is not yet used for actual I/O. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:45:01 +00:00
Daniel P. Berrangé	436a56e37d	qemu: convert monitor to use the per-VM event loop This converts the QEMU monitor APIs to use the per-VM event loop, which involves switching from virEvent APIs to GMainContext / GSource APIs. A GSocket is used as a convenient way to create a GSource for a socket, but is not yet used for actual I/O. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:44:55 +00:00
Daniel P. Berrangé	92890fbfa1	qemu: start/stop an event thread for QMP probing In common with regular QEMU guests, the QMP probing will need an event loop for handling monitor I/O operations. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:44:47 +00:00
Daniel P. Berrangé	e6afacb0fe	qemu: start/stop an event loop thread for domains The event loop thread will be responsible for handling any per-domain I/O operations, most notably the QEMU monitor and agent sockets. We start this event loop when launching QEMU, but stopping the event loop is a little more complicated. The obvious idea is to stop it in qemuProcessStop(), but if we do that we risk loosing the final events from the QEMU monitor, as they might not have been read by the event thread at the time we tell the thread to stop. The solution is to delay shutdown of the event thread until we have seen EOF from the QEMU monitor, and thus we know there are no further events to process. Note that this assumes that we don't have events to process from the QEMU agent. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:44:44 +00:00
Michal Privoznik	13eb6c1468	qemu: Tell secdrivers which images are top parent When preparing images for block jobs we modify their seclabels so that QEMU can open them. However, as mentioned in the previous commit, secdrivers base some it their decisions whether the image they are working on is top of of the backing chain. Fortunately, in places where we call secdrivers we know this and the information can be passed to secdrivers. The problem is the following: after the first blockcommit from the base to one of the parents the XATTRs on the base image are not cleared and therefore the second attempt to do another blockcommit fails. This is caused by blockcommit code calling qemuSecuritySetImageLabel() over the base image, possibly multiple times (to ensure RW/RO access). A naive fix would be to call the restore function. But this is not possible, because that would deny QEMU the access to the base image. Fortunately, we can use the fact that seclabels are remembered only for the top of the backing chain and not for the rest of the backing chain. And thanks to the previous commit we can tell secdrivers which images are top of the backing chain. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1803551 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-03-09 14:14:55 +01:00
Daniel P. Berrangé	5bff668dfb	src: improve thread naming with human targetted names Historically threads are given a name based on the C function, and this name is just used inside libvirt. With OS level thread naming this name is now visible to debuggers, but also has to fit in 15 characters on Linux, so function names are too long in some cases. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-05 12:23:04 +00:00
Ján Tomko	b164eac5e1	qemuExtDevicesStart: pass logManager Pass logManager to qemuExtDevicesStart for future usage. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Andrea Bolognani <abologna@redhat.com>	2020-03-04 12:08:50 +01:00
Ján Tomko	feb69a19ac	conf: do not pass vm object to virDomainClearNetBandwidth This function only uses the domain definition. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-02-25 17:50:47 +01:00
Ján Tomko	7e0d11be5b	virsh: include virutil.h where used Include virutil.h in all files that use it, instead of relying on it being pulled in somehow. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-02-24 23:15:50 +01:00
Michal Privoznik	74ec3f4d7d	qemu: Don't explicitly remove pidfile after virPidFileForceCleanupPath() In two places where virPidFileForceCleanupPath() is called, we try to unlink() the pidfile again. This is needless because virPidFileForceCleanupPath() has done just that. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-02-20 12:57:19 +01:00
zhenwei pi	26badd13e8	qemu: support Panic Crashloaded event handling Pvpanic device supports bit 1 as crashloaded event, it means that guest actually panicked and run kexec to handle error by guest side. Handle crashloaded as a lifecyle event in libvirt. Test case: Guest side: before testing, we need make sure kdump is enabled, 1, build new pvpanic driver (with commit from upstream e0b9a42735f2672ca2764cfbea6e55a81098d5ba 191941692a3d1b6a9614502b279be062926b70f5) 2, insmod new kmod 3, enable crash_kexec_post_notifiers, # echo 1 > /sys/module/kernel/parameters/crash_kexec_post_notifiers 4, trigger kernel panic # echo 1 > /proc/sys/kernel/sysrq # echo c > /proc/sysrq-trigger Host side: 1, build new qemu with pvpanic patches (with commit from upstream 600d7b47e8f5085919fd1d1157f25950ea8dbc11 7dc58deea79a343ac3adc5cadb97215086054c86) 2, build libvirt with this patch 3, handle lifecycle event and trigger guest side panic # virsh event stretch --event lifecycle event 'lifecycle' for domain stretch: Crashed Crashloaded events received: 1 Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2020-02-07 14:05:25 +00:00
Jiri Denemark	80791859ac	qemu: Pass machine type to virQEMUCapsIsCPUModeSupported The usability of a specific CPU mode may depend on machine type, let's prepare for this by passing it to virQEMUCapsIsCPUModeSupported. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-02-07 09:19:02 +01:00
Michal Privoznik	a37a8c569d	Drop virAtomic module Now, that every use of virAtomic was replaced with its g_atomic equivalent, let's remove the module. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-02-02 16:36:58 +01:00
Michal Privoznik	7390ff3caa	src: Drop virAtomicIntDecAndTest() with g_atomic_int_dec_and_test() Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-02-02 16:36:56 +01:00
Michal Privoznik	574678a27f	src: Replace virAtomicIntInc() with g_atomic_int_add() Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-02-02 16:36:54 +01:00
Daniel P. Berrangé	068efae5b1	qemu: add support for running QEMU driver in embedded mode This enables support for running QEMU embedded to the calling application process using a URI: qemu:///embed?root=/some/path Note that it is important to keep the path reasonably short to avoid risk of hitting the limit on UNIX socket path names which is 108 characters. When using the embedded mode with a root=/var/tmp/embed, the driver will use the following paths: logDir: /var/tmp/embed/log/qemu swtpmLogDir: /var/tmp/embed/log/swtpm configBaseDir: /var/tmp/embed/etc/qemu stateDir: /var/tmp/embed/run/qemu swtpmStateDir: /var/tmp/embed/run/swtpm cacheDir: /var/tmp/embed/cache/qemu libDir: /var/tmp/embed/lib/qemu swtpmStorageDir: /var/tmp/embed/lib/swtpm defaultTLSx509certdir: /var/tmp/embed/etc/pki/qemu These are identical whether the embedded driver is privileged or unprivileged. This compares with the system instance which uses logDir: /var/log/libvirt/qemu swtpmLogDir: /var/log/swtpm/libvirt/qemu configBaseDir: /etc/libvirt/qemu stateDir: /run/libvirt/qemu swtpmStateDir: /run/libvirt/qemu/swtpm cacheDir: /var/cache/libvirt/qemu libDir: /var/lib/libvirt/qemu swtpmStorageDir: /var/lib/libvirt/swtpm defaultTLSx509certdir: /etc/pki/qemu At this time all features present in the QEMU driver are available when running in embedded mode, availability matching whether the embedded driver is privileged or unprivileged. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-27 11:04:03 +00:00
Pavel Hrdina	894556ca81	secret: move virSecretGetSecretString into virsecret The function virSecretGetSecretString calls into secret driver and is used from other hypervisors drivers and as such makes more sense in util. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-17 15:52:37 +01:00
Daniel P. Berrangé	7b9645a7d1	util: replace atomic ops impls with g_atomic_int* Libvirt's original atomic ops impls were largely copied from GLib's code at the time. The only API difference was that libvirt's virAtomicIntInc() would return a value, but g_atomic_int_inc was void. We thus use g_atomic_int_add(v, 1) instead, though this means virAtomicIntInc() now returns the original value, instead of the new value. This rewrites libvirt's impl in terms of g_atomic_int* as a short term conversion. The key motivation was to quickly eliminate use of GNULIB's verify_expr() macro which is not a direct match for G_STATIC_ASSERT_EXPR. Long term all the callers should be updated to use g_atomic_int* directly. Reviewed-by: Pavel Hrdina <phrdina@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-17 10:02:00 +00:00
Jiri Denemark	bd04d63ad9	qemu: Don't emit SUSPENDED_POSTCOPY event on destination When pause-before-switchover QEMU capability is enabled, we get STOP event before MIGRATION event with postcopy-active state. To properly handle post-copy migration and emit correct events commit v4.10.0-rc1-4-geca9d21e6c added a hack to qemuProcessHandleMigrationStatus which translates the paused state reason to VIR_DOMAIN_PAUSED_POSTCOPY and emits VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event when migration state changes to post-copy. However, the code was effective on both sides of migration resulting in a confusing VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event on the destination host, where entering post-copy mode is already properly advertised by VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event. https://bugzilla.redhat.com/show_bug.cgi?id=1791458 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-01-16 15:12:19 +01:00
Michal Privoznik	50d7465f3d	qemu_firmware: Pass virDomainDef into qemuFirmwareFillDomain() This function needs domain definition really, we don't need to pass the whole domain object. This saves couple of dereferences and characters esp. in more checks to come. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cole Robinson <crobinso@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-07 16:26:47 +01:00
Peter Krempa	5632ed8bad	qemu: process: Terminate backup job on VM destroy Commit `d75f865fb9` caused a job-deadlock if a VM is running the backup job and being destroyed as it removed the cleanup of the async job type and there was nothing to clean up the backup job. Add an explicit cleanup of the backup job when destroying a VM. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-06 10:15:36 +01:00
Peter Krempa	728b993c8a	qemu: Reset the node-name allocator in qemuDomainObjPrivateDataClear qemuDomainObjPrivateDataClear clears state which become invalid after VM stopped running and the node name allocator belongs there. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-06 10:15:35 +01:00
Nikolay Shirokovskiy	6c6d93bc62	qemu: hide details of fake reboot If we use fake reboot then domain goes thru running->shutdown->running state changes with shutdown state only for short period of time. At least this is implementation details leaking into API. And also there is one real case when this is not convinient. I'm doing a backup with the help of temporary block snapshot (with the help of qemu's API which is used in the newly created libvirt's backup API). If guest is shutdowned I want to continue to backup so I don't kill the process and domain is in shutdown state. Later when backup is finished I want to destroy qemu process. So I check if it is in shutdowned state and destroy it if it is. Now if instead of shutdown domain got fake reboot then I can destroy process in the middle of fake reboot process. After shutdown event we also get stop event and now as domain state is running it will be transitioned to paused state and back to running later. Though this is not critical for the described case I guess it is better not to leak these details to user too. So let's leave domain in running state on stop event if fake reboot is in process. Reconnection code handles this patch without modification. It detects that qemu is not running due to shutdown and then calls qemuProcessShutdownOrReboot which reboots as fake reboot flag is set. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Cole Robinson <crobinso@redhat.com>	2019-12-24 09:22:40 +03:00
Daniel Henrique Barboza	7a7d36055c	qemu_process.c: remove 'cleanup' label from qemuProcessCreatePretendCmd() The 'cleanup' flag is doing no cleaup in this function. We can remove it and return NULL on error or qemuBuildCommandLine(). Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza	d8eb3ab9e1	qemu_process.c: remove cleanup labels after g_auto() changes The g_auto() changes made by the previous patches made a lot of 'cleanup' labels obsolete. Let's remove them. Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza	d234efc59a	qemu_process.c: use g_autoptr() Change all feasible pointers to use g_autoptr(). Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza	982ea95142	qemu_process.c: use g_autofree Change all feasible strings and scalar pointers to use g_autofree. Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2019-12-20 18:31:51 -05:00
Michal Privoznik	8e2026cc18	qemu: Generate command line of NVMe disks Now, that we have everything prepared, we can generate command line for NVMe disks. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cole Robinson <crobinso@redhat.com>	2019-12-17 10:04:44 +01:00
Daniel P. Berrangé	766c8ae963	Revert "qemu: directly create virResctrlInfo ignoring capabilities" This reverts commit `7be5fe66cd`. This commit broke resctrl, because it missed the fact that the virResctrlInfoGetCache() has side-effects causing it to actually change the virResctrlInfo parameter, not merely get data from it. This code will need some refactoring before we can try separating it from virCapabilities again. Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2019-12-12 11:16:44 +00:00
Daniel P. Berrangé	1902356231	qemu: keep capabilities when running QEMU as root When QEMU uid/gid is set to non-root this is pointless as if we just used a regular setuid/setgid call, the process will have all its capabilities cleared anyway by the kernel. When QEMU uid/gid is set to root, this is almost (always?) never what people actually want. People make QEMU run as root in order to access some privileged resource that libvirt doesn't support yet and this often requires capabilities. As a result they have to go find the qemu.conf param to turn this off. This is not viable for libguestfs - they want to control everything via the XML security label to request running as root regardless of the qemu.conf settings for user/group. Clearing capabilities was implemented originally because there was a proposal in Fedora to change permissions such that root, with no capabilities would not be able to compromise the system. ie a locked down root account. This never went anywhere though, and as a result clearing capabilities when running as root does not really get us any security benefit AFAICT. The root user can easily do something like create a cronjob, which will then faithfully be run with full capabilities, trivially bypassing the restriction we place. IOW, our clearing of capabilities is both useless from a security POV, and breaks valid use cases when people need to run as root. This removes the clear_emulator_capabilities configuration option from qemu.conf, and always runs QEMU with capabilities when root. The behaviour when non-root is unchanged. Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2019-12-11 16:01:20 +00:00
Peter Krempa	3656bb0a13	qemu: domain: Introduce QEMU_ASYNC_JOB_BACKUP async job type We will want to use the async job infrastructure along with all the APIs and event for the backup job so add the backup job as a new async job type. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2019-12-10 12:41:57 +01:00
Daniel P. Berrangé	7be5fe66cd	qemu: directly create virResctrlInfo ignoring capabilities We always refresh the capabilities object when using virResctrlInfo during process startup. This is undesirable overhead, because we can just directly create a virResctrlInfo instead. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2019-12-09 10:17:27 +00:00
Daniel P. Berrangé	adf009b48f	qemu: use host CPU object directly Avoid grabbing the whole virCapsPtr object when we only need the host CPU information. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2019-12-09 10:17:27 +00:00
Daniel P. Berrangé	1a1d848694	qemu: use NUMA capabilities object directly Avoid grabbing the whole virCapsPtr object when we only need the NUMA information. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2019-12-09 10:17:27 +00:00
Daniel P. Berrangé	6cc992bd1a	conf: move NUMA capabilities into self contained object The NUMA cells are stored directly in the virCapsHostPtr struct. This moves them into their own struct allowing them to be stored independantly of the rest of the host capabilities. The change is used as an excuse to switch the representation to use a GPtrArray too. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2019-12-09 10:17:27 +00:00

1 2 3 4 5 ...

1487 Commits