libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-02-04 02:45:22 +00:00

Author	SHA1	Message	Date
Martin Kletzander	3791f29b08	qemu: Do not error out when setting affinity failed Consider a host with 8 CPUs. There are the following possible scenarios 1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs 2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs 3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus; QEMU should get 8 CPUs 4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 8 CPUs 5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus; QEMU should get 4 CPUs 6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 4 CPUs Scenarios 1 & 2 always work unless systemd restricted libvirtd privs. Scenario 3 works because libvirt checks current affinity first and skips the sched_setaffinity call, avoiding the SYS_NICE issue Scenario 4 works only if CAP_SYS_NICE is availalbe Scenarios 5 & 6 works only if CAP_SYS_NICE is present AND the cgroups cpuset is not set on the container. If libvirt blindly ignores the sched_setaffinity failure, then scenarios 4, 5 and 6 should all work, but with caveat in case 4 and 6, that QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively. This is still better than failing. Therefore libvirt can blindly ignore the setaffinity failure, but ONLY ignore it when there was no affinity specified in the XML config. If user specified affinity explicitly, libvirt must report an error if it can't be honoured. Resolves: https://bugzilla.redhat.com/1819801 Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-04 14:44:21 +02:00
Michal Privoznik	95b9db4ee2	lib: Prefer WITH_* prefix for #if conditionals Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH. Things got way better with Pavel's work on meson, but apparently, mixing these two lead to confusing and easy to miss bugs (see 31fb929eca for instance). While we were forced to use HAVE_ prefix with autotools, we are free to chose our own prefix with meson and since WITH_ prefix appears to be more popular let's use it everywhere. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-02 10:28:10 +02:00
Laine Stump	95089f481e	util: assign tap device names using a monotonically increasing integer When creating a standard tap device, if provided with an ifname that contains "%d", rather than taking that literally as the name to use for the new device, the kernel will instead use that string as a template, and search for the lowest number that could be put in place of %d and produce an otherwise unused and unique name for the new device. For example, if there is no tap device name given in the XML, libvirt will always send "vnet%d" as the device name, and the kernel will create new devices named "vnet0", "vnet1", etc. If one of those devices is deleted, creating a "hole" in the name list, the kernel will always attempt to reuse the name in the hole first before using a name with a higher number (i.e. it finds the lowest possible unused number). The problem with this, as described in the previous patch dealing with macvtap device naming, is that it makes "immediate reuse" of a newly freed tap device name much more common, and in the aftermath of deleting a tap device, there is some other necessary cleanup of things which are named based on the device name (nwfilter rules, bandwidth rules, OVS switch ports, to name a few) that could end up stomping over the top of the setup of a new device of the same name for a different guest. Since the kernel "create a name based on a template" functionality for tap devices doesn't exist for macvtap, this patch for standard tap devices is a bit different from the previous patch for macvtap - in particular there was no previous "bitmap ID reservation system" or overly-complex retry loop that needed to be removed. We simply find and unused name, and pass that name on to the kernel instead of "vnet%d". This counter is also wrapped when either it gets to INT_MAX or if the full name would overflow IFNAMSIZ-1 characters. In the case of "vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly someday someone will change the name from vnet to something else. (NB: It is still possible for a user to provide their own parameterized template name (e.g. "mytap%d") in the XML, and libvirt will just pass that through to the kernel as it always has.) Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-01 14:16:44 -04:00
Laine Stump	d7f38beb2e	util: replace macvtap name reservation bitmap with a simple counter There have been some reports that, due to libvirt always trying to assign the lowest numbered macvtap / tap device name possible, a new guest would sometimes be started using the same tap device name as previously used by another guest that is in the process of being destroyed as the new guest is starting. In some cases this has led to, for example, the old guest's qemuProcessStop() code deleting a port from an OVS switch that had just been re-added by the new guest (because the port name is based on only the device name using the port). Similar problems can happen (and I believe have) with nwfilter rules and bandwidth rules (which are both instantiated based on the name of the tap device). A couple patches have been previously proposed to change the ordering of startup and shutdown processing, or to put a mutex around everything related to the tap/macvtap device name usage, but in the end no matter what you do there will still be possible holes, because the device could be deleted outside libvirt's control (for example, regular tap devices are automatically deleted when the qemu process terminates, and that isn't always initiated by libvirt but could instead happen completely asynchronously - libvirt then has no control over the ordering of shutdown operations, and no opportunity to protect it with a mutex.) But this only happens if a new device is created at the same time as one is being deleted. We can effectively eliminate the chance of this happening if we end the practice of always looking for the lowest numbered available device name, and instead just keep an integer that is incremented each time we need a new device name. At some point it will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15 character limit if nothing else), and we can't guarantee that the new name really will be the least* recently used name, but "math" suggests that it will be much less common that we'll try to re-use the most recently used name. This patch implements such a counter for macvtap/macvlan, replacing the existing, and much more complicated, "ID reservation" system. The counter is set according to whatever macvtap/macvlan devices are already in use by guests when libvirtd is started, incremented each time a new device name is needed, and wraps back to 0 when either INT_MAX is reached, or when the resulting device name would be longer than IFNAMSIZ-1 characters (which actually is what happens when the template for the device name is "maccvtap%d"). The result is that no macvtap name will be re-used until the host has created (and possibly destroyed) 99,999,999 devices. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-01 14:16:36 -04:00
Ján Tomko	0a37e0695b	Split declarations from initializations Split those initializations that depend on a statement above them. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Ján Tomko	a5152f23e7	Move declarations before statements Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Laine Stump	5cad64ec03	qemu: remove unreachable code in qemuProcessStart() Back when the original version of this chunk of code was added (commit 41b087198 in libvirt-0.8.1 in April 2010), we used virExecDaemonize() to start the qemu process, and would continue on in the function (which at that time was called qemudStartVMDaemon()) even if a -1 was returned. So it was possible to get to this code with rv == -1 (it was called "ret" in that version of the code). In modern libvirt code, qemu is started with virCommandRun(); then we call virPidFileReadPath(); those are the only two ways of setting "rv" prior to this code being removed, and in either case if the new value of rv < 0, then we immediately skip over the rest of the code to the cleanup: label. This means that the code being removed by this patch is unreachable. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-08-24 23:46:51 -04:00
Michal Privoznik	9048dc4e62	qemuDomainBuildNamespace: Populate basic /dev from daemon's namespace As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:40:36 +02:00
Michal Privoznik	8da362fe62	qemu_domain_namespace: Repurpose qemuDomainBuildNamespace() Okay, here is the deal. Currently, the way we build namespace is very fragile. It is done from pre-exec hook when starting a domain, after we mass closed all FDs and before we drop privileges and exec() QEMU. This fact poses some limitations onto the namespace build code, e.g. it has to make sure not to keep any FD opened (not even through a library call), because it would be leaked to QEMU. Also, it has to call only async signal safe functions. These requirements are hard to meet - in fact as of my commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling libdevmapper functions. To solve this issue and avoid similar problems in the future, we should change our paradigm. We already have functions which can populate domain's namespace with nodes from the daemon context. If we use them to populate the namespace and keep only the bare minimum in the pre-exec hook, we've mitigated the risk. Therefore, the old qemuDomainBuildNamespace() is renamed to qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace() function is introduced. So far, the new function is basically a NOP and domain's namespace is still populated from the pre-exec hook - next patches will fix it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:40:36 +02:00
Michal Privoznik	764eaf1aa4	qemu_domain_namespace: Rename qemuDomainCreateNamespace() The name of this function is not very helpful, because it doesn't create anything, it just flips a bit in a bitmask when domain is starting up. Move the function internals into qemu_process.c and forget the function ever existed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:40:33 +02:00
Michal Privoznik	90eee87569	qemu: Separate out namespace handling code The qemu_domain.c file is big as is and we should split it into separate semantic blocks. Start with code that handles domain namespaces. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-03 19:32:27 +02:00
Ján Tomko	ee247e1d3f	Use g_strfeev instead of virStringFreeList Both accept a NULL value gracefully and virStringFreeList does not zero the pointer afterwards, so a straight replace is safe. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2020-08-03 15:37:36 +02:00
Ján Tomko	1edf164848	Remove redundant conditions All of these have been checked earlier. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2020-08-03 15:19:28 +02:00
Ján Tomko	6c7ba7b496	qemu: Fix affinity typo Fixes: 4c0398b5284d14c55eca51095673b6fadbbd85fb Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-07-22 15:51:26 +02:00
Bihong Yu	3ee423c363	qemu: pre-create the dbus directory in qemuStateInitialize There are races condiction to make '/run/libvirt/qemu/dbus' directory in virDirCreateNoFork() while concurrent start VMs, and get "failed to create directory '/run/libvirt/qemu/dbus': File exists" error message. pre-create the dbus directory in qemuStateInitialize. Signed-off-by: Bihong Yu <yubihong@huawei.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-07-22 09:40:15 +02:00
Jiri Denemark	1031db3600	qemu: Properly set //cpu/@migratable default value for running domains Since active domains which do not have the attribute already set were not started by libvirt that probed for CPU migratable property, we need to check this property on reconnect and update the domain definition accordingly. https://bugzilla.redhat.com/show_bug.cgi?id=1857967 Reported-by: Mark Mielke <mark.mielke@gmail.com> Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-07-21 15:40:01 +02:00
Daniel Henrique Barboza	f187b2fb98	qemu_process.c: modernize qemuProcessQMPNew() Use g_autoptr() and remove the 'cleanup' label. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200717211556.1024748-3-danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-07-21 15:34:36 +02:00
Peter Krempa	db712b0673	qemuDomainDiskLookupByNodename: Remove unused 'idx' All callers pass NULL as the value. Remove the argument. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-07-21 09:52:46 +02:00
Peter Krempa	c414ab00e2	qemuProcessHandleBlockThreshold: Report correct indexes The index returned by qemuDomainDiskLookupByNodename is the position in the backing chain rather than the index we report in the XML. Since with -blockdev they differ now and additionally the disk source also has an index we need to fix the 'threshold' events we report: 1) If it's the top level image we must always trigger the event without any suffix as we did until now 2) We must report the correct index 3) We must report the correct index also for the top level image, when blockdev is used. This means that we need to potentially emit 2 events, one for the device without the index and then when blockdev is used and the top level image has an index we must do it also with the index. This will fix it for blockdev cases, while also not removing previous semantics. https://bugzilla.redhat.com/show_bug.cgi?id=1857204 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-07-21 09:52:46 +02:00
Peter Krempa	4a19b7b832	qemuDomainDiskBackingStoreGetName: Remove unused argument Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-07-21 09:52:46 +02:00
Prathamesh Chavan	aca37c3fb2	qemu_domainjob: introduce `privateData` for `qemuDomainJob` To remove dependecy of `qemuDomainJob` on job specific paramters, a `privateData` pointer is introduced. To handle it, structure of callback functions is also introduced. Signed-off-by: Prathamesh Chavan <pc44800@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-07-20 15:34:58 +02:00
Michal Privoznik	824e349397	qemu: Use qemuSecuritySetSavedStateLabel() to label restore path Currently, when restoring from a domain the path that the domain restores from is labelled under qemuSecuritySetAllLabel() (and after v6.3.0-rc1~108 even outside transactions). While this grants QEMU the access, it has a flaw, because once the domain is restored, up and running then qemuSecurityDomainRestorePathLabel() is called, which is not real counterpart. In case of DAC driver the SetAllLabel() does nothing with the restore path but RestorePathLabel() does - it chown()-s the file back and since there is no original label remembered, the file is chown()-ed to root:root. While the apparent solution is to have DAC driver set the label (and thus remember the original one) in SetAllLabel(), we can do better. Turns out, we are opening the file ourselves (because it may live on a root squashed NFS) and then are just passing the FD to QEMU. But this means, that we don't have to chown() the file at all, we need to set SELinux labels and/or add the path to AppArmor profile. And since we want to restore labels right after QEMU is done loading the migration stream (we don't want to wait until qemuSecurityRestoreAllLabel()), the best way to approach this is to have separate APIs for labelling and restoring label on the restore file. I will investigate whether AppArmor can use the SavedStateLabel() API instead of passing the restore path to SetAllLabel(). Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1851016 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2020-07-10 14:18:07 +02:00
Daniel P. Berrangé	fd460ef561	qemu: stop checking virObjectUnref return value Some, but not all, of the monitor event handlers check the virObjectUnref return value to see if the domain was disposed. It should not be possible for this to happen, since the function already holds a lock on the domain and has only just acquired an extra reference on the domain a few lines earlier. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-06-03 10:20:17 +01:00
Daniel Henrique Barboza	9665b27dba	qemuProcessRefreshCPU: skip 'host-model' logic for pSeries guests Commit v3.10.0-182-g237f045d9a ("qemu: Ignore fallback CPU attribute on reconnect") forced CPU 'fallback' to ALLOW, regardless of user choice. This fixed a situation in which guests created with older Libvirt versions, which used CPU mode 'host-model' in runtime, would fail to launch in a newer Libvirt if the fallback was set to FORBID. This would lead to a scenario where the CPU was translated to 'host-model' to 'custom', but then the FORBID setting would make the translation process fail. PSeries can operate with 'host-model' in runtime due to specific PPC64 mechanics regarding compatibility mode. The update() implementation of the cpuDriverPPC64 driver is a NO-OP if CPU mode is 'host-model', and the driver does not implement translate(). The commit mentioned above is causing PSeries guests to get their 'fallback' setting to ALLOW, overwriting user choice, exposing a design problem in qemuProcessRefreshCPU() - for PSeries guests, handling 'host-model' as it is being done does not apply. All other cpuArchDrivers implements update() and changes guest mode to VIR_CPU_MODE_CUSTOM, meaning that PSeries is currently the only exception to this logic. Let's make it official. https://bugzilla.redhat.com/show_bug.cgi?id=1660711 Suggested-by: Jiri Denemark <jdenemar@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200525123945.4049591-2-danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-05-25 16:20:25 +02:00
Daniel Henrique Barboza	f600c42627	qemu_process.c: modernize qemuProcessUpdateCPU code path Use automatic cleanup on qemuProcessUpdateCPU and the functions called by it. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200522195620.3843442-5-danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-05-25 12:31:14 +02:00
Peter Krempa	78d30aa0bf	qemu: Prepare for testing of 'netdev_add' props via qemuxml2argvtest qemuxml2argv test suite is way more comprehensive than the hotplug suite. Since we share the code paths for monitor and command line hotplug we can easily test the properties of devices against the QAPI schema. To achieve this we'll need to skip the JSON->commandline conversion for the test run so that we can analyze the pure properties. This patch adds flags for the comand line generator and hook them into the JSON->commandline convertor for -netdev. An upcoming patch will make use of this new infrastructure. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-05-20 09:41:58 +02:00
Michal Privoznik	8fd2749b2d	qemuProcessStop: Reattach NVMe disks a domain is mirroring into If the mirror destination is not a file but a NVMe disk, then call qemuHostdevReAttachOneNVMeDisk() to reattach the NVMe back to the host. This would be done by blockjob code when the job finishes, but in this case the job won't finish - QEMU is killed meanwhile. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1825785 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-05-18 15:14:27 +02:00
Michal Privoznik	0230e38384	qemuProcessStop: Use XATTRs to restore seclabels on disks a domain is mirroring into In v5.10.0-rc1~42 (which was later fixed in v6.0.0-rc1~487) I am removing XATTRs for a file that QEMU is mirroring a disk into but it is killed meanwhile. Well, we can call qemuSecurityRestoreImageLabel() which will not only remove XATTRs but also use them to restore the original owner of the file. This would be done by blockjob code when the job finishes, but in this case the job won't finish - QEMU is killed meanwhile Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-05-18 15:13:14 +02:00
Ján Tomko	006782a8bc	qemu: only stop external devices after the domain A failure in qemuProcessLaunch would lead to qemuExtDevicesStop being called twice - once in the cleanup section and then again in qemuProcessStop. However, the first one is called while the QEMU process is still running, which is too soon for the swtpm process, because the swtmp_ioctl command can lock up: https://bugzilla.redhat.com/show_bug.cgi?id=1822523 Remove the first call and only leave the one in qemuProcessStop, which is called after the QEMU process is killed. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>	2020-05-13 15:29:37 +02:00
Peter Krempa	3bcbdc51da	qemu: process: Don't clear QEMU_CAPS_BLOCKDEV when SD card is present Help QEMU in deprecation of -drive if=none without the need to refactor all old boards. Stop masking out -blockdev support when -drive if=sd needs to be used. We achieve this by forbidding blockjobs and special-casing all other code paths. Blockjobs are sacrificed in this case as SD cards are a corner case for some ARM boards and are thus not used commonly. https://bugzilla.redhat.com/show_bug.cgi?id=1821692 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:55:00 +02:00
Peter Krempa	e664abb62e	qemu: Prepare for 'sd' card use together with blockdev SD cards need to be instantiated via -drive if=sd. This means that all cases where we use the blockdev path need to be special-cased for SD cards. Note that at this point QEMU_CAPS_BLOCKDEV is still cleared if the VM config has a SD card. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:55:00 +02:00
Peter Krempa	d876a93f05	qemu: Handle cases when 'qomName' isn't present Use the drive alias for all cases when we can't generate qomName. This is meant to handle disks on 'sd' bus which are instantiated via -drive if=sd as there isn't any specific QOM name for them. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:55:00 +02:00
Peter Krempa	cc4a277db2	qemu: Rename qemuDiskBusNeedsDriveArg to qemuDiskBusIsSD The function effectively boils down to whether the disk is 'SD'. Since we'll need to make more decisions based on the fact whether the disk is on the SD bus, rename the function. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-05-12 06:54:59 +02:00
Michal Privoznik	1d3a9ee9da	qemu: Make memory path generation embed driver aware So far, libvirt generates the following path for memory: $memoryBackingDir/$id-$shortName/ram-nodeN where $memoryBackingDir is the path where QEMU mmaps() memory for the guest (e.g. /var/lib/libvirt/qemu/ram), $id is domain ID and $shortName is shortened version of domain name. So for instance, the generated path may look something like this: /var/lib/libvirt/qemu/ram/1-QEMUGuest/ram-node0 While in case of embed driver the following path would be generated by default: $root/lib/qemu/ram/1-QEMUGuest/ram-node0 which is not clashing with other embed drivers, we allow users to override the default and have all embed drivers use the same prefix. This can create clashing paths. Fortunately, we can reuse the approach for machined name generation (v6.1.0-178-gc9bd08ee35) and include part of hash of the root in the generated path. Note, the important change is in qemuGetMemoryBackingBasePath(). The rest is needed to pass driver around. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-04-07 15:26:32 +02:00
Michal Privoznik	bf54784cb1	qemu: Make hugepages path generation embed driver aware So far, libvirt generates the following path for hugepages: $mnt/libvirt/qemu/$id-$shortName where $mnt is the mount point of hugetlbfs corresponding to hugepages of desired size (e.g. /dev/hugepages), $id is domain ID and $shortName is shortened version of domain name. So for instance, the generated path may look something like this: /dev/hugepages/libvirt/qemu/1-QEMUGuest But this won't work with embed driver really, because if there are two instances of embed driver, and they both want to start a domain with the same name and with hugepages, both drivers will generate the same path which is not desired. Fortunately, we can reuse the approach for machined name generation (v6.1.0-178-gc9bd08ee35) and include part of hash of the root in the generated path. Note, the important change is in qemuGetBaseHugepagePath(). The rest is needed to pass driver around. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-04-07 15:26:26 +02:00
Daniel P. Berrangé	d2954c0729	qemu: ensure domain event thread is always stopped In previous commit: commit e6afacb0feabd9bf58331aa0e5259a35378be74e Author: Daniel P. Berrangé <berrange@redhat.com> Date: Wed Feb 12 12:26:11 2020 +0000 qemu: start/stop an event loop thread for domains A bogus comment was added claiming we didn't need to shutdown the event thread in the qemuProcessStop method, because this would be done in the monitor EOF callback. This was wrong because the EOF callback only runs in the case of a QEMU crash or a guest initiated clean shutdown & poweroff. In the case where the libvirt admin calls virDomainDestroy, the EOF callback never fires because we have already unregistered the event callbacks. We must thus always attempt to stop the event thread in qemuProcessStop. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-30 16:48:15 +01:00
Marc-André Lureau	db670b8d67	qemu: prepare and stop the dbus daemon Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-03-24 15:57:33 +01:00
Michal Privoznik	a02c589886	qemuProcessStartManagedPRDaemon: Don't pass -f pidfile to the daemon Now, that our virCommandSetPidFile() is more intelligent we don't need to rely on the daemon to create and lock the pidfile and use virCommandSetPidFile() at the same time. NOTE that as advertised in the previous commit, this was temporarily broken, because both virCommand and qemuProcessStartManagedPRDaemon() would try to lock the pidfile. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2020-03-24 15:53:03 +01:00
Gaurav Agrawal	d2c43a5b51	qemu: convert DomainLogContext class to use GObject Signed-off-by: Gaurav Agrawal <agrawalgaurav@gnome.org> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-03-16 17:28:39 +01:00
Ján Tomko	b0eea635b3	Use g_strerror instead of virStrerror Remove lots of stack-allocated buffers. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-03-13 17:26:55 +01:00
Nikolay Shirokovskiy	b47e3b9b5c	qemu: agent: sync once if qemu has serial port event Sync was introduced in [1] to check for ga presence. This check is racy but in the era before serial events are available there was not better solution I guess. In case we have the events the sync function is different. It allows us to flush stateless ga channel from remnants of previous communications. But we need to do it only once. Until we get timeout on issued command channel state is ok. [1] qemu_agent: Issue guest-sync prior to every command Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-03-12 18:07:50 +01:00
Daniel P. Berrangé	a18f2c52ac	qemu: convert agent to use the per-VM event loop This converts the QEMU agent APIs to use the per-VM event loop, which involves switching from virEvent APIs to GMainContext / GSource APIs. A GSocket is used as a convenient way to create a GSource for a socket, but is not yet used for actual I/O. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:45:01 +00:00
Daniel P. Berrangé	436a56e37d	qemu: convert monitor to use the per-VM event loop This converts the QEMU monitor APIs to use the per-VM event loop, which involves switching from virEvent APIs to GMainContext / GSource APIs. A GSocket is used as a convenient way to create a GSource for a socket, but is not yet used for actual I/O. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:44:55 +00:00
Daniel P. Berrangé	92890fbfa1	qemu: start/stop an event thread for QMP probing In common with regular QEMU guests, the QMP probing will need an event loop for handling monitor I/O operations. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:44:47 +00:00
Daniel P. Berrangé	e6afacb0fe	qemu: start/stop an event loop thread for domains The event loop thread will be responsible for handling any per-domain I/O operations, most notably the QEMU monitor and agent sockets. We start this event loop when launching QEMU, but stopping the event loop is a little more complicated. The obvious idea is to stop it in qemuProcessStop(), but if we do that we risk loosing the final events from the QEMU monitor, as they might not have been read by the event thread at the time we tell the thread to stop. The solution is to delay shutdown of the event thread until we have seen EOF from the QEMU monitor, and thus we know there are no further events to process. Note that this assumes that we don't have events to process from the QEMU agent. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-11 14:44:44 +00:00
Michal Privoznik	13eb6c1468	qemu: Tell secdrivers which images are top parent When preparing images for block jobs we modify their seclabels so that QEMU can open them. However, as mentioned in the previous commit, secdrivers base some it their decisions whether the image they are working on is top of of the backing chain. Fortunately, in places where we call secdrivers we know this and the information can be passed to secdrivers. The problem is the following: after the first blockcommit from the base to one of the parents the XATTRs on the base image are not cleared and therefore the second attempt to do another blockcommit fails. This is caused by blockcommit code calling qemuSecuritySetImageLabel() over the base image, possibly multiple times (to ensure RW/RO access). A naive fix would be to call the restore function. But this is not possible, because that would deny QEMU the access to the base image. Fortunately, we can use the fact that seclabels are remembered only for the top of the backing chain and not for the rest of the backing chain. And thanks to the previous commit we can tell secdrivers which images are top of the backing chain. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1803551 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-03-09 14:14:55 +01:00
Daniel P. Berrangé	5bff668dfb	src: improve thread naming with human targetted names Historically threads are given a name based on the C function, and this name is just used inside libvirt. With OS level thread naming this name is now visible to debuggers, but also has to fit in 15 characters on Linux, so function names are too long in some cases. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-03-05 12:23:04 +00:00
Ján Tomko	b164eac5e1	qemuExtDevicesStart: pass logManager Pass logManager to qemuExtDevicesStart for future usage. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Andrea Bolognani <abologna@redhat.com>	2020-03-04 12:08:50 +01:00
Ján Tomko	feb69a19ac	conf: do not pass vm object to virDomainClearNetBandwidth This function only uses the domain definition. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-02-25 17:50:47 +01:00
Ján Tomko	7e0d11be5b	virsh: include virutil.h where used Include virutil.h in all files that use it, instead of relying on it being pulled in somehow. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-02-24 23:15:50 +01:00

1 2 3 4 5 ...

1413 Commits