libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-01-04 20:15:19 +00:00

Author	SHA1	Message	Date
Michal Privoznik	a658a4bdf7	qemuBuildMemoryBackendProps: Prealloc mem for memfd backend If a domain was using hugepages through memory-backend-file or via -mem-path, we would turn prealloc on. But we are not doing that for memory-backend-memfd. Fix this, because we need QEMU to fully allocate hugepages. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-10-01 12:03:06 +02:00
Michal Privoznik	0217c5a6b4	qemuBuildMemoryBackendProps: Respect //memoryBacking/allocation/@mode=immediate If user specifies immediate memory allocation in the domain XML, they want QEMU to fully allocate its memory. And if the memory was allocated using plain '-m' then we would honour it. But, if a memory backend is used, then we don't set the prealloc attribute of the backend. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-10-01 12:02:19 +02:00
Michal Privoznik	eda5cc7a62	qemuBuildMemoryBackendProps: Move @prealloc setting to backend agnostic part All three memory backends (-file, -ram and -memfd) have .prealloc attribute. Since we are setting it only for -file, the corresponding code lives only under if() that handles that specific backend. But in near future we will want to set the attribute for other backends too. Therefore, move the corresponding code outside of the if(). This causes some .argv files to be changed, but the only change happening there is move of the attribute (best viewed with: 'git show --color-words=.'). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-10-01 12:01:31 +02:00
Michal Privoznik	bfb1ab1df1	qemu: Use .hostdevice attribute for usb-host This originally started as bug 1595525 in which namespaces and libusb used in QEMU were not playing nicely with each other. The problem was that libusb built a cache of USB devices it saw (which was a very limited set because of the namespace) and then expected to receive udev events to keep the cache in sync. But those udev events didn't come because on hotplug when we mknod() devices in the namespace no udev event is generated. And what is worse - libusb failed to open a device that wasn't in the cache. Without going further into what the problem was, libusb added a new API for opening USB devices that avoids using cache which QEMU incorporated and exposes under "hostdevice" attribute. What is even nicer is that QEMU uses qemu_open() for path provided in the attribute and thus FD passing could be used. Except qemu_open() expects so called FD sets instead of `getfd' and these are not implemented in libvirt, yet. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1877218 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:59:35 +02:00
Michal Privoznik	66c5674e79	qemu_capabilities: Add QEMU_CAPS_USB_HOST_HOSTDEVICE This capability tracks whether "usb-host" device has "hostdevice" attribute. This attribute allows us to specify full path to the USB device ("/dev/bus/usb/$bus/$dev") but more importantly, since QEMU uses qemu_open() for this attribute it allows us to pass pre-opened FD and have QEMU not bother with opening the file at all. The attribute was added in v5.1.0-rc0~71^2~1 QEMU commit. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:50:43 +02:00
Peter Krempa	43f0944f66	qemu: migration: Rename qemuMigrationEatCookie to qemuMigrationCookieParse Use a more descriptive name and move the verb to the end so that the functions conform with the naming policy. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:01:05 +02:00
Peter Krempa	5b32815d1a	qemuMigrationCookieXMLFormatStr: Remove There is just one caller, inline the code. This also optimizes the code as we no longer have to calculate length of the output XML as it's actually stored in the buffer struct. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:01:05 +02:00
Peter Krempa	2d155e2348	qemuMigrationSrcBeginPhase: Use qemuMigrationCookieNew We need an empty cookie, so use qemuMigrationCookieNew instead of qemuMigrationEatCookie with NULL/0 arguments. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:01:05 +02:00
Peter Krempa	775296cbd6	qemuMigrationCookieNew: Export Allow direct use rather than going through qemuMigrationEatCookie with NULL/0 arguments. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:01:05 +02:00
Peter Krempa	4aef0fe324	qemuMigrationCookieNew: Refactor allocation and cleanup Move around some code so that we can get rid of the 'cleanup:' label. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:01:05 +02:00
Peter Krempa	6c8b68b312	qemu: migration: Rename qemuMigrationBakeCookie to qemuMigrationCookieFormat Use a more descriptive name and move the verb to the end so that the functions conform with the naming policy. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 10:01:05 +02:00
Masayoshi Mizuma	596c659b4e	qemu: validate: Allow <transient/> disks Extract the validation of transient disk option. We support transient disks in qemu under the following conditions: - -blockdev is used - the disk source is a local file - the disk type is 'disk' - the disk is not readonly Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Masayoshi Mizuma	1c9227de5d	qemu: process: Handle transient disks on VM startup Add overlays after the VM starts before we start executing guest code. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Peter Krempa	e86b16ced7	qemu: hotplug: Remove overlay of <transient> disk on disk unplug Remove the overlay if the disk was <transient/>. Note that even if we'd forbid unplug of such a disk through the API, the disk can still be ejected from the guest. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Masayoshi Mizuma	cb62c23ff7	qemu: Block migration when transient disk option is enabled Block migration when transient disk option is enabled to simplify the handling of the overlay files. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Masayoshi Mizuma	83182f0838	qemu: Block disk hotplug when transient disk option is enabled For now we disable disk hotplug of transient disk as it requires creating an overlay prior to adding the frontend. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Masayoshi Mizuma	b3c582623c	qemu: Block blockjobs when transient disk option is enabled For now we disallow blockjobs with transient disks to avoid dealing with obsoleted overlays. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Peter Krempa	117ff49db7	qemu: snapshot: Introduce helpers for creating overlays on <transient/> disks To implement <transient/> disks we'll need to install an overlay on top of the original disk image which will be discarded after the VM is turned off. This was initially implemented by qemu but libvirt never picked up this option as the overlays were created by qemu without libvirt involvment which didn't work with SELinux. With blockdev the qemu feature became unsupported so we need to do this via the snapshot code anyways. The helpers introduced in this patch prepare a fake snapshot disk definition for a disk which is configured as <transient/> and use it to create a snapshot (without actually modifying metadata or persistent def). Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Peter Krempa	afc25e8553	qemu: prepare cleanup for <transient/> disk overlays Later patches will implement support for <transient/> disks in libvirt by installing an overlay on top of the configured image. This will require cleanup after the VM will be stopped so that the state is correctly discarded. Since the overlay will be installed only during the startup phase of the VM we need to ensure that qemuProcessStop doesn't delete the original file on some previous failure. This is solved by adding 'inhibitDiskTransientDelete' VM private data member which is set prior to any startup step and will be cleared once transient disk overlays are established. Based on that we can then delete the overlays for any <transient/> disk. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Tested-by: Ján Tomko <jtomko@redhat.com>	2020-10-01 09:55:02 +02:00
Ján Tomko	a63b48c5ec	qemu: agent: set ifname to NULL after freeing CVE-2020-25637 Signed-off-by: Ján Tomko <jtomko@redhat.com> Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Fixes: `0977b8aa07` Reviewed-by: Mauro Matteo Cascella <mcascell@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-30 11:42:28 +02:00
Ján Tomko	e4116eaa44	rpc: require write acl for guest agent in virDomainInterfaceAddresses CVE-2020-25637 Add a requirement for domain:write if source is set to VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_AGENT. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-30 11:42:28 +02:00
Peter Krempa	850f991897	qemuSnapshotDiskContextNew: Don't set 'ndd' 'ndd' tracks the actual number of snapshot disks filled into the structure and is incremented by the functions filling the context, thus it must not be set when initializing the context. Fixes: `8c2ecdf131` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 13:20:45 +02:00
Peter Krempa	6e514ea27c	qemuSnapshotDiskContextCleanup: Don't leak snapctxt The container itself needs to be freed too. Fixes: `8c2ecdf131` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 13:20:45 +02:00
Peter Krempa	4a927468fb	qemuSnapshotDiskPrepare: rename to qemuSnapshotDiskPrepareActiveExternal Make it obvious that the snapshot is prepared for the active external snapshot case. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 11:49:13 +02:00
Peter Krempa	ebdbd05aab	qemuSnapshotCreateActiveExternalDisks: Extract actual snapshot creation to 'qemuSnapshotDiskCreate' Extract the code which invokes the monitor and finalizes the snapshot into a separate function for easier reuse. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 11:49:13 +02:00
Peter Krempa	8c2ecdf131	qemu: snapshot: Introduce qemuSnapshotDiskContext Add a container struct which holds all data needed to create and clean up after a (for now external) snapshot. This will aggregate all the 'qemuSnapshotDiskDataPtr' the 'actions' of a transaction QMP command and everything needed for cleanup at any given point. This aggregation allows to simplify the arguments of the functions which prepare the snapshot data and additionally will simplify the code necessary for creating overlays on top of <transient/> disks. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 11:49:13 +02:00
Peter Krempa	a09c82cbd5	qemuSnapshotDiskPrepare/Cleanup: simplify passing of 'driver' and 'blockdev' Both can be fetched from 'vm'. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 11:49:13 +02:00
Peter Krempa	eb4aa7b109	qemuSnapshotDiskUpdateSource: Extract 'driver' and 'blockdev' from 'vm' Reduce the number of arguments by taking them from 'vm'. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 11:49:13 +02:00
Peter Krempa	8eacbeac74	qemu: snapshot: Rename 'qemuSnapshotCreateDiskActive' to 'qemuSnapshotCreateActiveExternalDisks' Be more specific about the role of the function. It's creating the disk portion of an external active snapshot. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-24 11:49:13 +02:00
Peter Krempa	1bb0faa51a	qemuSnapshotCreateInactiveExternal: Don't access 'idx' of snapshot After virDomainSnapshotAlignDisks is called the definitions of disks in the snapshot definition and in the domain definition are in the same order so they can be addressed using the same index. This frees up 'idx' to be removed later. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-23 22:37:56 +02:00
Peter Krempa	2b150c4d5f	qemuDomainBlockRebase: Replace ternary operator with if/else Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-23 22:37:56 +02:00
Peter Krempa	bc3a78f61a	virStorageSourceNew: Abort on failure Add an abort() on the class/object allocation failures so that virStorageSourceNew() always returns a virStorageSource and remove checks from all callers. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-23 22:37:56 +02:00
Collin Walling	9c6996124f	qemu: substitute missing model name for host-passthrough Before: $ uname -m s390x $ cat passthrough-cpu.xml <cpu check="none" mode="host-passthrough" /> $ virsh hypervisor-cpu-compare passthrough-cpu.xml error: Failed to compare hypervisor CPU with passthrough-cpu.xml error: internal error: unable to execute QEMU command 'query-cpu-model-comp arison': Invalid parameter type for 'modelb.name', expected: string After: $ virsh hypervisor-cpu-compare passthrough-cpu.xml CPU described in passthrough-cpu.xml is identical to the CPU provided by hy pervisor on the host Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Signed-off-by: Collin Walling <walling@linux.ibm.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-23 21:20:06 +02:00
Daniel Henrique Barboza	ace5931553	conf, qemu: move qemuDomainNVDimmAlignSizePseries to domain_conf.c We'll use the auto-alignment function during parse time, in domain_conf.c. Let's move the function to that file, renaming it to virDomainNVDimmAlignSizePseries(). This will also make it clearer that, although QEMU is the only driver that currently supports it, pSeries NVDIMM restrictions aren't tied to QEMU. Reviewed-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-09-23 13:19:47 -03:00
Ján Tomko	8e12a0b8fa	qemu: firmware: check virJSONValueObjectGet return value If the mapping is not present, we should not try to access its elements. Signed-off-by: Ján Tomko <jtomko@redhat.com> Fixes: `8b5b80f4c5` Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-09-23 13:26:34 +02:00
Daniel Henrique Barboza	63af8fdeb2	qemu: revert latest pSeries NVDIMM design changes In [1], changes were made to remove the existing auto-alignment for pSeries NVDIMM devices. That design promotes strange situations where the NVDIMM size reported in the domain XML is different from what QEMU is actually using. We removed the auto-alignment and relied on standard size validation. However, this goes against Libvirt design philosophy of not tampering with existing guest behavior, as pointed out by Daniel in [2]. Since we can't know for sure whether there are guests that are relying on the auto-alignment feature to work, the changes made in [1] are a direct violation of this rule. This patch reverts [1] entirely, re-enabling auto-alignment for pSeries NVDIMM as it was before. Changes will be made to ease the limitations of this design without hurting existing guests. This reverts the following commits: - commit `2d93cbdea9` Revert "formatdomain.html.in: mention pSeries NVDIMM 'align down' mechanic" - commit `0ee56369c8` qemu_domain.c: change qemuDomainMemoryDeviceAlignSize() return type - commit `07de813924` qemu_domain.c: do not auto-align ppc64 NVDIMMs - commit `0ccceaa57c` qemu_validate.c: add pSeries NVDIMM size alignment validation - commit `4fa2202d88` qemu_domain.c: make qemuDomainGetMemorySizeAlignment() public [1] https://www.redhat.com/archives/libvir-list/2020-July/msg02010.html [2] https://www.redhat.com/archives/libvir-list/2020-September/msg00572.html Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2020-09-22 12:25:34 +02:00
Roman Bogorodskiy	f787df9947	conf: add 'isa' controller type Introduce 'isa' controller type. In domain XML it looks this way: ... <controller type='isa' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> </controller> ... Currently, this is needed for the bhyve driver to allow choosing a specific PCI address for that. In bhyve, this controller is used to attach serial ports and a boot ROM. Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-21 20:01:12 +04:00
Nikolay Shirokovskiy	5756a7bf2a	qemu: fix concurrency crash bug in force snapshot revert This patch is just revert of [1]. Actually we should NOT pass QEMU_ASYNC_JOB_NONE as that patch suggests while we are in async job in order to acquire nested jobs correctly. The patch tries to fix issues introduced by another patch [2] where jobs are mistakenly cleared out in qemuProcessStop. Later patch [3] fixed the issue introduced by patch [2]. Now we need to revert [1] as well as we now still have same concurrency crash issues as [3] described but for the force revert. [1] `0c4408c83`: qemu: Don't use asyncJob after stop during snapshot revert [2] `888aa4b6b`: qemuDomainObjPrivateDataClear: Don't leak @migParams [3] `d75f865fb`: qemu: fix concurrency crash bug in snapshot revert Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-16 11:45:41 +03:00
Peter Krempa	f2d90b558f	qemuBuildHostdevSCSIAttachPrepare: Propagate 'readonly' flag also for iSCSI The 'readonly' hostdev property is stored separately from the virStorageSource as some hostdevs are not described by a virStorage source. We need to propagate the flag to the virStorage source also for iSCSI backends as it's used to generate the backend properties. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1868856 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-16 09:08:56 +02:00
Peter Krempa	1a5f35dbd2	qemu: backup: Write TLS cert and secret object aliases into status XML We've put the aliases into the backup job definition after the status XML was already written so they didn't appear in the on-disk state. Move the code putting them into the private definition earlier, so that the status XML update done by saving blockjobs already writes them out. Also add a note notifying that the block job status update writes the status XML. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1870488 Fixes: `423576679a` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:25:22 +02:00
Peter Krempa	5058062b5d	qemu: backup: Remove note that TLS should be implemented Commit `423576679a` implementing TLS forgot to remove the comment. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:25:22 +02:00
Peter Krempa	e5dc1427d7	qemuDomainPrepareHostdev: Don't base backend nodename on device alias QEMU's blockdev nodenames which are used to back SCSI/iSCSI hostdevs are limited to 32 characters. If a user passes a very long user alias as name of the host device it's easy to end up with a too-long nodename. To prevent this from happening don't base the nodename on the possibly user-specified alias but on the normal sequential node name generator. We then store the name in the status XML for further use. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	a669d68336	qemuDomainPrepareHostdev: base hostdev secret object names on backend alias The secret object is used to pass data to the backend so it's better fitting to base the secret object name on the SCSI host device backend name. Since we store the object alias in the status XML this modification is safe in regards to existing guests. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	ca495825a3	qemuDomainPrepareHostdev: Allocate backend nodenames in the prepare function Allocate the nodename in the setup function rather than in the command line generator. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	c17e4907fe	qemuDomainSecretHostdevPrepare: remove The function is no longer used once we setup per-hostdev data. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	3673bdbe13	qemu: domain: Extract preparation of hostdev specific data to a separate function Historically we've prepared secrets for all objects in one place. This doesn't make much sense and it's semantically more appealing to prepare everything for a single device type in one place. Move the setup of the (iSCSI\|SCSI) hostdev secrets into a new function which will be used to setup other things as well in the future. This is a similar approach we do for disks. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	82b60ec8ce	qemuBlockStorageSourceAttachData: remove 'storageNodeNameCopy' This was a hack when we were locally regenerating the nodename so that it's not leaked. Now that we use proper virStorageSource with persistence it's no longer required. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	cca2dd4890	qemuBuildHostdevSCSI(A\|De)tachPrepare: Use virStorageSource in def for SCSI hostdevs Modify the attach/detach data generators to actually use the virStorageSourceStructure embedded in the SCSI config data rather than creating an ad-hoc internal one. The modification will allow us to properly store the nodename used for the backend in the status XML rather than re-generating it all the time. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Peter Krempa	482c52b177	qemu: domain: Fill in (i)SCSI backend nodename if it is not present in status XML For upgrade reasons so that we can modify the used nodename we must generate the old version for all status XMLs which don't have it stored explicitly. The change will be required as using the user-provided alias may result in too-long nodenames which will be rejected by qemu. Add code which fills in the appropriate old value and add test cases to validate that it's added and also that existing nodenames are not overwritten. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-15 15:20:23 +02:00
Ján Tomko	af16e754cd	qemuProcessReconnect: clear 'oldjob' After we started copying the privateData pointer in qemuDomainObjRestoreJob, we should also free them once we're done with them. Register the clear function and use g_auto. Also add a check for job->cb to qemuDomainObjClearJob, to prevent freeing an uninitialized job. https://bugzilla.redhat.com/show_bug.cgi?id=1878450 Signed-off-by: Ján Tomko <jtomko@redhat.com> Fixes: `aca37c3fb2`	2020-09-14 18:10:56 +02:00
Ján Tomko	a3c340e05f	qemu: qemuDomainObjClearJob: use g_clear_pointer The function used g_clear_pointer for all but one pointer. Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-14 18:10:56 +02:00
Ján Tomko	ce66f9724c	qemu: rename qemuDomainObjFreeJob -> qemuDomainObjClearJob This function does not free the job. Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-14 18:10:56 +02:00
Lin Ma	10841b6cb6	qemu: Return perf status that affect next boot for shutoff domains While we set up perf events for a shutoff domain and check the settings, All of perf events are reported as 'disabled', unless we add --config, This is redundant for a shutoff domain. # virsh domstate $GUEST shut off # virsh perf --domain $GUEST cmt : disabled mbmt : disabled mbml : disabled ...... # virsh perf --domain $GUEST --enable mbmt mbmt : enabled # virsh perf --domain $GUEST cmt : disabled mbmt : disabled mbml : disabled ...... Use virDomainObjGetOneDefState instead of virDomainObjGetOneDef to fix the issue. After patch, The perf event status of a shutoff domain is reported correctly: # virsh domstate $GUEST shut off # virsh perf --domain $GUEST cmt : disabled mbmt : disabled mbml : disabled ...... # virsh perf --domain $GUEST --enable mbmt mbmt : enabled # virsh perf --domain $GUEST cmt : disabled mbmt : enabled mbml : disabled ...... Signed-off-by: Lin Ma <lma@suse.de> Reviewed-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-12 12:49:31 +02:00
Lin Ma	308ec831bb	qemu: qemuDomainPMSuspendForDuration: Check availability of agent It requires a guest agent configured and running in the domain's guest OS, So check qemu agent during qemuDomainPMSuspendForDuration(). Signed-off-by: Lin Ma <lma@suse.de> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-12 12:49:31 +02:00
Tim Wiederhake	caf5a88e59	qemu: Use glib memory functions in qemuProcessReadLog Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-11 18:19:58 +02:00
Tim Wiederhake	3e60deeed3	qemu: Use glib memory functions in qemuDomainLogContextRead Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-11 18:19:58 +02:00
Tim Wiederhake	42459f0b01	qemu: Use glib memory functions in qemuDomainMasterKeyReadFile Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2020-09-11 18:19:58 +02:00
Erik Skultety	9824a82198	qemu: qemuDomainPMSuspendAgent: Don't assign to 'ret' in a conditional When the guest agent isn't running, we still report success on a PM suspend action even though we logged an error correctly, this is because we poisoned the 'ret' value a few lines above. Fixes: `a663a86081` Signed-off-by: Erik Skultety <eskultet@redhat.com>	2020-09-11 14:48:40 +02:00
Michal Privoznik	c43622f06e	qemuFirmwareFillDomain: Fill NVRAM template on migration too In `8e1804f9f6` I've tried to fix the following use case: domain is started with path to UEFI only and relies on libvirt to figure out corresponding NVRAM template to create a per-domain copy from. The fix consisted of having a check tailored exactly for this use case and if it's hit then using FW autoselection to figure it out. Unfortunately, the NVRAM template is not saved in the inactive XML (well, the domain might be transient anyway). Then, as a part of that check we see whether the per-domain copy doesn't exist already and if it does then no template is looked up hence no template will appear in the live XML. This works, until the domain is migrated. At the destination, the per-domain copy will not exist so we need to know the template to create the per-domain copy from. But we don't even get to the check because we are not starting a fresh new domain and thus the qemuFirmwareFillDomain() function quits early. The solution is to switch order of these two checks. That is evaluate the check for the old style before checking flags. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852910 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2020-09-09 14:47:51 +02:00
Michal Privoznik	ec46e6d44b	qemu_process: Separate VIR_PERF_EVENT_* setting into a function When starting a domain, qemuProcessLaunch() iterates over all VIR_PERF_EVENT_* values and (possibly) enables them. While there is nothing wrong with the code, the for loop where it's done makes it harder to jump onto next block of code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-08 10:57:24 +02:00
Peter Krempa	de79fad40f	qemuBlockStorageSourceCreateDetectSize: Propagate cluster size for 'qcow2' Propagate the cluster size from the original image as the user might have configured a custom cluster size for performance reasons. Propagate the cluster size of a qcow2 image to the new overlay or copy. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-09-08 08:48:53 +02:00
Peter Krempa	e60620e28b	qemu: block: Allow specifying cluster size when using 'blockdev-create' 'blockdev-create' allows us to create the image with a custom cluster size if we wish to. Wire it up for 'qcow2'. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-09-08 08:48:53 +02:00
Peter Krempa	fd49364d8b	qemu: monitor: Detect image cluster size from 'query-named-block-nodes' Configuring the cluster size of an image may have performance implications. This patch allows us to detect cluster size for existing images so that we will be able to propagate it to new images which are based on existing images e.g. during snapshots/block-copy/etc. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-09-08 08:48:53 +02:00
Michal Privoznik	4a72b76b8a	qemu_namespace: Don't leak mknod items that are being skipped over When building and populating domain NS a couple of functions are called that append paths to a string list. This string list is then inspected, one item at the time by qemuNamespacePrepareOneItem() which gathers all the info for given path (stat buffer, possible link target, ACLs, SELinux label) using qemuNamespaceMknodItemInit(). If the path needs to be created in the domain's private /dev then it's added onto this qemuNamespaceMknodData list which is freed later in the process. But, if the path does not need to be created in the domain's private /dev, then the memory allocated by qemuNamespaceMknodItemInit() is not freed anywhere leading to a leak. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-07 16:27:25 +02:00
Martin Kletzander	f5b486daea	qemu: Allow setting affinity to fail and don't report error This is just a clean-up of commit `3791f29b08` using the new parameter of virProcessSetAffinity() introduced in commit `9514e24984` so that there is no error reported in the logs. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-07 14:48:57 +02:00
Martin Kletzander	9514e24984	Do not report error when setting affinity is allowed to fail Suggested-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-07 11:35:36 +02:00
Ján Tomko	7afc99ae2d	qemu: migration: remove unused variable ../src/qemu/qemu_migration.c:4091:36: error: unused variable 'cfg' [-Werror,-Wunused-variable] g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); Signed-off-by: Ján Tomko <jtomko@redhat.com> Fixes: `d92c2bbc65`	2020-09-07 11:03:54 +02:00
Michal Privoznik	d92c2bbc65	lib: Prefer g_autoptr() declaration of virQEMUDriverConfigPtr In the past we had to declare @cfg and then explicitly unref it. But now, with glib we can use g_autoptr() which will do the unref automatically and thus is more bulletproof. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2020-09-07 10:47:54 +02:00
Michal Privoznik	5befe4ee18	qemu_interface: Fix @cfg refcounting in qemuInterfacePrepareSlirp() In the qemuInterfacePrepareSlirp() function, the qemu driver config is obtained (via virQEMUDriverGetConfig()), but it is never unrefed leading to mangled refcounter. Fixes: `9145b3f1cc` Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>	2020-09-07 10:46:21 +02:00
Nikolay Shirokovskiy	399039a6b1	qemu: implement driver's shutdown/shutdown wait methods On shutdown we just stop accepting new jobs for worker thread so that on shutdown wait we can exit worker thread faster. Yes we basically stop processing of events for VMs but we are going to do so anyway in case of daemon shutdown. At the same time synchronous event processing that some API calls may require are still possible as per VM event loop is still running and we don't need worker thread for synchronous event processing. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-09-07 09:33:59 +03:00
Nikolay Shirokovskiy	860a999802	qemu: avoid deadlock in qemuDomainObjStopWorker We are dropping the only reference here so that the event loop thread is going to be exited synchronously. In order to avoid deadlocks we need to unlock the VM so that any handler being called can finish execution and thus even loop thread be finished too. Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-07 09:33:59 +03:00
Nikolay Shirokovskiy	5c0cd375d1	qemu: don't shutdown event thread in monitor EOF callback This hunk was introduced in [1] in order to avoid loosing events from monitor on stopping qemu process. But as explained in [2] on destroy we won't get neither EOF nor any other events as monitor is just closed. In case of crash/shutdown we won't get any more events as well and qemuDomainObjStopWorker will be called by qemuProcessStop eventually. Thus let's remove qemuDomainObjStopWorker from qemuProcessHandleMonitorEOF as it is not useful anymore. [1] `e6afacb0f`: qemu: start/stop an event loop thread for domains [2] `d2954c072`: qemu: ensure domain event thread is always stopped Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-07 09:33:59 +03:00
Martin Kletzander	fc7d53edf4	qemu: Fix comment in qemuProcessSetupPid This was supposed to be done in commit `3791f29b08`, but I missed a spot. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2020-09-06 13:44:27 +02:00
Martin Kletzander	f51cbe92c0	qemu: Allow migration over UNIX socket This allows: a) migration without access to network b) complete control of the migration stream c) easy migration between containerised libvirt daemons on the same host Resolves: https://bugzilla.redhat.com/1638889 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2020-09-05 07:55:45 +02:00
Martin Kletzander	3791f29b08	qemu: Do not error out when setting affinity failed Consider a host with 8 CPUs. There are the following possible scenarios 1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs 2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs 3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus; QEMU should get 8 CPUs 4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 8 CPUs 5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus; QEMU should get 4 CPUs 6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 4 CPUs Scenarios 1 & 2 always work unless systemd restricted libvirtd privs. Scenario 3 works because libvirt checks current affinity first and skips the sched_setaffinity call, avoiding the SYS_NICE issue Scenario 4 works only if CAP_SYS_NICE is availalbe Scenarios 5 & 6 works only if CAP_SYS_NICE is present AND the cgroups cpuset is not set on the container. If libvirt blindly ignores the sched_setaffinity failure, then scenarios 4, 5 and 6 should all work, but with caveat in case 4 and 6, that QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively. This is still better than failing. Therefore libvirt can blindly ignore the setaffinity failure, but ONLY ignore it when there was no affinity specified in the XML config. If user specified affinity explicitly, libvirt must report an error if it can't be honoured. Resolves: https://bugzilla.redhat.com/1819801 Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-04 14:44:21 +02:00
Martin Kletzander	49186372db	qemu: Allow NBD migration over UNIX socket Adds new typed param for migration and uses this as a UNIX socket path that should be used for the NBD part of migration. And also adds virsh support. Partially resolves: https://bugzilla.redhat.com/1638889 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-04 10:20:49 +02:00
Martin Kletzander	e74d627bb3	qemu: Rework starting NBD server for migration Clean up the semantics by using one extra self-describing variable. This also fixes the port allocation when the port is specified. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-04 10:20:49 +02:00
Martin Kletzander	d17ece4dd4	qemu: Rework qemuMigrationSrcConnect Instead of saving some data from a union up front and changing an overlayed struct before using said data, let's just set the new values after they are decided. This will increase the readability of future commit(s). Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-04 10:20:49 +02:00
Martin Kletzander	ae200449fe	qemu: Use g_autofree in qemuMigrationSrcConnect Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-09-04 10:20:49 +02:00
Michal Privoznik	8abd1ffed1	qemu_namespace: Be tolerant to non-existent files when populating /dev In 6.7.0 release I've changed how domain namespace is built and populated. Previously it used to be done from a pre-exec hook (ran in the forked off child, just before dropping all privileges and exec()-ing QEMU), which not only meant we had to have two different code paths for creating a node in domain's namespace (one for this pre-exec hook, the other for hotplug ran from the daemon), it also proved problematic because it was leaking FDs into QEMU process. To mitigate this problem, we've not only ditched libdevmapper from the NS population process, I've also dropped the pre-exec code and let the NS be populated from the daemon (using the hotplug code). But, I was not careful when doing so, because the pre-exec code was tolerant to files that doesn't exist, while this new code isn't. For instance, the very first thing that is done when the new NS is created is it's populated with @defaultDeviceACL which contain files like /dev/null, /dev/zero, /dev/random and /dev/kvm (and others). While the rest will probably exist every time, /dev/kvm might not and thus the new code I wrote has to be tolerant to that. Of course, users can override the @defaultDeviceACL (by setting cgroup_device_acl in qemu.conf) and remove /dev/kvm (which is acceptable workaround), but we definitely want libvirt to work out of the box even on hosts without KVM. Fixes: `9048dc4e62` Reported-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-04 08:18:21 +02:00
Han Han	be28a7fbd6	qemu_validate: Only allow none address for watchdog ib700 Since QEMU 1.5.3, the ib700 watchdog device has no options for address, and not address in device tree: $ /usr/libexec/qemu-kvm -version QEMU emulator version 1.5.3 (qemu-kvm-1.5.3-175.el7), Copyright (c) 2003-2008 Fabrice Bellard $ /usr/libexec/qemu-kvm -device ib700,\? $ virsh qemu-monitor-command seabios --hmp info qtree\|grep ib700 -A 2 dev: ib700, id "watchdog0" dev: isa-serial, id "serial0" index = 0 So only allow it to use none address. Fixes: `8a54cc1d08` Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1509908 Signed-off-by: Han Han <hhan@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-02 18:50:38 +02:00
Thomas Huth	f8333b3b0a	qemu: Fix domfsinfo for non-PCI device information from guest agent qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices. However, the QEMU guest agent could also provide the device name in the "dev" field of the response for other devices instead (well, at least after fixing another problem in the current QEMU guest agent...). So if creating the devAlias from the PCI information failed, let's fall back to the name provided by the guest agent. This helps to fix the empty "Target" fields that occur when running "virsh domfsinfo" on s390x where CCW devices are used for the guest instead of PCI devices. Also add a proper debug message here in case we completely failed to set the device alias, since this problem here was very hard to debug: The only two error messages that I've seen were "Unable to get filesystem information" and "Unable to encode message payload" - which only indicates that something went wrong in the RPC call. No debug message indicated the real problem, so I had to learn the hard way why the RPC call failed (it apparently does not like devAlias left to be NULL) and where the real problem comes from. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075 Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2020-09-02 17:49:09 +01:00
Thomas Huth	2f5d8ffebe	qemu: Do not silently allow non-available timers on non-x86 systems libvirt currently silently allows <timer name="kvmclock"/> and some other timer tags in the guest XML definition for timers that do not exist on non-x86 systems. We should not silently ignore these tags since the users might not get what they expected otherwise. Note: The error is only generated if the timer is marked with present="yes" - otherwise we would suddenly refuse XML definitions that worked without problems before. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1754887 Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-02 18:48:14 +02:00
Michal Privoznik	95b9db4ee2	lib: Prefer WITH_* prefix for #if conditionals Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH. Things got way better with Pavel's work on meson, but apparently, mixing these two lead to confusing and easy to miss bugs (see `31fb929eca` for instance). While we were forced to use HAVE_ prefix with autotools, we are free to chose our own prefix with meson and since WITH_ prefix appears to be more popular let's use it everywhere. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-09-02 10:28:10 +02:00
Patrick Magauran	69e3381626	qemu: Add e1000e/vmxnet3 IFF_VNET_HDR support Setting IFF_VNET_HDR for a tap device passes the whole packet to the host, reducing emulation overhead and improving performance. Libvirt bases its decision about applying IFF_VNET_HDR to the tap interface on whether or not the model of the emulated network device is virtio. Originally, virtio was the only model to support IFF_VNET_HDR in QEMU; however, the e1000e & vmxnet3 adapters have also supported it since their introductions - QEMU commit 786fd2b0f87 for vmxnet3, and QEMU commit 6f3fbe4ed0 for e1000e, so it should be set for those models too. Signed-off-by: Patrick Magauran <patmagauran.j@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Laine Stump <laine@redhat.com>	2020-09-01 18:48:21 -04:00
Jim Fehlig	9d15647dcb	Xen: Add writeFiltering option for PCI devices By default Xen only allows guests to write "known safe" values into PCI configuration space, yet many devices require writes to other areas of the configuration space in order to operate properly. To allow writing any values Xen supports the 'permissive' setting, see xl.cfg(5) man page. This change models Xen's permissive setting by adding a writeFiltering attribute on the <source> element of a PCI hostdev. When writeFiltering is set to 'no', the Xen permissive setting will be enabled and guests will be able to write any values into the device's configuration space. The permissive setting remains disabled in the absense of the writeFiltering attribute, of if it is explicitly set to 'yes'. Signed-off-by: Jim Fehlig <jfehlig@suse.com> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-01 14:29:17 -06:00
Jim Fehlig	2ad009eadd	qemu: Check for changes in qemu modules directory Add a configuration option for specifying location of the qemu modules directory, defaulting to /usr/lib64/qemu. Then use this location to check for changes in the directory, indicating that a qemu module has changed and capabilities need to be reprobed. Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-01 14:22:24 -06:00
Ján Tomko	daec478600	Prefer https: for Red Hat websites The list archives, people.redhat.com and bugzilla all support https. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com>	2020-09-01 21:58:46 +02:00
Laine Stump	95089f481e	util: assign tap device names using a monotonically increasing integer When creating a standard tap device, if provided with an ifname that contains "%d", rather than taking that literally as the name to use for the new device, the kernel will instead use that string as a template, and search for the lowest number that could be put in place of %d and produce an otherwise unused and unique name for the new device. For example, if there is no tap device name given in the XML, libvirt will always send "vnet%d" as the device name, and the kernel will create new devices named "vnet0", "vnet1", etc. If one of those devices is deleted, creating a "hole" in the name list, the kernel will always attempt to reuse the name in the hole first before using a name with a higher number (i.e. it finds the lowest possible unused number). The problem with this, as described in the previous patch dealing with macvtap device naming, is that it makes "immediate reuse" of a newly freed tap device name much more common, and in the aftermath of deleting a tap device, there is some other necessary cleanup of things which are named based on the device name (nwfilter rules, bandwidth rules, OVS switch ports, to name a few) that could end up stomping over the top of the setup of a new device of the same name for a different guest. Since the kernel "create a name based on a template" functionality for tap devices doesn't exist for macvtap, this patch for standard tap devices is a bit different from the previous patch for macvtap - in particular there was no previous "bitmap ID reservation system" or overly-complex retry loop that needed to be removed. We simply find and unused name, and pass that name on to the kernel instead of "vnet%d". This counter is also wrapped when either it gets to INT_MAX or if the full name would overflow IFNAMSIZ-1 characters. In the case of "vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly someday someone will change the name from vnet to something else. (NB: It is still possible for a user to provide their own parameterized template name (e.g. "mytap%d") in the XML, and libvirt will just pass that through to the kernel as it always has.) Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-01 14:16:44 -04:00
Laine Stump	d7f38beb2e	util: replace macvtap name reservation bitmap with a simple counter There have been some reports that, due to libvirt always trying to assign the lowest numbered macvtap / tap device name possible, a new guest would sometimes be started using the same tap device name as previously used by another guest that is in the process of being destroyed as the new guest is starting. In some cases this has led to, for example, the old guest's qemuProcessStop() code deleting a port from an OVS switch that had just been re-added by the new guest (because the port name is based on only the device name using the port). Similar problems can happen (and I believe have) with nwfilter rules and bandwidth rules (which are both instantiated based on the name of the tap device). A couple patches have been previously proposed to change the ordering of startup and shutdown processing, or to put a mutex around everything related to the tap/macvtap device name usage, but in the end no matter what you do there will still be possible holes, because the device could be deleted outside libvirt's control (for example, regular tap devices are automatically deleted when the qemu process terminates, and that isn't always initiated by libvirt but could instead happen completely asynchronously - libvirt then has no control over the ordering of shutdown operations, and no opportunity to protect it with a mutex.) But this only happens if a new device is created at the same time as one is being deleted. We can effectively eliminate the chance of this happening if we end the practice of always looking for the lowest numbered available device name, and instead just keep an integer that is incremented each time we need a new device name. At some point it will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15 character limit if nothing else), and we can't guarantee that the new name really will be the least* recently used name, but "math" suggests that it will be much less common that we'll try to re-use the most recently used name. This patch implements such a counter for macvtap/macvlan, replacing the existing, and much more complicated, "ID reservation" system. The counter is set according to whatever macvtap/macvlan devices are already in use by guests when libvirtd is started, incremented each time a new device name is needed, and wraps back to 0 when either INT_MAX is reached, or when the resulting device name would be longer than IFNAMSIZ-1 characters (which actually is what happens when the template for the device name is "maccvtap%d"). The result is that no macvtap name will be re-used until the host has created (and possibly destroyed) 99,999,999 devices. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-09-01 14:16:36 -04:00
Michal Privoznik	fc19155819	qemu: Validate memory hotplug in domainValidateCallback instead of cmd line generator When editing a domain with hotplug enabled, I removed the only NUMA node it had and got no error. I got the error later though, when starting the domain. This is not as user friendly as it can be. Move the validation call out from command line generator and into domain validator (which is called prior to starting cmd line generation anyway). When doing this, I had to remove memory-hotplug-nonuma xml2xml test case because there is no way the test case can succeed, obviously. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-09-01 09:30:27 +02:00
Daniel Henrique Barboza	2ba0b7497c	virhostcpu.c: skip non x86 hosts in virHostCPUGetMicrocodeVersion() Non-x86 archs does not have a 'microcode' version like x86. This is covered already inside the function - just return 0 if no microcode is found. Regardless of that, a read of /proc/cpuinfo is always made. Each read will invoke the kernel to fill in the CPU details every time. Now let's consider a non-x86 host, like a Power 9 server with 128 CPUs. Each /proc/cpuinfo read will need to fetch data for each CPU and it won't even matter because we know beforehand that PowerPC chips don't have microcode information. We can do better for non-x86 hosts by skipping this process entirely. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-08-25 19:44:39 +02:00
Ján Tomko	52cd849e62	VIR_XPATH_NODE_AUTORESTORE: remove semicolon from users Since the macro no longer includes the 'ignore_value' statement, stop putting another empty statement after it. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:12 +02:00
Ján Tomko	96b4f38603	Move debug statements after declarations Many of our functions start with a DEBUG statement. Move the statements after declarations to appease our coding style. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Ján Tomko	0a37e0695b	Split declarations from initializations Split those initializations that depend on a statement above them. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Ján Tomko	a5152f23e7	Move declarations before statements Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2020-08-25 19:03:11 +02:00
Peter Krempa	14b895ad3a	qemuMigrationCapsToJSON: Refactor capability object formatting Use virJSONValueObjectCreate rather than creating the object piece-by-piece and use new accessors for bitmap to simplify the code. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2020-08-25 08:24:34 +02:00
Roman Bogorodskiy	9375bc7373	conf: allow to map sound device to host device Introduce a new device element "<audio>" which allows to map guest sound device specified using the "<sound>" element to specific audio backend. Example: <sound model='ich7'> <audio id='1'/> </sound> <audio id='1' type='oss'> <input dev='/dev/dsp0'/> <output dev='/dev/dsp0'/> </audio> This block maps to OSS audio backend on the host using /dev/dsp0 device for both input (recording) and output (playback). OSS is the only backend supported so far. Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-08-25 08:42:16 +04:00
Roman Bogorodskiy	9499521718	conf: add 'ich7' sound model Add 'ich7' sound model. This is a preparation for sound support in bhyve, as 'ich7' is the only model it supports. Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-08-25 08:42:16 +04:00
Laine Stump	5cad64ec03	qemu: remove unreachable code in qemuProcessStart() Back when the original version of this chunk of code was added (commit `41b087198` in libvirt-0.8.1 in April 2010), we used virExecDaemonize() to start the qemu process, and would continue on in the function (which at that time was called qemudStartVMDaemon()) even if a -1 was returned. So it was possible to get to this code with rv == -1 (it was called "ret" in that version of the code). In modern libvirt code, qemu is started with virCommandRun(); then we call virPidFileReadPath(); those are the only two ways of setting "rv" prior to this code being removed, and in either case if the new value of rv < 0, then we immediately skip over the rest of the code to the cleanup: label. This means that the code being removed by this patch is unreachable. Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2020-08-24 23:46:51 -04:00

1 2 3 4 5 ...

10324 Commits