libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-01-05 20:45:18 +00:00

Author	SHA1	Message	Date
Andrea Bolognani	7e667664d2	qemu: Fix memory locking limit calculation For guests that use <memoryBacking><locked>, our only option is to remove the memory locking limit altogether. Partially-resolves: https://bugzilla.redhat.com/1431793	2017-03-28 10:54:49 +02:00
Andrea Bolognani	1f7661af8c	qemu: Remove qemuDomainRequiresMemLock() Instead of having a separate function, we can simply return zero from the existing qemuDomainGetMemLockLimitBytes() to signal the caller that the memory locking limit doesn't need to be set for the guest. Having a single function instead of two makes it less likely that we will use the wrong value, which is exactly what happened when we started applying the limit that was meant for VFIO-using guests to <memoryBacking><locked>-using guests.	2017-03-28 10:54:47 +02:00
Andrea Bolognani	4b67e7a377	Revert "qemu: Forbid <memoryBacking><locked> without <memtune><hard_limit>" This reverts commit `c2e60ad0e5`. Turns out this check is excessively strict: there are ways other than <memtune><hard_limit> to raise the memory locking limit for QEMU processes, one prominent example being tweaking /etc/security/limits.conf. Partially-resolves: https://bugzilla.redhat.com/1431793	2017-03-28 10:44:25 +02:00
Erik Skultety	c8e6775f30	qemu: Bump the memory locking limit for mdevs as well Since mdevs are just another type of VFIO devices, we should increase the memory locking limit the same way we do for VFIO PCI devices. Signed-off-by: Erik Skultety <eskultet@redhat.com>	2017-03-27 15:39:35 +02:00
Erik Skultety	de4e8bdbc7	qemu: cgroup: Adjust cgroups' logic to allow mediated devices As goes for all the other hostdev device types, grant the qemu process access to /dev/vfio/<mediated_device_iommu_group>. Signed-off-by: Erik Skultety <eskultet@redhat.com>	2017-03-27 15:39:35 +02:00
Erik Skultety	ec783d7c77	conf: Introduce new hostdev device type mdev A mediated device will be identified by a UUID (with 'model' now being a mandatory <hostdev> attribute to represent the mediated device API) of the user pre-created mediated device. We also need to make sure that if user explicitly provides a guest address for a mdev device, the address type will be matching the device API supported on that specific mediated device and error out with an incorrect XML message. The resulting device XML: <devices> <hostdev mode='subsystem' type='mdev' model='vfio-pci'> <source> <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'> </source> </hostdev> </devices> Signed-off-by: Erik Skultety <eskultet@redhat.com>	2017-03-27 15:39:35 +02:00
Peter Krempa	9b93c4c264	qemu: domain: Add helper to look up disk soruce by the backing store string	2017-03-27 10:18:16 +02:00
Peter Krempa	4e1618ce72	qemu: domain: Add helper to generate indexed backing store names The code is currently simple, but if we later add node names, it will be necessary to generate the names based on the node name. Add a helper so that there's a central point to fix once we add self-generated node names.	2017-03-27 09:29:57 +02:00
Peter Krempa	1a5e2a8098	qemu: domain: Add helper to lookup disk by node name Looks up a disk and its corresponding backing chain element by node name.	2017-03-27 09:29:57 +02:00
John Ferlan	1a6b6d9a56	qemu: Set up the migration TLS objects for target If the migration flags indicate this migration will be using TLS, then set up the destination during the prepare phase once the target domain has been started to add the TLS objects to perform the migration. This will create at least an "-object tls-creds-x509,endpoint=server,..." for TLS credentials and potentially an "-object secret,..." to handle the passphrase response to access the TLS credentials. The alias/id used for the TLS objects will contain "libvirt_migrate". Once the objects are created, the code will set the "tls-creds" and "tls-hostname" migration parameters to signify usage of TLS. During the Finish phase we'll be sure to attempt to clear the migration parameters and delete those objects (whether or not they were created). We'll also perform the same reset during recovery if we've reached FINISH3. If the migration isn't using TLS, then be sure to check if the migration parameters exist and clear them if so.	2017-03-25 08:19:49 -04:00
Jiri Denemark	fcd56ce866	qemu: Set default values for CPU check attribute Signed-off-by: Jiri Denemark <jdenemar@redhat.com>	2017-03-17 11:50:48 +01:00
Michal Privoznik	7b89f857d9	qemu: Namespaces for NVDIMM Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-03-15 17:04:33 +01:00
Michal Privoznik	1bc173199e	qemu: Implement NVDIMM So, majority of the code is just ready as-is. Well, with one slight change: differentiate between dimm and nvdimm in places like device alias generation, generating the command line and so on. Speaking of the command line, we also need to append 'nvdimm=on' to the '-machine' argument so that the nvdimm feature is advertised in the ACPI tables properly. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-03-15 14:16:32 +01:00
Michal Privoznik	b4e8a49f8d	Introduce NVDIMM memory model NVDIMM is new type of memory introduced into QEMU 2.6. The idea is that we have a Non-Volatile memory module that keeps the data persistent across domain reboots. At the domain XML level, we already have some representation of 'dimm' modules. Long story short, NVDIMM will utilize the existing <memory/> element that lives under <devices/> by adding a new attribute 'nvdimm' to the existing @model and introduce a new <path/> element for <source/> while reusing other fields. The resulting XML would appear as: <memory model='nvdimm'> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit='KiB'>523264</size> <node>0</node> </target> <address type='dimm' slot='0'/> </memory> So far, this is just a XML parser/formatter extension. QEMU driver implementation is in the next commit. For more info on NVDIMM visit the following web page: http://pmem.io/ Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-03-15 13:30:58 +01:00
Michal Privoznik	290a00e41d	qemuDomainBuildNamespace: Handle file mount points https://bugzilla.redhat.com/show_bug.cgi?id=1431112 Yeah, that's right. A mount point doesn't have to be a directory. It can be a file too. However, the code that tries to preserve mount points under /dev for new namespace for qemu does not count with that option. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-03-13 13:32:45 +01:00
Michal Privoznik	e915942b05	qemuProcessHandleMonitorEOF: Disable namespace for domain https://bugzilla.redhat.com/show_bug.cgi?id=1430634 If a qemu process has died, we get EOF on its monitor. At this point, since qemu process was the only one running in the namespace kernel has already cleaned the namespace up. Any attempt of ours to enter it has to fail. This really happened in the bug linked above. We've tried to attach a disk to qemu and while we were in the monitor talking to qemu it just died. Therefore our code tried to do some roll back (e.g. deny the device in cgroups again, restore labels, etc.). However, during the roll back (esp. when restoring labels) we still thought that domain has a namespace. So we used secdriver's transactions. This failed as there is no namespace to enter. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-03-10 16:02:34 +01:00
Pavel Hrdina	cd4a8b9304	conf: store "autoGenerated" for graphics listen in status XML When libvirtd is started we call qemuDomainRecheckInternalPaths to detect whether a domain has VNC socket path generated by libvirt based on option from qemu.conf. However if we are parsing status XML for running domain the existing socket path can be generated also if the config XML uses the new <listen type='socket'/> element without specifying any socket. The current code doesn't make difference how the socket was generated and always marks it as "fromConfig". We need to store the "autoGenerated" value in the status XML in order to preserve that information. The difference between "fromConfig" and "autoGenerated" is important for migration, because if the socket is based on "fromConfig" we don't print it into the migratable XML and we assume that user has properly configured qemu.conf on both hosts. However if the socket is based on "autoGenerated" it means that a new feature was used and therefore we need to leave the socket in migratable XML to make sure that if this feature is not supported on destination the migration will fail. Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2017-03-09 10:22:43 +01:00
John Ferlan	b2e5de96c7	qemu: Rename variable Rename 'secretUsageType' to 'usageType' since it's superfluous in an API qemuSecret	2017-03-08 14:37:05 -05:00
John Ferlan	7c2b7891cc	qemu: Introduce qemuDomainSecretInfoTLSNew Building upon the qemuDomainSecretInfoNew, create a helper which will build the secret used for TLS. Signed-off-by: John Ferlan <jferlan@redhat.com>	2017-03-08 14:31:09 -05:00
John Ferlan	c9a7b7b6ea	qemu: Introduce qemuDomainSecretInfoNew Create a helper which will create the secinfo used for disks, hostdevs, and chardevs. Signed-off-by: John Ferlan <jferlan@redhat.com>	2017-03-08 14:31:07 -05:00
Pavel Hrdina	3ffea19acd	qemu_domain: cleanup the controller post parse code Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2017-03-07 16:50:35 +01:00
Pavel Hrdina	57404ff7a7	qemu_domain: move controller post parse code into its own function Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2017-03-07 16:50:34 +01:00
Michal Privoznik	4da534c0b9	qemu: Enforce qemuSecurity wrappers Now that we have some qemuSecurity wrappers over virSecurityManager APIs, lets make sure everybody sticks with them. We have them for a reason and calling virSecurityManager API directly instead of wrapper may lead into accidentally labelling a file on the host instead of namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-03-06 08:54:28 +01:00
Marc Hartmayer	e22de286b1	qemu: Fix deadlock across fork() in QEMU driver The functions in virCommand() after fork() must be careful with regard to accessing any mutexes that may have been locked by other threads in the parent process. It is possible that another thread in the parent process holds the lock for the virQEMUDriver while fork() is called. This leads to a deadlock in the child process when 'virQEMUDriverGetConfig(driver)' is called and therefore the handshake never completes between the child and the parent process. Ultimately the virDomainObjectPtr will never be unlocked. It gets much worse if the other thread of the parent process, that holds the lock for the virQEMUDriver, tries to lock the already locked virDomainObject. This leads to a completely unresponsive libvirtd. It's possible to reproduce this case with calling 'virsh start XXX' and 'virsh managedsave XXX' in a tight loop for multiple domains. This commit fixes the deadlock in the same way as it is described in commit `61b52d2e38`. Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>	2017-02-21 15:47:32 +01:00
Michal Privoznik	5c74cf1f44	qemu: Allow @rendernode for virgl domains When enabling virgl, qemu opens /dev/dri/render*. So far, we are not allowing that in devices CGroup nor creating the file in domain's namespace and thus requiring users to set the paths in qemu.conf. This, however, is suboptimal as it allows access to ALL qemu processes even those which don't have virgl configured. Now that we have a way to specify render node that qemu will use we can be more cautious and enable just that. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-20 10:44:22 +01:00
Michal Privoznik	1bb787fdc9	qemuDomainGetHostdevPath: Report /dev/vfio/vfio less frequently So far, qemuDomainGetHostdevPath has no knowledge of the reasong it is called and thus reports /dev/vfio/vfio for every VFIO backed device. This is suboptimal, as we want it to: a) report /dev/vfio/vfio on every addition or domain startup b) report /dev/vfio/vfio only on last VFIO device being unplugged If a domain is being stopped then namespace and CGroup die with it so no need to worry about that. I mean, even when a domain that's exiting has more than one VFIO devices assigned to it, this function does not clean /dev/vfio/vfio in CGroup nor in the namespace. But that doesn't matter. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-02-20 07:21:59 +01:00
Michal Privoznik	b8e659aa98	qemuDomainGetHostdevPath: Create /dev/vfio/vfio iff needed So far, we are allowing /dev/vfio/vfio in the devices cgroup unconditionally (and creating it in the namespace too). Even if domain has no hostdev assignment configured. This is potential security hole. Therefore, when starting the domain (or hotplugging a hostdev) create & allow /dev/vfio/vfio too (if needed). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-02-20 07:21:58 +01:00
Michal Privoznik	9d92f533f8	qemuSetupHostdevCgroup: Use qemuDomainGetHostdevPath Since these two functions are nearly identical (with qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath) we can have one function call the other and thus de-duplicate some code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-02-20 07:21:58 +01:00
Michal Privoznik	b57bd206b9	qemu_conf: Check for namespaces availability more wisely The bare fact that mnt namespace is available is not enough for us to allow/enable qemu namespaces feature. There are other requirements: we must copy all the ACL & SELinux labels otherwise we might grant access that is administratively forbidden or vice versa. At the same time, the check for namespace prerequisites is moved from domain startup time to qemu.conf parser as it doesn't make much sense to allow users to start misconfigured libvirt just to find out they can't start a single domain. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-15 12:43:23 +01:00
Andrea Bolognani	ee6ec7824d	qemu: Call chmod() after mknod() mknod() is affected my the current umask, so we're not guaranteed the newly-created device node will have the right permissions. Call chmod(), which is not affected by the current umask, immediately afterwards to solve the issue. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1421036	2017-02-14 19:23:05 +01:00
Ján Tomko	723fef99c0	qemu: enforce maximum ports value for nec-xhci This controller only allows up to 15 ports. https://bugzilla.redhat.com/show_bug.cgi?id=1375417	2017-02-13 16:34:09 +01:00
Michal Privoznik	c2130c0d47	qemu_security: Introduce ImageLabel APIs Just like we need wrappers over other virSecurityManager APIs, we need one for virSecurityManagerSetImageLabel and virSecurityManagerRestoreImageLabel. Otherwise we might end up relabelling device in wrong namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-09 08:04:57 +01:00
Michal Privoznik	b7feabbfdc	qemuDomainNamespaceSetupDisk: Simplify disk check Firstly, instead of checking for next->path the virStorageSourceIsEmpty() function should be used which also takes disk type into account. Secondly, not every disk source passed has the correct type set (due to our laziness). Therefore, instead of checking for virStorageSourceIsBlockLocal() and also S_ISBLK() the former can be refined to just virStorageSourceIsLocalStorage(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-08 15:56:21 +01:00
Michal Privoznik	786d8d91b4	qemuDomainDiskChainElement{Prepare,Revoke}: manage /dev entry Again, one missed bit. This time without this commit there is no /dev entry in the namespace of the qemu process when doing disk snapshots or block-copy. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-08 15:56:13 +01:00
Michal Privoznik	18ce9d139d	qemuDomainNamespace{Setup,Teardown}Disk: Don't pass pointer to full disk These functions do not need to see the whole virDomainDiskDef. Moreover, they are going to be called from places where we don't have access to the full disk definition. Sticking with virStorageSource is more than enough. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-08 15:56:05 +01:00
Michal Privoznik	76d491ef14	qemuDomainNamespaceSetupDisk: Drop useless @src variable Since its introduction in `81df21507b` this variable was never used. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-08 15:55:56 +01:00
Michal Privoznik	8dc867e978	qemu_domain: Don't pass virDomainDeviceDefPtr to ns helpers There is no need for this. None of the namespace helpers uses it. Historically it was used when calling secdriver APIs, but we don't to that anymore. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-08 15:55:52 +01:00
Andrea Bolognani	c2e60ad0e5	qemu: Forbid <memoryBacking><locked> without <memtune><hard_limit> In order for memory locking to work, the hard limit on memory locking (and usage) has to be set appropriately by the user. The documentation mentions the requirement already: with this patch, it's going to be enforced by runtime checks as well, by forbidding a non-compliant guest from being defined as well as edited and started. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316774	2017-02-07 18:43:10 +01:00
Michal Privoznik	7f0b382522	qemuDomainAttachDeviceMknod: Don't loop endlessly When working with symlinks it is fairly easy to get into a loop. Don't. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-07 13:20:19 +01:00
Michal Privoznik	3f5fcacf89	qemuDomainAttachDeviceMknod: Deal with symlinks Similarly to one of the previous commits, we need to deal properly with symlinks in hotplug case too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-07 13:20:17 +01:00
Michal Privoznik	4ac847f93b	qemuDomainCreateDevice: Don't loop endlessly When working with symlinks it is fairly easy to get into a loop. Don't. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-07 13:18:32 +01:00
Michal Privoznik	54ed672214	qemuDomainCreateDevice: Properly deal with symlinks Imagine you have a disk with the following source set up: /dev/disk/by-uuid/$uuid (symlink to) -> /dev/sda After `cbc45525cb` the transitive end of the symlink chain is created (/dev/sda), but we need to create any item in chain too. Others might rely on that. In this case, /dev/disk/by-uuid/$uuid comes from domain XML thus it is this path that secdriver tries to relabel. Not the resolved one. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-07 13:18:10 +01:00
Michal Privoznik	b621291f5c	qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices After previous commit this has become redundant step. Also setting up devices in namespace and setting their label later on are two different steps and should be not done at once. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-02-07 10:40:53 +01:00
Michal Privoznik	572eda12ad	qemu: Implement mtu on interface Not only we should set the MTU on the host end of the device but also let qemu know what MTU did we set. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-01-26 10:00:01 +01:00
Michal Privoznik	b020cf73fe	domain_conf: Introduce <mtu/> to <interface/> So far we allow to set MTU for libvirt networks. However, not all domain interfaces have to be plugged into a libvirt network and even if they are, they might want to have a different MTU (e.g. for testing purposes). Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-01-26 09:59:56 +01:00
Chen Hanxiao	980f2a35c7	qemu_domain: add timestamp in tainting of guests log We lacked of timestamp in tainting of guests log, which bring troubles for finding guest issues: such as whether a guest powerdown caused by qemu-monitor-command or others issues inside guests. If we had timestamp in tainting of guests log, it would be helpful when checking guest's /var/log/messages. Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>	2017-01-21 12:34:19 -05:00
Michal Privoznik	57b5e27d3d	qemu: set default vhost-user ifname Based on work of Mehdi Abaakouk <sileht@sileht.net>. When parsing vhost-user interface XML and no ifname is found we can try to fill it in in post parse callback. The way this works is we try to make up interface name from given socket path and then ask openvswitch whether it knows the interface. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-01-20 15:42:12 +01:00
Michal Privoznik	d0baf54e53	qemu: Actually unshare() iff running as root https://bugzilla.redhat.com/show_bug.cgi?id=1413922 While all the code that deals with qemu namespaces correctly detects whether we are running as root (and turn into NO-OP for qemu:///session) the actual unshare() call is not guarded with such check. Therefore any attempt to start a domain under qemu:///session shall fail as unshare() is reserved for root. The fix consists of moving unshare() call (for which we have a wrapper called virProcessSetupPrivateMountNS) into qemuDomainBuildNamespace() where the proper check is performed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com>	2017-01-17 13:23:56 +01:00
Michal Privoznik	93a062c3b2	qemu: Copy SELinux labels for namespace too When creating new /dev/* for qemu, we do chown() and copy ACLs to create the exact copy from the original /dev. I though that copying SELinux labels is not necessary as SELinux will chose the sane defaults. Surprisingly, it does not leaving namespace with the following labels: crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0 random crw-------. root root system_u:object_r:tmpfs_t:s0 rtc0 drwxrwxrwt. root root system_u:object_r:tmpfs_t:s0 shm crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0 urandom As a result, domain is unable to start: error: internal error: process exited while connecting to monitor: Error in GnuTLS initialization: Failed to acquire random data. qemu-kvm: cannot initialize crypto: Unable to initialize GNUTLS library: Failed to acquire random data. The solution is to copy the SELinux labels as well. Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-01-13 14:45:52 +01:00
Michal Privoznik	cbc45525cb	qemuDomainCreateDevice: Canonicalize paths So far the decision whether /dev/* entry is created in the qemu namespace is really simple: does the path starts with "/dev/"? This can be easily fooled by providing path like the following (for any considered device like disk, rng, chardev, ..): /dev/../var/lib/libvirt/images/disk.qcow2 Therefore, before making the decision the path should be canonicalized. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2017-01-11 18:08:13 +01:00

1 2 3 4 5 ...

626 Commits