libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2024-10-16 19:19:18 +00:00

Author	SHA1	Message	Date
Peter Krempa	714b38cb23	qemu: Enforce WWN to be unique among VM's disks Operating systems use the identifier to name the disks. As the name suggests the ID should be unique. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1208009	2015-04-14 08:44:36 +02:00
Michal Privoznik	ea576ee543	qemuProcessHook: Call virNuma*() only when needed https://bugzilla.redhat.com/show_bug.cgi?id=1198645 Once upon a time, there was a little domain. And the domain was pinned onto a NUMA node and hasn't fully allocated its memory: <memory unit='KiB'>2355200</memory> <currentMemory unit='KiB'>1048576</currentMemory> <numatune> <memory mode='strict' nodeset='0'/> </numatune> Oh little me, said the domain, what will I do with so little memory. If I only had a few megabytes more. But the old admin noticed the whimpering, barely audible to untrained human ear. And good admin he was, he gave the domain yet more memory. But the old NUMA topology witch forbade to allocate more memory on the node zero. So he decided to allocate it on a different node: virsh # numatune little_domain --nodeset 0-1 virsh # setmem little_domain 2355200 The little domain was happy. For a while. Until bad, sharp teeth shaped creature came. Every process in the system was afraid of him. The OOM Killer they called him. Oh no, he's after the little domain. There's no escape. Do you kids know why? Because when the little domain was born, her father, Libvirt, called numa_set_membind(). So even if the admin allowed her to allocate memory from other nodes in the cgroups, the membind() forbid it. So what's the lesson? Libvirt should rely on cgroups, whenever possible and use numa_set_membind() as the last ditch effort. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-04-08 11:54:31 +02:00
Michael Chapman	7578cc17f5	qemu: fix crash in qemuProcessAutoDestroy The destination libvirt daemon in a migration may segfault if the client disconnects immediately after the migration has begun: # virsh -c qemu+tls://remote/system list --all Id Name State ---------------------------------------------------- ... # timeout --signal KILL 1 \ virsh migrate example qemu+tls://remote/system \ --verbose --compressed --live --auto-converge \ --abort-on-error --unsafe --persistent \ --undefinesource --copy-storage-all --xml example.xml Killed # virsh -c qemu+tls://remote/system list --all error: failed to connect to the hypervisor error: unable to connect to server at 'remote:16514': Connection refused The crash is in: 1531 void 1532 qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj) 1533 { 1534 qemuDomainObjPrivatePtr priv = obj->privateData; 1535 qemuDomainJob job = priv->job.active; 1536 1537 priv->jobs_queued--; Backtrace: #0 at qemuDomainObjEndJob at qemu/qemu_domain.c:1537 #1 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2497 #2 in qemuProcessAutoDestroy at qemu/qemu_process.c:5646 #3 in virCloseCallbacksRun at util/virclosecallbacks.c:350 #4 in qemuConnectClose at qemu/qemu_driver.c:1154 ... qemuDomainRemoveInactive calls virDomainObjListRemove, which in this case is holding the last remaining reference to the domain. qemuDomainRemoveInactive then calls qemuDomainObjEndJob, but the domain object has been freed and poisoned by then. This patch bumps the domain's refcount until qemuDomainRemoveInactive has completed. We also ensure qemuProcessAutoDestroy does not return the domain to virCloseCallbacksRun to be unlocked in this case. There is similar logic in bhyveProcessAutoDestroy and lxcProcessAutoDestroy (which call virDomainObjListRemove directly). Signed-off-by: Michael Chapman <mike@very.puzzling.org>	2015-04-08 09:45:47 +02:00
Ján Tomko	5903378834	Allocate virtio-serial addresses when starting a domain Instead of always using controller 0 and incrementing port number, respect the maximum port numbers of controllers and use all of them. Ports for virtio consoles are quietly reserved, but not formatted (neither in XML nor on QEMU command line). Also rejects duplicate virtio-serial addresses. https://bugzilla.redhat.com/show_bug.cgi?id=890606 https://bugzilla.redhat.com/show_bug.cgi?id=1076708 Test changes: * virtio-auto.args Filling out the port when just the controller is specified. switched from using maxport + 1 to: first free port on the controller * virtio-autoassign.args Filling out the address when no <address> is specified. Started using all the controllers instead of 0, also discards the bus value. * xml -> xml output of virtio-auto The port assignment is no longer done as a part of XML parsing, so the unspecified values stay 0.	2015-04-02 15:00:13 +02:00
Peter Krempa	98f08aba8e	qemu: cgroup: Use priv->autoCpuset instead of using qemuPrepareCpumap() Two places would call to qemuPrepareCpumap() with priv->autoNodeset to convert it to a cpuset. Remove the function and use the prepared cpuset automatically.	2015-04-02 10:12:08 +02:00
Peter Krempa	c9f9fa25d3	qemu: cgroup: Store auto cpuset instead of re-creating it on demand The automatic cpuset can be stored along with automatic nodeset and it does not have to be recreated when used.	2015-04-02 10:12:08 +02:00
Peter Krempa	630ee5ac6c	qemu: blockjob: Synchronously update backing chain in XML on ABORT/PIVOT When the synchronous pivot option is selected, libvirt would not update the backing chain until the job was exitted. Some applications then received invalid data as their job serialized first. This patch removes polling to wait for the ABORT/PIVOT job completion and replaces it with a condition. If a synchronous operation is requested the update of the XML is executed in the job of the caller of the synchronous request. Otherwise the monitor event callback uses a separate worker to update the backing chain with a new job. This is a regression since `1a92c71910` When the ABORT job is finished synchronously you get the following call stack: #0 qemuBlockJobEventProcess #1 qemuDomainBlockJobImpl #2 qemuDomainBlockJobAbort #3 virDomainBlockJobAbort While previously or while using the _ASYNC flag you'd get: #0 qemuBlockJobEventProcess #1 processBlockJobEvent #2 qemuProcessEventHandler #3 virThreadPoolWorker	2015-03-31 08:36:17 +08:00
Ján Tomko	9e48f6cf9f	Rename qemuMonitorIOThreadsInfo* to qemuMonitorIOThreadInfo* It only deals with a single thread.	2015-03-26 16:11:10 +01:00
Peter Krempa	5cdfaa31c4	qemu: memdev: Add infrastructure to load memory device information When using 'dimm' memory devices with qemu, some of the information like the slot number and base address need to be reloaded from qemu after process start so that it reflects the actual state. The state then allows to use memory devices across migrations.	2015-03-23 14:25:15 +01:00
Laine Stump	451547a422	util: clean up #includes of virnetdevopenvswitch.h virnetdevopenvswitch.h declares a few functions that can be called to add ports to and remove them from OVS bridges, and retrieve the migration data for a port. It does not contain any data definitions that are used by domain_conf.h. But for some reason, domain_conf.h virnetdevopenvswitch.h should be directly #including it. This adds a few lines to the project, but saves all the files that don't need it from the extra computing, and makes the dependencies more clear cut.	2015-03-18 14:43:47 -04:00
Jiri Denemark	18441ab914	Use PAUSED state for domains that are starting up When libvirt is starting a domain, it reports the state as SHUTOFF until it's RUNNING. This is not ideal because domain startup may take a long time (usually because of some configuration issues, firewalls blocking access to network disks, etc.) and domain lists provided by libvirt look awkward. One can see weird shutoff domains with IDs in a list of active domains or even shutoff transient domains. In any case, it looks more like a bug in libvirt than a normal state a domain goes through. Signed-off-by: Jiri Denemark <jdenemar@redhat.com>	2015-03-18 10:08:22 +01:00
Antoni Segura Puimedon	d490f47ba3	network: Add midonet virtual port type support to qemu Use the utilities introduced in the previous patches so the qemu driver is able to create tap devices that are bound (and unbound on domain destroyal) to Midonet virtual ports. Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>	2015-03-17 13:10:17 -04:00
Martin Kletzander	ad69e8be4a	conf: Use correct type for balloon stats period We're parsing memballoon status period as unsigned int, but when we're trying to set it, both we and qemu use signed int. That means large values will get wrapped around to negative one resulting in error. Basically the same problem as commit `e3a7b874` was dealing with when updating live domain. QEMU changed the accepted value to int64 in commit 1f9296b5, but even values as INT_MAX don't make sense since the value passed means seconds. Hence adding capability flag for this change isn't worth it. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140958 Signed-off-by: Luyao Huang <lhuang@redhat.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-03-17 12:06:14 +01:00
John Ferlan	a8a89270ef	Convert virDomainVcpuPinFindByVcpu into virDomainPinFindByVcpu Since both Vcpu and IOThreads code use the same API's, alter the naming of the API's to remove the "Vcpu" specific reference	2015-03-16 11:54:57 -04:00
John Ferlan	59ba70237a	Convert virDomainVcpuPinDefPtr to virDomainPinDefPtr As pointed out by jtomko in his review of the IOThreads pinning code: http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html there are some comments sprinkled in indicating IOThreads were using the same structure as the VcpuPin code... This is the first patch of a few that will change the virDomainVcpuPin* structures and code to just virDomainPin* - starting with the data structure naming...	2015-03-16 11:54:56 -04:00
Peter Krempa	4f9907cd11	conf: Replace access to def->mem.max_balloon with accessor functions As there are two possible approaches to define a domain's memory size - one used with legacy, non-NUMA VMs configured in the <memory> element and per-node based approach on NUMA machines - the user needs to make sure that both are specified correctly in the NUMA case. To avoid this burden on the user I'd like to replace the NUMA case with automatic totaling of the memory size. To achieve this I need to replace direct access to the virDomainMemtune's 'max_balloon' field with two separate getters depending on the desired size. The two sizes are needed as: 1) Startup memory size doesn't include memory modules in some hypervisors. 2) After startup these count as the usable memory size. Note that the comments for the functions are future aware and document state that will be present after a few later patches.	2015-03-16 14:26:51 +01:00
Peter Krempa	1a92c71910	qemu: event: Don't fiddle with disk backing trees without a job Surprisingly we did not grab a VM job when a block job finished and we'd happily rewrite the backing chain data. This made it possible to crash libvirt when queueing two backing chains tightly and other badness. To fix it, add yet another handler to the helper thread that handles monitor events that require a job.	2015-03-16 10:57:33 +01:00
Peter Krempa	5c634730b9	qemu: process: Export qemuProcessFindDomainDiskByAlias	2015-03-16 10:57:33 +01:00
Michal Privoznik	63889e0c77	qemuProcessReconnect: Fill in pid file path https://bugzilla.redhat.com/show_bug.cgi?id=1197600 So, libvirt uses pid file to track pid of started qemus. Whenever a domain is started, its pid is put into corresponding pid file. The pid file path is generated based on domain name and stored into domain object internals. However, it's not stored in the status XML and therefore lost on daemon restarts. Hence, later, when domain is being shut down, the daemon does not know which pid file to unlink, and the correct pid file is left behind. To avoid this, lets generate the pid file path again in qemuProcessReconnect(). Reported-by: Luyao Huang <lhuang@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-03-03 12:10:15 +01:00
Pavel Hrdina	a16e5f0a91	qemu: check defaultMode for spice graphics independently Instead of checking defaultMode for every channel that has no mode configured, test it only once outside of channel loop. This fixes a bug that in case all possible channels are fore example set to insecure, but defaultMode is set to secure, we wouldn't auto-generate TLS port. This results in failure while starting a guest. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143832 Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2015-03-03 11:42:33 +01:00
Pavel Hrdina	e4983952b4	qemu: remove duplicated code for allocating spice ports We have two different places that needs to be updated while touching code for allocation spice ports. Add a bool option to 'qemuProcessSPICEAllocatePorts' function to switch between true and fake allocation so we can use this function also in qemu_driver to generate native domain definition. Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2015-03-03 11:41:46 +01:00
Martin Kletzander	2fd5880b3b	conf: De-duplicate scheduling policy enums Since adding the support for scheduler policy settings in commit `8680ea97`, there are two enums with the same information. That was caused by rewriting the patch since first draft. Find out thanks to clang, but there was no impact whatsoever. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-03-03 09:26:59 +01:00
Peter Krempa	6bc80fa86d	conf: numa: Rename virDomainNumatune to virDomainNuma The structure will gradually become the only place for NUMA related config, thus rename it appropriately.	2015-02-20 17:43:04 +01:00
Michal Privoznik	37cf163ab2	virQEMUCapsCacheLookupCopy: Pass machine type It will come handy in the near future when we will filter some capabilities based on it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-02-20 13:27:59 +01:00
Michal Privoznik	76c61cdca2	qemuProcessHandleBlockJob: Take status into account Upon BLOCK_JOB_COMPLETED event delivery, we check if the job has completed (in qemuMonitorJSONHandleBlockJobImpl()). For better image, the event looks something like this: "timestamp": {"seconds": 1423582694, "microseconds": 372666}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 8412790784, "offset": 409993216, "speed": 8796093022207, "type": "mirror", "error": "No space left on device"}} If "len" does not equal "offset" it's considered an error, and we can clearly see "error" field filled in. However, later in the event processing this case was handled no differently to case of job being aborted via separate API. It's time that we start differentiate these two because of the future work. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-02-19 14:12:38 +01:00
Michal Privoznik	c37943a068	qemuProcessHandleBlockJob: Set disk->mirrorState more often Currently, upon BLOCK_JOB_* event, disk->mirrorState is not updated each time. The callback code handling the events checks if a blockjob was started via our public APIs prior to setting the mirrorState. However, some block jobs may be started internally (e.g. during storage migration), in which case we don't bother with setting disk->mirror (there's nothing we can set it to anyway), or other fields. But it will come handy if we update the mirrorState in these cases too. The event wasn't delivered just for fun - we've started the job after all. So, in this commit, the mirrorState is set to whatever job status we've obtained. Of course, there are some actions on some statuses that we want to perform. But instead of if {} else if {} else {} ... enumeration, let's move to switch(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-02-19 14:12:38 +01:00
Erik Skultety	c3d9d3bbc9	security: introduce virSecurityManagerCheckAllLabel function We do have a check for valid per-domain security model, however we still do permit an invalid security model for a domain's device (those which are specified with <source> element). This patch introduces a new function virSecurityManagerCheckAllLabel which compares user specified security model against currently registered security drivers. That being said, it also permits 'none' being specified as a device security model. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165485 Signed-off-by: Ján Tomko <jtomko@redhat.com>	2015-02-13 14:37:54 +01:00
Daniel P. Berrange	a103bb105c	qemu: fix setting of VM CPU affinity with TCG If a previous commit I fixed the incorrect handling of vcpu pids for TCG mode QEMU: commit `b07f3d821d` Author: Daniel P. Berrange <berrange@redhat.com> Date: Thu Dec 18 16:34:39 2014 +0000 Don't setup fake CPU pids for old QEMU The code assumes that def->vcpus == nvcpupids, so when we setup fake CPU pids for old QEMU with nvcpupids == 1, we cause the later code to read off the end of the array. This has fun results like sche_setaffinity(0, ...) which changes libvirtd's own CPU affinity, or even better sched_setaffinity($RANDOM, ...) which changes the affinity of a random OS process. The intent was that this would merely disable the ability to set per-vCPU affinity. It should still have been possible to set VM level host CPU affinity. Unfortunately, when you set <vcpu cpuset='0-1'>4</vcpu>, the XML parser will internally take this & initialize an entry in the def->cputune.vcpupin array for every VCPU. IOW this is implicitly being treated as <cputune> <vcpupin cpuset='0-1' vcpu='0'/> <vcpupin cpuset='0-1' vcpu='1'/> <vcpupin cpuset='0-1' vcpu='2'/> <vcpupin cpuset='0-1' vcpu='3'/> </cputune> Even more fun, the faked cputune elements are hidden from view when querying the live XML, because their cpuset mask is the same as the VM default cpumask. The upshot was that it was impossible to set VM level CPU affinity. To fix this we must update qemuProcessSetVcpuAffinities so that it only reports a fatal error if the per-VCPU cpu mask is different from the VM level cpu mask. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2015-02-12 10:02:50 +00:00
Martin Kletzander	104ba5966a	qemu: Add support for setting vCPU and I/O thread scheduler setting Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-02-11 17:30:07 +01:00
Daniel P. Berrange	95fd6a91c6	qemu: include libvirt & QEMU versions in QEMU log files It is often helpful to know which version of libvirt and QEMU was present when a guest was first launched. Ensure this info is written into the QEMU log file for each guest.	2015-02-06 10:22:07 +00:00
Daniel P. Berrange	f7afeddce9	qemu: report TAP device indexes to systemd Record the index of each TAP device created and report them to systemd, so they show up in machinectl status for the VM.	2015-01-27 13:57:02 +00:00
Daniel P. Berrange	7b1ba9566b	Remove use of nwfilterPrivateData from nwfilter driver The nwfilter driver can rely on its global state instead of the connect private data.	2015-01-27 12:02:03 +00:00
Ján Tomko	5c703ca396	Always check return value of qemuDomainObjExitMonitor Depending on the context, either error out if the domain has disappeared in the meantime, or just ignore the value to allow marking the function as ATTRIBUTE_RETURN_CHECK.	2015-01-19 10:12:32 +01:00
Ján Tomko	6edb97f29a	Fix vmdef usage after domain crash in monitor on device detach https://bugzilla.redhat.com/show_bug.cgi?id=1161024 In the device type-specific functions, exit early if the domain has disappeared, because the cleanup should have been done by qemuProcessStop. Check the return value in processDeviceDeletedEvent and qemuProcessUpdateDevices. Skip audit and removing the device from live def because it has already been cleaned up.	2015-01-19 10:12:07 +01:00
Ján Tomko	c749eda4a2	Fix vmdef usage while in monitor in qemu process Make local copy of the disk alias in qemuProcessInitPasswords, instead of referencing the one in domain definition, which might get freed if the domain crashes while we're in monitor. Also copy the memballoon period value.	2015-01-14 19:30:32 +01:00
Pavel Hrdina	ce745914b3	qemu_process: detect updated video ram size values from QEMU QEMU internally updates the size of video memory if the domain XML had provided too low memory size or there are some dependencies for a QXL devices 'vgamem' and 'ram' size. We need to know about the changes and store them into the status XML to not break migration or managedsave through different libvirt versions. The values would be loaded only if the "vgamem_mb" property exists for the device. The presence of the "vgamem_mb" also tells that the "ram_size" and "vram_size" exists for QXL devices. Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2015-01-14 11:55:51 +01:00
Martin Kletzander	540c339a25	qemu: completely rework reference counting There is one problem that causes various errors in the daemon. When domain is waiting for a job, it is unlocked while waiting on the condition. However, if that domain is for example transient and being removed in another API (e.g. cancelling incoming migration), it get's unref'd. If the first call, that was waiting, fails to get the job, it unref's the domain object, and because it was the last reference, it causes clearing of the whole domain object. However, when finishing the call, the domain must be unlocked, but there is no way for the API to know whether it was cleaned or not (unless there is some ugly temporary variable, but let's scratch that). The root cause is that our APIs don't ref the objects they are using and all use the implicit reference that the object has when it is in the domain list. That reference can be removed when the API is waiting for a job. And because each domain doesn't do its ref'ing, it results in the ugly checking of the return value of virObjectUnref() that we have everywhere. This patch changes qemuDomObjFromDomain() to ref the domain (using virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which should be the only function in which the return value of virObjectUnref() is checked. This makes all reference counting deterministic and makes the code a bit clearer. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-12-21 10:48:56 +01:00
Daniel P. Berrange	65686e5a81	disable vCPU pinning with TCG mode Although QMP returns info about vCPU threads in TCG mode, the data it returns is mostly lies. Only the first vCPU has a valid thread_id returned. The thread_id given for the other vCPUs is in fact the main emulator thread. All vCPUs actually run under the same thread in TCG mode. Our vCPU pinning code is not at all able to cope with this so if you try to set CPU affinity per-vCPU you end up with wierd errors error: Failed to start domain instance-00000007 error: cannot set CPU affinity on process 24365: Invalid argument Since few people will care about the performance of TCG with strict CPU pinning, lets just disable that for now, so we get a clear error message error: Failed to start domain instance-00000007 error: Requested operation is not valid: cpu affinity is not supported	2014-12-19 11:32:21 +00:00
Daniel P. Berrange	b07f3d821d	Don't setup fake CPU pids for old QEMU The code assumes that def->vcpus == nvcpupids, so when we setup fake CPU pids for old QEMU with nvcpupids == 1, we cause the later code to read off the end of the array. This has fun results like sche_setaffinity(0, ...) which changes libvirtd's own CPU affinity, or even better sched_setaffinity($RANDOM, ...) which changes the affinity of a random OS process.	2014-12-19 11:32:21 +00:00
Martin Kletzander	c74d58ad47	qemu: Save numad advice into qemuDomainObjPrivate Thanks to that we don't need to drag the pointer everywhere and future code will get cleaner. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-12-16 11:15:27 +01:00
Martin Kletzander	f801a81208	qemu: Remove unnecessary qemuSetupCgroupPostInit function Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-12-16 11:15:27 +01:00
Laine Stump	c5a54917d5	qemu: add a qemuInterfaceStopDevices(), called when guest CPUs stop We now have a qemuInterfaceStartDevices() which does the final activation needed for the host-side tap/macvtap devices that are used for qemu network connections. It will soon make sense to have the converse qemuInterfaceStopDevices() which will undo whatever was done during qemuInterfaceStartDevices(). A function to "stop" a single device has also been added, and is called from the appropriate place in qemuDomainDetachNetDevice(), although this is currently unnecessary - the device is going to immediately be deleted anyway, so any extra "deactivation" will be for naught. The call is included for completeness, though, in anticipation that in the future there may be some required action that isn't nullified by deleting the device. This patch is a part of a more complete fix for: https://bugzilla.redhat.com/show_bug.cgi?id=1081461	2014-12-13 22:20:28 -05:00
Laine Stump	879c13d6cc	qemu: always call qemuInterfaceStartDevices() when starting CPUs The patch that added qemuInterfaceStartDevices() (upstream commit `82977058f5`) had an extra conditional to prevent calling it if the reason for starting the CPUs was VIR_DOMAIN_RUNNING_UNPAUSED or VIR_DOMAIN_RUNNING_SAVE_CANCELED. This was put in by the author as the result of a reviewer asking if it was necessary to ifup the interfaces in all occasions (because these were the two cases where the CPU would have already been started (and stopped) once, so the interface would already be ifup'ed). It turns out that, as long as there is no corresponding qemuInterfaceStopDevices() to ifdown the interfaces anytime the CPUs are stopped, neglecting to ifup when reason is RUNNING_UNPAUSED or RUNNING_SAVE_CANCELED doesn't cause any problems (because it just happens that the interface will have already been ifup'ed by a prior call when the CPU was previously started for some other reason). However, it also doesn't help, and there will soon be a qemuInterfaceStopDevices() function which will ifdown these interfaces when the guest CPUs are stopped, and once that is done, the interfaces will be left down in some cases when they should be up (for example, if a domain is paused and then unpaused). So, this patch is removing the condition in favor of always calling qemuInterfaeStartDevices() when the guest CPUs are started. This patch (and the aforementioned patch) resolve: https://bugzilla.redhat.com/show_bug.cgi?id=1081461	2014-12-13 21:44:45 -05:00
Matthew Rosato	82977058f5	network: Bring netdevs online later Currently, MAC registration occurs during device creation, which is early enough that, during live migration, you end up with duplicate MAC addresses on still-running source and target devices, even though the target device isn't actually being used yet. This patch proposes to defer MAC registration until right before the guest can actually use the device -- In other words, right before starting guest CPUs. Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Laine Stump <laine@laine.org>	2014-12-10 15:09:01 -05:00
Peter Krempa	38bde5776a	qemu: process: Avoid uninitialized use two vars when reconnecting to vm `3ecebf0711` breaks the build as it adds a way to jump to cleanup before the 'cfg' object is retrieved and 'priv' is initialized.	2014-12-04 16:24:25 +01:00
Peter Krempa	3ecebf0711	qemu: process: Refactor reconnecting to qemu processes Move entering the job into the thread to simplify the program flow. Also as the code holds a separate reference to the domain object some conditions can be simplified. After this patch qemuDomainObjTransferJob is no longer needed so this patch removes it.	2014-12-04 15:28:39 +01:00
Luyao Huang	f8c1fb3d2e	qemu: Make pid available for security managers in qemuProcessAttach There are some small issue in qemuProcessAttach: 1.Fix virSecurityManagerGetProcessLabel always get pid = 0, move 'vm->pid = pid' before call virSecurityManagerGetProcessLabel. 2.Use virSecurityManagerGenLabel to get image label. 3.Fix always set selinux label for other security driver label. Signed-off-by: Luyao Huang <lhuang@redhat.com>	2014-12-01 12:04:38 +01:00
Erik Skultety	8e23e0e977	qemu: fix block{commit,copy} abort handling When a block{commit,copy} job was aborted on a domain, block job handler did not process it correctly, leaving a phantom job in the background. Any further calls to any blockjob causes "block <jobtype> still active" error. This patch fixes the blockjob handler so that it checks not only for VIR_DOMAIN_BLOCK_JOB_FAILED status, but VIR_DOMAIN_BLOCK_JOB_CANCELED status as well, followed by our existing cleanup routine. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135169 Signed-off-by: Jiri Denemark <jdenemar@redhat.com>	2014-12-01 10:09:03 +01:00
Michal Privoznik	6085d917d5	qemu: Don't track quiesced state of FSs https://bugzilla.redhat.com/show_bug.cgi?id=1160084 As of `b6d4dad11b` (1.2.5) we are trying to keep the status of FSFreeze in the guest. Even though I've tried to fixed couple of corner cases (`6ea54769ba`), it occurred to me just recently, that the approach is broken by design. Firstly, there are many other ways to talk to qemu-ga (even through libvirt) that filesystems can be thawed (e.g. qemu-agent-command) without libvirt noticing. Moreover, there are plenty of ways to thaw filesystems without even qemu-ga noticing (yes, qemu-ga keeps internal track of FSFreeze status). So, instead of keeping the track ourselves, or asking qemu-ga for stale state, it's the best to let qemu-ga deal with that (and possibly let guest kernel propagate an error). Moreover, there's one bug with the following approach, if fsfreeze command failed, we've executed fsthaw subsequently. So issuing domfsfreeze in virsh gave the following result: virsh # domfsfreeze gentoo Froze 1 filesystem(s) virsh # domfsfreeze gentoo error: Unable to freeze filesystems error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance virsh # domfsfreeze gentoo Froze 1 filesystem(s) virsh # domfsfreeze gentoo error: Unable to freeze filesystems error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-11-28 11:22:24 +01:00
Peter Krempa	b29f2436ac	qemu: Emit the guest agent lifecycle event Add code to emit the event on change of the channel state and reconnect to the qemu process.	2014-11-24 15:50:59 +01:00
Peter Krempa	21c676c2aa	qemu: process: Refresh virtio channel guest state when connecting to mon Use data provided by "query-chardev" to refresh the guest frontend state of virtio channels.	2014-11-24 08:58:30 +01:00
Peter Krempa	4d7eb90311	qemu: chardev: Extract more information about character devices Improve the monitor function to also retrieve the guest state of character device (if provided) so that we can refresh the state of virtio-serial channels and perhaps react to changes in the state in future patches. This patch changes the returned data from qemuMonitorGetChardevInfo to return a structure containing the pty path and the state for all the character devices. The change to the testsuite makes sure that the data is parsed correctly.	2014-11-24 08:58:30 +01:00
Peter Krempa	15bbaaf014	qemu: Add handling for VSERPORT_CHANGE event New qemu added a new event that is emitted when a virtio serial channel is opened in the guest OS. This allows us to update the state of the port in the output-only XML element. This patch implements the monitor callbacks and necessary handlers to update the state in the definition.	2014-11-21 11:00:11 +01:00
Peter Krempa	e9a4506963	qemu: monitor: Rename and improve qemuMonitorGetPtyPaths To unify future additions that require information from "query-chardev" rename qemuMonitorGetPtyPaths and friends to qemuMonitorGetChardevInfo and move the allocation of the returned hash into the top level function.	2014-11-21 11:00:10 +01:00
Peter Krempa	6692ba731b	qemu: process: report useful error if alias formatting fails When retrieving the paths for PTY devices the alias gets formatted into a static string. If it doesn't fit we wouldn't report an error.	2014-11-21 11:00:10 +01:00
Peter Krempa	7e130e8b35	storage: qemu: Fix security labelling of new image chain elements When creating a disk image snapshot the libvirt code would blindly copy the parents label to the newly created image. This runs into problems when you start a VM from an image hosted on NFS (or other storage system that doesn't support selinux labels) and the snapshot destination is on a storage system that does support selinux labels. Libvirt's code in that case generates a different security label for the image hosted on NFS. This label is valid only for NFS images and doesn't allow access in case of a locally stored image. To fix this issue libvirt needs to refrain from copying security information in cases where the default domain seclabel is a better choice. This patch repurposes the now unused @force argument of virStorageSourceInitChainElement to denote whether a copy of the security labelling stuff should be attempted or not. This allows to fine-control the copy operation for cases where we need to keep the label of the old disk vs. the cases where we need to keep the label unset to use the default domain imagelabel. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718	2014-11-21 09:28:26 +01:00
Anirban Chakraborty	22cff52a2b	network: Add network bandwidth support to ethernet interfaces Ethernet interfaces in libvirt currently do not support bandwidth setting. For example, following xml file for an interface will not apply these settings to corresponding qdiscs. <interface type="ethernet"> <mac address="02:36:1d:18:2a:e4"/> <model type="virtio"/> <script path=""/> <target dev="tap361d182a-e4"/> <bandwidth> <inbound average="984" peak="1024" burst="64"/> <outbound average="2000" peak="2048" burst="128"/> </bandwidth> </interface> Signed-off-by: Anirban Chakraborty <abchak@juniper.net> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-11-19 10:36:49 +01:00
Martin Kletzander	5cca4cd16f	Remove unnecessary curly brackets in src/qemu/ Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-11-14 17:13:01 +01:00
Pavel Hrdina	41127244fb	nwfilter: fix deadlock caused updating network device and nwfilter Commit `6e5c79a1` tried to fix deadlock between nwfilter{Define,Undefine} and starting of guest, but this same deadlock exists for updating/attaching network device to domain. The deadlock was introduced by removing global QEMU driver lock because nwfilter was counting on this lock and ensure that all driver locks are locked inside of nwfilter{Define,Undefine}. This patch extends usage of virNWFilterReadLockFilterUpdates to prevent the deadlock for all possible paths in QEMU driver. LXC and UML drivers still have global lock. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143780 Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2014-11-13 10:45:19 +01:00
Michal Privoznik	54ddc08ddb	qemuPrepareNVRAM: Save domain conf only if domain's persistent In one of my previous patches (`3a3c3780b`) I've tried to fix the problem of nvram path disappearing on a domain that's been started and shut down again. I fixed this by explicitly saving domain's config file. However, I did a bit of clumsy without realizing we have a transient domains for which we don't save the config file. Hence, any domain using UEFI became persistent. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-11-13 09:35:25 +01:00
Michal Privoznik	6ea54769ba	qemu: Update fsfreeze status on domain state transitions https://bugzilla.redhat.com/show_bug.cgi?id=1160084 As of `b6d4dad1` (1.2.5) libvirt keeps track if domain disks have been frozen. However, this falls into that set of information which don't survive domain restart. Therefore, we need to clear the flag upon some state transitions. Moreover, once we clear the flag we must update the status file too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-11-06 15:20:01 +01:00
Martin Kletzander	c63ef0452b	numa: split util/ and conf/ and support non-contiguous nodesets This is a reaction to Michal's fix [1] for non-NUMA systems that also splits out conf/ out of util/ because libvirt_util shouldn't require libvirt_conf if it is the other way around. This particular use case worked, but we're trying to avoid it as mentioned [2], many times. The only functions from virnuma.c that needed numatune_conf were virDomainNumatuneNodesetIsAvailable() and virNumaSetupMemoryPolicy(). The first one should be in numatune_conf as it works with virDomainNumatune, the second one just needs nodeset and mode, both of which can be passed without the need of numatune_conf. Apart from fixing that, this patch also fixes recently added code (between commits d2460f85^..5c8515620) that doesn't support non-contiguous nodesets. It uses new function virNumaNodesetIsAvailable(), which doesn't need a stub as it doesn't use any libnuma functions, to check if every specified nodeset is available. [1] https://www.redhat.com/archives/libvir-list/2014-November/msg00118.html [2] http://www.redhat.com/archives/libvir-list/2011-June/msg01040.html Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-11-06 15:13:55 +01:00
Martin Kletzander	11a48758a7	qemu: make advice from numad available when building commandline Particularly in qemuBuildNumaArgStr(), there was a need for the advice due to memory backing, which needs to know the nodeset it will be pinned to. With newer qemu this caused the following error when starting domain: error: internal error: Advice from numad is needed in case of automatic numa placement even when starting perfectly valid domain, e.g.: ... <vcpu placement='auto'>4</vcpu> <numatune> <memory mode='strict' placement='auto'/> </numatune> <cpu> <numa> <cell id='0' cpus='0' memory='524288'/> <cell id='1' cpus='1' memory='524288'/> </numa> </cpu> ... Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138545 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-11-03 16:43:22 +01:00
weiwei li	be598c5ff8	qemu: Release nbd port from migrationPorts instead of remotePorts commit `3e1e16aa8d` (Use a port from the migration range for NBD as well) changed ndb port allocation from remotePorts to migrationPorts, but did not change the port releasing process, which makes an error when migrating several times (above 64): error: internal error: Unable to find an unused port in range 'migration' (49152-49215) https://bugzilla.redhat.com/show_bug.cgi?id=1159245 Signed-off-by: Weiwei Li <nuonuoli@tencent.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2014-10-31 12:20:06 +01:00
Zhou yimin	411cea638f	qemu: move setting emulatorpin ahead of monitor showing up If VM is configured with many devices(including passthrough devices) and large memory, libvirtd will take seconds(in the worst case) to wait for monitor. In this period the qemu process may run on any PCPU though I intend to pin emulator to the specified PCPU in xml configuration. Actually qemu process takes high cpu usage during vm startup. So this is not the strict CPU isolation in this case. Signed-off-by: Zhou yimin <zhouyimin@huawei.com>	2014-10-21 12:26:38 +02:00
Laine Stump	b6bdda458a	qemu: setup infrastructure to handle NIC_RX_FILTER_CHANGED event NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the guest modified the NIC's RX Filter (for example, if the MAC address of the NIC is changed by the guest). This patch doesn't do anything useful with that event; it just sets up all the plumbing to get news of the event into a worker thread with all proper locking/reference counting, and provide an easy place to add in desired functionality. See src/qemu/EVENTHANDLERS.txt for information/instructions on adding a libvirt-internal handler for a qemu event (using NIC_RX_FILTER_CHANGED as an example).	2014-10-06 13:50:57 -04:00
Guido Günther	4882618ed1	qemu: use systemd's TerminateMachine to kill all processes If we don't properly clean up all processes in the machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm starts fail with 'CreateMachine: File exists' Additional processes can e.g. be added via echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks but there are other cases like http://bugs.debian.org/761521 Invoke TerminateMachine to be on the safe side since systemd tracks the cgroup anyway. This is a noop if all processes have terminated already.	2014-10-01 20:17:46 +02:00
Ján Tomko	ec5f817f2e	Don't verify CPU features with host-passthrough Commit `fba6bc4` introduced the non-migratable invtsc feature, breaking save/migration with host-model and host-passthrough. On hosts with this feature present it was automatically included in the CPU definition, regardless of QEMU support. Commit `de0aeaf` stopped including it by default for host-model, but failed to fix host-passthrough. This commit ignores checking of CPU features with host-passthrough, since we don't pass them to QEMU (only -cpu host is passed), allowing domains using host-passthrough that were saved with the broken version of libvirtd to be restored. https://bugzilla.redhat.com/show_bug.cgi?id=1147584	2014-09-30 10:47:02 +02:00
Michal Privoznik	3a3c3780b4	qemuPrepareNVRAM: Save domain after NVRAM path generation On a domain startup, the variable store path is generated if needed. The path is intended to be generated only once. However, the updated domain definition is not saved into config dir rather than state XML only. So later, whenever the domain is destroyed and the daemon is restarted, the generated path is forgotten and the file may be left behind on virDomainUndefine() call. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-09-26 10:14:34 +02:00
Peter Krempa	639a00984a	qemu: Report better errors from broken backing chains Request erroring out from the backing chain traveller and drop qemu's internal backing chain integrity tester. The backing chain traveller reports errors by itself with possibly more detail than qemuDiskChainCheckBroken ever could. We also need to make sure that we reconnect to existing qemu instances even at the cost of losing the backing chain info (this really should be stored in the XML rather than reloaded from disk, but that needs some work).	2014-09-24 10:18:47 +02:00
John Ferlan	74eaa0918b	qemu: Process the hostdev "rawio" setting Mimic the "Disk" processing for 'rawio', but for a scsi_host hostdev lun device.	2014-09-19 07:49:06 -04:00
John Ferlan	320825b4ca	domain_conf: Change virDomainDiskDef 'rawio' to use virTristateBool Adjust disk definition for 'rawio' to use the TristateBool logic	2014-09-19 05:59:36 -04:00
John Ferlan	8921d48868	qemu: Add missing goto on rawio Commit id '9a2f36ec' added a build conditional of CAP_SYS_RAWIO in order to determine whether or not a disk definition using rawio should be allowed on platforms without CAP_SYS_RAWIO. If one was found, virReportError was used but the code didn't goto cleanup. This patch adds the goto.	2014-09-19 05:54:00 -04:00
Pavel Hrdina	da7799d879	Move the FIPS detection from capabilities We are not detecting the presence of FIPS from QEMU, but from procfs and that means it's not QEMU capability. It was decided that we will pass this flag to QEMU even if it's not supported by old QEMU binaries. This patch also reverts changes done by commit `a21cfb0f` to qemucapabilitestest and implements a new test case in qemuxml2argvtest. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135431 Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2014-09-19 09:08:23 +02:00
Ján Tomko	c1480871bb	Fixes for domains with no iothreads Plug a memory leak and silence a warning.	2014-09-18 14:49:01 +02:00
Ján Tomko	b20d39a56f	Wire up the interface backend options Pass the user-specified tun path down when creating tap device when called from the qemu driver. Also honor the vhost device path specified by user.	2014-09-16 16:02:34 +02:00
John Ferlan	76a81b1d31	qemu: Need to check for capability before query Prior to trying the query-iothreads call - check if the qemu has the capability Signed-off-by: John Ferlan <jferlan@redhat.com>	2014-09-16 06:08:20 -04:00
John Ferlan	500c91c57d	qemu_cgroup: Adjust spacing around incrementor Change "i+1" to "i + 1"	2014-09-15 21:05:46 -04:00
John Ferlan	b66c950fb9	qemu: Fix iothreads issue If there are no iothreads, then return from qemuProcessDetectIOThreadPIDs without error; otherwise, the following occurs: error: Failed to start domain $dom error: An error occurred, but the cause is unknown	2014-09-15 21:05:46 -04:00
John Ferlan	9bef96ec50	qemu: Allow pinning specific IOThreads to a CPU Modify qemuProcessStart() in order to allowing setting affinity to specific CPU's for IOThreads. The process followed is similar to that for the vCPU's. This involves adding a function to fetch the IOThread id's via qemuMonitorGetIOThreads() and adding them to iothreadpids[] list. Then making sure all the cgroup data has been properly set up and finally assigning affinity.	2014-09-15 13:18:56 -04:00
John Ferlan	35a50ea8c7	qemu: Resolve Coverity NEGATIVE_RETURNS In qemuProcessInitPCIAddresses() if qemuMonitorGetAllPCIAddresses() returns a negative (or zero) value, then no need to call the qemuProcessDetectPCIAddresses(). Signed-off-by: John Ferlan <jferlan@redhat.com>	2014-09-11 08:10:14 -04:00
Michal Privoznik	742b08e30f	qemu: Automatically create NVRAM store When using split UEFI image, it may come handy if libvirt manages per domain _VARS file automatically. While the _CODE file is RO and can be shared among multiple domains, you certainly don't want to do that on the _VARS file. This latter one needs to be per domain. So at the domain startup process, if it's determined that domain needs _VARS file it's copied from this master _VARS file. The location of the master file is configurable in qemu.conf. Temporary, on per domain basis the location of master NVRAM file can be overridden by this @template attribute I'm inventing to the <nvram/> element. All it does is holding path to the master NVRAM file from which local copy is created. If that's the case, the map in qemu.conf is not consulted. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com>	2014-09-10 09:38:07 +02:00
Jiri Denemark	eaee338ae6	qemu: Recompute downtime and total time when migration completes Total time of a migration and total downtime transfered from a source to a destination host do not count with the transfer time to the destination host and with the time elapsed before guest CPUs are resumed. Thus, source libvirtd remembers when migration started and when guest CPUs were paused. Both timestamps are transferred to destination libvirtd which uses them to compute total migration time and total downtime. Obviously, this requires the time to be synchronized between the two hosts. The reported times are useless otherwise but they would be equally useless if we didn't do this recomputation so don't lose anything by doing it. Signed-off-by: Jiri Denemark <jdenemar@redhat.com>	2014-09-10 09:37:34 +02:00
Jiri Denemark	03890605dc	qemu: Propagate QEMU errors during incoming migrations When QEMU fails during incoming migration after we successfully started it (i.e., during Perform or Finish phase), we report a rather unhelpful message Unable to read from monitor: Connection reset by peer We already have a code that takes error messages from QEMU's error output but we disable it once QEMU successfully starts. This patch postpones this until the end of Finish phase during incoming migration so that we can report a much better error message: internal error: early end of file from monitor: possible problem: Unknown savevm section or instance '0000:00:05.0/virtio-balloon' 0 load of migration failed https://bugzilla.redhat.com/show_bug.cgi?id=1090093 Signed-off-by: Jiri Denemark <jdenemar@redhat.com>	2014-09-08 13:33:44 +02:00
Eric Blake	44e30277d8	maint: use consistent if-else braces in qemu I'm about to add a syntax check that enforces our documented HACKING style of always using matching {} on if-else statements. This commit focuses on the qemu driver. * src/qemu/qemu_command.c (qemuParseISCSIString) (qemuParseCommandLineDisk, qemuParseCommandLine) (qemuBuildSmpArgStr, qemuBuildCommandLine) (qemuParseCommandLineDisk, qemuParseCommandLineSmp): Correct use of {}. * src/qemu/qemu_capabilities.c (virQEMUCapsProbeCPUModels): Likewise. * src/qemu/qemu_driver.c (qemuDomainCoreDumpWithFormat) (qemuDomainRestoreFlags, qemuDomainGetInfo) (qemuDomainMergeBlkioDevice): Likewise. * src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise. * src/qemu/qemu_monitor_text.c (qemuMonitorTextCreateSnapshot) (qemuMonitorTextLoadSnapshot, qemuMonitorTextDeleteSnapshot): Likewise. * src/qemu/qemu_process.c (qemuProcessStop): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-09-04 08:53:21 -06:00
Wang Rui	4f2ad084bc	qemu_process: Resolve Coverity RESOURCE_LEAK If virSecurityManagerClearSocketLabel() fails, 'agent' won't be freed before jumping to cleanup. Signed-off-by: Wang Rui <moon.wangrui@huawei.com>	2014-09-03 15:00:19 -04:00
Chunyan Liu	0e1a1a8c47	qemu: ensure sane umask for qemu process Add umask to _virCommand, allow user to set umask to command. Set umask(002) to qemu process to overwrite the default umask of 022 set by many distros, so that unix sockets created for virtio-serial has expected permissions. Fix problem reported here: https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11 https://bugzilla.novell.com/show_bug.cgi?id=888166 To use virtio-serial device, unix socket created for chardev with default umask(022) has insufficient permissions. e.g.: -device virtio-serial \ -chardev socket,path=/tmp/foo,server,nowait,id=foo \ -device virtserialport,chardev=foo,name=org.fedoraproject.port.0 srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock Other users in the same group (like real user, test engines, etc) cannot write to this socket. Signed-off-by: Chunyan Liu <cyliu@suse.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2014-09-03 05:58:15 -06:00
Erik Skultety	36a0993a15	qemu: min_guarantee: Parameter 'min_guarantee' not supported The 'min_guarantee' is used by VMware ESX and OpenVZ drivers, with qemu however, libvirt should report error when starting a domain, because this element is not used. Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1122455	2014-08-22 16:33:18 +02:00
Roman Bogorodskiy	8c170c9fe6	storage: make disk source pool translation generic Currently, qemu driver uses qemuTranslateDiskSourcePool() to translate disk volume information. This function is general enough and could be used for other drivers as well, so move it to conf/domain_conf.c along with its helpers. - qemuTranslateDiskSourcePool: move to storage/storage_driver.c and rename to virStorageTranslateDiskSourcePool, - qemuAddISCSIPoolSourceHost: move to storage/storage_driver.c and rename to virStorageAddISCSIPoolSourceHost, - qemuTranslateDiskSourcePoolAuth: move to storage/storage_driver.c and rename to virStorageTranslateDiskSourcePoolAuth, - Update users of qemuTranslateDiskSourcePool to use a new name.	2014-08-19 20:50:12 +04:00
Peter Krempa	482f4e596f	qemu: process: Pin on per-vcpu basis instead of per-vcpupin element Pin existing vcpus rather than existing vcpu pinning infos. This increases the complexity of the lookup, but avoids pinning cpus that are not enabled actually.	2014-08-18 17:43:05 +02:00
Peter Krempa	a821f1f028	qemu: process: Remove unnecessary argument and rename function We set just one affinity of the emulator and the virConnectPtr isn't needed for that function.	2014-08-18 17:43:05 +02:00
Erik Skultety	9b1759bbe9	qemu: Redundant listen address entry in quest xml When editing guest's XML (on QEMU), it was possible to add multiple listen elements into graphics parent element. However QEMU does not support listening on multiple addresses. Configuration is tested for multiple 'listen address' and if positive, an error is raised. https://bugzilla.redhat.com/show_bug.cgi?id=1119212	2014-08-18 14:45:37 +02:00
Pavel Hrdina	0c35a415f7	qemu_process: fix memleak found by coverity Signed-off-by: Pavel Hrdina <phrdina@redhat.com>	2014-08-14 19:33:06 +02:00
Sam Bobroff	f0f9eed843	qemu: Tidy up job handling during live migration During a QEMU live migration several warning messages about job handling could be written to syslog on the destination host: "entering monitor without asking for a nested job is dangerous" The messages are written because the job handling during migration uses hard coded asyncJob values in several places that are incorrect. This patch passes the required asyncJob value around and prevents the warnings as well as any issues that the warnings may be referring to. https://bugzilla.redhat.com/show_bug.cgi?id=1130089 Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>	2014-08-14 12:12:42 +02:00
Peter Krempa	e3f5af6a5f	qemu: process: Fix header format of qemuProcessSetVcpuAffinities Fix header alignment and remove the unused conn parameter.	2014-08-12 17:24:34 +02:00
Eric Blake	232a31bea3	blockcommit: track job type in xml A future patch is going to wire up qemu active block commit jobs; but as they have similar events and are canceled/pivoted in the same way as block copy jobs, it is easiest to track all bookkeeping for the commit job by reusing the <mirror> element. This patch adds domain XML to track which job was responsible for creating a mirroring situation, and adds a job='copy' attribute to all existing uses of <mirror>. Along the way, it also massages the qemu monitor backend to read the new field in order to generate the correct type of libvirt job (even though it requires a future patch to actually cause a qemu event that can be reported as an active commit). It also prepares to update persistent XML to match changes made to live XML when a copy completes. * docs/schemas/domaincommon.rng: Enhance schema. * docs/formatdomain.html.in: Document it. * src/conf/domain_conf.h (_virDomainDiskDef): Add a field. * src/conf/domain_conf.c (virDomainBlockJobType): String conversion. (virDomainDiskDefParseXML): Parse job type. (virDomainDiskDefFormat): Output job type. * src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Distinguish active from regular commit. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set job type. (qemuDomainBlockPivot, qemuDomainBlockJobImpl): Clean up job type on completion. * tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml: Update tests. * tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise. * tests/qemuxml2argvdata/qemuxml2argv-disk-active-commit.xml: New file. * tests/qemuxml2xmltest.c (mymain): Drive new test. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-07-30 06:32:38 -06:00
Eric Blake	febf84c26a	blockjob: properly track blockcopy xml changes on disk We were not directly saving the domain XML to file after starting or finishing a blockcopy. Without the startup write, a libvirtd restart in the middle of a copy job would forget that the job was underway. Then at pivot, we were indirectly writing new XML in reaction to events that occur as we stop and restart the guest CPUs. But there was a race: since pivot is an async action, it is possible that libvirtd is restarted before the pivot completes, so if XML changes during the event, that change was not written. The original blockcopy code cleared out the <mirror> element prior to restarting the CPUs, but this is also a race, observed if a user does an async pivot and a dumpxml before the event occurs. Furthermore, this race will interfere with active commit in a future patch, because that code will rely on the <mirror> element at the time of the qemu event to determine whether to inform the user of a normal commit or an active commit. Fix things by saving state any time we modify live XML, while delaying XML disk modifications until after the event completes. We still need a to teach libvirtd restarts to examine all existing <mirror> elements to see if the job completed in the meantime (that is, if libvirtd misses the event, the updated state still needs to be updated in live XML), but that will be a later patch, in part because we also need to to start taking advantage of newer qemu's ability to keep the job around after completion rather than the current usage where the job disappears both on error and on success. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): Track XML change on disk. (qemuDomainBlockJobImpl, qemuDomainBlockPivot): Move job-end XML rewrites... * src/qemu/qemu_process.c (qemuProcessHandleBlockJob): ...here. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-07-29 15:36:30 -06:00
Eric Blake	9a212d6708	blockcopy: add more XML for state tracking Doing a blockcopy operation across a libvirtd restart is not very robust at the moment. In particular, we are clearing the <mirror> element prior to telling qemu to finish the job. Also, thanks to the ability to request async completion, the user can easily regain control prior to qemu actually finishing the effort, and they should be able to poll the domain XML to see if the job is still going. A future patch will fix things to actually wait until qemu is done before modifying the XML to reflect the job completion. But since qemu issues identical BLOCK_JOB_COMPLETE events regardless of whether the job was cancelled (kept the original disk) or completed (pivoted to the new disk), we have to track which of the two operations were used to end the job. Furthermore, we'd like to avoid attempts to end a job where we are already waiting on an earlier request to qemu to end the job. Likewise, if we miss the qemu event (perhaps because it arrived during a libvirtd restart), we still need enough state recorded to be able to determine how to modify the domain XML once we reconnect to qemu and manually learn whether the job still exists. Although this patch doesn't actually fix the problem, it is a preliminary step that makes it possible to track whether a job has already begun steps towards completion. * src/conf/domain_conf.h (virDomainDiskMirrorState): New enum. (_virDomainDiskDef): Convert bool mirroring to new enum. * src/conf/domain_conf.c (virDomainDiskDefParseXML) (virDomainDiskDefFormat): Handle new values. * src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Adjust client. * src/qemu/qemu_driver.c (qemuDomainBlockPivot) (qemuDomainBlockJobImpl): Likewise. * docs/schemas/domaincommon.rng (diskMirror): Expose new values. * docs/formatdomain.html.in (elementsDisks): Document it. * tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-07-29 15:36:30 -06:00
Michal Privoznik	136ad49740	domain: Introduce ./hugepages/page/[@size, @unit, @nodeset] <memoryBacking> <hugepages> <page size="1" unit="G" nodeset="0-3,5"/> <page size="2" unit="M" nodeset="4"/> </hugepages> </memoryBacking> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-07-29 12:02:34 +01:00
Michal Privoznik	725a211fc0	qemu: Utilize virFileFindHugeTLBFS Use better detection of hugetlbfs mount points. Yes, there can be multiple mount points each serving different huge page size. Since we already have ability to override the mount point in the qemu.conf file, this crazy backward compatibility code is brought in. Now we allow multiple mount points, so the "hugetlbfs_mount" option must take an list of strings (mount points). But previously, it was just a string, so we must accept both types now. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2014-07-29 11:58:35 +01:00

1 2 3 4 5 ...

578 Commits