Add an abort() on the class/object allocation failures so that
virStorageSourceNew() always returns a virStorageSource and remove
checks from all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Before:
$ uname -m
s390x
$ cat passthrough-cpu.xml
<cpu check="none" mode="host-passthrough" />
$ virsh hypervisor-cpu-compare passthrough-cpu.xml
error: Failed to compare hypervisor CPU with passthrough-cpu.xml
error: internal error: unable to execute QEMU command 'query-cpu-model-comp
arison': Invalid parameter type for 'modelb.name', expected: string
After:
$ virsh hypervisor-cpu-compare passthrough-cpu.xml
CPU described in passthrough-cpu.xml is identical to the CPU provided by hy
pervisor on the host
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
We'll use the auto-alignment function during parse time, in
domain_conf.c. Let's move the function to that file, renaming
it to virDomainNVDimmAlignSizePseries(). This will also make it
clearer that, although QEMU is the only driver that currently
supports it, pSeries NVDIMM restrictions aren't tied to QEMU.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
If the mapping is not present, we should not try to
access its elements.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 8b5b80f4c5
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In [1], changes were made to remove the existing auto-alignment
for pSeries NVDIMM devices. That design promotes strange situations
where the NVDIMM size reported in the domain XML is different
from what QEMU is actually using. We removed the auto-alignment
and relied on standard size validation.
However, this goes against Libvirt design philosophy of not
tampering with existing guest behavior, as pointed out by Daniel
in [2]. Since we can't know for sure whether there are guests that
are relying on the auto-alignment feature to work, the changes
made in [1] are a direct violation of this rule.
This patch reverts [1] entirely, re-enabling auto-alignment for
pSeries NVDIMM as it was before. Changes will be made to ease
the limitations of this design without hurting existing
guests.
This reverts the following commits:
- commit 2d93cbdea9
Revert "formatdomain.html.in: mention pSeries NVDIMM 'align down' mechanic"
- commit 0ee56369c8
qemu_domain.c: change qemuDomainMemoryDeviceAlignSize() return type
- commit 07de813924
qemu_domain.c: do not auto-align ppc64 NVDIMMs
- commit 0ccceaa57c
qemu_validate.c: add pSeries NVDIMM size alignment validation
- commit 4fa2202d88
qemu_domain.c: make qemuDomainGetMemorySizeAlignment() public
[1] https://www.redhat.com/archives/libvir-list/2020-July/msg02010.html
[2] https://www.redhat.com/archives/libvir-list/2020-September/msg00572.html
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Introduce 'isa' controller type. In domain XML it looks this way:
...
<controller type='isa' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x0'/>
</controller>
...
Currently, this is needed for the bhyve driver to allow choosing a
specific PCI address for that. In bhyve, this controller is used to
attach serial ports and a boot ROM.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This patch is just revert of [1]. Actually we should NOT pass
QEMU_ASYNC_JOB_NONE as that patch suggests while we are in async job in order
to acquire nested jobs correctly. The patch tries to fix issues introduced by
another patch [2] where jobs are mistakenly cleared out in qemuProcessStop.
Later patch [3] fixed the issue introduced by patch [2]. Now we need to revert
[1] as well as we now still have same concurrency crash issues as [3] described
but for the force revert.
[1] 0c4408c83: qemu: Don't use asyncJob after stop during snapshot revert
[2] 888aa4b6b: qemuDomainObjPrivateDataClear: Don't leak @migParams
[3] d75f865fb: qemu: fix concurrency crash bug in snapshot revert
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'readonly' hostdev property is stored separately from the
virStorageSource as some hostdevs are not described by a virStorage
source. We need to propagate the flag to the virStorage source also for
iSCSI backends as it's used to generate the backend properties.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1868856
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We've put the aliases into the backup job definition after the status
XML was already written so they didn't appear in the on-disk state.
Move the code putting them into the private definition earlier, so that
the status XML update done by saving blockjobs already writes them out.
Also add a note notifying that the block job status update writes the
status XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1870488
Fixes: 423576679a
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 423576679a implementing TLS forgot to remove the comment.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
QEMU's blockdev nodenames which are used to back SCSI/iSCSI hostdevs are
limited to 32 characters. If a user passes a very long user alias as
name of the host device it's easy to end up with a too-long nodename.
To prevent this from happening don't base the nodename on the possibly
user-specified alias but on the normal sequential node name generator.
We then store the name in the status XML for further use.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The secret object is used to pass data to the backend so it's better
fitting to base the secret object name on the SCSI host device backend
name.
Since we store the object alias in the status XML this modification is
safe in regards to existing guests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allocate the nodename in the setup function rather than in the command
line generator.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is no longer used once we setup per-hostdev data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Historically we've prepared secrets for all objects in one place. This
doesn't make much sense and it's semantically more appealing to prepare
everything for a single device type in one place.
Move the setup of the (iSCSI|SCSI) hostdev secrets into a new function
which will be used to setup other things as well in the future.
This is a similar approach we do for disks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This was a hack when we were locally regenerating the nodename so that
it's not leaked. Now that we use proper virStorageSource with
persistence it's no longer required.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Modify the attach/detach data generators to actually use the
virStorageSourceStructure embedded in the SCSI config data rather than
creating an ad-hoc internal one.
The modification will allow us to properly store the nodename used for
the backend in the status XML rather than re-generating it all the
time.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For upgrade reasons so that we can modify the used nodename we must
generate the old version for all status XMLs which don't have it stored
explicitly.
The change will be required as using the user-provided alias may result
in too-long nodenames which will be rejected by qemu.
Add code which fills in the appropriate old value and add test cases to
validate that it's added and also that existing nodenames are not
overwritten.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After we started copying the privateData pointer in
qemuDomainObjRestoreJob, we should also free them
once we're done with them.
Register the clear function and use g_auto.
Also add a check for job->cb to qemuDomainObjClearJob,
to prevent freeing an uninitialized job.
https://bugzilla.redhat.com/show_bug.cgi?id=1878450
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: aca37c3fb2
While we set up perf events for a shutoff domain and check the settings,
All of perf events are reported as 'disabled', unless we add --config,
This is redundant for a shutoff domain.
# virsh domstate $GUEST
shut off
# virsh perf --domain $GUEST
cmt : disabled
mbmt : disabled
mbml : disabled
......
# virsh perf --domain $GUEST --enable mbmt
mbmt : enabled
# virsh perf --domain $GUEST
cmt : disabled
mbmt : disabled
mbml : disabled
......
Use virDomainObjGetOneDefState instead of virDomainObjGetOneDef to fix
the issue. After patch, The perf event status of a shutoff domain is
reported correctly:
# virsh domstate $GUEST
shut off
# virsh perf --domain $GUEST
cmt : disabled
mbmt : disabled
mbml : disabled
......
# virsh perf --domain $GUEST --enable mbmt
mbmt : enabled
# virsh perf --domain $GUEST
cmt : disabled
mbmt : enabled
mbml : disabled
......
Signed-off-by: Lin Ma <lma@suse.de>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
It requires a guest agent configured and running in the domain's guest
OS, So check qemu agent during qemuDomainPMSuspendForDuration().
Signed-off-by: Lin Ma <lma@suse.de>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
When the guest agent isn't running, we still report success on a PM
suspend action even though we logged an error correctly, this is because
we poisoned the 'ret' value a few lines above.
Fixes: a663a86081
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In 8e1804f9f6 I've tried to fix the following use case: domain
is started with path to UEFI only and relies on libvirt to figure
out corresponding NVRAM template to create a per-domain copy
from. The fix consisted of having a check tailored exactly for
this use case and if it's hit then using FW autoselection to
figure it out. Unfortunately, the NVRAM template is not saved in
the inactive XML (well, the domain might be transient anyway).
Then, as a part of that check we see whether the per-domain copy
doesn't exist already and if it does then no template is looked
up hence no template will appear in the live XML.
This works, until the domain is migrated. At the destination, the
per-domain copy will not exist so we need to know the template to
create the per-domain copy from. But we don't even get to the
check because we are not starting a fresh new domain and thus the
qemuFirmwareFillDomain() function quits early.
The solution is to switch order of these two checks. That is
evaluate the check for the old style before checking flags.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852910
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
When starting a domain, qemuProcessLaunch() iterates over all
VIR_PERF_EVENT_* values and (possibly) enables them. While there
is nothing wrong with the code, the for loop where it's done makes
it harder to jump onto next block of code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Propagate the cluster size from the original image as the user might
have configured a custom cluster size for performance reasons. Propagate
the cluster size of a qcow2 image to the new overlay or copy.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
'blockdev-create' allows us to create the image with a custom cluster
size if we wish to. Wire it up for 'qcow2'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Configuring the cluster size of an image may have performance
implications. This patch allows us to detect cluster size for existing
images so that we will be able to propagate it to new images which are
based on existing images e.g. during snapshots/block-copy/etc.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When building and populating domain NS a couple of functions are
called that append paths to a string list. This string list is
then inspected, one item at the time by
qemuNamespacePrepareOneItem() which gathers all the info for
given path (stat buffer, possible link target, ACLs, SELinux
label) using qemuNamespaceMknodItemInit(). If the path needs to
be created in the domain's private /dev then it's added onto this
qemuNamespaceMknodData list which is freed later in the process.
But, if the path does not need to be created in the domain's
private /dev, then the memory allocated by
qemuNamespaceMknodItemInit() is not freed anywhere leading to a
leak.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is just a clean-up of commit 3791f29b08 using the new parameter of
virProcessSetAffinity() introduced in commit 9514e24984 so that there is
no error reported in the logs.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In the past we had to declare @cfg and then explicitly unref it.
But now, with glib we can use g_autoptr() which will do the unref
automatically and thus is more bulletproof.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
In the qemuInterfacePrepareSlirp() function, the qemu driver
config is obtained (via virQEMUDriverGetConfig()), but it is
never unrefed leading to mangled refcounter.
Fixes: 9145b3f1cc
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
On shutdown we just stop accepting new jobs for worker thread so that on
shutdown wait we can exit worker thread faster. Yes we basically stop
processing of events for VMs but we are going to do so anyway in case of daemon
shutdown.
At the same time synchronous event processing that some API calls may require
are still possible as per VM event loop is still running and we don't need
worker thread for synchronous event processing.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We are dropping the only reference here so that the event loop thread
is going to be exited synchronously. In order to avoid deadlocks we
need to unlock the VM so that any handler being called can finish
execution and thus even loop thread be finished too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This hunk was introduced in [1] in order to avoid loosing
events from monitor on stopping qemu process. But as explained
in [2] on destroy we won't get neither EOF nor any other
events as monitor is just closed. In case of crash/shutdown
we won't get any more events as well and qemuDomainObjStopWorker
will be called by qemuProcessStop eventually. Thus let's
remove qemuDomainObjStopWorker from qemuProcessHandleMonitorEOF
as it is not useful anymore.
[1] e6afacb0f: qemu: start/stop an event loop thread for domains
[2] d2954c072: qemu: ensure domain event thread is always stopped
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This allows:
a) migration without access to network
b) complete control of the migration stream
c) easy migration between containerised libvirt daemons on the same host
Resolves: https://bugzilla.redhat.com/1638889
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Consider a host with 8 CPUs. There are the following possible scenarios
1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs
2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs
3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus;
QEMU should get 8 CPUs
4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus;
QEMU should get 8 CPUs
5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus;
QEMU should get 4 CPUs
6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus;
QEMU should get 4 CPUs
Scenarios 1 & 2 always work unless systemd restricted libvirtd privs.
Scenario 3 works because libvirt checks current affinity first and
skips the sched_setaffinity call, avoiding the SYS_NICE issue
Scenario 4 works only if CAP_SYS_NICE is availalbe
Scenarios 5 & 6 works only if CAP_SYS_NICE is present *AND* the cgroups
cpuset is not set on the container.
If libvirt blindly ignores the sched_setaffinity failure, then scenarios
4, 5 and 6 should all work, but with caveat in case 4 and 6, that
QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively.
This is still better than failing.
Therefore libvirt can blindly ignore the setaffinity failure, but *ONLY*
ignore it when there was no affinity specified in the XML config.
If user specified affinity explicitly, libvirt must report an error if
it can't be honoured.
Resolves: https://bugzilla.redhat.com/1819801
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Adds new typed param for migration and uses this as a UNIX socket path that
should be used for the NBD part of migration. And also adds virsh support.
Partially resolves: https://bugzilla.redhat.com/1638889
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Clean up the semantics by using one extra self-describing variable.
This also fixes the port allocation when the port is specified.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Instead of saving some data from a union up front and changing an overlayed
struct before using said data, let's just set the new values after they are
decided. This will increase the readability of future commit(s).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
In 6.7.0 release I've changed how domain namespace is built and
populated. Previously it used to be done from a pre-exec hook
(ran in the forked off child, just before dropping all privileges
and exec()-ing QEMU), which not only meant we had to have two
different code paths for creating a node in domain's namespace
(one for this pre-exec hook, the other for hotplug ran from the
daemon), it also proved problematic because it was leaking FDs
into QEMU process.
To mitigate this problem, we've not only ditched libdevmapper
from the NS population process, I've also dropped the pre-exec
code and let the NS be populated from the daemon (using the
hotplug code). But, I was not careful when doing so, because the
pre-exec code was tolerant to files that doesn't exist, while
this new code isn't. For instance, the very first thing that is
done when the new NS is created is it's populated with
@defaultDeviceACL which contain files like /dev/null, /dev/zero,
/dev/random and /dev/kvm (and others). While the rest will
probably exist every time, /dev/kvm might not and thus the new
code I wrote has to be tolerant to that.
Of course, users can override the @defaultDeviceACL (by setting
cgroup_device_acl in qemu.conf) and remove /dev/kvm (which is
acceptable workaround), but we definitely want libvirt to work
out of the box even on hosts without KVM.
Fixes: 9048dc4e62
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since QEMU 1.5.3, the ib700 watchdog device has no options for address,
and not address in device tree:
$ /usr/libexec/qemu-kvm -version
QEMU emulator version 1.5.3 (qemu-kvm-1.5.3-175.el7), Copyright (c) 2003-2008 Fabrice Bellard
$ /usr/libexec/qemu-kvm -device ib700,\?
$ virsh qemu-monitor-command seabios --hmp info qtree|grep ib700 -A 2
dev: ib700, id "watchdog0"
dev: isa-serial, id "serial0"
index = 0
So only allow it to use none address.
Fixes: 8a54cc1d08
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1509908
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>