AArch64 qemu has similar behavior as armv7l, like use of mmio etc.
This patch adds similar bypass checks what we have for armv7l to aarch64.
E.g. we are enabling mmio transport for Nicdev.
Making addDefaultUSB and addDefaultMemballoon to false etc.
V3:
- Adding missing domain rng schema for aarcg64 and test case in
testutilsqemu.c which was causing test suite failure
while running make check.
V2:
- Added testcase to qemuxml2argvtest as suggested
during review comments of V1.
V1:
- Initial patch.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
https://bugzilla.redhat.com/show_bug.cgi?id=1047234
Add a range check for supported numa memory placement modes provided by
the user before setting them in the domain definition. Without the check
the user is able to provide a (yet) unknown mode which is then stored in
the domain definition. This potentially causes a NULL dereference when
the defintion is formatted into the XML.
To reproduce run:
virsh numatune DOMNAME --mode 6 --nodeset 0
The XML will then contain:
<numatune>
<memory mode='(null)' nodeset='0'/>
</numatune>
With this fix, the command fails:
error: Unable to change numa parameters
error: invalid argument: unsupported numa_mode: '6'
Add whitespace to separate logical code blocks, reformat error messages
and clean up code flow.
This patch changes error handling in some cases where the the loop would
be continued to jump to cleanup instead and error out rather than modify
the domain any further.
Do not leave the PCI address of the primary video card set
to the legacy default (0000:00:02.0) if we're doing two-pass
allocation.
Since QEMU 1.6 (QEMU_CAPS_VIDEO_PRIMARY) we allow the primary
video card to be on other slots than 0000:00:02.0 (as we use
-device instead of -vga).
However we fail to assign it an address if:
* another device explicitly uses 0000:00:02.0 and
* the primary video device has no address specified
On the first pass, we have set the address to default, then checked
if it's available, leaving it set even if it wasn't. This address
got picked up by the second pass, resulting in a conflict:
XML error: Attempted double use of PCI slot 0000:00:02.0
(may need "multifunction='on'" for device on function 0)
Also fix the test that was supposed to catch this.
This eliminates the misleading error message that was being logged
when a vfio hostdev hotplug failed:
error: unable to set user and group to '107:107' on '/dev/vfio/22':
No such file or directory
as documented in:
https://bugzilla.redhat.com/show_bug.cgi?id=1035490
Commit ee414b5d (pushed as a fix for Bug 1016511 and part of Bug
1025108) replaced the single call to
virSecurityManagerSetHostdevLabel() in qemuDomainAttachHostDevice()
with individual calls to that same function in each
device-type-specific attach function (for PCI, USB, and SCSI). It also
added a corresponding call to virSecurityManagerRestoreHostdevLabel()
in the error handling of the device-type-specific functions, but
forgot to remove the common call to that from
qemuDomainAttachHostDevice() - this resulted in a duplicate call to
virSecurityManagerRestoreHostdevLabel(), with the second occurrence
being after (e.g.) a PCI device has already been re-attached to the
host driver, thus destroying some of the device nodes / links that we
then attempted to re-label (e.f. /dev/vfio/22) and generating an error
log that obscured the original error.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1035490
virProcessSetMaxMemLock() (which is a wrapper over prlimit(3)) expects
the memory size in bytes, but libvirt's domain definition (which was
being used by qemuDomainAttachHostPciDevice()) stores all memory
tuning parameters in KiB. This was being accounted for when setting
MaxMemLock at domain startup time (so cold-plugged devices would
work), but not for hotplug.
This patch simplifies the few lines that call
virProcessSetMemMaxLock(), and multiply the amount * 1024 so that
we're locking the correct amount of memory.
What remains a mystery to me is why hot-plug of a managed='no' device
would succeed (at least on my system) while managed='yes' would
fail. I guess in one case the memory was coincidentally already
resident and in the other it wasn't.
On a system that is enforcing FIPS, most libraries honor the
current mode by default. Qemu, on the other hand, refused to
honor FIPS mode unless you add the '-enable-fips' command
line option; worse, this option is not discoverable via QMP,
and is only present on binaries built for Linux. So, if we
detect FIPS mode, then we unconditionally ask for FIPS; either
qemu is new enough to have the option and then correctly
cripple insecure VNC passwords, or it is so old that we are
correctly avoiding a FIPS violation by preventing qemu from
starting. Meanwhile, if we don't detect FIPS mode, then
omitting the argument is safe whether the qemu has the option
(but it would do nothing because FIPS is disabled) or whether
qemu lacks the option (including in the case where we are not
running on Linux).
The testsuite was a bit interesting: we don't want our test
to depend on whether it is being run in FIPS mode, so I had
to tweak things to set the capability bit outside of our
normal interaction with capability parsing.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1035474
* src/qemu/qemu_capabilities.h (QEMU_CAPS_ENABLE_FIPS): New bit.
* src/qemu/qemu_capabilities.c (virQEMUCapsInitQMP): Conditionally
set capability according to detection of FIPS mode.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Use it.
* tests/qemucapabilitiestest.c (testQemuCaps): Conditionally set
capability to test expected output.
* tests/qemucapabilitiesdata/caps_1.2.2-1.caps: Update list.
* tests/qemucapabilitiesdata/caps_1.6.0-1.caps: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The support for <boot rebootTimeout="12345"/> was added before we were
checking for qemu command line options in QMP, so we haven't properly
adapted virQEMUCaps when using it and thus we report unsupported
option with new enough qemu.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1042690
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Recent changes to events (commit 8a29ffcf) resulted in new compile
failures on some targets (such as ARM OMAP5):
conf/domain_event.c: In function 'virDomainEventDispatchDefaultFunc':
conf/domain_event.c:1198:30: error: cast increases required alignment of
target type [-Werror=cast-align]
conf/domain_event.c:1314:34: error: cast increases required alignment of
target type [-Werror=cast-align]
cc1: all warnings being treated as errors
The error is due to alignment; the base class is merely aligned
to the worst of 'int' and 'void*', while the child class must
be aligned to a 'long long'. The solution is to include a
'long long' (and for good measure, a function pointer) in the
base class to ensure correct alignment regardless of what a
child class may add, but to wrap the inclusion in a union so
as to not incur any wasted space. On a typical x86_64 platform,
the base class remains 16 bytes; on i686, the base class remains
12 bytes; and on the impacted ARM platform, the base class grows
from 12 bytes to 16 bytes due to the increase of alignment from
4 to 8 bytes.
Reported by Michele Paolino and others.
* src/util/virobject.h (_virObject): Use a union to ensure that
subclasses never have stricter alignment than the parent.
* src/util/virobject.c (virObjectNew, virObjectUnref)
(virObjectRef): Adjust clients.
* src/libvirt.c (virConnectRef, virDomainRef, virNetworkRef)
(virInterfaceRef, virStoragePoolRef, virStorageVolRef)
(virNodeDeviceRef, virSecretRef, virStreamRef, virNWFilterRef)
(virDomainSnapshotRef): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal)
(qemuMonitorClose): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Map the new <panic> device in XML to the '-device pvpanic' command
line of qemu. Clients can then couple the <panic> device and the
<on_crash> directive to control behavior when the guest reports
a panic to qemu.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1035955
There's a window when starting a qemu process between fork() and exec()
during which we are doing things that may fail but not tunnelling the
error to the daemon. This is basically all within qemuProcessHook().
So whenever we fail in something, e.g. placing a process onto numa node,
users are left with:
error: Child quit during startup handshake: Input/output error
while the original error is thrown into the domain log:
libvirt: error : internal error: NUMA memory tuning in 'preferred'
mode only supports single node
Hence, we should read the log file and search for the error message and
report it to users.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For dead domains that have no memtune limits, we return 0 instead of
"unlimited", this patch fixes it to return PARAM_UNLIMITED.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We were unconditionally removing the device from the host list, when it
should only be done on error.
This fixes USB collision detection when hotplugging the same device to
two guests.
If we hit a collision, we free the USB device while it is still part
of our temporary USBDeviceList. When the list is unref'd, the device
is free'd again.
Make the initial device freeing dependent on whether it is present
in the temporary list or not.
Similar to what Jiri did for cgroup setup/teardown in 05e149f94, push
it all into the device handler functions so we can do the necessary prep
work before claiming the device.
This also fixes hotplugging USB devices by product/vendor (virt-manager's
default behavior):
https://bugzilla.redhat.com/show_bug.cgi?id=1016511
https://bugzilla.redhat.com/show_bug.cgi?id=1035108
When attempting to enable more vCPUs in the guest than is currently
enabled in the guest but less than the maximum count for the VM we
currently reported an unhelpful message:
error: internal error: guest agent reports less cpu than requested
This patch changes it to:
error: invalid argument: requested vcpu count is greater than the count
of enabled vcpus in the domain: 3 > 2
When an error occurred in qemuAgentIO, it will be saved in mon->lastError,
but it will not be freed at the end. Present since commit c160ce33;
and compare to commit 9cc8a5af fixing the same problem in qemu_monitor.c.
==22219== 54 bytes in 1 blocks are definitely lost in loss record 982 of 1,379
==22219== at 0x4C26B9B: malloc (vg_replace_malloc.c:263)
==22219== by 0x8520521: strdup (in /lib64/libc-2.11.3.so)
==22219== by 0x52E99CB: virStrdup (virstring.c:554)
==22219== by 0x52B44C4: virCopyError (virerror.c:195)
==22219== by 0x52B5123: virCopyLastError (virerror.c:312)
==22219== by 0x10905877: qemuAgentIO (qemu_agent.c:660)
==22219== by 0x52B6122: virEventPollDispatchHandles (vireventpoll.c:501)
==22219== by 0x52B7AEA: virEventPollRunOnce (vireventpoll.c:647)
==22219== by 0x52B5C1B: virEventRunDefaultImpl (virevent.c:274)
==22219== by 0x54181FD: virNetServerRun (virnetserver.c:1112)
==22219== by 0x11EF4D: main (libvirtd.c:1513)
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch fixes memory leaks reported by valgrind on running
qemuxml2argvtest; introduced in commit 0df53f04.
Most of them are of the form:
==24777== 15 bytes in 1 blocks are definitely lost in loss record 39 of 129
==24777== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==24777== by 0x341F485E21: strdup (strdup.c:42)
==24777== by 0x4CADE5F: virStrdup (virstring.c:554)
==24777== by 0x4362B6: qemuBuildDriveStr (qemu_command.c:3848)
==24777== by 0x43EF73: qemuBuildCommandLine (qemu_command.c:8500)
==24777== by 0x426670: testCompareXMLToArgvHelper (qemuxml2argvtest.c:350)
==24777== by 0x427C01: virtTestRun (testutils.c:138)
==24777== by 0x41DDB5: mymain (qemuxml2argvtest.c:658)
==24777== by 0x4282A2: virtTestMain (testutils.c:593)
==24777== by 0x341F421A04: (below main) (libc-start.c:225)
==24777==
Signed-off-by: Eric Blake <eblake@redhat.com>
Ever since the subcpusets(vcpu,emulator) were introduced, the parent
cpuset cannot be modified to remove the nodes that are in use by the
subcpusets.
The fix is to break the memory node modification into three steps:
1. assign new nodes into the parent,
2. change the nodes in the child nodes,
3. remove the old nodes on the parent node.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1009880
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1029732
The BZ asked for the capability to change the number of queues used by
a virtio-net device while the device is in use. Because the number of
queues can only be set at the time the device is created, that isn't
possible. However, libvirt also shouldn't be silently reporting
success when someone tries to change the number of queues. So this
patch flags that as an error (just as attempts to change any of the
other virtio-specific parameters already do).
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=888635
(which was already closed as CANTFIX because the qemu "-boot strict"
commandline option wasn't available at the time).
Problem: you couldn't have a domain that used PXE to boot, but also
had an un-bootable disk device *even if that disk wasn't listed in the
boot order*, because if PXE timed out (e.g. due to the bridge
forwarding delay), the BIOS would move on to the next target, which
would be the unbootable disk device (again - even though it wasn't
given a boot order), and get stuck at a "BOOT DISK FAILURE, PRESS ANY
KEY" message until a user intervened.
The solution available since sometime around QEMU 1.5, is to add
"-boot strict=on" to *every* qemu command. When this is done, if any
devices have a boot order specified, then QEMU will *only* attempt to
boot from those devices that have an explicit boot order, ignoring the
rest.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1035188
Commit f094aaac48 changed the PCI device assignment in qemu domains
to default to using VFIO rather than legacy KVM device assignment
(when VFIO is available). It didn't change which driver was used by
default for virNodeDeviceDetachFlags(), though, so that API (and the
virsh nodedev-detach command) was still binding to the pci-stub
driver, used by legacy KVM assignment, by default.
This patch publicizes (only within the qemu module, though, so no
additions to the symbol exports are needed) the functions that check
for presence of KVM and VFIO device assignment, then uses those
functions to decide what to do when no driver is specified for
virNodeDeviceDetachFlags(); if the vfio driver is loaded, the device
will be bound to vfio-pci, or if legacy KVM assignment is supported on
this system, the device will be bound to pci-stub; if neither method
is available, the detach will fail.
Currently the snapshot code did not check if it actually supports
snapshots on various disk backends for domains. To avoid future problems
add checkers that whitelist the supported configurations.
This patch adds function qemuGetDriveSourceString to produce
qemu-compatible disk source strings that will enable to reuse the code
and refactors building of the qemu commandline of disks to use this new
helper.
Automatically assign secret type from the disk source definition and
pull in adding of the comma. Then update callers to keep generated
output the same.
Before this patch, the translation function still needs a second ugly
helper function to actually format the command line for qemu. But if we
do the right stuff in the translation function, we don't have to bother
with the second function any more.
This patch removes the messy qemuBuildVolumeString function and changes
qemuTranslateDiskSourcePool to set stuff up correctly so that the
regular code paths meant for volumes can be used to format the command
line correctly.
For this purpose a new helper "qemuDiskGetActualType()" is introduced to
return the type of the volume in a pool.
As a part of the refactor the qemuTranslateDiskSourcePool function is
fixed to do decisions based on the pool type instead of the volume type.
This allows to separate pool-type-specific stuff more clearly and will
ease addition of other pool types that will require certain other
operations to get the correct pool source.
The previously fixed tests should make sure that we don't break stuff
that was working before.
When doing an internal snapshot on a VM with sheepdog or RBD disks we
would not set a flag to mark the domain is using internal snapshots and
might end up creating a mixed snapshot. Move the setting of the variable
to avoid this problem.
The virsh command 'domxml-to-native' (virConnectDomainXMLToNative())
converts all network devices to "type='ethernet'" in order to make it
more likely that the generated command could be run directly from a
shell (other libvirt network device types end up referencing file
descriptors for tap devices assumed to have been created by libvirt,
which can't be done in this case).
During this conversion, all of the netdev parameters are cleared out,
then specific items are filled in after changing the type. The MAC
address was not one of these preserved items, and the result was that
mac addresses in the generated commandlines were always
00:00:00:00:00:00.
This patch saves the mac address before the conversion, then
repopulates it afterwards, so the proper mac addresses show up in the
commandline.
Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
In the 'directory' and 'netfs' storage pools, a user can see
both 'file' and 'dir' storage volume types, to know when they
can descend into a subdirectory. But in a network-based storage
pool, such as the upcoming 'gluster' pool, we use 'network'
instead of 'file', and did not have any counterpart for a
directory until this patch. Adding a new volume type
'network-dir' is better than reusing 'dir', because it makes
it clear that the only way to access 'network' volumes within
that container is through the network mounting (leaving 'dir'
for something accessible in the local file system).
* include/libvirt/libvirt.h.in (virStorageVolType): Expand enum.
* docs/formatstorage.html.in: Document it.
* docs/schemasa/storagevol.rng (vol): Allow new value.
* src/conf/storage_conf.c (virStorageVol): Use new value.
* src/qemu/qemu_command.c (qemuBuildVolumeString): Fix client.
* src/qemu/qemu_conf.c (qemuTranslateDiskSourcePool): Likewise.
* tools/virsh-volume.c (vshVolumeTypeToString): Likewise.
* src/storage/storage_backend_fs.c
(virStorageBackendFileSystemVolDelete): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The bus type IDE being enum Zero, the bus type on pseries system appears as IDE for all the -hda/-cdrom and for disk drives with if="none" type. Pseries platform needs this to appear as SCSI instead of IDE. The ide being not supported, the explicit requests for ide devices will return an error.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
This nested job is canceled by the first ExitMonitor call (even though
it was not created by the corresponding EnterMonitor call), and
again in qemuMigrationPrepareAny if qemuProcessStart failed.
This can lead to a crash if the vm object was disposed of before calling
qemuDomainRemoveInactive:
0 ..62bc in virClassIsDerivedFrom (klass=0xdeadbeef,
parent=0x7ffce4cdd270) at util/virobject.c:166
1 ..6666 in virObjectIsClass at util/virobject.c:362
2 ..66b4 in virObjectLock at util/virobject.c:314
3 ..477e in virDomainObjListRemove at conf/domain_conf.c:2359
4 ..7a64 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2087
5 ..956c in qemuMigrationPrepareAny at qemu/qemu_migration.c:2469
This was added by commit e4e2822, exposed by 5a4c237 and c7ac251.
https://bugzilla.redhat.com/show_bug.cgi?id=1018267
If a SCSI hostdev is included in an initial domain XML, without a
corresponding controller statement, one is created silently when the
guest is booted.
When hotplugging a SCSI hostdev, a presumption is that the controller
is already present in the domain either from the original XML, or via
an earlier hotplug.
[root@xxxxxxxx ~]# cat disk.xml
<hostdev mode='subsystem' type='scsi'>
<source>
<adapter name='scsi_host0'/>
<address bus='0' target='3' unit='1088438288'/>
</source>
</hostdev>
[root@xxxxxxxx ~]# virsh attach-device guest01 disk.xml
error: Failed to attach device from disk.xml
error: internal error: unable to execute QEMU command 'device_add': Bus 'scsi0.0' not found
Since the infrastructure is in place, we can also create a controller
silently for use by the hotplugged hostdev device.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>