The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
It's not desired to force users imagine path for a socket they
are not even supposed to connect to. On the other hand, we
already have a release where the qemu agent socket path is
exposed to XML, so we cannot silently drop it from there.
The new path is generated in form:
$LOCALSTATEDIR/lib/libvirt/qemu/channel/target/$domain.$name
for qemu system mode, and
$XDG_CONFIG_HOME/qemu/lib/channel/target/$domain.$name
for qemu session mode.
virPCIDeviceReattach and virPCIDeviceUnbindFromStub (called by
virPCIDeviceReattach) had previously required the name of the stub
driver as input. This is unnecessary, because the name of the driver
the device is currently bound to can be found by looking at the link:
/sys/bus/pci/dddd:bb:ss.ff/driver
Instead of requiring that the name of the expected stub driver name
and only unbinding if that one name is matched, we no longer take a
driver name in the arglist for either of these
functions. virPCIDeviceUnbindFromStub just compares the name of the
currently bound driver to a list of "well known" stubs (right now
contains "pci-stub" and "vfio-pci" for qemu, and "pciback" for xen),
and only performs the unbind if it's one of those devices.
This allows virsh nodedevice-reattach to work properly across a
libvirtd restart, and fixes a couple of cases where we were
erroneously still hard-coding "pci-stub" as the drive name.
For some unknown reason, virPCIDeviceReattach had been calling
modprobe on the stub driver prior to unbinding the device. This was
problematic because we no longer know the name of the stub driver in
that function. However, it is pointless to probe for the stub driver
at that time anyway - because the device is bound to the stub driver,
we are guaranteed that it is already loaded, and so that call to
modprobe has been removed.
For s390 we don't want to have a default USB device generated even
if QEMU is silently tolerating -usb on the command line. This may change
in the future.
Another reason to avoid the USB controller is that it implies a PCI
bus which might cause a regression at some later point in time.
The following change will set the USB controller model to 'none'
unless a model or address has been specified, which can be the case
if a legacy definition is loaded or the XML writer knows what
she/he's doing.
Requiring the user to explicitly disable USB on systems not supporting
it seems cumbersome.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Commit eca3fdf inadvertantly caused a failure to start for any domain
with the following in its config:
<graphics type='spice' autoport='yes'/>
The problem is that when tlsPort == 0 and defaultMode == "any" (which
is the default for defaultMode), this would be flagged in the code as
"needTLSPort", and if there was then no spice tls config, the new
error+fail would happen.
This patch checks for the case of defaultMode == "any", and in that
case simply doesn't allocate a TLS port (since that's probably not
what the user wanted, and it would have failed later anyway.). It does
leave the error in place for cases when the user specifically asked to
use tls in one way or another, though.
As a result of commit id '19c345f2', 'make -C tests valgrind' has the
following for qemuxml2argvtest:
==22482== 197 (80 direct, 117 indirect) bytes in 1 blocks are definitely lost in loss record 101 of 120
==22482== at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==22482== by 0x4C6F301: virAlloc (viralloc.c:124)
==22482== by 0x4C840FC: virSaveLastError (virerror.c:308)
==22482== by 0x431882: qemuBuildCommandLine (qemu_command.c:8204)
==22482== by 0x41E8F0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:155)
==22482== by 0x41FE9F: virtTestRun (testutils.c:157)
==22482== by 0x419DEB: mymain (qemuxml2argvtest.c:654)
==22482== by 0x4204DA: virtTestMain (testutils.c:719)
==22482== by 0x39D0821A04: (below main) (libc-start.c:225)
==22482==
qemuBuildMemballoonDevStr returns NULL if memballoon doesn't have
the right address type, but it doesn't report an error, leading to:
error: An error occurred, but the cause is unknown
Report a helpful error message instead, e.g.:
error: XML error: memballoon unsupported with address type 'usb'
When a user requests auto-allocation of the spice TLS port but spice TLS
is disabled in qemu.conf, we start the machine and let qemu fail instead
of erroring out sooner.
Add an error message so that this doesn't happen.
The USB-specific cgroup setup had been inserted inline in
qemuDomainAttachHostUsbDevice and qemuSetupCgroup, but now there is a
common cgroup setup function called for all hostdevs, so it makes sens
to put the usb-specific setup there and just rely on that function
being called.
The one thing I'm uncertain of here (and a reason for not pushing
until after release) is that previously hostdev->missing was checked
only when starting a domain (and cgroup setup for the device skipped
if missing was true), but with this consolidation, it is now checked
in the case of hotplug as well. I don't know if this will have any
practical effect (does it make sense to hotplug a "missing" usb
device?)
PCIO device assignment using VFIO requires read/write access by the
qemu process to /dev/vfio/vfio, and /dev/vfio/nn, where "nn" is the
VFIO group number that the assigned device belongs to (and can be
found with the function virPCIDeviceGetVFIOGroupDev)
/dev/vfio/vfio can be accessible to any guest without danger
(according to vfio developers), so it is added to the static ACL.
The group device must be dynamically added to the cgroup ACL for each
vfio hostdev in two places:
1) for any devices in the persistent config when the domain is started
(done during qemuSetupCgroup())
2) at device attach time for any hotplug devices (done in
qemuDomainAttachHostDevice)
The group device must be removed from the ACL when a device it
"hot-unplugged" (in qemuDomainDetachHostDevice())
Note that USB devices are already doing their own cgroup setup and
teardown in the hostdev-usb specific function. I chose to make the new
functions generic and call them in a common location though. We can
then move the USB-specific code (which is duplicated in two locations)
to this single location. I'll be posting a followup patch to do that.
Don't reserve slot 2 for video if the machine has no PCI buses.
Error out when the user specifies a video device without
a PCI address when there are no PCI buses.
(This wouldn't work on a machine with no PCI bus anyway since
we do add PCI addresses for video devices to the command line)
In the past we automatically added a USB controller and assigned
it a PCI address (0:0:1.2) even on machines without a PCI bus.
This didn't break machines with no PCI bus because the command
line for it is just '-usb', with no mention of the PCI bus.
The implicit IDE controller (reserved address 0:0:1.1) has
no command line at all.
Commit b33eb0dc removed the ability to reserve PCI addresses
on machines without a PCI bus. This made them stop working,
since there would always be the implicit USB controller.
Skip the reservation of addresses for these controllers when
there is no PCI bus, instead of failing.
This isn't strictly speaking a bugfix, but I realized I'd gotten a bit
too verbose when I chose the names for
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_*. This shortens them all a bit.
<source type='bridge'> uses a helper application to do the necessary
TUN/TAP setup to use an existing network bridge, thus letting
unprivileged users use TUN/TAP interfaces.
However, libvirt should be preventing QEMU from running any setuid
programs at all, which would include this helper program. From
a security POV, any setuid helper needs to be run by libvirtd itself,
not QEMU.
This is what this patch does. libvirt now invokes the setuid helper,
gets the TAP fd and then passes it to QEMU in the normal manner.
The path to the helper is specified in qemu.conf.
As a small advantage, this adds a <target dev='tap0'/> element to the
XML of an active domain using <interface type='bridge'>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VFIO requires all of the guest's memory and IO space to be lockable in
RAM. The domain's max_balloon is the maximum amount of memory the
domain can have (in KiB). We add a generous 1GiB to that for IO space
(still much better than KVM device assignment, where the KVM module
actually *ignores* the process limits and locks everything anyway),
and convert from KiB to bytes.
In the case of hotplug, we are changing the limit for the already
existing qemu process (prlimit() is used under the hood), and for
regular commandline additions of vfio devices, we schedule a call to
setrlimit() that will happen after the qemu process is forked.
These were previously being set in a custom hook function, but now
that virCommand directly supports setting them, we can eliminate that
part of the hook and call the APIs directly.
The differences from virNodeDeviceDettach are very minor:
1) Check that the flags are 0.
2) Set the virPCIDevice's stubDriver according to the driverName that
is passed in.
3) Call virPCIDeviceDetach with a NULL stubDriver, indicating it
should get the name of the stub driver from the virPCIDevice
object.
If the config for a device has specified <driver name='vfio'/>,
"backend" in the pci part of the hostdev object will be set to
..._VFIO. In this case, when creating a virPCIDevice set the
stubDriver to "vfio-pci", otherwise set it to "pci-stub". We will rely
on the lower levels to report an error if the vfio driver isn't
loaded.
The detach/attach functions in virpci.c will pay attention to the
stubDriver setting in the device, and bind/unbind the appropriate
driver when preparing hostdevs for the domain.
Note that we don't yet attempt to do anything to mark active any other
devices in the same vfio "group" as a single device that is being
marked active. We do need to do that, but in order to get basic VFIO
functionality testing sooner rather than later, initially we'll just
live with more cryptic errors when someone tries to do that.
The device option for vfio-pci is nearly identical to that for
pci-assign - only the configfd parameter isn't supported (or needed).
Checking for presence of the bootindex parameter is done separately
from constructing the commandline, similar to how it is done for
pci-assign.
This patch contains tests to check for proper commandline
construction. It also includes tests for parser-formatter-parser
roundtrips (xml2xml), because those tests use the same data files, and
would have failed had they been included before now.
qemu: xml/args tests for VFIO hostdev and <interface type='hostdev'/>
These should be squashed in with the patch that adds commandline
handling of vfio (they would fail at any earlier time).
There will soon be other items related to pci hostdevs that need to be
in the same part of the hostdevsubsys union as the pci address (which
is currently a single member called "pci". This patch replaces the
single member named pci with a struct named pci that contains a single
member named "addr".
QEMU_CAPS_DEVICE_VFIO_PCI is set if the device named "vfio-pci" is
supported in the qemu binary.
QEMU_CAPS_VFIO_PCI_BOOTINDEX is set if the vfio-pci device supports
the "bootindex" parameter; for some reason, the bootindex parameter
wasn't included in early versions of vfio support (qemu 1.4) so we
have to check for it separately from vfio itself.
Jim Fehlig reported on IRC that older gcc/glibc triggers this warning:
cc1: warnings being treated as errors
qemu/qemu_domain.c: In function 'qemuDomainDefFormatBuf':
qemu/qemu_domain.c:1297: error: declaration of 'remove' shadows a global declaration [-Wshadow]
/usr/include/stdio.h:157: error: shadowed declaration is here [-Wshadow]
make[3]: *** [libvirt_driver_qemu_impl_la-qemu_domain.lo] Error 1
Fix it like we have done in the past (such as commit 2e6322a).
* src/qemu/qemu_domain.c (qemuDomainDefFormatBuf): Avoid shadowing
a function name.
Signed-off-by: Eric Blake <eblake@redhat.com>
After 9d6e56db the syntax-check was unhappy due to wrong whitespacing:
src/qemu/qemu_command.c:1637: for ( ; a.slot < QEMU_PCI_ADDRESS_SLOT_LAST; a.slot++) {
maint.mk: incorrect whitespace around brackets, see HACKING for rules
make: *** [bracket-spacing-check] Error 1
After 78d7c3c5 we are strdup()-ing path to qemu-bridge-helper.
However, the check for its return value is missing. So it is
possible we've ignored the OOM error silently.
Add a "dry run" address allocation to figure out how many bridges
will be needed for all the devices without explicit addresses.
Auto-add just enough bridges to put all the devices on, or up to the
bridge with the largest specified index.
<controller type='pci' index='0' model='pci-root'/>
is auto-added to pc* machine types.
Without this controller PCI bus 0 is not available and
no PCI addresses are assigned by default.
Since older libvirt supported PCI bus 0 even without
this controller, it is removed from the XML when migrating.
Now we set the default disk driver name when parsing
the qemu command line too, hence all the test changes.
Assume format type is 'auto' when none is specified on
qemu command line.
Currently, if there has been an error in building command line
process after virtual interfaces has been created, the flow jumps
to 'error' label, where virDomainConfNWFilterTeardown() is
called. This may report an error as well, but should not
overwrite the original cause why we jumped to 'error' label.
Instead of making a choice between the underscore and camelCase, this
simply changes "num_queues" into "queues", which is also consistent
with Michal's multiple queue support for interface.
Improve error reporting and generating of SPICE command line arguments
according to the need to enable TLS. If TLS is disabled, there's no need
to pass the certificate dir to qemu.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=953126
Ensure that all drivers implementing public APIs use a
naming convention for their implementation that matches
the public API name.
eg for the public API virDomainCreate make sure QEMU
uses qemuDomainCreate and not qemuDomainStart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the driver struct field names match the public
API names. For an API virXXXX we must have a driver struct
field xXXXX. ie strip the leading 'vir' and lowercase any
leading uppercase letters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Switch the function from a bunch of ifs to a switch statement with
correct type and reflow some code.
Also fix comment in enum describing possible graphics types
Decrease size of qemuBuildGraphicsCommandLine() by splitting out
spice-related code into qemuBuildGraphicsVNCCommandLine().
This patch also fixes 2 possible memory leaks on error path in the code
that was split-out. The buffer containing the already generated options
and a listen address string could be leaked.
Also break a few very long lines and reflow code that fits now.
Decrease size of qemuBuildGraphicsCommandLine() by splitting out
spice-related code into qemuBuildGraphicsSPICECommandLine().
This patch also fixes 2 possible memory leaks on error path in the code
that was split-out. The buffer containing the already generated options
and a listen address string could be leaked.
Also break a few very long lines.
Refactoring done in 19c6ad9ac7e7eb2fd3c8262bff5f087b508ad07f didn't
correctly take into account the order cgroup limit modification needs to
be done in. This resulted into errors when decreasing the limits.
The operations need to take place in this order:
decrease hard limit
change swap hard limit
or
change swap hard limit
increase hard limit
This patch also fixes the check if the hard_limit is less than
swap_hard_limit to print better error messages. For this purpose I
introduced a helper function virCompareLimitUlong to compare limit
values where value of 0 is equal to unlimited. Additionally the check is
now applied also when the user does not provide all of the tunables
through the API and in that case the currently set values are used.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=950478