The virNetSocketWriteSASL method has to encode the buffer it is given and then
write it to the underlying socket. This write is not guaranteed to send the
full amount of data that was encoded by SASL. We cache the SASL encoded data so
that on the next invocation of virNetSocketWriteSASL we carry on sending it.
The subtle problem is that the 'len' value passed into virNetSocketWriteSASL on
the 2nd call may be larger than the original value. So when we've completed
sending the SASL encoded data we previously cached, we must return the original
length we encoded, not the new length.
This flaw means we could potentially have been discarded queued data without
sending it. This would have exhibited itself as a libvirt client never receiving
the reply to a method it invokes, async events silently going missing, or worse
stream data silently getting dropped.
For this to be a problem libvirtd would have to be queued data to send to the
client, while at the same time the TCP socket send buffer is full (due to a very
slow client). This is quite unlikely so if this bug was ever triggered by a real
world user it would be almost impossible to reproduce or diagnose, if indeed it
was ever noticed at all.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
During migration, the lock process is paused in the perform phase
but not resumed if there is a subsequent failure, leaving the locked
resource unprotected.
The perform phase itself can fail, in which case the lock process
should be resumed before returning from perform. The finish phase
could also fail on the destination host, in which case the migration
is canceled in the confirm phase and the VM is resumed. The lock
process needs to be resumed there as well.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Basically the `cpus` and `tasks` files are not needed, and I've witnessed on a
real system that the schemata file may have spaces prepended to a line, so let's
adjust at least one test so that it reflects what can happen. Also `000`
allocation is invalid and a full mask means it's all free. So adjust for that
too.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We've been building up to this. This adds support for cputune/cachetune
settings for domains in the QEMU driver. The addition into
qemuProcessSetupVcpu() automatically adds support for hotplug. For hot-unplug
we need to remove the allocation only if all the vCPUs were unplugged. But
since the threads are left running, we can't really do much about it now.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1289368
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This test initializes capabilities from vircaps2xmldata (since it exists there
already) and then requests list of free bitmaps (all unallocated space) from
virresctrl.c
Desirable outputs are saved in virresctrldata.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
More info in the documentation, this is basically the XML parsing/formatting
support, schemas, tests and documentation for the new cputune/cachetune element
that will get used by following patches.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
With this commit we finally have a way to read and manipulate basic resctrl
settings. Locking is done only on exposed functions that read/write from/to
resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are
only supposed to be used from tests.
More information about how resctrl works:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/x86/intel_rdt_ui.txt
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This will make the current functions obsolete and it will provide more
information to the virresctrl module so that it can be used later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Unfortunately, we have a number of aliases in virsh and even though
these are not visible any more, we have to support them. The problem is
that when trying to print help for the alias, we get SIGSEGV because
there isn't any @def structure anymore and we need to query the command
being aliased instead.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1538570
Signed-off-by: Erik Skultety <eskultet@redhat.com>
These helpers are called from a single place only - cmdHelp wrapper and
just before the wrapper invokes the helpers, it performs the search,
either for command group or for the command itself, except the result is
discarded and the helper therefore needs to do it again. Drop this
inefficient handling and pass the @def structure rather than a name,
thus preventing the helper from needing to perform the search again.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This wires up the previously added OEM strings XML schema to be able to
generate comamnd line args for QEMU. This requires QEMU >= 2.12 release
containing this patch:
commit 2d6dcbf93fb01b4a7f45a93d276d4d74b16392dd
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Sat Oct 28 21:51:36 2017 +0100
smbios: support setting OEM strings table
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The OEM strings table in SMBIOS allows the vendor to pass arbitrary
strings into the guest OS. This can be used as a way to pass data to an
application like cloud-init, or potentially as an alternative to the
kernel command line for OS installers where you can't modify the install
ISO image to change the kernel args.
As an example, consider if cloud-init and anaconda supported OEM strings
you could use something like
<oemStrings>
<entry>cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/</entry>
<entry>anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os</entry>
</oemStrings>
use of a application specific prefix as illustrated above is
recommended, but not mandated, so that an app can reliably identify
which of the many OEM strings are targetted at it.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We can start qemu with a "cpu,+la57" to set 57-bit vitrual address
space. So VM can be aware that it need to enable 5-level paging.
Corresponding QEMU commits:
al57 6c7c3c21f95dd9af8a0691c0dd29b07247984122
Since
commit eee7bd4ecb
Author: Joao Martins <joao.m.martins@oracle.com>
Date: Tue Jul 26 00:45:14 2016 +0100
libxl: implement virDomainBlockStats
Introduce initial support for domainBlockStats API
the libxl driver calls a couple of xenstore APIs, so it must explicitly
link to this library rather than rely on indirect linkage via libxl or
other xen libraries.
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This is a slight change from previous patches since virSecret
does not have a name only UUID strings.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The virConnectListAllNWFilters() has no extra flags yet, which
simplifies things a bit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Yet again, we don't need listing by device capabilities, so flags
are unused.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This one is a bit simpler since virStoragePoolListAllVolumes()
has no flags yet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Provide more details related to the requirement that setting one
of the values requires setting all of them.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The libvirtd daemon uses systemd-machined D-Bus API when manipulating
domains. The systemd-machined is D-Bus activated on demand.
However, during system shutdown systemd-machined is stopped concurrently
with libvirtd and virsh users also doing their final cleanup may
transitively fail due to unavailability of systemd-machined. Example
error message
> libvirtd[1390]: 2017-12-20 18:55:56.182+0000: 32700: error : virSystemdTerminateMachine:503 : Refusing activation, D-Bus is shutting down.
To circumvent this we need to explicitly specify both ordering and
requirement dependency (to avoid late D-Bus activation) on
systemd-machined. See [1] for the dependency debate.
[1] https://lists.freedesktop.org/archives/systemd-devel/2018-January/040095.html
We recently added a generic XHCI USB3 controller to QEMU, and libvirt
supports adding that controller rather than the NEC XHCI USB3
controller, but when auto-adding a USB controller to Q35 domains we
were still adding the vendor-specific NEC controller. This patch
changes to add the generic controller instead, if it's available in
the QEMU binary that will be used.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Whenever a different kernel is booted, some capabilities related to KVM
(such as CPUID bits) may change. We need to refresh the cache to see the
changes.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
qemuDomainDefValidateVideo() (called from qemuDomainDefValidate()) is
just a loop performing various checks on each video device. Rather
than maintaining this separate function, just fold the validations
into qemuDomainDeviceDefValidateVideo(), which is called once for each
video device.
Commit 10c73bf1 fixed a bug that I had introduced back in commit
70249927 - if a vhost-scsi device had no manually assigned PCI
address, one wouldn't be assigned automatically. There was a slight
problem with the logic of the fix though - in the case of domains with
pcie-root (e.g. those with a q35 machinetype),
qemuDomainDeviceCalculatePCIConnectFlags() will attempt to determine
if the host-side PCI device is Express or legacy by examining sysfs
based on the host-side PCI address stored in
hostdev->source.subsys.u.pci.addr, but that part of the union is only
valid for PCI hostdevs, *not* for SCSI hostdevs. So we end up trying
to read sysfs for some probably-non-existent device, which fails, and
the function virPCIDeviceIsPCIExpress() returns failure (-1).
By coincidence, the return value is being examined as a boolean, and
since -1 is true, we still end up assigning the vhost-scsi device to
an Express slot, but that is just by chance (and could fail in the
case that the gibberish in the "hostside PCI address" was the address
of a real device that happened to be legacy PCI).
Since (according to Paolo Bonzini) vhost-scsi devices appear just like
virtio-scsi devices in the guest, they should follow the same rules as
virtio devices when deciding whether they should be placed in an
Express or a legacy slot. That's accomplished in this patch by
returning early with virtioFlags, rather than erroneously using
hostdev->source.subsys.u.pci.addr. It also adds a test case for PCIe
to assure it doesn't get broken in the future.
Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
to retrieve the PCI device associated with the network device, ignoring
non-PCI devices. It does so via the following call chain
virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
virPCIGetDeviceAddressFromSysfsLink()
For non-PCI network devices (qeth, Xen vif, etc),
virPCIGetDeviceAddressFromSysfsLink() will report an error when
virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
logs an error. After commit 8708ca01c there are now two errors reported
for each non-PCI network device even though the errors are harmless.
To avoid the errors, introduce virNetDevIsPCIDevice() and use it in
virNetDevGetPCIDevice() before attempting to retrieve the associated
PCI device. virNetDevIsPCIDevice() uses the 'subsystem' property of the
device to determine if it is PCI. See the sysfs rules in kernel
documentation for more details
https://www.kernel.org/doc/html/latest/admin-guide/sysfs-rules.html
https://bugzilla.redhat.com/show_bug.cgi?id=1536461
This reverts commit aeda1b8c56.
Problem is that we need mon->lastError to be set because it's
used all over the place. Also, there's nothing wrong with
reporting error if one occurred. I mean, if there's a thread
executing an API and which currently is talking on monitor it
definitely wants the error reported.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When migrating a shutoff domain (i.e., offline migration), we have no
statistics to report and thus jobInfo will be NULL in
qemuMigrationFinish.
Broken by me in v3.10.0-183-ge8784e7868.
https://bugzilla.redhat.com/show_bug.cgi?id=1536351
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of EPYC with indirect branch prediction protection.
The only difference between EPYC and EPYC-IBPB is the added "ibpb"
feature.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
After the latest CPU additions, the build fails with clang:
cputest.c:905:1: error: stack frame size of 26136 bytes
in function 'mymain' [-Werror,-Wframe-larger-than=]
Raise the relaxed limit which is used for tests.
We read from QEMU until seeing a \r\n pair to indicate a completed reply
or event. To avoid memory denial-of-service though, we must have a size
limit on amount of data we buffer. 10 MB is large enough that it ought
to cope with normal QEMU replies, and small enough that we're not
consuming unreasonable mem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As usual, a bunch of changes slipped through the cracks during the
development cycle. Update the release notes to include at least the
most notable ones.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Commit 7a931a4204 refactored the code and probably forgot to add
this line.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This is a variant of Skylake-Server with indirect branch prediction
protection. The only difference between Skylake-Server and
Skylake-Server-IBRS is the added "spec-ctrl" feature.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Skylake-Client with indirect branch prediction
protection. The only difference between Skylake-Client and
Skylake-Client-IBRS is the added "spec-ctrl" feature.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Broadwell with indirect branch prediction
protection. The only difference between Broadwell and Broadwell-IBRS is
the added "spec-ctrl" feature.
The Broadwell-IBRS model in QEMU is a bit different since Broadwell got
several additional features since we added it in cpu_map.xml:
abm, arat, f16c, rdrand, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>