Historically firewall rules for virtual networks were added straight
into the base chains. This works but has a number of bugs and design
limitations:
- It is inflexible for admins wanting to add extra rules ahead
of libvirt's rules, via hook scripts.
- It is not clear to the admin that the rules were created by
libvirt
- Each rule must be deleted by libvirt individually since they
are all directly in the builtin chains
- The ordering of rules in the forward chain is incorrect
when multiple networks are created, allowing traffic to
mistakenly flow between networks in one direction.
To address all of these problems, libvirt needs to move to creating
rules in its own private chains. In the top level builtin chains,
libvirt will add links to its own private top level chains.
Addressing the traffic ordering bug requires some extra steps. With
everything going into the FORWARD chain there was interleaving of rules
for outbound traffic and inbound traffic for each network:
-A FORWARD -d 192.168.3.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.3.0/24 -i virbr1 -j ACCEPT
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -d 192.168.2.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.2.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
The rule allowing outbound traffic from virbr1 would mistakenly
allow packets from virbr1 to virbr0, before the rule denying input
to virbr0 gets a chance to run.
What we really need todo is group the forwarding rules into three
distinct sets:
* Cross rules - LIBVIRT_FWX
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
* Incoming rules - LIBVIRT_FWI
-A FORWARD -d 192.168.3.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -d 192.168.2.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
* Outgoing rules - LIBVIRT_FWO
-A FORWARD -s 192.168.3.0/24 -i virbr1 -j ACCEPT
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -s 192.168.2.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
There is thus no risk of outgoing rules for one network mistakenly
allowing incoming traffic for another network, as all incoming rules
are evalated first.
With this in mind, we'll thus need three distinct chains linked from
the FORWARD chain, so we end up with:
INPUT --> LIBVIRT_INP (filter)
OUTPUT --> LIBVIRT_OUT (filter)
FORWARD +-> LIBVIRT_FWX (filter)
+-> LIBVIRT_FWO
\-> LIBVIRT_FWI
POSTROUTING --> LIBVIRT_PRT (nat & mangle)
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Some of the query callbacks want to know the firewall layer that was
being used for triggering the query to avoid duplicating that data.
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the platform driver impls to run logic before and after the
firewall reload process.
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
disk->mirror would not be cleared while the local pointer was freed in
qemuDomainBlockCommit if qemuDomainObjExitMonitor or qemuBlockJobDiskNew
would return a failure.
Since block job handling is executed in the separate handler which needs
a qemu job, we don't need to pre-set the mirror state prior to starting
the job. Similarly the block copy job does not do that.
Move the setting of the data after starting the job so that we avoid
this problem.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
While this should not be necessary as we clear it in the event handler,
let's be sure and clear it prior to starting the job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Switching a block job to some states (e.g. QEMU_BLOCKJOB_STATE_READY)
might not require a job, thus if it will become ready asynchronously we
should not overwrite the state any more.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
While the callers should make sure that they don't call
qemuBlockJobEmitEvents for any internal state or job, let's add checks
that prevents us from emitting wrong events altogether.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1665553
Ceph can be mounted just like any other filesystem and in fact is
a shared and cluster filesystem. The filesystem magic constant
was taken from kernel sources as it is not in magic.h yet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
We have this very handy macro called VIR_STEAL_PTR() which steals
one pointer into the other and sets the other to NULL. The
following coccinelle patch was used to create this commit:
@ rule1 @
identifier a, b;
@@
- b = a;
...
- a = NULL;
+ VIR_STEAL_PTR(b, a);
Some places were clean up afterwards to make syntax-check happy
(e.g. some curly braces were removed where the body become a one
liner).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Implement support for passing custom command line arguments
to bhyve using the 'bhyve:commandline' element:
<bhyve:commandline>
<bhyve:arg value='-newarg'/>
</bhyve:commandline>
* Define virDomainXMLNamespace for the bhyve driver, which
at this point supports only the 'commandline' element
described above,
* Update command generation code to inject these command line
arguments between driver-generated arguments and the vmname
positional argument.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
networkMigrateStateFiles was added nearly 5 years ago when the network
state directory was moved from /var/lib/libvirt to /var/run/libvirt
just prior to libvirt-1.2.4). It was only required to maintain proper
state information for networks that were active during an upgrade that
didn't involve rebooting the host. At this point the likelyhood of
anyone upgrading their libvirt from pre-1.2.4 directly to 5.0.0 or
later *without rebooting the host* is probably so close to 0 that no
properly informed bookie would take *any* odds on it happening, so it
seems appropriate to remove this pointless code.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Upcoming patches need an array of strings for use in QMP
block-dirty-bitmap-merge. A convenience wrapper cuts down
on the verbosity of creating the array, similar to the
existing virJSONValueObjectAppendString().
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
A function that returns -1 for multiple possible failures, but only
raises a libvirt error for some of those failures, can be hard to
use correctly. Yet both of our JSON object/array appenders fall in
that pattern. True, the silent errors represent coding bugs that
none of the callers should ever trigger, while the noisy errors
represent memory failures that can happen anywhere, so we happened
to never end up failing without an error. But it is better to
either use the _QUIET memory allocation variants, and make callers
decide to report failure; or make all failure paths noisy. This
patch takes the latter approach.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use qemuBuildControllersCommandLine since it builds the command line
for (nearly) all controllers, not just one.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Now that the inner loop does not require any other variables,
it can be easily separated. Apart from reducing the indentation
level this will allow it to be called from different code paths.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Now that it's no longer needed, remove the argument.
This removes the last helper variable in
qemuBuildControllerDevCommandLine.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
qemuBuildLegacyUSBControllerCommandLine is the only place where
we need to count the USB controllers.
Count them again instead of keeping track in a variable passed to
qemuBuildControllerDevStr.
This removes the need for another variable in the loop in
qemuBuildControllerDevCommandLine.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Count them in qemuBuildLegacyUSBControllerCommandLine to remove
yet another variable accessed from the loop in
qemuBuildControllerDevCommandLine.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This removes the need to mark it in the 'usbcontroller' variable.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Move out the code formatting "-usb" on the QEMU command line.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Similar to what commit 86dba8f3 did for virPortAllocatorRelease,
ignore port 0 in virPortAllocatorSetUsed.
For all the reasonable use cases the callers already check that
the port is non-zero, however if the port from the XML overflows
unsigned short and turns into 0, it can be set as used by
virPortAllocatorSetUsed but not released by virPortAllocatorRelease.
Also skip port '0' in virPortAllocatorSetUsed to make this behavior
symmetric.
The serenity was disturbed by commit 5dbda5e9 which started using
virPortAllocatorRelease instead of virPortAllocatorSetUsed (false).
https://bugzilla.redhat.com/show_bug.cgi?id=1591645
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Switch qemuBuildVirtioDevStr to use virDomainDeviceSetData: callers
pass in the virDomainDeviceType and the void * DefPtr. This will
save us from having to repeatedly extend the function argument
list in subsequent patches.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This is essentially a wrapper for easily setting the variable
name in virDomainDeviceDef that matches its associated
VIR_DOMAIN_DEVICE_TYPE.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Current code essentially duplicates the same logic, but misses
some cases (like vhost-vsock-device).
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
The vhost-scsi device string should depend on the requested
address type, not strictly on the emulated arch. This is the
same logic used by qemuBuildVirtioDevStr, and this particular
path is already tested in the hostdev-scsi-vhost-scsi-ccw tests
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Move the rng->model == VIRTIO check to parse time. This also
allows us to remove similar checks throughout the qemu driver
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
If we validate that memballoon is NONE|VIRTIO at parse time,
we can drop similar checks elsewhere in the qemu driver
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This will be extended in the future, so let's simplify things by
centralizing the checks.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
If the two sysfs_path are both NULL, there may be an incorrect
object returned for virNodeDeviceObjListFindBySysfsPath().
This check exists in old interface virNodeDeviceFindBySysfsPath().
e.g.
virNodeDeviceFindBySysfsPath(virNodeDeviceObjListPtr devs,
const char *sysfs_path)
{
...
if ((devs->objs[i]->def->sysfs_path != NULL) &&
(STREQ(devs->objs[i]->def->sysfs_path, sysfs_path))) {
...
}
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
13 bytes in 1 blocks are definitely lost in loss record 44 of 179
at 0x4C2EE6F: malloc (vg_replace_malloc.c:299)
by 0x9514A69: strdup (in /lib64/libc-2.27.so)
by 0x5E60C0B: virStrdup (virstring.c:956)
by 0x54C856F: virHostGetDRMRenderNode (qemuxml2argvmock.c:190)
by 0x57CB4E3: qemuProcessGraphicsSetupRenderNode (qemu_process.c:4860)
by 0x57CB571: qemuProcessSetupGraphics (qemu_process.c:4881)
by 0x57CE01B: qemuProcessPrepareDomain (qemu_process.c:6040)
by 0x57D102E: qemuProcessCreatePretendCmd (qemu_process.c:6975)
by 0x114C1C: testCompareXMLToArgv (qemuxml2argvtest.c:611)
by 0x134B90: virTestRun (testutils.c:174)
by 0x123478: mymain (qemuxml2argvtest.c:1697)
by 0x136BFA: virTestMain (testutils.c:1112)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This partially reverts 00dc991ca1.
2,030 (1,456 direct, 574 indirect) bytes in 14 blocks are definitely lost in loss record 77 of 80
at 0x4C30E96: calloc (vg_replace_malloc.c:711)
by 0x50F83AA: virAlloc (viralloc.c:143)
by 0x5178DFA: virPCIDeviceNew (virpci.c:1753)
by 0x51753E9: virPCIDeviceIterDevices (virpci.c:468)
by 0x5175EB5: virPCIDeviceGetParent (virpci.c:759)
by 0x517AB55: virPCIDeviceIsBehindSwitchLackingACS (virpci.c:2476)
by 0x517AC24: virPCIDeviceIsAssignable (virpci.c:2494)
by 0x10BF27: testVirPCIDeviceIsAssignable (virpcitest.c:229)
by 0x10D14C: virTestRun (testutils.c:174)
by 0x10C535: mymain (virpcitest.c:422)
by 0x10F1B6: virTestMain (testutils.c:1112)
by 0x10CF93: main (virpcitest.c:455)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This is a return argument that is to be compared against NULL on
successful return. However, it is not initialized and therefore
relies on callers setting it to NULL prior calling the function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Asserting the value we set four lines earlier in qemuBlockjobState
doesn't buy us any safety (if the public header adds a value, we end
up skipping that value without the compiler warning us of our gap);
what we really want is to assert that the value auto-assigned by the
compiler matches the actual last value in the public headers (as was
done below for qemuBlockJobType). Add useful comments while at it.
Signed-off-by: Eric Blake <eblake@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Upstream apparmor is switching to named profiles. In short,
/usr/sbin/dnsmasq {
becomes
profile dnsmasq /usr/sbin/dnsmasq {
Consequently, any profiles that reference profiles in a peer= condition
need to be updated if the referenced profile switches to a named profile.
Apparmor commit 9ab45d81 switched dnsmasq to a named profile. ATM it is
the only named profile switch that has affected libvirt. Add rules to the
libvirtd profile to reference dnsmasq in peer= conditions by profile name.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The libxl driver does not set the new memory value in the active domain def
after a successful balloon. This results in the old memory value in
<currentMemory>. E.g.
virsh dumpxml test | grep currentMemory
<currentMemory unit='KiB'>20971520</currentMemory>
virsh setmem test 16777216 --live
virsh dumpxml test | grep currentMemory
<currentMemory unit='KiB'>20971520</currentMemory>
Set the new memory value in active domain def after a successful call to
libxl_set_memory_target().
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Hanlde all the possible failure codes as per ACPI standard documented in
the function header.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1660410
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We forgot to document the specific fields for the 0x103 and 0x200
sources which are tied to device removal and device hotplug
respectively.
The value description is based on the ACPI 6.2A standard Table 6-207 and
Table 6-208. At the time of writing of this patch the standard can be
accessed e.g. at:
https://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The @linkdev is In/Out function parameter as second order
reference pointer so requires first order dereference for
checking NULL which can be the result of virPCIGetNetName().
Fixes: d6ee56d723 (util: change virPCIGetNetName() to not return error if device has no net name)
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
The device xml parser code does not set "model" while parsing the
following XML:
<interface type='hostdev'>
<source>
<address type='pci' domain='0x0002' bus='0x01' slot='0x00' function='0x2'/>
</source>
</interface>
The net->model can be NULL and therefore must be compared using
STREQ_NULLABLE instead of plain STREQ.
Fixes: ac47e4a622 (qemu: replace "def->nets[i]" with "net" and "def->sounds[i]" with "sound")
Fixes: c7fc151eec (qemu: assign virtio devices to PCIe slot when appropriate)
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Removing redundant sections of the code
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libvirt wrongly assumes that VF netdev has to have the
netdev assigned to PF. There is no such requirement in SRIOV standard.
This patch change the virNetDevSwitchdevFeature() function to deal
with SRIOV devices which does not have netdev on PF. Also corrects
one comment about PF netdev assumption.
One example of such devices is ThunderX VNIC.
By applying this change, VF device is used for virNetlinkCommand() as
it is the only netdev assigned to VNIC.
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This adds the virt-aa-helper support for gl enabled graphics devices to
generate rules for the needed rendernode paths.
Example in domain xml:
<graphics type='spice'>
<gl enable='yes' rendernode='/dev/dri/bar'/>
</graphics>
results in:
"/dev/dri/bar" rw,
Special cases are:
- multiple devices with rendernodes -> all are added
- non explicit rendernodes -> follow recently added virHostGetDRMRenderNode
- rendernode without opengl (in egl-headless for example) -> still add
the node
Fixes: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1757085
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Add a capability check to qemuDomainDefValidate and refuse to start
a domain with VNC graphics if the TLS secret was set in qemu.conf
and it's not supported.
Note that qemuDomainSecretGraphicsPrepare does not generate any
secret data if the capability is not present and qemuBuildTLSx509BackendProps
is not called at all.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Use the password stored in the secret driver under
the uuid specified by the vnc_tls_x509_secret_uuid
option in qemu.conf.
https://bugzilla.redhat.com/show_bug.cgi?id=1602418
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Add an option that lets the user specify the secret
that unlocks the server TLS key.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Be generic instead of trying to enumerate all the involved
device types.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Instead of hardcoding the TLS creds alias in
qemuBuildGraphicsVNCCommandLine, store it
in the domain private data.
Given that we only support one VNC graphics
and thus have only one alias per-domain,
this is overengineered, but it will allow us
to prepare the secret upfront when we start
supporting encrypted server TLS keys.
Note that the alias is not formatted anywhere
since we won't need to access it after domain
startup.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
A helper function for allocating the virDomainGraphicsDef structure.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Switch the function to use VIR_AUTOFREE and VIR_AUTOPTR macros
to get rid of the cleanup section.
Requested-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Switch the function to use VIR_AUTOFREE and VIR_AUTOPTR macros
to get rid of the cleanup section.
Requested-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Switch the function to use VIR_AUTOFREE and VIR_AUTOPTR macros
to get rid of the cleanup section.
Requested-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Switch the function to use VIR_AUTOFREE and VIR_AUTOPTR macros
to get rid of the cleanup section.
Requested-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
If a -drive has no image, using image properties makes qemu whine that
they should not be used.
This patch stops formating cache/readonly/... for empty drives
for the pre-blockdev syntax. Unfortunately those parameters can't be
added later when inserting media, but on the other hand qemu will start
with an empty drive.
Since we already were able to start a VM with such config previously due
to qemu ignoring them I've opted just to skip formatting them.
Additionally with -blockdev support it will work as expected as the
image properties will be formatted when adding the image itself which is
not possible without it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1651457
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When commit 361c8dc17 added support for hotplugging the i6300esb
watchdog device (first in libvirt-3.9.0), it accidentally contstructed
the commandline for the device_add command before allocating a PCI
address for the device. With no PCI address specified in the command,
the watchdog would simply be placed at the lowest unused PCI slot.
On a 440fx guest, this doesn't cause a problem, because libvirt's PCI
address allocation algorithm would most likely give the same address
anyway (usually a slot on pci-root), so nobody noticed the omission of
address from the command.
But on a Q35 guest, the lowest unused PCI slot is on pcie-root, which
doesn't support hotplug; libvirt knows enough to assign a PCI address
that is on a pcie-to-pci-bridge (because its slots *do* support
hotplug), but qemu doesn't, so if there is no PCI address in the
command, qemu just tries to plug the new device into pcie-root, and
fails because it doesn't support hotplug, e.g.:
error: Failed to attach device from watchdog.xml
error: internal error: unable to execute QEMU command 'device_add':
Bus 'pcie.0' does not support hotplugging
The solution is simply to build the command string after assigning a
PCI address, not before.
Resolves: https://bugzilla.redhat.com/1666559
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If code in the @actualType switch needs to have/know which PCI
Address is being used, then we must assign it earlier. In particular
a vhost-user device needs to call qemuDomainSupportsNicdev which
requires an address to be defined.
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
This is the only patch that mixes various augeas entry
groups in one function.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Split out parts of the config parsing code to make
the parent function easier to read.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
if virNetClientNew finishes with error before sock is set
to client object then sock does not get unrefed. This is
unexpected by function clients like virNetClientNewUNIX.
Let's make sure sock gets unrefed on any error path.
Next some clients like virNetClientNewLibSSH2 try to unref
sock on virNetClientNew errors. This is not correct even
before this patch because in some cases virNetClientNew
unrefed sock on error path by itself. Let's give up
sock managment to virNetClientNew entirely.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Currently the job name corresponds to the disk the job belongs to. For
jobs which will not correspond to disks we'll need to track the name
separately.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that the data is per-job, we don't really need to bother with
finishing the synchronous job handling if the job is already terminated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than storing the presence of the blockjob in a flag we can bind
together the lifecycle of the job with the lifecycle of the object which
is tracking the data for it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Instead of passing in the disk information, pass in the job and name the
function accordingly.
Few callers needed to be modified to have the job pointer handy.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The processing function modifies the job state so it should make sure
that the variable holding the new state is cleared properly and not the
caller. The caller should only deal with the job state and not the
transition that happened.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The job error can be safely accessed in the job structure, so we don't
need to propagate it through qemuBlockJobUpdateDisk.
Drop the propagation and refactor any caller that pased non-NULL error.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The same message is reported in 3 distinct places. Move it out into a
single function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a field tracking the current state of job so that it can be queried
later. Until now the job state e.g. that the job is _READY for
finalizing was tracked only for mirror jobs. Add tracking of state for
all jobs.
Similarly to 'qemuBlockJobType' this maps the existing states of the
blockjob from virConnectDomainEventBlockJobStatus to
'qemuBlockJobState' so that we can track some internal states as well.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Modify qemuBlockJobSyncBeginDisk to operate on qemuBlockt sJobDataPtr and
rename it accordingly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We can properly track the job type when starting the job so that we
don't have to infer it later.
This patch also adds an enum of block job types specific to qemu
(qemuBlockjobType) which mirrors the public block job types
(virDomainBlockJobType) but allows for other types to be added later
which will not be public.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Block jobs can also happen on objects which are not a disk at a given
point (e.g. the frontend was not hotplugged yet) and thus will be
eventually kept separately. Add a reference back to the disk for
blockjobs which do correspond to a disk.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If the job wasn't started, we don't need to end the synchronous job. Add
a note and drop the unnecessary calls.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than directly modifying fields in the qemuBlockJobDataPtr
structure add a bunch of fields which allow to do the transitions.
This will help later when adding more complexity to the job handling.
APIs introduced in this patch are:
qemuBlockJobDiskNew - prepare for starting a new blockjob on a disk
qemuBlockJobDiskGetJob - get the block job data structure for a disk
For individual job state manipulation the following APIs are added:
qemuBlockJobStarted - Sets the job as started with qemu. Until that
the job can be cancelled without asking qemu.
qemuBlockJobStartupFinalize - finalize job startup. If the job was
started in qemu already, just releases
reference to the job object. Otherwise
clears everything as if the job was never
started.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Extract the disk mirroring startup code from the loop into a separate
function to allow cleaner cleanup paths.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The field is used to note the state the job has transitioned to while
handling the blockjob state change event. Rename the field so that it's
obvious that this is the new state and not the general state of the
blockjob.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reference counting will simplify semantics of the lifecycle of the
object.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When cancelling job after a reconnect we can now use the disk block job
state rather than having to re-detect it in the migration code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that we reprobe the status of blockjobs when reconnecting in
addition to handling job status events, the status reprobing can be
removed as we always track the correct status internally.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Block job state was widely untracked by libvirt across restarts which
was allowed by a stateless block job finishing handler which discarded
disk state and redetected it. This is undesirable since we'll need to
track more information for individual blockjobs due to -blockdev
integration requirements.
In case of legacy blockjobs we can recover whether the job is present at
reconnect time by querying qemu. Adding tracking whether a job is
present will allow simplification of the non-shared-storage cancellation
code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Internally we do a 'block-copy' to accomodate non-shared storage
migration but the code did not fill in that the block job was active on
the disk when starting the copy job. Since we handle block jobs finishes
regardless of having it registered it's not a problem but soon will
become one.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuBlockJobEventProcessLegacy was getting too big. Remove handling of
completed jobs in a separate function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This will handle blockjob finalizing for the old approach so rename it
accordingly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'cleanup' label was accessed only from a jump to 'error'. Consolidate
everyting into 'cleanup'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Struct qemuDomainDiskPrivate was holding multiple variables connected to
a disk block job. Consolidate them into a new struct qemuBlockJobData.
This will also allow simpler extensions to the block job mechanisms.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The blockjob module uses 'qemuDomainAsyncJob' in it's public headers.
As I plan adding a new structure containing job data which will need to
be included in "qemu_domain.h" it's necessary to break the circular
dependency.
Convert 'qemuDomainAsyncJob' type to 'int' as it's an enum.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All the public APIs of the qemu_blockjob module operate on a 'disk'.
Since I'll be adding APIs which operate on a job later let's rename the
existing ones.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is now only called locally. Some code movement was
necessary to avoid forward declarations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Replace use of qemuBlockJobEventProcess with the general helper. A small
tweak is required to pass in the 'type' and 'status' of the job via the
appropriate private data variables.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The event reports the disk path to identify the disk which makes sense
only for local disks. Additionally network backed disks like NBD don't
need to have a path so the callback would return NULL.
Report VIR_DOMAIN_EVENT_ID_BLOCK_JOB only for non-empty local disks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Put the emitting of VIR_DOMAIN_EVENT_ID_BLOCK_JOB and
VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 into a separate function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Instead of copying the default default values upfront
and then wondering whether the user has given us a new default,
leave the per-usage TLS certdirs and secrets empty during
parsing and only fill them afterwards if they weren't provided
by the user.
This means that instead of looking whether the specific certdir
paths match the default default, the Validate function (which
is called in between parsing and setting the defaults) can error
out for missing directories if the value is present, because
it must've come from the user.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Introduce a set of bool variables with the 'present' suffix
to track whether the value was actually specified.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
According to the GNU Make manual, "double-colon rules are
somewhat obscure and not often very useful". Looking at
the few instances we have in libvirt, that certainly seems
to be the case, so just drop them.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Commit 7282f455a got rid of the VIR_WARNINGS_NO_CAST_ALIGN macro
when refactoring the code and broke the build with clang.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Turns out, that there are few bugs that are not that trivial to
fix (e.g. around block jobs). Instead of rushing in not
thoroughly tested fixes disable the feature temporarily for the
release.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
I had intended to make these changes to commit d40b820c before
pushing, but forgot about it during the day between the initial review
and ACK.
Neither change is significant - just returning immediately when
virNetDevGetName() fails (instead of logging a debug message first)
and eliminating a comment that adds to confusion rather than
eliminating it. Still, the changes should be made to be more
consistent with nearly identical code just a few lines up (added in
commit 7282f455)
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When checking the setting of accept_ra, we have assumed that all
routes have a single nexthop, so the interface of the route would be
in the RTA_OIF attribute of the netlink RTM_NEWROUTE message. But
multipath routes don't have an RTA_OIF; instead, they have an
RTA_MULTIPATH attribute, which is an array of rtnexthop, with each
rtnexthop having an interface. This patch adds a loop to look at the
setting of accept_ra of the interface for every rtnexthop in the
array.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
When commit 1d94b3e7 added code to walk the [n]hostdevs list looking
to add shared hostdevs, it should've filtered any hostdevs that were
not SCSI hostdev's.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is about the same number of code lines, but is simpler, and more
consistent with what will be added to check another attribute in a
coming patch.
As a side effect, it
Resolves: https://bugzilla.redhat.com/1583131
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This same operation needs to be done in multiple places, so move the
inline code into a separate function.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This is problematic if a callback function wants to send the nlmsghdr
to a library function that has no "const" in its prototype
(e.g. nlmsg_find_attr())
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
These files need to be installed on the system for apparmor
support to work, so they don't belong with examples.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Instead of defining targets conditionally and depending on
them unconditionally, define a couple of variables and
conditionally add targets to them.
In addition to removing a bunch of useless code, this has
the nice effect of no longer requiring the main Makefile.am
to have any knowledge about the contents of the various
snippets it includes.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This is consistent with the way we already handle
configuration for other init systems such as upstart and
systemd.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The feature was added to QEMU in 3.1.0 and it is currently blocking
migration, which is expected to change in the future. Luckily 3.1.0 is
new enough to give us migratability hints on each feature via
query-cpu-model-expension, which means we don't need to use the
"migratable" attribute on the CPU map XML.
The kernel calls this feature arch_capabilities and RHEL/CentOS 7.* use
arch-facilities. Apparently some CPU test files were gathered with the
RHEL version of QEMU. Let's update the test files to avoid possible
confusion about the correct naming.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The session daemon is unable to set XATTRs in 'trusted'
namespace because it doesn't run as privileged process.
Therefore, when creating the default qemu config enable
rememberOwner only when running as privileged process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since its introduction in commit 0977b8aa07 (released in v1.2.14)
qemuAgentGetInterfaces calls qemuAgentCommand with needReply=false,
which allows qemuAgentCommand to return 0 even when it did not get
any reply from the agent.
Set needReply to true, since we dereference it right after.
This can be hit if libvirt is waiting for an event from the agent
(e.g. shutdown) and the agent cannot reply in time (e.g. due to
the guest being shut down), as reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=1663051
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The use of 'lxc://' was mistakenly broken in:
commit 4c8574c85c
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Wed Mar 28 12:49:29 2018 +0100
driver: ensure NULL URI isn't passed to drivers with whitelisted URIs
Allow it again for historical compatibility.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In the previous commit we are using uint64_t for storing subnet
prefix and interface id that qemu reports in
RDMA_GID_STATUS_CHANGED event. We also report them in some debug
messages. This poses a problem because uint64_t can be UL or ULL
depending on the host architecture and hence we wouldn't know
which format to use. Switch to ULL which is big enough and
doesn't suffer from the issue.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This event is emitted on the monitor when a GID table in pvrdma device
is modified and the change needs to be propagate to the backend RDMA
device's GID table.
The control over the RDMA device's GID table is done by updating the
device's Ethernet function addresses.
Usually the first GID entry is determine by the MAC address, the second
by the first IPv6 address and the third by the IPv4 address. Other
entries can be added by adding more IP addresses. The opposite is the
same, i.e. whenever an address is removed, the corresponding GID entry
is removed.
The process is done by the network and RDMA stacks. Whenever an address
is added the ib_core driver is notified and calls the device driver's
add_gid function which in turn update the device.
To support this in pvrdma device we need to hook into the create_bind
and destroy_bind HW commands triggered by pvrdma driver in guest.
Whenever a changed is made to the pvrdma device's GID table a special
QMP messages is sent to be processed by libvirt to update the address of
the backend Ethernet device.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These were not caught by our current regular expressions
but will be caught by the improved ones we're about to
introduce, so fix them ahead of time.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Essentially, bring back the old behaviour as of commit eba36a38 which
was later changed by commit ae06048bf5. Even though all the stderr
messages will eventually end up in the journal, we're not making use of
the fields journald provides.
https://bugzilla.redhat.com/show_bug.cgi?id=1592644
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Our use of INCLUDES in Makefile.am hearkens back to when we had to
cater to automake 1.9.6 (thanks, RHEL 5) which lacked AM_CPPFLAGS.
Modern Automake flags a warning that INCLUDES is deprecated, and
now that we mandate RHEL 7 or better (see commit c1bc9c66), we no
longer have to cater to the old spelling. This change will also
make it easier to do per-binary CPPFLAGS.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit c0a8ea45 removed the use of gettextize, and the setting of
GETTEXT_CPPFLAGS, but did not scrub the now-unused variable from
Makefile.am snippets.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In 600462834f we've tried to remove Author(s): lines
from comments at the beginning of our source files. Well, in some
files while we removed the "Author" line we did not remove the
actual list of authors.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
According to the result parsing from xml, add the unarmed property
into QEMU command line:
-device nvdimm,...[,unarmed=on]
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
According to the result parsing from xml, add pmem property
into QEMU command line:
-object memory-backend-file,...[,pmem=on]
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
According to the result parsing from xml, add align property
into QEMU command line:
-object memory-backend-file,...[,align=xxx]
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This capability tracks if nvdimm has the unarmed attribute or not
for the nvdimm readonly xml attribute.
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This capability tracks if memory-backend-file has the pmem
attribute or not.
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This capability tracks if memory-backend-file has the align
attribute or not.
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
NVDIMM emulation will mmap the backend file, it uses host pagesize
as the alignment of mapping address before, but some backends may
require alignments different from the pagesize. So the 'alignsize'
option is introduced to allow specification of the proper alignment:
<devices>
...
<memory model='nvdimm' access='shared'>
<source>
<path>/dev/dax0.0</path>
<alignsize unit='MiB'>2</alignsize>
</source>
<target>
<size unit='MiB'>4094</size>
<node>0</node>
<label>
<size unit='MiB'>2</size>
</label>
</target>
</memory>
...
</devices>
Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Before launching a SEV guest we take the base64-encoded guest owner's
data specified in launchSecurity and create files with the same content
under /var/lib/libvirt/qemu/<domain>. The reason for this is that we
need to pass these files on to QEMU which then uses them to communicate
with the SEV firmware, except when it doesn't have permissions to open
those files since we don't relabel them.
https://bugzilla.redhat.com/show_bug.cgi?id=1658112
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Since SEV operates on a per domain basis, it's very likely that all
SEV launch-related data will be created under
/var/lib/libvirt/qemu/<domain_name>. Therefore, when calling into
qemuProcessSEVCreateFile we can assume @libDir as the directory prefix
rather than passing it explicitly.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
The @con type security_context_t is actually a "char *", so the
correct check should be to dereference one more level; otherwise,
we could return/use the NULL pointer later in a subsequent
virSecuritySELinuxSetFileconImpl call (using @fcon).
Suggested-by: Michal Prívozník <mprivozn@redhat.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If virSecuritySELinuxRestoreFileLabel returns 0 or -1 too soon, then
the @newpath will be leaked.
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Because missing optional storage source is not error. The patch
address only local files. Fixing other cases is a bit ugly.
Below is example of error notice in log now:
error: virStorageFileReportBrokenChain:427 :
Cannot access storage file '/path/to/missing/optional/disk':
No such file or directory
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Every time we call all domain stats for inactive domain with
unavailable storage source we get error message in logs [1]. It's a bit noisy.
While it's arguable whether we need such message or not for mandatory
disks we would like not to see messages for optional disks. Let's
filter at least for cases of local files. Fixing other cases would
require passing flag down the stack to .backendInit of storage
which is ugly.
Stats for active domain are fine because we either drop disks
with unavailable sources or clean source which is handled
by virStorageSourceIsEmpty in qemuDomainGetStatsOneBlockFallback.
We have these logs for successful stats since 25aa7035d (version 1.2.15)
which in turn fixes 596a13713 (version 1.2.12 )which added substantial
stats for offline disks.
[1] error message example:
qemuOpenFileAs:3324 : Failed to open file '/path/to/optional/disk': No such file or directory
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU
group. This reduces the overhead of the QEMU capabilities cache
lookup. Before this patch there were many fork() calls used for
checking whether /dev/kvm is accessible. Now we store the result
whether /dev/kvm is accessible or not and we only need to re-run the
virFileAccessibleAs check if the ctime of /dev/kvm has changed.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This test checks if security label remembering works correctly.
It uses qemuSecurity* APIs to do that. And some mocking (even
though it's not real mocking as we are used to from other tests
like virpcitest). So far, only DAC driver is tested.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We are setting label on kernel, initrd, dtb and slic_table files.
But we never restored it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It helps whe trying to match calls with virSecuritySELinuxSetAllLabel
if the order in which devices are set/restored is the same in
both functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When iterating over list of paths/disk sources to relabel it may
happen that the process fails at some point. In that case, for
the sake of keeping seclabel refcount (stored in XATTRs) in sync
with reality we have to perform rollback. However, if that fails
too the only thing we can do is warn user.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's important to keep XATTRs untouched (well, in the same state
they were in when entering the function). Otherwise our
refcounting would be messed up.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to what I did in DAC driver, this also requires the
same SELinux label to be used for shared paths. If a path is
already in use by a domain (or domains) then and the domain we
are starting now wants to access the path it has to have the same
SELinux label. This might look too restrictive as the new label
can still guarantee access to already running domains but in
reality it is very unlikely and usually an admin mistake.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It is going to be important to know if the current transaction we
are running is a restore operation or set label operation so that
we know whether to call virSecurityGetRememberedLabel() or
virSecuritySetRememberedLabel(). That is, whether we are in a
restore and therefore have to fetch the remembered label, or we
are in set operation and therefore have to store the original
label.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that we have seclabel remembering we can safely restore
labels for shared and RO disks. In fact we need to do that to
keep seclabel refcount stored in XATTRs in sync with reality.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This also requires the same DAC label to be used for shared
paths. If a path is already in use by a domain (or domains) then
and the domain we are starting now wants to access the path it
has to have the same DAC label. This might look too restrictive
as the new label can still guarantee access to already running
domains but in reality it is very unlikely and usually an admin
mistake.
This requirement also simplifies seclabel remembering, because we
can store only one seclabel and have a refcounter for how many
times the path is in use. If we were to allow different labels
and store them in some sort of array the algorithm to match
labels to domains would be needlessly complicated.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Because the implementation that will be used for label
remembering/recall is not atomic we have to give callers a chance
to enable or disable it. That is, enable it if and only if
metadata locking is enabled. Otherwise the feature MUST be turned
off.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We are setting label on kernel, initrd, dtb and slic_table files.
But we never restored it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It helps whe trying to match calls with virSecurityDACSetAllLabel
if the order in which devices are set/restored is the same in
both functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When iterating over list of paths/disk sources to relabel it may
happen that the process fails at some point. In that case, for
the sake of keeping seclabel refcount (stored in XATTRs) in sync
with reality we have to perform rollback. However, if that fails
too the only thing we can do is warn user.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's important to keep XATTRs untouched (well, in the same state
they were in when entering the function). Otherwise our
refcounting would be messed up.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This file implements wrappers over XATTR getter/setter. It
ensures the proper XATTR namespace is used.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For consistency, handle the @data "char **" (or remote_string)
assignments and processing similarly between various APIs
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Using a combination of VIR_ALLOC and VIR_STRDUP into a local
variable and then jumping to error on the VIR_STRDUP before
assiging it into the @data would cause a memory leak. Let's
just avoid that by assiging directly into @data.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
The virtualization driver has two connections to the virtlogd daemon,
one pipe fd for writing to the log file, and one socket fd for making
RPC calls. The typical sequence is to write some data to the pipe fd and
then make an RPC call to determine the current log file offset.
Unfortunately these two operations are not guaranteed to be handling in
order by virtlogd. The event loop for virtlogd may identify an incoming
event on both the pipe fd and socket fd in the same iteration of the
event loop. It is then entirely possible that it will process the socket
fd RPC call before reading the pending log data from the pipe fd.
As a result the virtualization driver will get an outdated log file
offset reported back.
This can be seen with the QEMU driver where, when a guest fails to
start, it will randomly include too much data in the error message it
has fetched from the log file.
The solution is to ensure we have drained all pending data from the pipe
fd before reporting the log file offset. The pipe fd is always in
blocking mode, so cares needs to be taken to avoid blocking. When
draining this is taken care of by using poll(). The extra complication
is that they might already be an event loop dispatch pending on the pipe
fd. If we have just drained the pipe this pending event will be invalid
so must be discarded.
See also https://bugzilla.redhat.com/show_bug.cgi?id=1356108
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The arguments to the N_() macro must only ever be a literal string. It
is not possible to use macro arguments, or use macro string
concatenation in this context. The N_() macro is a no-op whose only
purpose is to act as a marker for xgettext when it extracts translatable
strings from the source code. Anything other than a literal string will
be silently ignored by xgettext.
Unfortunately this means that the clever MSG, MSG2 & MSG_EXISTS macros
used for building up error message strings have prevented any of the
error messages getting marked for translation. We must sadly, revert to
a more explicit listing of strings for now.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The autostart under session daemon might not behave as you'd
expect it to behave. This patch is inspired by latest
libvirt-users discussion:
https://www.redhat.com/archives/libvirt-users/2018-December/msg00047.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The driver is unmaintained, untested and severely broken for
quite some time now. Since nobody even reported any issue with it
let us drop it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
QEMU can report how many times during post-copy migration the domain
running on the destination host tried to access a page which has not
been migrated yet.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The QEMU command line arguments are very long and currently all written
on a single line to /var/log/libvirt/qemu/$GUEST.log. This introduces
logic to add line breaks after every env variable and "-" optional
argument, and every positional argument. This will create a clearer log
file, which will in turn present better in bug reports when people cut +
paste from the log into a bug comment.
An example log file entry now looks like this:
2018-12-14 12:57:03.677+0000: starting up libvirt version: 5.0.0, qemu version: 3.0.0qemu-3.0.0-1.fc29, kernel: 4.19.5-300.fc29.x86_64, hostname: localhost.localdomain
LC_ALL=C \
PATH=/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin \
HOME=/home/berrange \
USER=berrange \
LOGNAME=berrange \
QEMU_AUDIO_DRV=none \
/usr/bin/qemu-system-ppc64 \
-name guest=guest,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/home/berrange/.config/libvirt/qemu/lib/domain-33-guest/master-key.aes \
-machine pseries-2.10,accel=tcg,usb=off,dump-guest-core=off \
-m 1024 \
-realtime mlock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid c8a74977-ab18-41d0-ae3b-4041c7fffbcd \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=23,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device qemu-xhci,id=usb,bus=pci.0,addr=0x1 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
2018-12-14 12:57:03.730+0000: shutting down, reason=failed
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virCommand APIs do not expect to be given a NULL value for an arg
name or value. Such a mistake can lead to execution of the wrong
command, as the NULL may prematurely terminate the list of args.
Detect this and report suitable error messages.
This identified a flaw in the storage test which was passing a NULL
instead of the volume path. This flaw was then validated by an incorrect
set of qemu-img args as expected data.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Simplify adding of new errors by just adding them to the array of
messages rather than having to add conversion code.
Additionally most of the messages add the format string part as a suffix
so we can avoid some of the duplication by using a macro which adds the
suffix to the original string. This way most messages fit into the 80
column limit and only 3 exceed 100 colums.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Erik Skultety <eskultet@redhat.com>
Clarify how @info is used and what the returned values look like.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Simplify wording of the error string for VIR_ERR_OPEN_FAILED and
VIR_ERR_CALL_FAILED. The error codes itself are currently unused so it
will not impact any client.
This will simplify upcomming patch which refactors how we convert these.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Few error codes were missing the version of the message with additional
info. In case of the modified messages it's not very likely they'll ever
report any additional data, but for the sake of consistency we should
provide them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Additional information for an error message is either in form of a
string or empty. Fix two offenders. One used %d as the format modifier
and the second one always expected a string.
Thankfully, neither of the offenders are currently in effect.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
We do have one for the error domain but not for the error number itself.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Require that all headers are guarded by a symbol named
LIBVIRT_$FILENAME
where $FILENAME is the uppercased filename, with all characters
outside a-z changed into '_'.
Note we do not use a leading __ because that is technically a
namespace reserved for the toolchain.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This introduces a syntax-check script that validates header files use a
common layout:
/*
...copyright header...
*/
<one blank line>
#ifndef SYMBOL
# define SYMBOL
....content....
#endif /* SYMBOL */
For any file ending priv.h, before the #ifndef, we will require a
guard to prevent bogus imports:
#ifndef SYMBOL_ALLOW
# error ....
#endif /* SYMBOL_ALLOW */
<one blank line>
The many mistakes this script identifies are then fixed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
For some reason, xdr_free uses char * instead of void * for its 2nd
argument which is passed to a custom free routine. Commit
dc54b3ec missed this detail which made the build fail on a number of
platforms. Fix it by explicitly casting the object pointer to char *
just like we do in other places throughout the code base.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When libvirt is reconnecting to running domain that uses cgroup v2
the QEMU process reports cgroup for the emulator directory because the
main thread is in that cgroup. We need to remove the "/emulator" part
in order to match with the root cgroup directory name for that domain.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The rewrite to support cgroup v2 missed this function. In cgroup v2
we have different files to track tasks.
We would fail to remove cgroup on non-systemd OSes if there is any
extra process assigned to guest cgroup because we would not kill any
process form the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Turns out there some build platforms that must not define MOUNT
or VGCHANGE in config.h... So moving the commands from the storage
backend specific module into a common storage_util module causes
issues for those platforms.
So instead of assuming they are there, let's just pass the command
string to the storage util API's from the storage backend specific
code (as would have been successful before). Also modify the test
to determine whether the MOUNT and/or VGCHANGE doesn't exist and
just define it to (for example) what Fedora has for the path. Could
have just used "mount" and "vgchange" in the call, but that defeats
the purpose of adding the call to virTestClearCommandPath.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The make_nonnull_XXX methods can all fail due to OOM but this was being
silently ignored and thus also not checked by callers. Make the methods
propagate errors and use ATTRIBUTE_RETURN_CHECK to force callers to deal
with it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In many files there are header comments that contain an Author:
statement, supposedly reflecting who originally wrote the code.
In a large collaborative project like libvirt, any non-trivial
file will have been modified by a large number of different
contributors. IOW, the Author: comments are quickly out of date,
omitting people who have made significant contribitions.
In some places Author: lines have been added despite the person
merely being responsible for creating the file by moving existing
code out of another file. IOW, the Author: lines give an incorrect
record of authorship.
With this all in mind, the comments are useless as a means to identify
who to talk to about code in a particular file. Contributors will always
be better off using 'git log' and 'git blame' if they need to find the
author of a particular bit of code.
This commit thus deletes all Author: comments from the source and adds
a rule to prevent them reappearing.
The Copyright headers are similarly misleading and inaccurate, however,
we cannot delete these as they have legal meaning, despite being largely
inaccurate. In addition only the copyright holder is permitted to change
their respective copyright statement.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Support for nested KVM is handled via a kernel module configuration
parameters values for kvm_intel, kvm_amd, kvm_hv (PPC), or kvm (s390).
While it's possible to fetch the kmod config values via virKModConfig,
unfortunately that is the static value and we need to get the
current/dynamic value from the kernel file system.
So this patch adds a new API virHostKVMSupportsNesting that will
search the 3 kernel modules to get the nesting value and check if
it is 'Y' (or 'y' just in case) to return a true/false whether
the KVM kernel supports nesting.
We need to do this in order to handle cases where adjustments to
the value are made after libvirtd is started to force a refetch of
the latest QEMU capabilities since the correct CPU settings need
to be made for a guest to add the "vmx=on" to/for the guest config.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1656255
If virSecretGetSecretString is using by secretLookupByUUID,
then it's possible the found sec->usageType doesn't match the
desired @secretUsageType. If this occurs for the encrypted
volume creation processing and a subsequent pool refresh is
executed, then the secret used to create the volume will not
be found by the storageBackendLoadDefaultSecrets which expects
to find secrets by VIR_SECRET_USAGE_TYPE_VOLUME.
Add a check to virSecretGetSecretString to avoid the possibility
along with an error indicating the incorrect matched types.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Add the logical storage pool startup validation (xml2argv) tests.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
It's only pass as 0 or 1 and used as a bool, let's just use a bool
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Let's create helpers for each style of command line created. This
primarily is easier on the eyes rather than the large multi line
if-then-else-else clause used, but may also be useful if in the
future any particular pool needs to add to the command line based
on pool xml format.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Move virStorageBackendFileSystemMountCmd to storage_util so that
it can be used by the test harness.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Extract out the code that is used to create the MOUNT command
for starting the pool. We can use this for Storage Pool XML
to Argv testing to ensure code changes don't alter how a
storage pool is started.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1624223
There are two ways to request memory preallocation on cmd line:
-mem-prealloc and .prealloc attribute for a memory-backend-file.
However, as it turns out it's not safe to use both at the same
time. If -mem-prealloc is used then qemu will fully allocate the
memory (this is done by actually touching every page that has
been allocated). Then, if .prealloc=yes is specified,
mbind(flags = MPOL_MF_STRICT | MPOL_MF_MOVE) is called which:
a) has to (possibly) move the memory to a different NUMA node,
b) can have no effect when hugepages are in play (thus ignoring user
request to place memory on desired NUMA nodes).
Prefer -mem-prealloc as it is more backward compatible
compared to switching to "-numa node,memdev= + -object
memory-backend-file".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
So far we have two arguments that we are passing to
qemuBuildMemoryBackendProps() and that are taken from domain
private data: @qemuCaps and @autoNodeset. In the next commit I
will use one more item from there. Therefore, instead of having
it as yet another argument to the function, pass pointer to the
private data object.
There is one change in qemuDomainAttachMemory() where previously
@autoNodeset was NULL but now is priv->autoNodeset (which may be
set). This is safe to do as @autoNodeset is advisory only.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1624336
Add a check during virDomainDefCompatibleDevice whether the
domain supports cold/hotplug of a memory module even though
this duplicates the qemuDomainDefValidateMemoryHotplug check.
Without this check, the cold/hot plug would fail on the
subsequent mem_memory check (since it's 0). Adding a check
for max_memory > 0 would allow the subsequent hotplug check
to fail, but would cause coldplug to fail with the somewhat
opaque message "no free memory device slot available".
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
If virDomainDefCompatibleDevice fails because there is insufficient
domain def->mem.max_memory, then let's also print out that value in
the error message.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
The QEMU validation code for graphics has been in place for a while, but
because it is only executed from virDomainDeviceInfoIterateInternal, it
was never run, since the iterator expects the device to have boot info
which graphics don't have. The unfortunate side effect of this whole mess
was that a few capabilities were missing from the test suite (as commit
d8266ebe1 demonstrated with graphics-spice-invalid-egl-headless test),
which in turn meant that a few graphics tests which expected a failure
happily accepted any failure the test runtime returned which made them
succeed. The impact of this was that we then allowed to start a domain
with multiple OpenGL-enabled graphics devices.
This patch enables iteration over graphics devices. Unsurprisingly,
a few tests started to fail as a result, so fix those too.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Validation of domain devices is accomplished via a generic device
iterator which takes a callback, iterates over all kinds of supported
device types and invokes the callback on every single device. However,
there might be cases when we need to alter the behaviour of the
iteration (most notably skip or include a group of devices). Therefore,
this patch introduces iterator flags.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since the code was never run, it would have been very hard to spot this
mistake, especially since the compiler can't really warn about it.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This commit fixes a bug when you have multiple network settings defined.
Basically, if you set an IPv6 or IPv4 gateway, it carries on next
network settings. It is happening because the data is not being
initialized when a new network type is defined. So, the old data still
persists into the pointer. Another way to initialized the data was
introduced using memset() to avoid missing attributes from the struct.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit 255e0732 introduced a few graphics-related helpers. The problem
is that virDomainGraphicsNeedsAutoRenderNode returns true if it gets
NULL as a response from virDomainGraphicsNeedsAutoRenderNode. That's
okay for egl-headless because that one always needs a DRM render node,
the same is not true for SPICE though, and unless the XML specifies
<gl enable='yes'> for SPICE, there's no need for any renderer.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Disable external snapshot of a readonly disk for domains as
this operation is not very useful. Such a snapshot is not
possible for active domains but the error message from QEMU
is more cryptic:
error: internal error: unable to execute QEMU command 'transaction':
Could not create file: Permission denied
This error at least makes the error more understandable for
active domains and disallows for inactive domains as well.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If domain is killed with `xl destroy`, libvirt will not notice it and
still report the domain as running. Also trying to destroy the domain
through libvirt will fail. The only way to recover from such a situation
is to restart libvirt daemon. The problem is that even though libxl
report LIBXL_EVENT_TYPE_DOMAIN_DEATH, libvirt ignore it as all the
domain cleanup is done in a function actually destroying the domain. If
destroy is done outside of libvirt, there is no place where it would be
handled.
Fix this by doing domain cleanup in LIBXL_EVENT_TYPE_DOMAIN_DEATH too.
To avoid doing it twice, add a ignoreDeathEvent flag
libxlDomainObjPrivate, set when the domain death is triggered by libvirt
itself.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Since domain was suspended before and on failed wakeup is destroyed,
send an event.
Also, add missing libxlDomainCleanup.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Commit 017dfa27d changed a few switch statements in the LXC code to
have all possible enum values, and in the process changed the switch
statement in virLXCControllerGetNICIndexes() to return an error status
for unsupported interface types, but it erroneously put type='direct'
on the list of unsupported types.
type='direct' (implemented with a macvlan interface) is supported on
LXC, but it's interface shouldn't be placed on the list of interfaces
given to CreateMachineWithNetwork() because the interface is put
inside the container, while CreateMachineWithNetwork() only wants to
know about the parent veths of veth pairs (the parent veth remains on
the host side, while the child veth is put into the container).
Resolves: https://bugzilla.redhat.com/1656463
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virLXCControllerGetNICIndexes() was deciding whether or not to add the
ifindex for an interface's ifname to the list of ifindexes sent to
CreateMachineWithNetwork based on the interface type stored in the
config. This would be incorrect in the case of <interface
type='network'> where the network was giving out macvlan interfaces
tied to a physical device (i.e. when the actual interface type was
"direct").
Instead of checking the setting of "net->type", we should be checking
the setting of virDomainNetGetActualType(net).
I don't think this caused any actual misbehavior, it was just
technically wrong.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Most of the iptables APIs share code for the add/delete paths, but a
couple were separated. Merge the remaining APIs to facilitate future
changes.
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add support for converting openvswitch interface configuration
to/from libvirt domXML and xl.cfg(5). The xl config syntax for
virtual interfaces is described in detail in the
xl-network-configuration(5) man page. The Xen Networking wiki
also contains information and examples for using openvswitch
in xl.cfg config format
https://wiki.xenproject.org/wiki/Xen_Networking#Open_vSwitch
Tests are added to check conversions of openvswitch tagged and
trunked VLAN configuration.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
It is currently possible to use <interface>s of type openvswitch
with the libxl driver in a non-standard way, e.g.
<interface type='bridge'>
<source bridge='ovsbr0'/>
<mac address='00:16:3e:7a:35:ce'/>
<script path='vif-openvswitch'/>
</interface>
This patch adds support for openvswitch <interface>s specified
in typical libvirt config
<interface type='bridge'>
<source bridge='ovsbr0'/>
<mac address='00:16:3e:7a:35:ce'/>
<virtualport type='openvswitch'/>
</interface>
VLAN tags and trunking are also supported using the extended
syntax for specifying an openvswitch bridge in libxl
BRIDGE_NAME[.VLAN][:TRUNK:TRUNK]
See Xen's networking wiki for more details on openvswitch support
https://wiki.xenproject.org/wiki/Xen_Networking#Open_vSwitch
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 212dc9286 made a generic qemuDomainGetIOThreadsMon which
would fail if the QEMU_CAPS_OBJECT_IOTHREAD didn't exist. Then
commit d1eac927 used that helper for the collection of all domain
stats. However, if the capability doesn't exist, then the entire
stats collection fails. Since the IOThread stats were meant to be
if available only, thus rather than failing if the capability
doesn't exist, let's just not collect the stats. Restore the caps
failure logic for qemuDomainGetIOThreadsLive.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
out the reason we failed leaving the caller with no idea why the
collection failed.
To fix this, let's Preserve the error and Restore it prior to return
so that a caller such as 'virsh domstats' doesn't get the generic
"error: An error occurred, but the cause is unknown".
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We already have a way stricter check in the code which is doing the
snapshot so duplicating it in the parser does not make much sense. Also
gets rid of an ugly ternary operator.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We are preparing a certain disk source passed in as '@src' so the
individual functions should use that rather than disk->src which
corresponds to the top level element of the chain only.
Without this change TLS and persistent reservations would not work for
backing images of a chain when using -blockdev.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function clears and frees the passed buffers on success, but not in
one case of failure. Modify the control flow that the args are always
consumed, record it in the docs and remove few pointless cleanup paths
in callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1656014
An RNG device can consists of more devices than RND device
itself. For instance, in case of EGD there is a chardev that
connects to EGD daemon and feeds the qemu with random data. When
doing RNG device removal we have to remove the associated chardev
as well.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The way that the code is currently written makes my eyes hurt.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are two functions called from syncNicRxFilterMultiMode:
virNetDevSetRcvAllMulti() and virNetDevSetRcvMulti(). Both of
them return 0 on success and -1 on error. However, currently
their return value is checked for != 0 which conflicts with our
assumptions on retvals: a positive value is still considered
success but with current check it would lead to failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Depending on whether QEMU actually supports the option, we can put the
'rendernode' on the '-display egl-headless' cmdline.
https://bugzilla.redhat.com/show_bug.cgi?id=1628892
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Just like for SPICE, we need to change the permissions on the DRI device
used as the @rendernode for egl-headless graphics type.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Just like for SPICE, we need to put the render node DRI device into the
device cgroup list so that users don't need to add it manually via
qemu.conf file.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Just like for SPICE, we need to put the DRI device into the namespace,
otherwise it will be left out from the DAC relabeling process.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Unlike with SPICE and SDL which use the <gl> subelement to enable OpenGL
acceleration, specifying egl-headless graphics in the XML has
essentially the same meaning, thus in case of egl-headless we don't have
a need for the 'enable' element attribute and we'll only be interested
in the 'rendernode' one further down the road.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>