41307 Commits

Author SHA1 Message Date
Daniel P. Berrangé
cff6236105 qemu: use on|off for -vnc boolean option values
The preferred syntax for boolean options is to set the value "on" or
"off". QEMU 7.1.0 will deprecate the short format we currently use.

The long format has been supported with -vnc since the change to use
QemuOpts in 2.2.0, so we check based on the new capability flag.

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-16 14:03:11 +00:00
Daniel P. Berrangé
a4f57fa37d qemu: probe for -vnc supporting use of QemuOpts syntax
This was introduced in QEMU 2.2.0, and is visible by -vnc appearing in
the "query-command-line-options" data.

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-16 14:02:59 +00:00
Michal Privoznik
2642cc0f66 qemu: Don't lie about @ndevAlias when translating FSInfo
When virDomainGetFSInfo() is called over a QEMU/KVM domain it
results into calling of 'guest-get-fsinfo' guest agent command to
which it replies with info on guest (mounted) filesystems. When
filling return structure we also try to do basic lookup and
translate guest agent provided disk address into disk target (as
seen in domain XML). This can of course fail - guest can have
variety of disks not recorded in domain XML (iSCSI, scsi_debug,
NFS to name a few). If that's the case, a debug message is logged
and no disk target is added into the return structure.

However, due to the way our code is written the caller is led to
believe that the target was added into the structure. This may
lead to a situation where the array of disk targets (strings)
contains NULL. But our RPC structure says the array contains only
non-NULL strings. This results in somewhat 'cryptic' (at least to
users) error message:

  error: Unable to get filesystem information
  error: Unable to encode message payload

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1919783
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-02-16 14:06:31 +01:00
Michal Privoznik
9ad1e1d897 qemu: Bring if() outside from loop in virDomainFSInfoFormat()
After previous commit, the freeing of @info_ret inside of
virDomainFSInfoFormat() looks like this:

  for () {
    if (info_ret)
      virDomainFSInfoFree(info_ret[i]);
  }

It is needless to compare @info_ret against NULL in each
iteration. We can switch the order and do the comparison first
followed by the loop.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-02-16 14:06:31 +01:00
Michal Privoznik
59c80e9fd0 qemu: Move qemuAgentFSInfo array free into qemuDomainGetFSInfo()
When qemuDomainGetFSInfo() is called it calls
qemuDomainGetFSInfoAgent() which executes 'guest-get-fsinfo'
guest agent command, parses returned JSON and returns an array of
qemuAgentFSInfo structures (well, pointers to those structs).
Then it grabs a domain job and tries to do some matching of guest
returned info against domain definition. This matching is done in
virDomainFSInfoFormat() which also frees the array of
qemuAgentFSInfo structures allocated earlier.

But this is not just. If acquiring the domain job fails (or
domain activeness check executed right after that fails) then
virDomainFSInfoFormat() is not called, leaking the array of
structs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-02-16 14:06:31 +01:00
Michal Privoznik
5922b2b104 qemu: Drop needless check in virDomainFSInfoFormat()
As the very first thing, this function checks whether the number
of items inside @agentinfo array is not negative. This is
redundant as the only caller - qemuDomainGetFSInfo() already
checked for that and would not even call this function if that
was the case.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-02-16 14:06:31 +01:00
Daniel P. Berrangé
67f8ccb4e2 qemu: use long on|off syntax for -spice boolean option values
The preferred syntax for boolean options is to set the value "on" or
"off". QEMU 7.1.0 will deprecate the short format we currently use.

The long format has been supported with -spice since at least 1.5.3,
so we don't need to check for it.

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-16 12:38:30 +00:00
Daniel P. Berrangé
43c9c0859f qemu: use long on|off syntax for -chardev boolean option values
The preferred syntax for boolean options is to set the value "on" or
"off". QEMU 7.1.0 will deprecate the short format we currently use.

The long format has been supported with -chardev since at least 1.5.3,
so we don't need to check for it.

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-16 12:38:20 +00:00
Peter Krempa
ce8670ef46 qemuSnapshotFSFreeze: Don't return -2
The -2 value is misleading because if 'qemuAgentFSFreeze' fails it
doesn't necessarily mean that the command was sent to the agent.

Since callers don't care about the -2 value specifically, remove it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-02-16 12:25:30 +01:00
Peter Krempa
ee91c82533 qemuSnapshotCreateActiveExternal: Don't thaw filesystems when freeze fails
If we didn't freeze any filesystems we should not even attempt thawing
them. Additionally 'guest-fsfreeze-freeze' fails if the filesystems are
already frozen, where thawing them may break users data integrity if
they used VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE accidentally after an
explicit virDomainFSFreeze and the next snapshot without that flag would
be taken with already thawed filesystems.

This effectively reverts 7c736bab06479ccec59df69fb79a5c06d112d8fb .
Libvirt nowadays checks whether the guest agent is connected and pings
it before issuing an command so it's very unlikely that we'd end up in a
situation where qemuSnapshotCreateActiveExternal froze filesystems and
didn't thaw them.

Additionally we now discourage the use of
VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE since users have better control if
they freeze the FS themselves.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-02-16 12:25:30 +01:00
Peter Krempa
ec86b8fa29 api: Discourage use of VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE
The flag creates additional points of failure which are hard to recover
from, such as when thawing of the filesystems fails after an otherwise
successful snapshot.

Encourage use of explicit virDomainFSFreeze/virDomainFSThaw.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-02-16 12:25:30 +01:00
Peter Krempa
4079144836 storagevolxml2argvdata: Rewrap all output files
Use scripts/test-wrap-argv.py to rewrap the output files so that any
further changes don't introduce churn since we are rewrapping the output
automatically now.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2021-02-16 12:25:30 +01:00
Peter Krempa
ba2369d2fe testutils: virTestRewrapFile: Rewrap also '.argv' files
The suffix is used for output files of 'storagevolxml2argvtest.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2021-02-16 12:25:30 +01:00
Peter Krempa
b805ff66d4 qemuMigrationSrcPerformPeer2Peer3: Don't leak 'dom_xml' on cleanup
Use g_autofree for 'dom_xml' to free it on some of the (unlikely) code
paths jumping to cleanup prior to the deallocation which is done right
after it's not needed any more since it's a big string.

Noticed when running under valgrind:

==2204780== 8,192 bytes in 1 blocks are definitely lost in loss record 2,539 of 2,551
==2204780==    at 0x483BCE8: realloc (vg_replace_malloc.c:834)
==2204780==    by 0x4D890DF: g_realloc (in /usr/lib64/libglib-2.0.so.0.6600.4)
==2204780==    by 0x4DA3AF0: g_string_append_vprintf (in /usr/lib64/libglib-2.0.so.0.6600.4)
==2204780==    by 0x4917293: virBufferAsprintf (virbuffer.c:307)
==2204780==    by 0x49B0B75: virDomainChrDefFormat (domain_conf.c:26109)
==2204780==    by 0x49E25EF: virDomainDefFormatInternalSetRootName (domain_conf.c:28956)
==2204780==    by 0x15F81D24: qemuDomainDefFormatBufInternal (qemu_domain.c:6204)
==2204780==    by 0x15F8270D: qemuDomainDefFormatXMLInternal (qemu_domain.c:6229)
==2204780==    by 0x15F8270D: qemuDomainDefFormatLive (qemu_domain.c:6279)
==2204780==    by 0x15FD8100: qemuMigrationSrcBeginPhase (qemu_migration.c:2395)
==2204780==    by 0x15FE0F0D: qemuMigrationSrcPerformPeer2Peer3 (qemu_migration.c:4640)
==2204780==    by 0x15FE0F0D: qemuMigrationSrcPerformPeer2Peer (qemu_migration.c:5093)
==2204780==    by 0x15FE0F0D: qemuMigrationSrcPerformJob (qemu_migration.c:5168)
==2204780==    by 0x15FE280E: qemuMigrationSrcPerform (qemu_migration.c:5372)
==2204780==    by 0x15F9BA3D: qemuDomainMigratePerform3Params (qemu_driver.c:11841)

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2021-02-16 12:25:30 +01:00
Peter Krempa
fb1fb62db6 virDomainMigrateVersion3Full: Don't set 'cancelled' to the same value
It's already initialized to '1'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2021-02-16 12:25:30 +01:00
Andrea Bolognani
3ab9b399bc ci: Build on macOS 11 instead of macOS 10.15
macOS builder capacity on Cirrus CI is quite limited, and so we
can't afford to keep the old build job around after adding the
new one like we do for FreeBSD.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-02-16 11:35:44 +01:00
Andrea Bolognani
cb5defccb1 ci: Update package list on Cirrus CI
While pkgng on FreeBSD updates the package list automatically
when it's run, homebrew on macOS doesn't do the same thing, which
can result in stale packages being installed. Explicitly call
'brew update' before 'brew install' to avoid that scenario.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-02-16 11:35:38 +01:00
Michal Privoznik
a1229335f6 qemu_hotplug: Don't dereference NULL pointer @newb in qemuDomainChangeNet()
In one of my previous commits I've made an attempt to restore the
noqueue qdisc on a TAP corresponding to domain's <interface/> if
QoS is cleared out. The commit consisted of two almost identical
hunks. In both the pointer is dereferenced. But in one of them,
the pointer to new bandwidth can't be NULL while in the other it
can leading to a crash.

Fixes: d53b09235398c1320ed2f1b45b640823171467ed
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1919619
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-02-16 09:05:33 +01:00
Ville Skyttä
97f99b4bd4 docs: tlscerts: Fix a few broken links
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-02-16 08:35:09 +01:00
Pavel Hrdina
6a1f5e8a4f vircgroup: correctly free nested virCgroupPtr
Fixes: 184245f53b94fc84f727eb6e8a2aa52df02d69c0

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-02-15 19:21:35 +01:00
Andrea Bolognani
ee095c2312 ci: Build on FreeBSD 12.2
The FreeBSD 12.1 image on Cirrus CI is currently broken, but
that's okay because a FreeBSD 12.2 image is also available and
we'd rather build on the more up-to-date target anyway.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-02-15 12:11:28 +01:00
Andrea Bolognani
2d92970d8f ci: Refresh Dockerfiles
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-02-15 12:11:23 +01:00
Andrea Bolognani
cca0f9db42 news: Mention Apple Silicon support
After the recent fixes, it's now confirmed to work.

https://gitlab.com/libvirt/libvirt/-/issues/121

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-02-15 11:18:44 +01:00
Ricky Tigg
97c57c6785 Translated using Weblate (Finnish)
Currently translated at 14.6% (1530 of 10451 strings)

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/fi/

Co-authored-by: Ricky Tigg <ricky.tigg@gmail.com>
Signed-off-by: Ricky Tigg <ricky.tigg@gmail.com>
2021-02-13 10:40:13 +01:00
Laine Stump
d2b0ee0aff vmware: convert VIR_FREE to g_free in other functions that free their arg
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
04e90f72a7 util: convert VIR_FREE to g_free in other functions that free their arg
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
4abf2d5d74 qemu: convert VIR_FREE to g_free in other functions that free their arg
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
d835c0affe remote: convert VIR_FREE to g_free in other functions that free their arg
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
c80efb9b60 openvz: convert VIR_FREE to g_free in other functions that free their arg
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
f42be9ae48 locking: convert VIR_FREE to g_free in other functions that free their arg
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
11c48fe6eb conf: convert VIR_FREE to g_free in other functions that free their arg
Previous patches have converted VIR_FREE to g_free in functions with
names ending in Free() and Dispose(), but there are a few similar
functions with names that don't fit that pattern, but server the same
purpose (and thus can survive the same conversion). in particular
*Free*(), and *Unref().

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Laine Stump
e5339e38ca esx: replace VIR_FREE with g_free in any ESX_VI__TEMPLATE__FREE
Invocations of the macro ESX_VI__TEMPLATE__FREE() will free the main
object (referenced as "item") that's pointing to all the things being
VIR_FREEd in the body, so it is safe for all the pointers in item to
just be g_freed rather that VIR_FREEd.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-12 12:10:38 -05:00
Michal Privoznik
859f7e2072 qemu_shim: URI escape root directory
The root directory can be provided by user (or a temporary one is
generated) and is always formatted into connection URI for both
secret driver and QEMU driver, like this:

  qemu:///embed?root=$root

But if it so happens that there is an URI unfriendly character in
root directory or path to it (say a space) then invalid URI is
formatted which results in unexpected results. We can trust
g_dir_make_tmp() to generate valid URI but we can't trust user.
Escape user provided root directory. Always.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1920400
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 17:59:42 +01:00
Erik Skultety
f28a652a32 ci: Makefile: Expose the CI_USER_LOGIN variable for users to use
More often than not I find myself debugging in the containers which
means that I need to have root inside, but without manually tweaking
the Makefile each time the execution would simply fail thanks to the
uid/gid mapping we do. What if we expose the CI_USER_LOGIN variable, so
that when needed, the root can be simply passed with this variable and
voila - you have a root shell inside the container with CWD=~root.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2021-02-12 17:01:41 +01:00
Erik Skultety
321293e2a3 ci: Drop the prepare.sh script
The purpose of this script was to prepare a customized environment in
the container, but was actually never used and it required the usage of
sudo to switch the environment from root's context to a regular user's
one.
The thing is that once someone needs a custom script they would very
likely to debug something and would also benefit from root privileges
in general, so the usage of 'sudo' in such case was a bit cumbersome.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2021-02-12 17:01:41 +01:00
Erik Skultety
ee07bffacc ci: Run podman command directly without wrapping it with prepare.sh
The prepare.sh script isn't currently used and forces us to make use
of sudo to switch the user inside the container from root to $USER
which created a problem on our Debian Slim-based containers which don't
have the 'sudo' package installed.
This patch removes the sudo invocation and instead runs the CMD
directly with podman.

Summary of the changes:
- move the corresponding env variables which we need to be set in the
  environment from the sudo invocation to the podman invocation
- pass --workdir to podman to retain the original behaviour we had with
  sudo spawning a login shell.
- MESON_OPTS env variable doesn't need to propagated to the execution
  environment anymore (like we had to do with sudo), because it's
  defined in the Dockerfile

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2021-02-12 17:01:41 +01:00
Erik Skultety
3ca7299a00 ci: Specify the shebang sequence for build.sh
This is necessary for the follow up patch, because the default
entrypoint for a Dockerfile is exec.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2021-02-12 17:01:41 +01:00
Andrea Bolognani
025ac65ada ci: Move ppc64le build from Debian sid to Debian 10
Debian sid is currently broken on ppc64le, so move the build to
Debian 10; do the opposite for the aarch64 and mips64el builds to
try and restore the 10/sid balance.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 16:35:15 +01:00
Andrea Bolognani
c56dac1c54 ci: Mark container build jobs as required/optional correctly
Whether a container build job is considered required depends on
whether the corresponding cross-build job exists, and in a few
cases the two got out of sync over time.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 16:35:09 +01:00
Andrea Bolognani
52a6cd5f9e ci: Shuffle cross-building jobs around
Keep them ordered by architecture, the same way the corresponding
container jobs are, to make it easier to jump between the two
sections and compare them.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 16:34:39 +01:00
Daniel P. Berrangé
46783e6307 tools: report messages for 'dominfo' command
$ virsh dominfo demo
Id:             2
Name:           demo
UUID:           eadf8ef0-bf14-4c5f-9708-4a19bacf9e81
OS Type:        hvm
State:          running
CPU(s):         2
CPU time:       15.8s
Max memory:     1536000 KiB
Used memory:    1536000 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: unconfined_u:unconfined_r:svirt_t:s0:c443,c956 (permissive)
Messages:       tainted: custom monitor control commands issued
                tainted: use of deprecated configuration settings
                deprecated configuration: machine type 'pc-1.2'

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:13 +00:00
Daniel P. Berrangé
970a59d746 qemu: implement virDomainGetMessages API
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:13 +00:00
Daniel P. Berrangé
07308b9789 remote: add RPC support for the virDomainGetMessages API
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:13 +00:00
Daniel P. Berrangé
c80911f2de src: define virDomainGetMessages API
This API allows fetching a list of informational messages recorded
against the domain. This provides a way to give information about
tainting of the guest due to undesirable actions/configs, as well
as provide details of deprecated features.

The output of this API is explicitly targetted at humans, not
machines, so it is inappropriate to attempt to pattern match on
the strings and take action off them, not least because the messages
are marked for translation.

Should there be a demand for machine targetted information, this
would have to be addressed via a new API, and is not planned at
this point in time.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:12 +00:00
Daniel P. Berrangé
17f001c451 qemu: record deprecation messages against the domain
These messages are only valid while the domain is running.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:12 +00:00
Daniel P. Berrangé
842900dc1e conf: record deprecation messages against the domain
These messages will be stored in the live status XML.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 09:19:12 +00:00
Laine Stump
bebaafd6b4 news: document support for <teaming> in <hostdev>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:21:59 -05:00
Laine Stump
010ed0856b qemu: plug <teaming> config from <hostdev> into qemu commandline
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:21:59 -05:00
Laine Stump
db64acfbda conf: parse/format <teaming> element in plain <hostdev>
The <teaming> element in <interface> allows pairing two interfaces
together as a simple "failover bond" network device in a guest. One of
the devices is the "transient" interface - it will be preferred for
all network traffic when it is present, but may be removed when
necessary, in particular during migration, when traffic will instead
go through the other interface of the pair - the "persistent"
interface. As it happens, in the QEMU implementation of this teaming
pair (called "virtio failover" in QEMU) the transient interface is
always a host network device assigned to the guest using VFIO (aka
"hostdev"); the persistent interface is always an emulated virtio NIC.

When support was initially added for <teaming>, it was written to
require that the transient/hostdev device be defined using <interface
type='hostdev'>; this was done because the virtio failover
implementation in QEMU and the virtio guest driver demands that the
two interfaces in the pair have matching MAC addresses, and the only
way libvirt can guarantee the MAC address of a hostdev network device
is to use <interface type='hostdev'>, whose main purpose is to
configure the device's MAC address before handing the device to
QEMU. (note that <interface type='hostdev'> in turn requires that the
network device be an SRIOV VF (Virtual Function), as that is the only
type of network device whose MAC address we can set in a way that will
survive the device's driver init in the guest).

It has recently come up that some users are unable to use <teaming>
because they are running in a container environment where libvirt
doesn't have the necessary privileges or resources to set the VF's MAC
address (because setting the VF MAC is done via the same device's PF
(Physical Function), and the PF is not exposed to libvirt's container).

At the same time, these users *are* able to set the VF's MAC address
themselves in advance of staring up libvirt in the container. So they
could theoretically use the <teaming> feature if libvirt just skipped
the "setting the MAC address" part.

Fortunately, that is *exactly* the difference between <interface
type='hostdev'> (which must be a "hostdev VF") and <hostdev> (a "plain
hostdev" - it could be *any* PCI device; libvirt doesn't know what type
of PCI device it is, and doesn't care).

But what is still needed is for libvirt to provide a small bit of
information on the QEMU commandline argument for the hostdev, telling
QEMU that this device will be part of a team ("failover pair"), and
the id of the other device in the pair.

To make both of those goals simultaneously possible, this patch adds
support for the <teaming> element to plain <hostdev> - libvirt doesn't
try to set any MAC addresses, and QEMU gets the extra commandline
argument it needs)

(actually, this patch adds only the parsing/formatting of the
<teaming> element in <hostdev>. The next patch will actually wire that
into the qemu driver.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 17:15:34 -05:00
Laine Stump
5cea59b2b3 schema: separate teaming element definition from interface element
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 16:31:52 -05:00