conf: parse/format <teaming> element in plain <hostdev>
The <teaming> element in <interface> allows pairing two interfaces
together as a simple "failover bond" network device in a guest. One of
the devices is the "transient" interface - it will be preferred for
all network traffic when it is present, but may be removed when
necessary, in particular during migration, when traffic will instead
go through the other interface of the pair - the "persistent"
interface. As it happens, in the QEMU implementation of this teaming
pair (called "virtio failover" in QEMU) the transient interface is
always a host network device assigned to the guest using VFIO (aka
"hostdev"); the persistent interface is always an emulated virtio NIC.
When support was initially added for <teaming>, it was written to
require that the transient/hostdev device be defined using <interface
type='hostdev'>; this was done because the virtio failover
implementation in QEMU and the virtio guest driver demands that the
two interfaces in the pair have matching MAC addresses, and the only
way libvirt can guarantee the MAC address of a hostdev network device
is to use <interface type='hostdev'>, whose main purpose is to
configure the device's MAC address before handing the device to
QEMU. (note that <interface type='hostdev'> in turn requires that the
network device be an SRIOV VF (Virtual Function), as that is the only
type of network device whose MAC address we can set in a way that will
survive the device's driver init in the guest).
It has recently come up that some users are unable to use <teaming>
because they are running in a container environment where libvirt
doesn't have the necessary privileges or resources to set the VF's MAC
address (because setting the VF MAC is done via the same device's PF
(Physical Function), and the PF is not exposed to libvirt's container).
At the same time, these users *are* able to set the VF's MAC address
themselves in advance of staring up libvirt in the container. So they
could theoretically use the <teaming> feature if libvirt just skipped
the "setting the MAC address" part.
Fortunately, that is *exactly* the difference between <interface
type='hostdev'> (which must be a "hostdev VF") and <hostdev> (a "plain
hostdev" - it could be *any* PCI device; libvirt doesn't know what type
of PCI device it is, and doesn't care).
But what is still needed is for libvirt to provide a small bit of
information on the QEMU commandline argument for the hostdev, telling
QEMU that this device will be part of a team ("failover pair"), and
the id of the other device in the pair.
To make both of those goals simultaneously possible, this patch adds
support for the <teaming> element to plain <hostdev> - libvirt doesn't
try to set any MAC addresses, and QEMU gets the extra commandline
argument it needs)
(actually, this patch adds only the parsing/formatting of the
<teaming> element in <hostdev>. The next patch will actually wire that
into the qemu driver.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 00:58:29 -05:00
|
|
|
<domain type='qemu'>
|
|
|
|
<name>QEMUGuest1</name>
|
|
|
|
<uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
|
|
|
|
<memory unit='KiB'>219100</memory>
|
|
|
|
<currentMemory unit='KiB'>219100</currentMemory>
|
|
|
|
<vcpu placement='static'>1</vcpu>
|
|
|
|
<os>
|
|
|
|
<type arch='i686' machine='pc'>hvm</type>
|
|
|
|
<boot dev='hd'/>
|
|
|
|
</os>
|
|
|
|
<clock offset='utc'/>
|
|
|
|
<on_poweroff>destroy</on_poweroff>
|
|
|
|
<on_reboot>restart</on_reboot>
|
|
|
|
<on_crash>destroy</on_crash>
|
|
|
|
<devices>
|
|
|
|
<emulator>/usr/bin/qemu-system-i386</emulator>
|
|
|
|
<disk type='block' device='disk'>
|
|
|
|
<driver name='qemu' type='raw'/>
|
|
|
|
<source dev='/dev/HostVG/QEMUGuest1'/>
|
|
|
|
<target dev='hda' bus='ide'/>
|
|
|
|
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
|
|
|
|
</disk>
|
|
|
|
<controller type='usb' index='0'>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
|
|
|
|
</controller>
|
|
|
|
<controller type='pci' index='0' model='pci-root'/>
|
|
|
|
<controller type='ide' index='0'>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
|
|
|
|
</controller>
|
|
|
|
<interface type='user'>
|
|
|
|
<mac address='00:11:22:33:44:55'/>
|
|
|
|
<model type='virtio'/>
|
|
|
|
<teaming type='persistent'/>
|
|
|
|
<alias name='ua-backup0'/>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
|
|
|
|
</interface>
|
|
|
|
<interface type='user'>
|
|
|
|
<mac address='66:44:33:22:11:00'/>
|
|
|
|
<model type='virtio'/>
|
|
|
|
<teaming type='persistent'/>
|
|
|
|
<alias name='ua-backup1'/>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
|
|
|
|
</interface>
|
|
|
|
<input type='mouse' bus='ps2'/>
|
|
|
|
<input type='keyboard' bus='ps2'/>
|
2021-02-24 14:24:10 +00:00
|
|
|
<audio id='1' type='none'/>
|
conf: parse/format <teaming> element in plain <hostdev>
The <teaming> element in <interface> allows pairing two interfaces
together as a simple "failover bond" network device in a guest. One of
the devices is the "transient" interface - it will be preferred for
all network traffic when it is present, but may be removed when
necessary, in particular during migration, when traffic will instead
go through the other interface of the pair - the "persistent"
interface. As it happens, in the QEMU implementation of this teaming
pair (called "virtio failover" in QEMU) the transient interface is
always a host network device assigned to the guest using VFIO (aka
"hostdev"); the persistent interface is always an emulated virtio NIC.
When support was initially added for <teaming>, it was written to
require that the transient/hostdev device be defined using <interface
type='hostdev'>; this was done because the virtio failover
implementation in QEMU and the virtio guest driver demands that the
two interfaces in the pair have matching MAC addresses, and the only
way libvirt can guarantee the MAC address of a hostdev network device
is to use <interface type='hostdev'>, whose main purpose is to
configure the device's MAC address before handing the device to
QEMU. (note that <interface type='hostdev'> in turn requires that the
network device be an SRIOV VF (Virtual Function), as that is the only
type of network device whose MAC address we can set in a way that will
survive the device's driver init in the guest).
It has recently come up that some users are unable to use <teaming>
because they are running in a container environment where libvirt
doesn't have the necessary privileges or resources to set the VF's MAC
address (because setting the VF MAC is done via the same device's PF
(Physical Function), and the PF is not exposed to libvirt's container).
At the same time, these users *are* able to set the VF's MAC address
themselves in advance of staring up libvirt in the container. So they
could theoretically use the <teaming> feature if libvirt just skipped
the "setting the MAC address" part.
Fortunately, that is *exactly* the difference between <interface
type='hostdev'> (which must be a "hostdev VF") and <hostdev> (a "plain
hostdev" - it could be *any* PCI device; libvirt doesn't know what type
of PCI device it is, and doesn't care).
But what is still needed is for libvirt to provide a small bit of
information on the QEMU commandline argument for the hostdev, telling
QEMU that this device will be part of a team ("failover pair"), and
the id of the other device in the pair.
To make both of those goals simultaneously possible, this patch adds
support for the <teaming> element to plain <hostdev> - libvirt doesn't
try to set any MAC addresses, and QEMU gets the extra commandline
argument it needs)
(actually, this patch adds only the parsing/formatting of the
<teaming> element in <hostdev>. The next patch will actually wire that
into the qemu driver.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-02-11 00:58:29 -05:00
|
|
|
<hostdev mode='subsystem' type='pci' managed='yes'>
|
|
|
|
<source>
|
|
|
|
<address domain='0x0000' bus='0x03' slot='0x07' function='0x1'/>
|
|
|
|
</source>
|
|
|
|
<teaming type='transient' persistent='ua-backup0'/>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
|
|
|
|
</hostdev>
|
|
|
|
<hostdev mode='subsystem' type='pci' managed='yes'>
|
|
|
|
<source>
|
|
|
|
<address domain='0x0000' bus='0x03' slot='0x07' function='0x2'/>
|
|
|
|
</source>
|
|
|
|
<teaming type='transient' persistent='ua-backup1'/>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
|
|
|
|
</hostdev>
|
|
|
|
<memballoon model='virtio'>
|
|
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
|
|
|
|
</memballoon>
|
|
|
|
</devices>
|
|
|
|
</domain>
|