As sketched in previous commits, imagine the following scenario:
virsh # domiftune gentoo vnet0
inbound.average: 100
inbound.peak : 0
inbound.burst : 0
outbound.average: 100
outbound.peak : 0
outbound.burst : 0
virsh # domiftune gentoo vnet0 --inbound 0
virsh # shutdown gentoo
Domain gentoo is being shutdown
virsh # list --all
error: Failed to list domains
error: Cannot recv data: Connection reset by peer
Program received signal SIGSEGV, Segmentation fault.
0x00007fffe80ea221 in networkUnplugBandwidth (net=0x7fff9400c1a0, iface=0x7fff940ea3e0) at network/bridge_driver.c:4881
4881 net->floor_sum -= ifaceBand->in->floor;
This is rather unfortunate. We should not SIGSEGV here. The
problem is, that while in the second step the inbound QoS was
cleared out, the network part of it was not updated (moreover, we
don't report that vnet0 had inbound.floor set). Internal
structure therefore still had some fragments left (e.g.
class_id). So when qemuProcessStop() started to clean up the
environment it got to networkUnplugBandwidth(). Here, class_id is
set therefore function assumes that there is an inbound QoS. This
actually is a fair assumption to make, there's no need for a
special QoS box in network's QoS when there's no QoS to set.
Anyway, the problem is not the networkUnplugBandwidth() rather
than qemuDomainSetInterfaceParameters() which completely forgot
about QoS being disperse (some parts are set directly on
interface itself, some on bridge the interface is plugged into).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So, if a domain vNIC's bandwidth has been successfully set, it's
possible that because @floor is set on network's bridge, this
part may need updating too. And that's exactly what this function
does. While the previous commit introduced a function to check if
@floor can be satisfied, this does all the hard work. In general,
there may be three, well four possibilities:
1) No change in @floor value (either it remain unset, or its
value hasn't changed)
2) The @floor value has changed from a non-zero to a non-zero
value
3) New @floor is to be set
4) Old @floor must be cleared out
The difference between 2), 3) and 4) is, that while in 2) the QoS
tree on the network's bridge already has a special class for the
vNIC, in 3) the class must be created from scratch. In 4) it must
be removed. Fortunately, we have helpers for all three
interesting cases.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When a domain vNIC's bandwidth is to be changed (at runtime) it is
possible that guaranteed minimal bandwidth (@floor) will change too.
Well, so far it is, because we still don't have an implementation that
allows setting it dynamically, so it's effectively erased on:
#virsh domiftune $dom vnet0 --inbound 0
However, that's slightly unfortunate. We do some checks on domain
startup to see if @floor can be guaranteed. We ought do the same if
QoS is changed at runtime.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is no functional change. It's just that later in the series we
will need to pass class_id as an integer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There is no guarantee that an enum start it mapped onto a value
of zero. However, we are guaranteed that enum items are
consecutive integers. Moreover, it's a pity to define an enum to
avoid using magical constants but then using them anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit a6f9af8292 added checking for address colisions between
starting and ending addresses of forwarding addresses, but forgot that
there might be no addresses set at all.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Unlike what happens on x86, on ppc64 you can't mix and match CPU
features to obtain the guest CPU you want regardless of the host
CPU, so the concept of model fallback doesn't apply.
Make sure CPU definitions emitted by the driver, eg. as output of
the cpuBaseline() and cpuUpdate() calls, reflect this fact.
All previously recognized CPU models (POWER7_v2.1, POWER7_v2.3,
POWER7+_v2.1 and POWER8_v1.0) are internally converted to the
corrisponding generation name so that existing guests don't stop
working.
Use multiple PVRs per CPU model to reduce the number of models we
need to keep track of.
Remove specific CPU models (eg. POWER7+_v2.1): the corresponding
generic CPU model (eg. POWER7) should be used instead to ensure
the guest can be booted on any compatible host.
Get rid of all the entries that did not match any of the CPU
models supported by QEMU, like power8 and power8e.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250977
This will allow us to perform PVR matching more broadly, eg. consider
both POWER8 and POWER8E CPUs to be the same even though they have
different PVR values.
The upcoming commits will make heavy modifications to the ppc64
driver, split so that it's easier to review the changes.
Instead of updating the test cases so that they pass, possibly
only to update them again with the following commit, disable them
for the time being.
Another commit will update them all in one go once all required
changes are in place.
This ensures comparison of two CPU definitions will be consistent
regardless of the fact that it is performed using cpuCompare() or
cpuGuestData(). The x86 driver uses the same exact code.
Limitations of the POWER architecture mean that you can't run
eg. a POWER7 guest on a POWER8 host when using KVM. This applies
to all guests, not just those using VIR_CPU_MATCH_STRICT in the
CPU definition; in fact, exact and strict CPU matching are
basically the same on ppc64.
This means, of course, that hosts using different CPUs have to be
considered incompatible as well.
Change ppc64Compute(), called by cpuGuestData(), to reflect this
fact and update test cases accordingly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250977
ppc64Compute(), called by cpuNodeData(), is used not only to retrieve
the driver-specific data associated to a guest CPU definition, but
also to check whether said guest CPU is compatible with the host CPU.
If the user is not interested in the CPU data, it's perfectly fine
to pass a NULL pointer instead of a return location, and the
compatibility data returned should not be affected by this. One of
the checks, specifically the one on CPU model name, was however
only performed if the return location was non-NULL.
A test is considered successful if the obtained result matches
the expected result: if that's not the case, whether because a
test that was expected to succeed failed or because a test that
was supposed to fail succeeded, then something's not right and
we want the user to know about this.
On the other hand, if a failure that's unrelated to the bits
we're testing occurs, then the user should be notified even if
the test was expected to fail.
Use different values to tell these two situations apart.
Fix a test case that was wrongly expected to fail as well.
Use briefer checks, eg. (!model) instead of (model == NULL), and
avoid initializing to NULL a pointer that would be assigned in
the first line of the function anyway.
Also remove a pointless NULL assignment.
No functional changes.
Use the ppc64Driver prefix for all functions that are used to
fill in the cpuDriverPPC64 structure, ie. those that are going
to be called by the generic CPU code.
This makes it clear which functions are exported and which are
implementation details; it also gets rid of the ambiguity that
affected the ppc64DataFree() function which, despite what the
name suggested, was not related to ppc64DataCopy() and could
not be used to release the memory allocated for a
virCPUppc64Data* instance.
No functional changes.
This is a public library, it shouldn't include anything that is
internal. Including the library in it's current state to an example
application fails the preprocessor phase.
nwfilter uses iptables and ebtables, which only work properly on
tap-based network connections (*not* on macvtap, for example), but we
just ignore any <filterref> elements for other types of networks,
potentially giving users a false sense of security.
This patch checks the network type and fails/logs an error if any
domain <interface> has a <filterref> when the connection isn't using a
tap device.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1180011
This patch modifies virSocketAddrGetRange() to function properly when
the containing network/prefix of the address range isn't known, for
example in the case of the NAT range of a virtual network (since it is
a range of addresses on the *host*, not within the network itself). We
then take advantage of this new functionality to validate the NAT
range of a virtual network.
Extra test cases are also added to verify that virSocketAddrGetRange()
works properly in both positive and negative cases when the network
pointer is NULL.
This is the *real* fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=985653
Commits 1e334a and 48e8b9 had earlier been pushed as fixes for that
bug, but I had neglected to read the report carefully, so instead of
fixing validation for the NAT range, I had fixed validation for the
DHCP range. sigh.
https://bugzilla.redhat.com/show_bug.cgi?id=1022292
The following XML really does not make any sense:
<inbound average="-1" burst="-2" peak="-3" floor="-4"/>
There can't be a negative packet rate. Well, so far we haven't
assigned any meaning to it. So reject it unless users harm themselves,
because otherwise we turn the negative numbers into really big values.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's no need to set mon->fd to a dummy value since
it's initialized to proper value just a few lines below.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Since its introduction in 2011 (particularly in commit f4324e3292),
the option doesn't work. It just effectively disables all incoming
connections. That's because the client private data that contain the
'keepalive_supported' boolean, are initialized to zeroes so the bool is
false and the only other place where the bool is used is when checking
whether the client supports keepalive. Thus, according to the server,
no client supports keepalive.
Removing this instead of fixing it is better because a) apparently
nobody ever tried it since 2011 (4 years without one month) and b) we
cannot know whether the client supports keepalive until we get a ping or
pong keepalive packet. And that won't happen until after we dispatched
the ConnectOpen call.
Another two reasons would be c) the keepalive_required was tracked on
the server level, but keepalive_supported was in private data of the
client as well as the check that was made in the remote layer, thus
making all other instances of virNetServer miss this feature unless they
all implemented it for themselves and d) we can always add it back in
case there is a request and a use-case for it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So the API takes @dumpformat argument. This is what makes it special
when compared to virDomainCoreDump. The argument is there so that
users can choose the format of resulting core dump file. And to ease
them the choosing process we even have an enum with supported values
across all the hypervisors. But we don't mention the enum in the
function description anywhere. Fix it!
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
By specifying parentIndex in a call to virNetworkUpdate(), it was
possible to direct libvirt to add a dhcp range or static host of a
non-matching address family to the <dhcp> element of an <ip>. For
example, given:
<ip address='192.168.122.1' netmask='255.255.255.0'/>
<ip family='ipv6' address='2001:db6:ca3:45::1' prefix='64'/>
you could provide a static host entry with an IPv4 address, and
specify that it be added to the 2nd <ip> element (index 1):
virsh net-update default add ip-dhcp-host --parent-index 1 \
'<host mac="52:54:00:00:00:01" ip="192.168.122.45"/>'
This would be happily added with no error (and no concern of any
possible future consequences).
This patch checks that any dhcp range or host element being added to a
network ip's <dhcp> subelement has addresses of the same family as the
ip element they are being added to.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1184736
This controller can be connected only to a port on a
pcie-switch-upstream-port. It provides a single hotpluggable port that
will accept any PCI or PCIe device, as well as any device requiring a
pcie-*-port (the only current example of such a device is the
pcie-switch-upstream-port).
The downstream ports of an x3130-upstream switch can each have one of
these plugged into them (and that is the only place they can be
connected). Each xio3130-downstream provides a single PCIe port that
can have PCI or PCIe devices hotplugged into it. Apparently an entire
set of x3130-upstream + several xio3130-downstreams can be hotplugged
as a unit, but it's not clear to me yet how that would be done, since
qemu only allows attaching a single device at a time.
This device will be used to implement the
"pcie-switch-downstream-port" model of pci controller.
This controller can be connected only to a pcie-root-port or a
pcie-switch-downstream-port (which will be added in a later patch),
which is the reason for the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_PORT. A pcie-switch-upstream-port provides
32 ports (slot=0 to slot=31) on the downstream side, which can only
have pci controllers of model "pcie-switch-downstream-port" plugged
into them, which is the reason for the other new connect type
VIR_PCI_CONNECT_TYPE_PCIE_SWITCH.
This is the upstream part of a PCIe switch. It connects to a PCIe port
(but not PCI) on the upstream side, and can have up to 31
xio3130-downstream controllers (but no other types of devices)
connected to its downstream side.
This device will be used to implement the "pcie-switch-upstream-port"
model of pci controller.
This is backed by the qemu device ioh3420.
chassis and port from the <target> subelement are used to store/set the
respective qemu device options for the ioh3420. Currently, chassis is
set to be the index of the controller, and port is set to
"(slot << 3) + function" (per suggestion from Alex Williamson).
This controller can be connected (at domain startup time only - not
hotpluggable) only to a port on the pcie root complex ("pcie-root" in
libvirt config), hence the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_ROOT. It provides a hotpluggable port that
will accept any PCI or PCIe device.
New attributes must be added to the controller <target> subelement for
this - chassis and port are guest-visible option values that will be
set by libvirt with values derived from the controller's index and pci
address information.
This is a PCIE "root port". It connects only to a port of the
integrated pcie.0 bus of a Q35 machine (can't be hotplugged), and
provides a single PCIe port that can have PCI or PCIe devices
hotplugged into it.
This device will be used to implement the "pcie-root-port" model of
pci controller.
This uses the new subelement/attribute in two ways:
1) If a "pci-bridge" pci controller has no chassisNr attribute, it
will automatically be set to the controller's index as soon as the
controller's PCI address is known (during
qemuDomainAssignPCIAddresses()).
2) when creating the commandline for a pci-bridge device, chassisNr
will be used to set qemu's chassis_nr option (rather than the previous
practice of hard-coding it to the controller's index).
There are some configuration options to some types of pci controllers
that are currently automatically derived from other parts of the
controller's configuration. For example, in qemu a pci-bridge
controller has an option that is called "chassis_nr"; up until now
libvirt has always set chassis_nr to the index of the pci-bridge. So
this:
<controller type='pci' model='pci-bridge' index='2'/>
will always result in:
-device pci-bridge,chassis_nr=2,...
on the qemu commandline. In the future we may decide there is a better
way to derive that option, but even in that case we will need for
existing domains to retain the same chassis_nr they were using in the
past - that is something that is visible to the guest so it is part of
the guest ABI and changing it would lead to problems for migrating
guests (or just guests with very picky OSes).
The <target> subelement has been added as a place to put the new
"chassisNr" attribute that will be filled in by libvirt when it
auto-generates the chassisNr; it will be saved in the config, then
reused any time the domain is started:
<controller type='pci' model='pci-bridge' index='2'>
<model type='pci-bridge'/>
<target chassisNr='2'/>
</controller>
The one oddity of all this is that if the controller configuration
is changed (for example to change the index or the pci address
where the controller is plugged in), the items in <target> will
*not* be re-generated, which might lead to conflict. I can't
really see any way around this, but fortunately if there is a
material conflict qemu will let us know and we will pass that on
to the user.
This patch provides qemu support for the contents of <model> in
<controller> for the two existing PCI controller types that need it
(i.e. the two controller types that are backed by a device that must
be specified on the qemu commandline):
1) pci-bridge - sets <model> name attribute default as "pci-bridge"
2) dmi-to-pci-bridge - sets <model> name attribute default as
"i82801b11-bridge".
These both match current hardcoded practice.
The defaults are set at the end of qemuDomainAssignPCIAddresses().
This can't be done earlier because some of the options that will be
autogenerated need full PCI address info for the controller, and
because qemuDomainAssignPCIAddresses() might create extra controllers
which would need default settings added, and that hasn't yet been done
at the time the PostParse callbacks are being run.
qemuDomainAssignPCIAddresses() is still called prior to the XML being
written to disk, though, so the autogenerated defaults are persistent.
qemu capabilities bits aren't checked when the domain is defined, but
rather when the commandline is actually created (so the domain can
possibly be defined on a host that doesn't yet have support for the
given device, or a host different from the one where it will
eventually be run). When the commandline is being generated we compare
the modelName to known qemu device names implementing the given type
of controller, and check the capabilities bit for that device.
This new subelement is used in PCI controllers: the toplevel
*attribute* "model" of a controller denotes what kind of PCI
controller is being described, e.g. a "dmi-to-pci-bridge",
"pci-bridge", or "pci-root". But in the future there will be different
implementations of some of those types of PCI controllers, which
behave similarly from libvirt's point of view (and so should have the
same model), but use a different device in qemu (and present
themselves as a different piece of hardware in the guest). In an ideal
world we (i.e. "I") would have thought of that back when the pci
controllers were added, and used some sort of type/class/model
notation (where class was used in the way we are now using model, and
model was used for the actual manufacturer's model number of a
particular family of PCI controller), but that opportunity is long
past, so as an alternative, this patch allows selecting a particular
implementation of a pci controller with the "name" attribute of the
<model> subelement, e.g.:
<controller type='pci' model='dmi-to-pci-bridge' index='1'>
<model name='i82801b11-bridge'/>
</controller>
In this case, "dmi-to-pci-bridge" is the kind of controller (one that
has a single PCIe port upstream, and 32 standard PCI ports downstream,
which are not hotpluggable), and the qemu device to be used to
implement this kind of controller is named "i82801b11-bridge".
Implementing the above now will allow us in the future to add a new
kind of dmi-to-pci-bridge that doesn't use qemu's i82801b11-bridge
device, but instead uses something else (which doesn't yet exist, but
qemu people have been discussing it), all without breaking existing
configs.
(note that for the existing "pci-bridge" type of PCI controller, both
the model attribute and <model> name are 'pci-bridge'. This is just a
coincidence, since it turns out that in this case the device name in
qemu really is a generic 'pci-bridge' rather than being the name of
some real-world chip)