The pxb device is a PCIe expander bus that can be added to any
Q35-based machinetype. A single PCIe port (*not* hotpluggable) is
provided; if more than one device is desired, or if hotplug
support is needed, either a pcie-root-port, or some combination of
pcie-switch-upstream-port and pcie-swith-downstream-ports must be
added to it. It can have a NUMA node number associated with it, as
well as a bus number.
This is backed by the qemu device "pxb".
The pxb device always includes a pci-bridge that is at the bus number
of the pxb + 1.
busNr and <node> from the <target> subelement are used to set the
bus_nr and numa_node options for pxb.
During post-parse we validate that the domain's machinetype is
440fx-based (since the pxb device only works on 440fx-based machines),
and <node> also gets a sanity check to assure that the NUMA node
specified for the pxb (if any - it's optional) actually exists on the
guest.
This is a standard PCI root bus (not a bridge) that can be added to a
440fx-based domain. Although it uses a PCI slot, this is *not* how it
is connected into the PCI bus hierarchy, but is only used for
control. Each pci-expander-bus provides 32 slots (0-31) that can
accept hotplug of standard PCI devices.
The usefulness of pci-expander-bus relative to a pci-bridge is that
the NUMA node of the bus can be specified with the <node> subelement
of <target>. This gives guest-side visibility to the NUMA node of
attached devices (presuming that management apps only assign a device
to a bus that has a NUMA node number matching the node number of the
device on the host).
Each pci-expander-bus also has a "busNr" attribute. The expander-bus
itself will take the busNr specified, and all buses that are connected
to this bus (including the pci-bridge that is automatically added to
any expander bus of model "pxb" (see the next commit)) will use
busNr+1, busNr+2, etc, and the pci-root (or the expander-bus with next
lower busNr) will use bus numbers lower than busNr.
The pxb device is a PCI expander bus that can be added to any
440fx-based machinetype. The PCI bus that is created has 32 standard
PCI slots (hotpluggable). It can have a NUMA node number associated
with it, as well as a bus number.
There are two places in qemu_domain_address.c where we have a switch
statement to convert PCI controller models
(VIR_DOMAIN_CONTROLLER_MODEL_PCI*) into the connection type flag that
is matched when looking for an upstream connection for that model of
controller (VIR_PCI_CONNECT_TYPE_*). This patch makes a utility
function in conf/domain_addr.c to do that, so that when a new PCI
controller is added, we only need to add the new model-->connect-type
in a single place.
The flags used to determine which devices could be plugged into which
controllers were quite confusing, as they tried to create classes of
connections, then put particular devices into possibly multiple
classes, while sometimes setting multiple flags for the controllers
themselves. The attempt to have a single flag indicate, e.g. that a
root-port or a switch-downstream-port could connect was not only
confusing, it was leading to a situation where it would be impossible
to specify exactly the right combinations for a new controller.
The solution is for the VIR_PCI_CONNECT_TYPE_* flags to have a 1:1
correspondence with each type of PCI controller, plus a flag for a PCI
endpoint device and another for a PCIe endpoint device (the only
exception to this is that pci-bridge and pcie-expander-bus controllers
have their upstream connection classified as
VIR_PCI_CONNECT_TYPE_PCI_DEVICE since they can be plugged into
*exactly* the same ports as any endpoint device). Each device then
has a single flag for connect type (plus the HOTPLUG flag if that
device can e hotplugged), and each controller sets the CONNECT bits
for all controllers that can be plugged into it, as well as for either
type of endpoint device that can be plugged in (and the HOTPLUG flag
if it can accept hotplugged devices).
With this change, it is *slightly* easier to understand the matching
of connections (as long as you remember that the flag for a
device/upstream-facing connection of a controller is the same as that
device's type, while the flags for a controller's downstream
connections is the OR of all device types that can be plugged into
that controller). More importantly, it will be possible to correctly
specify what can be plugged into a pcie-switch-expander-bus, when
support for it is added.
When support for dmi-to-pci-bridge was added, it was assumed that,
just as with the pci-root bus, slot 0 was reserved. This is not the
case - it can be used to connect a device just like any other slot, so
remove the restriction and update the test cases that auto-assign an
address on a dmi-to-pci-bridge.
Every other maxSlot was either set to 0 or to
VIR_PCI_ADDRESS_SLOT_LAST, but this one was for some reason set to the
literal value 31 (which is the same as VIR_PCI_ADDRESS_SLOT_LAST).
This makes them all consistent.
This is especially useful for "bus", since the bus of a device's pci
address is matched to the "index" of a controller to determine which
bus it will be connected to, and "index" is always specified in
decimal - being able to specify both in decimal at least makes it
easier to assure a device is being assigned to the correct bus when it
is added. For the other attributes, it is just a convenience.
(MB: the parser already allows for any of these attributes to be given
in decimal, and there are even examples floating around on the
internet that give them in decimal rather than hex (written in the
days before virsh did schema validation on all XML). This only updates
the schema to match the parser.)
nwfilter.rng defines uint16range and uint32range, but in a different
manner (it also allows a variable name as the value, rather than just
a decimal or hex number). I wanted to add uint16range to
basictypes.rng, but my desired definition was parallel to those for
uint8range and uint24range which are defined in basictypes.rng - they
*don't* allow a variable name for the value.
The simplest path to make everyone happy is to make the "plain"
versions in basictypes.rng have simpler names - "uint8", "uint16", and
"uint24". This patch renames uint8range and uint24range to uint8 and
uint24, while the next patch will add uint16.
The pcie-switch-downstream-port and pcie-root-port controllers have
only a single slot, numbered 0, and the greate majority of all guest
PCI devices are plugged into function 0 of whatever slot they're
using. The parser makes these optional, setting them to 0 when not
specified, and it's logical for the schema to also make them optional.
We use device-mapper to enumerate all dm devices, and filter out
the list of multipath devices by checking the target_type string
name. The code however cancels all scanning if we encounter
target_type=NULL
I don't know how to reproduce that situation, but a user was hitting
it in their setup, and inspecting the lvm2/device-mapper code shows
many places where !target_type is explicitly ignored and processing
continues on to the next device. So I think we should do the same
https://bugzilla.redhat.com/show_bug.cgi?id=1069317
The watchdog cli refactoring in 4666b762 dropped the temporary variable
we use to convert to action=dump to action=pause for the qemu cli, and
stored the converted value in the domain structure. Our other watchdog
handling code then treated it as though the user requested action=pause,
which broke action=dump handling.
Revive the temporary variable to fix things.
* Add a test for listen=XXX and <listen address=YYY/> collision error
* Add an explicit test for listen=XXX duplicated to <listen address=XXX/>
We implicitly test it elsewhere but I figure it's better to be explicit,
and this test case can be extended in the future for additional listen
back compat if/when we support <listen type='socket'/> syntax
In qemuHotplugCreateObjects, the ret variable was filled by
the value returned by qemuTestCapsCacheInsert.
If any of the functions after this assignment failed, we would still
return success.
Also adjust testCompareXMLToArgvHelper, where this change is just
cosmetic, because the value was overwritten right away.
Upcoming compression options for migration command patch
series hits current limit of 32 possible options for a command.
Lets take one step further and support 64 possible options.
And all it takes is moving from 32 bit integers to 64 bit ones.
The only less then trivial change i found is moving from
'ffs' to 'ffsl'.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
I tried compiling libvirt with older gcc and probably because I used
different configure options I got some shadowed declarations.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add codes to support creating domain with network defition of assigning
SRIOV VF from a pool.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
commit 30c61901 added new functions to libvirt_private.syms
not alpabetically sorted and erroneously added vz sources to
STATEFUL_DRIVER_SOURCE_FILES, which triggered check-aclrules
running while vz driver isn't ready for it yet.
Pushing under build-breaker rule.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
SDK does not allocate memory when getting strings thus we
need to call every function that returns string twice.
First to obtain string length, second to obtain string
itself. It is tedious so let's create helper functions
for cases when we know length of the result beforehand
and we are not.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
remove unnecessary vzConnectClose prototype and make
local structure vzDomainDefParserConfig be static
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We don't need them anymore as all pointers within vzDriver structure
are not changed during the time it exists.
Where we still need to synchronize we use virObjectLock/Unlock as far
as vzDriver is lockable object.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Lock driver when a new domain is created in prlsdkNewDomainByHandle
and try to find it in the list under lock again because it can race
with vzDomainDefineXMLFlags when a domain with the same uuid is added
via vz dispatcher directly and libvirt define.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
This patch introduces a new 'vzDriver' lockable object and provides
helper functions to allocate/destroy it and we pass it to prlsdkXxx
functions instead of virConnectPtr.
Now we store domain related objects such as domain list, capabitilies
etc. within a single vz_driver vzDriver structure, which is shared by
all driver connections. It is allocated during daemon initialization or
in a lazy manner when a new connection to 'vz' driver is established.
When a connection to vz daemon drops, vzDestroyConnection is called,
which in turn relays disconnect event to all connection to 'vz' driver.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Make it possible to build vz driver as a module and don't link it with
libvirt.so statically.
Remove registering it on client's side as far as we start relying on daemon
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
GCC in RHEL-6 complains about listen:
../../src/conf/domain_conf.c:23718: error: declaration of 'listen' shadows a global declaration [-Wshadow]
/usr/include/sys/socket.h:204: error: shadowed declaration is here [-Wshadow]
This renames all the listen to gListen.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Trying to reload/SIGUSR1 virtlogd or virtlockd fails with:
error : virNetDaemonRun:747 : internal error: Not all servers restored, cannot run server
Commit 252610f7 changed the daemon state json to allow tracking
multiple servers. However it missed clearing dmn->srvObject after
the json is empty, like the previous code paths handled. Later on in
virNewDaemonRun, dmn->srvObject is expected to be empty otherwise we
throw the above error.
https://bugzilla.redhat.com/show_bug.cgi?id=1311013
Since qemu is now able to notify us that the guest rejected the memory
unplug operation we can relay this to the user and make the API fail
right away.
Additionally document the possible values from the ACPI docs for future
reference.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1320447
The event is emitted on ACPI OSPM Status Indication events.
ACPI standard documentation describes the method as:
This object is an optional control method that is invoked by OSPM to
indicate processing status to the platform. During device ejection,
device hot add, or other event processing, OSPM may need to perform
specific handshaking with the platform. OSPM may also need to indicate
to the platform its inability to complete a requested operation; for
example, when a user presses an ejection button for a device that is
currently in use or is otherwise currently incapable of being ejected.
In this case, the processing of the ACPI Eject Request notification by
OSPM fails. OSPM may indicate this failure to the platform through the
invocation of the _OST control method. As a result of the status
notification indicating ejection failure, the platform may take certain
action including reissuing the notification or perhaps turning on an
appropriate indicator light to signal the failure to the user.
Since we didn't opt to use one single event for device lifecycle for a
VM we are missing one last event if the device removal failed. This
event will be emitted once we asked to eject the device but for some
reason it is not possible.
Similarly to the DEVICE_DELETED event we will be able to tell when
unplug of certain device types will be rejected by the guest OS. Wire up
the device deletion signalling code to allow handling this.
Neither of the callers cares whether the DEVICE_DELETED event isn't
supported or the event was received. Simplify the code and callers by
unifying the two values and changing the return value constants so that
a temporary variable can be omitted.
Callers ignore if this function returns -1 and continue as though the
DEVICE_DELETED event was not received. Since we can't be sure that the
event was not received we should behave as if the event was not
supported and remove the device definition right away. The error
fortunately won't really happen here.