To ensure the libvirt libxl driver will build with future versions
of Xen where the libxl API may change in incompatible ways,
explicitly use LIBXL_API_VERSION 0x040200. The libxl driver
does use new libxl APIs that have been added since Xen 4.2, but
currently it does not make use of any changes made to existing
APIs such as libxl_domain_create_restore or libxl_set_vcpuaffinity.
The version can be bumped if/when the libxl driver consumes the
changed APIs.
Further details can be found in the following discussion thread
https://www.redhat.com/archives/libvir-list/2016-April/msg00178.html
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When creating the master key, we used mode 0600 (which we should) but
because we were creating it as root, the file is not readable by any
qemu running as non-root. Fortunately, it's just a matter of labelling
the file. We are generating the file path few times already, so let's
label it in the same function that has access to the path already.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In a few places in libvirt we busy-wait for events, for example qemu
creating a monitor socket. This is problematic because:
- We need to choose a sufficiently small polling period so that
libvirt doesn't add unnecessary delays.
- We need to choose a sufficiently large polling period so that
the effect of busy-waiting doesn't affect the system.
The solution to this conflict is to use an exponential backoff.
This patch adds two functions to hide the details, and modifies a few
places where we currently busy-wait.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
In case of ploop volume, target path of the volume is the path to the
directory that contains image file named root.hds and DiskDescriptor.xml.
While using uploadVol and downloadVol callbacks we need to open root.hds
itself.
Upload or download operations with ploop volume are only allowed when
images do not have snapshots. Otherwise operation fails.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Refreshes meta-information such as allocation, capacity, format, etc.
Ploop volumes differ from other volume types. Path to volume is the path
to directory with image file root.hds and DiskDescriptor.xml.
https://openvz.org/Ploop/format
Due to this fact, operations of opening the volume have to be done once
again. get the information.
To decide whether the given volume is ploops one, it is necessary to check
the presence of root.hds and DiskDescriptor.xml files in volumes' directory.
Only in this case the volume can be manipulated as the ploops one.
Such strategy helps us to resolve problems that might occure, when we
upload some other volume type from ploop source.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Recursively deletes whole directory of a ploop volume.
To delete ploop image it has to be unmounted.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
These callbacks let us to create ploop volumes in dir, fs and etc. pools.
If a ploop volume was created via buildVol callback, then this volume
is an empty ploop device with DiskDescriptor.xml.
If the volume was created via .buildFrom - then its content is similar to
input volume content.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Ploop image consists of directory with two files: ploop image itself,
called root.hds and DiskDescriptor.xml that contains information about
ploop device: https://openvz.org/Ploop/format.
Such volume are difficult to manipulate in terms of existing volume types
because they are neither a single files nor a directory.
This patch introduces new volume type - ploop. This volume type is used
by ploop volume's exclusively.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Rather than trying some magic calculations on our side query the monitor
for the current size of the memory balloon both on hotplug and
hotunplug.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1220702
Now that there is just one format of the memory balloon command line
used the code can be merged into a single function.
Additionally with some tweaks to the control flow the code is easier to
read.
The change that made qemu not add the memballoon by default happened
prior to 0.12.0. Additionaly the comment was misleading due to the code
that was added below. Since we always need to add a balloon on the
commandline drop the comment.
The only caller of this code is:
for (i = 0; i < dom->def->ngraphics; i++) {
if (dom->def->graphics[i]->type == VIR_DOMAIN_GRAPHICS_TYPE_SPICE) {
if (!(mig->graphics =
qemuMigrationCookieGraphicsAlloc(driver, dom->def->graphics[i])))
return -1;
mig->flags |= QEMU_MIGRATION_COOKIE_GRAPHICS;
break;
}
}
So this is never triggered for VNC, and in fact VNC has no support for
seamless migration anyways so that seems correct. Drop the dead VNC
handling.
Since vz driver is now lives as a part of daemon we can benefit from
this fact and allow vz clients to use shared drivers API like storage,
network, nwfilter etc.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
This is backed by the qemu device pxb-pcie, which will be available in
qemu 2.6.0.
As with pci-expander-bus (which uses qemu's pxb device), the busNr
attribute and <node> subelement of <target> are used to set the bus_nr
and numa_node options.
During post-parse we validate that the domain's machinetype is
q35-based (since the device shows up for 440fx-based machinetypes, but
is unusable), as well as checking that <node> specifies a node that is
actually configured on the guest.
This controller provides a single PCIe port on a new root. It is
similar to pci-expander-bus, intended to provide a bus that can be
associated with a guest-identifiable NUMA node, but is for
machinetypes with PCIe rather than PCI (e.g. q35-based machinetypes).
Aside from PCIe vs. PCI, the other main difference is that a
pci-expander-bus has a companion pci-bridge that is automatically
attached along with it, but pcie-expander-bus has only a single port,
and that port will only connect to a pcie-root-port, or to a
pcie-switch-upstream-port. In order for the bus to be of any use in
the guest, it must have either a pcie-root-port or a
pcie-switch-upstream-port attached (and one or more
pcie-switch-downstream-ports attached to the
pcie-switch-upstream-port).
The pxb device is a PCIe expander bus that can be added to any
Q35-based machinetype. A single PCIe port (*not* hotpluggable) is
provided; if more than one device is desired, or if hotplug
support is needed, either a pcie-root-port, or some combination of
pcie-switch-upstream-port and pcie-swith-downstream-ports must be
added to it. It can have a NUMA node number associated with it, as
well as a bus number.
This is backed by the qemu device "pxb".
The pxb device always includes a pci-bridge that is at the bus number
of the pxb + 1.
busNr and <node> from the <target> subelement are used to set the
bus_nr and numa_node options for pxb.
During post-parse we validate that the domain's machinetype is
440fx-based (since the pxb device only works on 440fx-based machines),
and <node> also gets a sanity check to assure that the NUMA node
specified for the pxb (if any - it's optional) actually exists on the
guest.
This is a standard PCI root bus (not a bridge) that can be added to a
440fx-based domain. Although it uses a PCI slot, this is *not* how it
is connected into the PCI bus hierarchy, but is only used for
control. Each pci-expander-bus provides 32 slots (0-31) that can
accept hotplug of standard PCI devices.
The usefulness of pci-expander-bus relative to a pci-bridge is that
the NUMA node of the bus can be specified with the <node> subelement
of <target>. This gives guest-side visibility to the NUMA node of
attached devices (presuming that management apps only assign a device
to a bus that has a NUMA node number matching the node number of the
device on the host).
Each pci-expander-bus also has a "busNr" attribute. The expander-bus
itself will take the busNr specified, and all buses that are connected
to this bus (including the pci-bridge that is automatically added to
any expander bus of model "pxb" (see the next commit)) will use
busNr+1, busNr+2, etc, and the pci-root (or the expander-bus with next
lower busNr) will use bus numbers lower than busNr.
The pxb device is a PCI expander bus that can be added to any
440fx-based machinetype. The PCI bus that is created has 32 standard
PCI slots (hotpluggable). It can have a NUMA node number associated
with it, as well as a bus number.
There are two places in qemu_domain_address.c where we have a switch
statement to convert PCI controller models
(VIR_DOMAIN_CONTROLLER_MODEL_PCI*) into the connection type flag that
is matched when looking for an upstream connection for that model of
controller (VIR_PCI_CONNECT_TYPE_*). This patch makes a utility
function in conf/domain_addr.c to do that, so that when a new PCI
controller is added, we only need to add the new model-->connect-type
in a single place.
The flags used to determine which devices could be plugged into which
controllers were quite confusing, as they tried to create classes of
connections, then put particular devices into possibly multiple
classes, while sometimes setting multiple flags for the controllers
themselves. The attempt to have a single flag indicate, e.g. that a
root-port or a switch-downstream-port could connect was not only
confusing, it was leading to a situation where it would be impossible
to specify exactly the right combinations for a new controller.
The solution is for the VIR_PCI_CONNECT_TYPE_* flags to have a 1:1
correspondence with each type of PCI controller, plus a flag for a PCI
endpoint device and another for a PCIe endpoint device (the only
exception to this is that pci-bridge and pcie-expander-bus controllers
have their upstream connection classified as
VIR_PCI_CONNECT_TYPE_PCI_DEVICE since they can be plugged into
*exactly* the same ports as any endpoint device). Each device then
has a single flag for connect type (plus the HOTPLUG flag if that
device can e hotplugged), and each controller sets the CONNECT bits
for all controllers that can be plugged into it, as well as for either
type of endpoint device that can be plugged in (and the HOTPLUG flag
if it can accept hotplugged devices).
With this change, it is *slightly* easier to understand the matching
of connections (as long as you remember that the flag for a
device/upstream-facing connection of a controller is the same as that
device's type, while the flags for a controller's downstream
connections is the OR of all device types that can be plugged into
that controller). More importantly, it will be possible to correctly
specify what can be plugged into a pcie-switch-expander-bus, when
support for it is added.
When support for dmi-to-pci-bridge was added, it was assumed that,
just as with the pci-root bus, slot 0 was reserved. This is not the
case - it can be used to connect a device just like any other slot, so
remove the restriction and update the test cases that auto-assign an
address on a dmi-to-pci-bridge.
Every other maxSlot was either set to 0 or to
VIR_PCI_ADDRESS_SLOT_LAST, but this one was for some reason set to the
literal value 31 (which is the same as VIR_PCI_ADDRESS_SLOT_LAST).
This makes them all consistent.
We use device-mapper to enumerate all dm devices, and filter out
the list of multipath devices by checking the target_type string
name. The code however cancels all scanning if we encounter
target_type=NULL
I don't know how to reproduce that situation, but a user was hitting
it in their setup, and inspecting the lvm2/device-mapper code shows
many places where !target_type is explicitly ignored and processing
continues on to the next device. So I think we should do the same
https://bugzilla.redhat.com/show_bug.cgi?id=1069317
The watchdog cli refactoring in 4666b762 dropped the temporary variable
we use to convert to action=dump to action=pause for the qemu cli, and
stored the converted value in the domain structure. Our other watchdog
handling code then treated it as though the user requested action=pause,
which broke action=dump handling.
Revive the temporary variable to fix things.
I tried compiling libvirt with older gcc and probably because I used
different configure options I got some shadowed declarations.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add codes to support creating domain with network defition of assigning
SRIOV VF from a pool.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
commit 30c61901 added new functions to libvirt_private.syms
not alpabetically sorted and erroneously added vz sources to
STATEFUL_DRIVER_SOURCE_FILES, which triggered check-aclrules
running while vz driver isn't ready for it yet.
Pushing under build-breaker rule.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
SDK does not allocate memory when getting strings thus we
need to call every function that returns string twice.
First to obtain string length, second to obtain string
itself. It is tedious so let's create helper functions
for cases when we know length of the result beforehand
and we are not.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
remove unnecessary vzConnectClose prototype and make
local structure vzDomainDefParserConfig be static
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We don't need them anymore as all pointers within vzDriver structure
are not changed during the time it exists.
Where we still need to synchronize we use virObjectLock/Unlock as far
as vzDriver is lockable object.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Lock driver when a new domain is created in prlsdkNewDomainByHandle
and try to find it in the list under lock again because it can race
with vzDomainDefineXMLFlags when a domain with the same uuid is added
via vz dispatcher directly and libvirt define.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
This patch introduces a new 'vzDriver' lockable object and provides
helper functions to allocate/destroy it and we pass it to prlsdkXxx
functions instead of virConnectPtr.
Now we store domain related objects such as domain list, capabitilies
etc. within a single vz_driver vzDriver structure, which is shared by
all driver connections. It is allocated during daemon initialization or
in a lazy manner when a new connection to 'vz' driver is established.
When a connection to vz daemon drops, vzDestroyConnection is called,
which in turn relays disconnect event to all connection to 'vz' driver.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Make it possible to build vz driver as a module and don't link it with
libvirt.so statically.
Remove registering it on client's side as far as we start relying on daemon
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
GCC in RHEL-6 complains about listen:
../../src/conf/domain_conf.c:23718: error: declaration of 'listen' shadows a global declaration [-Wshadow]
/usr/include/sys/socket.h:204: error: shadowed declaration is here [-Wshadow]
This renames all the listen to gListen.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Trying to reload/SIGUSR1 virtlogd or virtlockd fails with:
error : virNetDaemonRun:747 : internal error: Not all servers restored, cannot run server
Commit 252610f7 changed the daemon state json to allow tracking
multiple servers. However it missed clearing dmn->srvObject after
the json is empty, like the previous code paths handled. Later on in
virNewDaemonRun, dmn->srvObject is expected to be empty otherwise we
throw the above error.
https://bugzilla.redhat.com/show_bug.cgi?id=1311013
Since qemu is now able to notify us that the guest rejected the memory
unplug operation we can relay this to the user and make the API fail
right away.
Additionally document the possible values from the ACPI docs for future
reference.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1320447
The event is emitted on ACPI OSPM Status Indication events.
ACPI standard documentation describes the method as:
This object is an optional control method that is invoked by OSPM to
indicate processing status to the platform. During device ejection,
device hot add, or other event processing, OSPM may need to perform
specific handshaking with the platform. OSPM may also need to indicate
to the platform its inability to complete a requested operation; for
example, when a user presses an ejection button for a device that is
currently in use or is otherwise currently incapable of being ejected.
In this case, the processing of the ACPI Eject Request notification by
OSPM fails. OSPM may indicate this failure to the platform through the
invocation of the _OST control method. As a result of the status
notification indicating ejection failure, the platform may take certain
action including reissuing the notification or perhaps turning on an
appropriate indicator light to signal the failure to the user.
Since we didn't opt to use one single event for device lifecycle for a
VM we are missing one last event if the device removal failed. This
event will be emitted once we asked to eject the device but for some
reason it is not possible.
Similarly to the DEVICE_DELETED event we will be able to tell when
unplug of certain device types will be rejected by the guest OS. Wire up
the device deletion signalling code to allow handling this.
Neither of the callers cares whether the DEVICE_DELETED event isn't
supported or the event was received. Simplify the code and callers by
unifying the two values and changing the return value constants so that
a temporary variable can be omitted.
Callers ignore if this function returns -1 and continue as though the
DEVICE_DELETED event was not received. Since we can't be sure that the
event was not received we should behave as if the event was not
supported and remove the device definition right away. The error
fortunately won't really happen here.
The address assigning code might add new pci bridges.
We need them to have an alias when building the command line.
In real word usage, this is not a problem because all the code
paths already call qemuDomainAssignAddresses. However moving
this call lets us remove one extra call from qemuxml2argvtest.
Essentially revert commit 3a6204c which added these to allow the test
suite to pass without depending on the host system state.
Since commit 4b527c1 we already mock virSCSIDeviceGetSgName, so these
callbacks are useless.
Instead of calling the virDomainGraphicsListensParseXML function for all
graphics types and ignore the wrong ones move the call only to graphics
types where we supports listen elements.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Those are the last two places that uses the getter functions. Use a
direct access instead and remove those getters.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Removes the check for graphics type, it's not a public API and developer
know what he's doing and this check makes no sense. It also removes
the ability to allocate a new array if there is none. This was used by
the virDomainGraphicsListenAdd* functions and isn't used anymore.
This is now a simple getter with simple check for listens array presence
and whether the index in out of bounds.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This effectively removes virDomainGraphicsListenSetAddress which was
used only to change the address of listen structure and possible change
the listen type. The new function will auto-expand the listens array
and append a new listen.
The old function was used on pre-allocated array of listens and in most
cases it only "add" a new listen. The two remaining uses can access the
listen structure directly.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Rest of the fields of the iotune data structure did not check for
malformed integers. Use the previously defined macro to extract them
which will simplify the code and add error reporting.
Since the structure was pre-initialized to 0 we don't need to set every
single member to 0 if it's not present in the XML. Additionally if we
put the name of the field into the error message the code can be
simplified using a macro to parse the members.
No need to remember connection name and have corresponding
domain type to keep backward compatibility with former
'parallels' driver. It is enough to be able to accept 'parallels'
uri and domain types.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
we don't need to allocate macstr at all as it is an array
and already has the the space it needs.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
If we encounter a video device with primary=yes, we insert it
at def->videos[0].
There is no need to record this in a separate variable,
just check if there already is a primary video at def->videos[0].
We call VIR_INSERT_ELEMENT_INPLACE either with 0 (for primary video)
or def->nvideos (for the rest).
Use a variable with more semantic name, since j is usually used
for iterating.
We start with both i and def->nvideos at 0 and increment both
after every successful iteration.
Use i directly, instead of passing the def->nvideos value through j.
Commit 119cd06 started setting the primary bool for the first
user-specified video even if user omitted the 'primary' attribute.
However this was done before the addition of the implicit device.
This broke startup of transient qemu domains with no <video>:
https://bugzilla.redhat.com/show_bug.cgi?id=1325757
Move this default to virDomainDefPostParseInternal,
after the addition of the implicit video device, to catch the implicit
video as well.
Quite straigthforward as vz sdk memory setting function makes
just what we want to that is set "amount of physical memory
allocated to a domain".
'useflags' is introduced for non flag function implementation.
We can't just use combination of flags like "live | config" or
we fail for inactive domains. Other combinations have drawbacks
too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Actually this is not pure refactoring. Part of common code is
replaced with virDomainObjUpdateModificationImpact and this
a good replacement. It includes removed check of inactive
domain and active flags set. Additionally we resolve
current flag in accordance with current state of domain.
Thus it becames possible to attach/detach devices for
inactive domains if this flag is set.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Commit dc98a5bc refactored the code a lot and forget about checking if
listen attribute is specified. This ensures that listen attribute and
first listen element are compared only if both exist.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
fdstream.c: In function 'virFDStreamWrite':
fdstream.c:390:29: error: logical 'or' of equal expressions [-Werror=logical-op]
if (errno == EAGAIN || errno == EWOULDBLOCK) {
^~
Fedora rawhide now uses gcc 6.0 and there is a bug with -Wlogical-op
producing false warnings.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69602
Use GCC pragma push/pop and ignore -Wlogical-op for GCC that supports
push/pop pragma and also has this bug.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Replace the nonsensical debug statement by adding the expected event
code into the existing debug statement.
Since the monitor code always notifies the agent on guest
reboot/shutdown even if that was not initiated by the agent the warning
emitted later is bogus and pollutes the logs in such cases. Delete it
and keep just the original debug message where this info can be
inferred.
Move including of gnutls/gnutls.h in qemu/qemu_domain.c under the
"ifdef WITH_GNUTLS" check because otherwise it fails like this:
CC qemu/libvirt_driver_qemu_impl_la-qemu_domain.lo
qemu/qemu_domain.c:50:10: fatal error: 'gnutls/gnutls.h' file not found
in case if gnutls is not installed on the system.
Some places already check for "virt-" prefix as well as plain "virt".
virQEMUCapsHasPCIMultiBus did not, resulting in multiple PCI devices
having assigned the same unnumbered "pci" alias.
Add a test for the "virt-2.6" machine type which also omits the
<model type='virtio'/> in <interface>, to check if
qemuDomainDefaultNetModel works too.
https://bugzilla.redhat.com/show_bug.cgi?id=1325085
virSocketAddrFormat() wants a single pointer, not a double pointer.
Fixes the following compilation error on FreeBSD:
util/virnetdev.c:1448:72: error: incompatible pointer types passing
'virSocketAddr **' to parameter of type 'const virSocketAddr *';
remove & [-Werror,-Wincompatible-pointer-types]
if (VIR_SOCKET_ADDR_VALID(peer) && !(peerstr = virSocketAddrFormat(&peer)))
^~~~~
./util/virsocketaddr.h:92:48: note: passing argument to parameter 'addr' here
char *virSocketAddrFormat(const virSocketAddr *addr);
^
FreeBSD lacks ENODATA, and viruuid.c redefines it to EIO, but it's not
actually using it. On the other hand, we have virrandom.c that's using
ENODATA. So make this re-definition common by moving it to internal.h,
so all the current and possible future users don't need to care about
that.
In the latest libxenlight code, libxl_domain_create_restore accepts a
new argument. Update libvirt's libxl driver for that. Use the macro
provided by libxenlight to detect which version should be used.
The new parameter (send_back_fd) is set to -1 because libvirt provides
no such fd.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Message-id: 1459866012-27081-1-git-send-email-wei.liu2@citrix.com
Our use of gnutls_rnd(), introduced with commit ad7520e8, is
conditional to the availability of the <gnutls/crypto.h> header
file.
Such check, however, turns out not to be strict enough, as there
are some versions of GnuTLS (eg. 2.8.5 from CentOS 6) that provide
the header file, but not the function itself, which was introduced
only in GnuTLS 2.12.0.
Introduce an explicit check for the function.
As usual we try to deal correctly with vz domains that were
created by other means and thus can have all range of SDK domain
parameters. If vz domain boot order can't be represented
in libvirt os boot section let's give warning and make os boot section
represent SDK to some extent.
1. Os boot section supports up to 4 boot devices. Here we just
cut SDK boot order up to this limit. Not too bad.
2. If there is a floppy in boot order let's just skip it.
Anyway we don't show it in the xml. Not too bad too.
3. SDK boot order with unsupported disks order. Say we have "hdb, hda" in
SDK. We can not present this thru os boot order. Well let's just
give warning but leave double <boot dev='hd'/> in xml. It's
kind of misleading but we warn you!
SDK boot order have an extra parameters 'inUse' and 'sequenceIndex'
which makes our task more complicated. In realitly however 'inUse'
is always on and 'sequenceIndex' is not less than 'boot position index'
which simplifies out task back again! To be on a safe side let's explicitly
check for this conditions!
We have another exercise here. We want to check for unrepresentable
condition 3 (see above). The tricky part is that in contrast to
domains defined thru this driver 3-rd party defined domains can
have device ordering different from default. Thus we need
some id to check that N-th boot disk of os boot section is same as
N-th boot disk of SDK boot. This is what prlsdkBootOrderCheck
for. It uses disks sources paths as id for disks and iface names
for network devices.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We want to report boot order in dumpxml for vz domains.
Thus we want disks devices to be sorted in output compatible with boot
ordering specification. So let's just use virDomainDiskInsert
which makes appropriate sorting.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The patch makes some refactoring of the existing code. Current boot order spec code
makes very simple thing in somewhat obscure way. In case of VMs
it sets the first hdd as the only bootable device. In case of CTs it
doesn't touch the boot order at all if one of the filesystems is mounted to root.
Otherwise like in case of VMs it sets the first hdd as the only bootable
device and additionally sets this device mount point to root. Refactored
code makes all this explicit.
The actual boot order support is simple. Common libvirt domain xml parsing
code makes the exact ordering of disks devices as described in docs
for boot ordering (disks are sorted by bus order first, device target
second. Bus order is the order of disk buses appearence in original
xml. Device targets order is alphabetical). We add devices in the
same order and SDK designates device indexes sequentially for each
device type. Thus device index is equal to its boot index. For
example N-th cdrom in boot specification refers to sdk cdrom with
it's device index N.
If there is no boot spec in xml the parsing code will add <boot dev='hdd'>
for HVMs automatically and we backward compatibly set fist hdd as
bootable.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
The new perf code didn't bother to clear a pointer in 'priv' causing a
double free or other memory corruption goodness if a VM failed to start.
Clear the pointer after freeing the memory.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1324757
For device hotplug, the new alias ID needs to be checked in the list
rather than using the count of devices. Unplugging a device that is not
last in the array will make further hotplug impossible due to alias
collision.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1324551
For device hotplug, the new alias ID needs to be checked in the list
rather than using the count of devices. Unplugging a device that is not
last in the array will make further hotplug impossible due to alias
collision.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1324551
Commit id 'fb2bd208' essentially copied the qemuGetSecretString
creating an libxlGetSecretString. Rather than have multiple copies
of the same code, create src/secret/secret_util.{c,h} files and
place the common function in there.
Modify the the build in order to build the module as a library
which is then pulled in by both the qemu and libxl drivers for
usage from both qemu_command.c and libxl_conf.c
If the -object secret capability exists, then get the path to the
masterKey file and provide that to qemu. Checking for the existence
of the file before passing to qemu could be done, but causes issues
in mock test environment.
Since the qemuDomainObjPrivate is not available when building the
command line, the qemuBuildHasMasterKey API will have to suffice
as the primary arbiter for whether the capability exists in order
to find/return the path to the master key for usage.
Created the qemuDomainGetMasterKeyAlias API which will be used by
later patches to define the 'keyid' (eg, masterKey) to be used by
other secrets to provide the id to qemu for the master key.
Add a masterKey and masterKeyLen to _qemuDomainObjPrivate to store a
random domain master key and its length in order to support the ability
to encrypt/decrypt sensitive data shared between libvirt and qemu. The
key will be base64 encoded and written to a file to be used by the
command line building code to share with qemu.
New API's from this patch:
qemuDomainGetMasterKeyFilePath:
Return a path to where the key is located
qemuDomainWriteMasterKeyFile: (private)
Open (create/trunc) the masterKey path and write the masterKey
qemuDomainMasterKeyReadFile:
Using the master key path, open/read the file, and store the
masterKey and masterKeyLen. Expected use only from qemuProcessReconnect
qemuDomainGenerateRandomKey: (private)
Generate a random key using available algorithms
The key is generated either from the gnutls_rnd function if it
exists or a less cryptographically strong mechanism using
virGenerateRandomBytes
qemuDomainMasterKeyRemove:
Remove traces of the master key, remove the *KeyFilePath
qemuDomainMasterKeyCreate:
Generate the domain master key and save the key in the location
returned by qemuDomainGetMasterKeyFilePath.
This API will first ensure the QEMU_CAPS_OBJECT_SECRET is set
in the capabilities. If not, then there's no need to generate
the secret or file.
The creation of the key will be attempted from qemuProcessPrepareHost
once the libDir directory structure exists.
The removal of the key will handled from qemuProcessStop just prior
to deleting the libDir tree.
Since the key will not be written out to the domain object XML file,
the qemuProcessReconnect will read the saved file and restore the
masterKey and masterKeyLen.
Using the existing virUUIDGenerateRandomBytes, move API to virrandom.c
rename it to virRandomBytes and add it to libvirt_private.syms.
This will be used as a fallback for generating a domain master key.
Add a capability bit for the qemu secret object.
Adjust the 2.6.0-1 caps/replies to add the secret object. For the
.replies it's take from the '{"execute":"qom-list-types"}' output.
When a hostdev is attached to the guest (and removed from the host),
the order of operations is call qemuHostdevPreparePCIDevices to remove
the device from the host, call qemuSetupHostdevCgroup to setup the cgroups,
and virSecurityManagerSetHostdevLabel to set the labels.
When the device is removed from the guest, the code didn't use the
reverse order leading to possible issues (especially if the path to
the device no longer exists). This patch will move the call to
qemuTeardownHostdevCgroup to prior to reattaching the device to
the host.
When a hostdev is attached to the guest (and removed from the host),
the order of operations is call qemuHostdevPreparePCIDevices to remove
the device from the host, call qemuSetupHostdevCgroup to setup the cgroups,
and virSecurityManagerSetHostdevLabel to set the labels.
When the device is removed from the guest, the code didn't use the
reverse order leading to possible issues (especially if the path to
the device no longer exists). This patch will move the call to
virSecurityManagerRestoreHostdevLabel to prior to reattaching the
device to the host.
Commit id '3992ff14' added the prototype for networkGetActualType
with 1 parameter, but added 2 ATTRIBUTE_NONNULL's (assume from a
cut-n-paste), just remove (2).
Commit d77ffb6876 added not only reporting of the PCI header type, but
also parsing of that information. However, because there was no parsing
done for the other sub-PCI capabilities, if there was any other
capability then a valid header type name (like phys_function or
virt_functions) the parsing would fail. This prevented passing node
device XMLs that we generated into our own functions when dealing with,
e.g. with SRIOV cards.
Instead of reworking the whole parsing, just fix this one occurence and
remove a test for it for the time being. Future patches will deal with
the rest.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Starting with commit f8e712fe, if you start a domain that has an
<interface type='hostdev' (or that has <interface type='network'>
where the network is a pool of devices for hostdev assignment), when
you later try to add *another* interface (of any kind) with hotplug,
the function qemuAssignDeviceNetAlias() fails as soon as it sees a
"hostdevN" alias in the list of interfaces), causing the attach to
fail.
This is because (starting with f8e712fe) the device alias names are
assigned during the new function qemuProcessPrepareDomain(), which is
called *before* networkAllocateActualDevice() (which is called from
qemuProcessPrepareHost(), which is called from
qemuProcessLaunch()). Prior to that commit,
networkAllocateActualDevice() was called first.
The problem with this is that the alias for interfaces that are really
a hostdev (<interface type='hostdev'>) is of the form "hostdevN" (just
like other hostdevs), while other interfaces are "netN". But if you
don't know that the interface is going to be a hostdev at the time you
assign the alias name, you can't name it differently. (As far as I've
seen so far, the change in name by itself wouldn't have been a problem
(other than just an outwardly noticeable change in behavior) except
for the abovementioned failure to attach/detach new interfaces.
Rather than take the chance that there may be other not-yet-revealed
problems associated with changing the alias name, this patch changes
the way that aliases are assigned to restore the old behavior.
Old: In the past, assigning an alias to an interface was skipped if it
was seen that the interface was type='hostdev' - we knew that the
hostdev part of the interface was also in the list of hostdevs (that's
part of what happens in networkAllocateActualDevice()) and it would be
assigned when all the other hostdev aliases were assigned.
New: When assigning an alias to an interface, we haven't yet called
networkAllocateActualDevice() to construct the hostdev part of the
interface, so we can't just wait for the loop that creates aliases for
all the hostdevs (there's nothing on that list for this device
yet!). Instead we handle it immediately in the loop creating interface
aliases, by calling the new function networkGetActualType() to
determine if it is going to be hostdev, and if so calling
qemuAssignDeviceHostdevAlias() instead.
Some adjustments have to be made to both
qemuAssignDeviceHostdevAlias() and to qemuAssignDeviceNetAlias() to
accommodate this. In both of them, an error return from
qemuDomainDeviceAliasIndex() is no longer considered an error; instead
it's just ignored (because it almost certainly means that the alias
string for the device was "net" when we expected "hostdev" or vice
versa). in qemuAssignDeviceHostdevAlias() we have to look at all
interface aliases for hostdevN in addition to looking at all hostdev
aliases (this wasn't necessary in the past, because both the interface
entry and the hostdev entry for the device already pointed at the
device info; no longer the case since the hostdev entry hasn't yet
been setup).
Fortunately the buggy behavior hasn't yet been in any official release
of libvirt.
In certain cases, we need to assign a hostdevN-style alias in a case
when we don't have a virDomainHostdevDefPtr (instead we have a
virDomainNetDefPtr). Since qemuAssignDeviceHostdevAlias() doesn't use
anything in the virDomainHostdevDef except the alias string itself
anyway, this patch just changes the arguments to pass a pointer to the
alias pointer instead.
There are times when it's necessary to learn the actual type of a
network connection before any resources have been allocated
(e.g. during qemuProcessPrepareDomain()), but in the past it was
necessary to call networkAllocateActualDevice() in order to have the
actual type filled in.
This new function returns the type of network that *will be* setup
once it actually happens, but without making any changes on the host.
The paths have the domain ID in them. Without cleaning them, they would
contain the same ID even after multiple restarts. That could cause
various problems, e.g. with access.
Add function qemuDomainClearPrivatePaths() for this as a counterpart of
qemuDomainSetPrivatePaths().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The directory name changed in a89f05ba8d.
This unbreaks launching QEMU/KVM VMs with apparmor enabled. It also adds
the directory for the qemu guest-agent socket which is not known when
parsing the domain XML.
This reverts commit ee4cfb5643.
Since we're still not persisting our bookkeeping lists across
daemon restarts, we might have lost some information
virPCIDeviceReattach() relies on, for example whether the
device needs to be unbound from the stub driver.
As a result, if the daemon has been restarted in the meantime,
the device might end up remaining bound to the stub driver even
after 'virsh nodedev-reattach' or similar has been called, with
no way of giving it back to the host short of messing with
sysfs behind libvirt's back.
Revert back to the previous behavior of always trying to bind
the device to the host driver, regardless of its status when it
was detached, until persistent bookkeeping lists have been
implemented.
Commit 08cc14f moved the conversion of MiB/s to B/s out of the
qemuMonitor APIs, but forgot to adjust the qemuMigrationDriveMirror
caller.
This patch will convert the migrate_speed value from MiB/s to its
mirror_speed equivalent in bytes/s.
Signed-off-by: Rudy Zhang <rudyflyzhang@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
@flags have a valid modification impact only after calling
virDomainObjUpdateModificationImpact. virDomainObjGetOneDef calls it but
doesn't update them in the caller.
Chunyan sent a nice cleanup patch for libxlDomainDetachNetDevice
https://www.redhat.com/archives/libvir-list/2016-March/msg00926.html
which I incorrectly modified before pushing as commit b5534e53. My
modification caused network devices of type hostdev to no longer
be removed. This patch changes b5534e53 to resemble Chunyan's
original, correct patch.
Chunyan sent a correct patch to fix a resource leak on error in
libxlDomainAttachNetDevice
https://www.redhat.com/archives/libvir-list/2016-March/msg00924.html
I made what was thought to be an improvement and pushed the patch as
commit e6336442. As it turns out, my change broke adding net devices
that are actually hostdevs to the list of nets in virDomainDef. This
patch changes e6336442 to resemble Chunyan's original, correct
patch.
Compilation for xdg-app failed due to a buggy SASL headers present on
the used runtime (org.gnome.Sdk 3.18).
In file included from rpc/virnetsaslcontext.h:24:0,
from rpc/virnetsaslcontext.c:25:
/usr/include/sasl/sasl.h:230:38: error: unknown type name 'size_t'
typedef void *sasl_realloc_t(void *, size_t);
^
/usr/include/sasl/sasl.h:235:5: error: unknown type name 'sasl_realloc_t'
sasl_realloc_t *,
Use the same workaround as commit 1be3dfd did.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We use _LAST items in enums to mark the last position in given
enum. Now, if and enum is passed to switch(), compiler checks
that all the values from enum occur in 'case' enumeration.
Including _LAST. But coverity spots it's a dead code. And it
really is. So to resolve this, we tend to put a comment just
above 'case ..._LAST' notifying coverity that we know this is a
dead code but we want to have it that way.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Do I really need to explain why?
Well, if read() is interrupted int the middle of reading, we will
never read the rest (even though it's highly unlikely as we are
reading just 8 bytes).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though we have the machine locked throughout whole APIs we
are querying/modifying domain internal state. We should grab a
job whilst doing that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have @flags we can support changing perf events just
in active or inactive configuration regardless of the other.
Previously, calling virDomainSetPerfEvents set events in both
active and inactive configuration at once. Even though we allow
users to set perf events that are to be enabled once domain is
started up. The virDomainGetPerfEvents API was flawed too. It
returned just runtime info.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed that these APIs are missing @flags argument. Even
though we don't have a use for them, it's our policy that every
new API must have @flags.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
They recently were extracted to a separate function. They don't belong
together though. Since -numa formatting is pretty compact, move it to
the main function and rename qemuBuildNumaCommandLine to
qemuBuildMemoryDeviceCommandLine.
When starting up a VM libvirtd asks numad to place the VM in case of
automatic nodeset. The nodeset would not be passed to the memory device
formatter and the user would get an error.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1269715
Commit 7068b56c introduced several hyperv features. Not all hyperv
features are supported by old enough kernels and we shouldn't allow to
start a guest if kernel doesn't support any of the hyperv feature.
There is one exception, for backward compatibility we cannot error out
if one of the RELAXED, VAPIC or SPINLOCKS isn't supported, for the same
reason we ignore invtsc, to not break restoring saved domains with older
libvirt.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This check is there to allow restore saved domain with older libvirt
where we included invtsc by default for host-passthrough model. Don't
skip the whole function, but only the part that checks for invtsc.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Remove disabling domain death events from libxlDomainStart error
path. The domain death event is already disabled in libxlDomainCleanup.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
libxlDomainStart allocates and reserves resources that were not
being released in error paths. libxlDomainCleanup already handles
the job of releasing resources, and libxlDomainStart should call
it when encountering a failure.
Change the error handling logic to call libxlDomainCleanup on
failure. This includes acquiring the lease sooner and allowing
it to be released in libxlDomainCleanup on failure, similar to
the way other resources are reclaimed. With the lease now
released in libxlDomainCleanup, the release_dom label can be
renamed to cleanup_dom to better reflect its changed semantics.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Create a bitmap of iothreads that have scheduler info set so that the
transformation algorithm does not have to iterate the empty bitmap many
times. By reusing self-expanding bitmaps the bitmap size does not need
to be pre-calculated.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1264008
In some cases it's impractical to use the regular APIs as the bitmap
size needs to be pre-declared. These new APIs allow to use bitmaps that
self expand.
The new code adds a property to the bitmap to track the allocation of
memory so that VIR_RESIZE_N can be used.
Since commit v1.3.2-119-g1e34a8f which enabled debug-threads in QEMU
qemuGetProcessInfo would fail to parse stats for any thread with a space
in its name.
https://bugzilla.redhat.com/show_bug.cgi?id=1316803
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemu won't ever add those functions directly to QMP. They will be
replaced with 'blockdev-add' and 'blockdev-del' eventually. At this time
there's no need to keep the stubs around.
Additionally the drive_del stub in JSON contained dead code in the
attempt to report errors. (VIR_ERR_OPERATION_UNSUPPORTED was never
reported). Since the text impl does have the same message it is reported
anyways.
This patch adds new xml element, and so we can have the option of
also having perf events enabled immediately at startup.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-6-git-send-email-qiaowei.ren@intel.com
This patch implement a set of interfaces for perf event. Based on
these interfaces, we can implement internal driver API for perf,
and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-4-git-send-email-qiaowei.ren@intel.com
API agreed on in
https://www.redhat.com/archives/libvir-list/2015-October/msg00872.html
* include/libvirt/libvirt-domain.h (virDomainGetPerfEvents,
virDomainSetPerfEvents): New declarations.
* src/libvirt_public.syms: Export new symbols.
* src/driver-hypervisor.h (virDrvDomainGetPerfEvents,
virDrvDomainSetPerfEvents): New typedefs.
* src/libvirt-domain.c: Implement virDomainGetPerfEvents and
virDomainSetPerfEvents.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-2-git-send-email-qiaowei.ren@intel.com
If the pool creation thread happens to detect the luns in
the scsi target, the size parameters will be calculated as
part of the refreshPool called from storagePoolCreate().
This means the virStoragePoolFCRefreshThread (commit id
'512b874') waiting to run and "refresh" the pool will
essentially double the allocation and capacity values.
A separate refresh would correct the values.
To avoid this, the FCRefreshThread needs to reinitialize
the pool size values prior to calling virStorageBackendSCSIFindLUs
which eventually calls virStorageBackendSCSINewLun and
updates the size values for each volume found.
After the recent commits the build didn't work for me. Fix it by
using size_t as the callback argument is using and the correct
formatter. The attempted fixup to use %llu as a formatter was wrong.
Commit e6336442 changed the 'out:' label to 'cleanup' in
libxlDomainAttachNetDevice(), but missed a comment referencing
the 'out:' label. Remove it from the comment since it is no
longer accurate anyhow.
This patch adds support for "vpindex", "runtime", "synic",
"stimer", and "vendor_id" features available in qemu 2.5+.
- When Hyper-V "vpindex" is on, guest can use MSR HV_X64_MSR_VP_INDEX
to get virtual processor ID.
- Hyper-V "runtime" enlightement feature allows to use MSR
HV_X64_MSR_VP_RUNTIME to get the time the virtual processor consumes
running guest code, as well as the time the hypervisor spends running
code on behalf of that guest.
- Hyper-V "synic" stands for Synthetic Interrupt Controller, which is
lapic extension controlled via MSRs.
- Hyper-V "stimer" switches on Hyper-V SynIC timers MSR's support.
Guest can setup and use fired by host events (SynIC interrupt and
appropriate timer expiration message) as guest clock events
- Hyper-V "reset" allows guest to reset VM.
- Hyper-V "vendor_id" exposes hypervisor vendor id to guest.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
1. All hyperv features are tristate ones. So make tristate generating part common.
2. Reduce nesting on spinlocks.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
1. All hyperv features are tristate ones. So make tristate parsing code common.
2. Reindent switch statement.
3. Reduce nesting in spinlocks parsing.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
After the patches that added tracking of in-use macvtap names (commit
370608, first appearing in libvirt-1.3.2), if the function to allocate
a new macvtap device came to a device name created outside libvirt, it
would retry the same device name MACVLAN_MAX_ID (8191) times before
finally giving up in failure.
The problem was that virBitmapNextClearBit was always being called
with "0" rather than the value most recently checked (which would
increment each time through the loop), so it would always return the
same id (since we dutifully release that id after failing to create a
new device using it).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1321546
Signed-off-by: Laine Stump <laine@laine.org>
For those VF allocated from a network pool, we need to set its backend
to be VIR_DOMAIN_HOSTDEV_PCI_BACKEND_XEN so that later work can be
correct.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
This reverts commit bb5f2dc91f.
The "if (vol->target.format != VIR_STORAGE_FILE_RAW)" check in the
createVol backend. This check is bogus because virStorageVolDefParseXML()
in conf/storage_conf.c sets target.format only if volOptions in
virStoragePoolTypeInfo has formatFromString set, and that's not the
case the zfs backend.
So the check always fails and breaks volume creation.
This reverts commit 6682d6219d.
The "if (vol->target.format != VIR_STORAGE_FILE_RAW)" check in the
createVol backend. This check is bogus because virStorageVolDefParseXML()
in conf/storage_conf.c sets target.format only if volOptions in
virStoragePoolTypeInfo has formatFromString set, and that's not the
case the logical backend.
So the check always fails and breaks volume creation.
When hostdev parent is network device, should call
libxlDomainDetachNetDevice to detach the device from a higher level.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
When AttachNetDevice failed, should call networkReleaseActualDevice
to release actual device, and if actual device is hostdev, should
remove the hostdev from vm->def->hostdevs.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
networkStartNetwork() and networkShutdownNetwork() were calling the
wrong type-specific function in the case of networks that were
configured for macvtap ("direct") bridge mode - they were instead
calling the functions for a tap+bridge network. Currently none of
these functions does anything (they just return 0) so it hasn't
created any problems, but that could change in the future.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316465
An attempt to simplify the code for the VIR_NETWORK_FORWARD_BRIDGE
case of networkUpdateState in commit b61db335 (first in release
1.2.14) resulted in networks based on macvtap bridge mode being
erroneously marked as inactive any time libvirtd was restarted.
The problem is that the original code had differentiated between a
network using tap devices to connect to an existing host-bridge device
(forward mode of VIR_NETWORK_FORWARD_BRIDGE and a non-NULL
def->bridge), and one using macvtap bridge mode to connect to any
ethernet device (still forward mode VIR_NETWORK_FORWARD_BRIDGE, but
null def->bridge), but the changed code assumed that all networks with
VIR_NETWORK_FORWARD_BRIDGE were tap + host-bridge networks, so a null
def->bridge was interpreted as "inactive".
This patch restores the original code in networkUpdateState
The adminDispatchConnectListServers() function is generated by
our great perl script. However, it has a tiny flaw: if
adminConnectListServers() it calls fails, the control jumps onto
cleanup label where we try to free any list of servers built so
far. However, in the loop @i is unsigned (size_t) while @nresults
is signed (int). Currently, it does no harm because of the check
for @result being non-NULL. But if that ever changes in the
future, this bug will be hard to chase.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If a user specify network type ethernet, then create it via libvirt and run
script if it provided. After this commit user does not need to
run external script to create tap device or add root permissions to qemu
process.
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Instead of forcing the values for the unbind_from_stub, remove_slot
and reprobe properties, look up the actual device and use that when
calling virPCIDeviceReattach().
This ensures the device is restored to its original state after
reattach: for example, if it was not bound to any driver before
detach, it will not be bound forcefully during reattach.
We would be just fine looking up the information in pcidevs most
of the time; however, some corner cases would not be handled
properly, so look up the actual device instead.
After this patch, ownership of virPCIDevice instances is very easy
to keep track of: for each host PCI device, the only instance that
actually matters is the one inside one of the bookkeeping list.
Whenever some operation needs to be performed on a PCI device, the
actual device is looked up first; when this is not the case, a
comment explains the reason.
Unmanaged devices, as the name suggests, are not detached
automatically from the host by libvirt before being attached to a
guest: it's the user's responsability to detach them manually
beforehand. If that preliminary step has not been performed, the
attach operation can't complete successfully.
Instead of relying on the lower layers to error out with cryptic
messages such as
error: Failed to attach device from /tmp/hostdev.xml
error: Path '/dev/vfio/12' is not accessible: No such file or directory
prevent the situation altogether and provide the user with a more
useful error message.
Unmanaged devices are attached to guests in two steps: first,
the device is detached from the host and marked as inactive;
subsequently, it is marked as active and attached to the guest.
If the daemon is restarted between these two operations, we lose
track of the inactive device.
Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly
take care of this situation, but some planned changes will make
it so that's no longer the case. Plus, explicit is always better
than implicit.
This will skip few steps from qemuProcessStart in order to create only
qemu CMD. Use a VIR_QEMU_PROCESS_START_PRETEND for all the qemuProcess*
functions called by this one to not modify or check host.
This new function will be used later on for XMLToNative API and also for
qemuxml2argvtest to make sure that both API and test uses the same code
as qemuProcessStart.
We need also update qemuProcessInit to wrap few lines of code with check
that VIR_QEMU_PROCESS_START_PRETEND that makes sense only for
qemuProcessStart.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move all code that checks host and domain. Do not check host if we use
VIR_QEMU_PROCESS_START_PRETEND flag.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move all code that modifies only live XML to this function. The new
VIR_QEMU_PROCESS_START_PRETEND flag will be used by qemuXMLToNative and
qemuxml2argvtest later in order to reuse the same code as
qemuProcessStart uses.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The postParse callback is the correct place to generate default values
that should be present in offline XML.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
QEMU changed the error message to:
"Tray of device 'drive-sata0-0-1' is not open"
and they may change the error massage in the future.
This updates the code to not depend on the text from the error message
but only on error itself.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When reading in an XML definition for a SCSI target device, the name
property of struct scsi_target refers to the @target element.
Let's fix this obvious typo and also extend the XML schema to provide
validation.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Until now, the libxl driver ignored any <hap> setting in domain XML
and deferred to libxl, which enables hap if not specified. While
this is a good default, it prevents disabling hap if desired.
This change allows disabling hap with <hap state='off'/>. hap is
explicitly enabled with <hap/> or <hap state='on/>. Absense of <hap>
retains current behavior of deferring default state to libxl.
hap is enabled by default in xm and xl config and usually only
specified when it is desirable to disable hap (hap = 0). Change
the xm,xl <-> xml converter to behave similarly. I.e. only
produce 'hap = 0' when <hap state='off'/> and vice versa.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Most hypervisors use Hardware Assisted Paging by default and don't
require specifying the feature in domain conf. But some hypervisors
support disabling HAP on a per-domain basis. To enable HAP by default
yet provide a knob to disable it, extend the <hap> feature with a
'state=on|off' attribute, similar to <pvspinlock> and <vmport> features.
In the absence of <hap>, the hypervisor default (on) is used. <hap>
without the state attribute would be the same as <hap state='on'/> for
backwards compatibility. And of course <hap state='off'/> disables hap.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The function already takes two bool arguments, switching to flags makes
it a lot easier to read. Especially in case we need to add another
boolean in the future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In post-copy mode none of the hosts has a complete guest state and
rolling back migration is impossible. Thus aborting it would be
equivalent to destroying the domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When migration fails in the post-copy mode, it's impossible to just kill
the destination domain and resume the source since the source no longer
contains current guest state. Let's mark domains on both sides as
VIR_DOMAIN_PAUSED_POSTCOPY_FAILED to let the upper layer decide what to
do with them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When destination libvirtd is restarted during migration in Finish phase
just after the point we started guest CPUs, we should not kill the
domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Migration enters "postcopy-active" state after QEMU switches to
post-copy and pauses guest CPUs. From libvirt's point of view this state
is similar to "completed" because we need to transfer guest execution to
the destination host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
To use post-copy one has to start the migration with
VIR_MIGRATE_POSTCOPY flag and, while migration is in progress, call
virDomainMigrateStartPostCopy() to switch from pre-copy to post-copy.
Signed-off-by: Cristian Klein <cristiklein@gmail.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY and VIR_DOMAIN_PAUSED_POSTCOPY are
used on the source host once migration enters post-copy mode (which
means the domain gets paused on the source. After the destination host
takes over the execution of the domain, its virtual CPUs are resumed and
the domain enters VIR_DOMAIN_RUNNING_POSTCOPY state and
VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event is emitted.
In case migration fails during post-copy mode and none of the hosts have
complete state of the domain, both domains will remain paused with
VIR_DOMAIN_PAUSED_POSTCOPY_FAILED reason and an upper layer may decide
what to do.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This allows setting the address in host and/or network order and makes
the naming consistent. Now you don't need to call [hn]to[nh]l()
functions as that is taken care of by these functions. Also, now
the *NetOrder take the address in network order, the other functions in
host order so the naming and usage is consistent. Some places were
having the address in network order and calling ntohl() just so the
original function can call htonl() again. This makes it nicer to read.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If a <graphics type='spice'> has no port nor tlsPort set, the generated
QEMU command line will contain -spice port=0.
This is later going to be ignored by spice-server, but it's better not
to add it at all in this situation.
As an empty -spice is not allowed, we still need to append port=0 if we
did not add any other argument.
The end goal is to avoid adding -spice port=0,addr=127.0.0.1 to QEMU command
line when no SPICE port is specified in libvirt XML.
Currently, the code relies on port=xx to always be present, so subsequent
args can be unconditionally appended with a leading ','. Since port=0
will no longer be added in a subsequent commit, we append a ',' to every
arg instead of prepending, and remove the last one before adding it to
the arg list.
It's just a combination of AddImplicitControllers, and AddConsoleCompat.
Every caller that wants ImplicitControllers also wants the ConsoleCompat
AFAICT, so lump them together. We also need it for future patches.
Judging by how the whitelist has skewed quite far from the original
error message, I think it's better to just drop these.
If someone wants to revive this check I suggest implementing it on
a per-HV driver basis with PostParse callbacks.