A new function virNetDevOpenvswitchUpdateVlan has been created to instruct
OVS of the changes. qemuDomainChangeNet has been modified to handle the
update of the VLAN configuration for a running guest and rely on
virNetDevOpenvswitchUpdateVlan to do the actual update if needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Preparation for switching to virFileCache where there are two callbacks,
one to get a new data and second one to load a cached data.
This also removes virQEMUCapsReset which is no longer required.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
It's not required and following patches will change the code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Cleanups the code a little bit and reduces amount of arguments passed
throughout the functions.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
While searching for an element using a function it may be
desirable to know the element key for future operation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
At present shared disks can be migrated with either readonly or cache=none. But
cache=directsync should be safe for migration, because both cache=directsync and cache=none
don't use the host page cache, and cache=direct write through qemu block layer cache.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Wang Yechao <wang.yechao255@zte.com.cn>
Use virStorageSource accessors to check the file and call
virStorageFileAccess before even attempting to stat the target. This
will be helpful once we try to add network destinations for block copy,
since there will be no need to stat them.
When copying to a block device, the block device will already exist. To
allow users using a block device without any preparation, they need to
use the block copy without VIR_DOMAIN_BLOCK_COPY_REUSE_EXT.
This means that if the target is an existing block device we don't need
to prepare it, but we can't reject it as being existing.
To avoid breaking this feature, explicitly assume that existing block
devices will be reused even without that flag explicitly specified,
while skipping attempts to create it.
qemuMonitorDriveMirror still needs to honor the flag as specified by the
user, since qemu overwrites the metadata otherwise.
The refactor to split up storage driver into modules broke the apparmor
helper program, since that did not initialize the storage driver
properly and thus detection of the backing chain could not work.
Register the storage driver backends explicitly. Unfortunately it's now
necessary to link with the full storage driver to satisfy dependencies
of the loadable modules.
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The purpose of this function is to tell if the current position
in given FD is in data section or a hole and how much bytes there
is remaining until the end of the section. This is achieved by
couple of lseeks(). The most important part is that we reposition
the FD back, so that the position is unchanged from the caller
POV. And until now the final lseek() back to the original
position was done with no check for errors. And I was convinced
that that's okay since nothing can go wrong. However, review
feedback from a related series persuaded me, that it's better to
be safe than sorry. Therefore, lets check if the final lseek()
succeeded and if it doesn't report an error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All the pieces are now in place, so we can finally start
using isolation groups to achieve our initial goal, which is
separating hostdevs from emulated PCI devices while keeping
hostdevs that belong to the same host IOMMU group together.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1280542
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
These rules will make it possible for libvirt to
automatically assign PCI addresses in a way that
respects any isolation constraints devices might
have.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Isolation groups will eventually allow us to make sure certain
devices, eg. PCI hostdevs, are assigned to guest PCI buses in
a way that guarantees improved isolation, error detection and
recovery for machine types and hypervisors that support it,
eg. pSeries guest on QEMU.
This patch merely defines storage for the new information
we're going to need later on and makes sure it is passed from
the hypervisor driver (QEMU / bhyve) down to the generic PCI
address allocation code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Now that we have a bit more control, let's convert our object into
a lockable object and let that magic handle the create and lock/unlock.
This also involves creating a virNodeDeviceEndAPI in order to handle
the object cleanup for API's that use the Add or Find API's in order
to get a locked/reffed object. The EndAPI will unlock and unref the
object returning NULL to indicate to the caller to not use the obj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the structures to withing virnodedeviceobj.c
Move the typedefs from node_device_conf to virnodedeviceobj.h
Signed-off-by: John Ferlan <jferlan@redhat.com>
In an overall effort to privatize access to virNodeDeviceObj and
virNodeDeviceObjList into the virnodedeviceobj module, move the
object list parsing from node_device_driver and replace with a
call to a virnodedeviceobj helper. This follows other similar
APIs/helpers which peruse the object list looking for some specific
data in order to get/return an @device (virNodeDevice) object to
the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to move the call to nodeDeviceSysfsGetSCSIHostCaps from
node_device_driver into virnodedeviceobj, so move the guts of the code
from the driver specific node_device_linux_sysfs into its own API
since virnodedeviceobj cannot callback into the driver.
Nothing in the code deals with sysfs anyway, as that's hidden by the
various virSCSIHost* and virVHBA* utility function calls.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create local @obj and @def for the API's rather than referencing the
devs->objs[i][->def->]. It'll make future patches easier to read.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Ensure that any function that walks the node device object list is prefixed
by virNodeDeviceObjList.
Also, modify the @filter param name for virNodeDeviceObjListExport to
be @aclfilter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to make things private, make the ->devs be pointers to a
virNodeDeviceObjList and then manage everything inside virnodedeviceobj
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create an allocator for the virNodeDeviceObjPtr - include setting up
the mutex, saving the virNodeDeviceDefPtr, and locking the return object.
Signed-off-by: John Ferlan <jferlan@redhat.com>
- In testDestroyVport rather than use a cleanup label, just return -1
immediately since nothing else is needed.
- In testStoragePoolDestroy, if !privpool, then just return -1 since
nothing else will happen anyway.
- Rather than "goto cleanup;" on failure to virNodeDeviceObjFindByName
an @obj, just return directly. This then allows the cleanup: label code
to not have to check "if (obj)" before calling virNodeDeviceObjUnlock.
This also simplifies some exit logic...
- In testNodeDeviceObjFindByName use an error: label to handle the failure
and don't do the ncaps++ within the VIR_STRDUP() source target index.
Only increment ncaps after success. Easier on eyes at error label too.
- In testNodeDeviceDestroy use "cleanup" rather than "out" for the goto
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than passing the object to be removed by reference, pass by value
and then let the caller decide whether or not the object should be free'd
and how to handle the logic afterwards. This includes free'ing the object
and/or setting the local variable to NULL to prevent subsequent unexpected
usage (via something like virNodeDeviceObjRemove in testNodeDeviceDestroy).
For now this function will just handle the remove of the object from the
list for which it was placed during virNodeDeviceObjAssignDef.
This essentially reverts logic from commit id '61148074' that free'd the
device entry on list, set *dev = NULL and returned. Thus fixing a bug in
node_device_hal.c/dev_refresh() which would never call dev_create(udi)
since @dev would have been set to NULL.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Recent refactors made it so that the function may use uninitialized
pointer, but it actually wanted to use a different variable and value
at all.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This reverts commit b3e71a8830.
As it turns out this ends up very badly as the @def could be Free'd
even though it's owned by @obj as a result of the AssignDef.
New API will be virNWFilterInstantiateFilterInternal as it's called from
the virNWFilterInstantiateFilter and virNWFilterUpdateInstantiateFilter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rename to virNWFilterDoInstantiate to better describe the action.
Also fix the @vmuuid parameter to not have the ATTRIBUTE_UNUSED since it
is used in the call to virNWFilterDHCPSnoopReq.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When looking for slots suitable for a PCI device, libvirt
might need to add an extra PCI controller: for pSeries guests,
we want that extra controller to be a PHB (pci-root) rather
than a PCI bridge.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
PCI bus has to be numbered sequentially, and no index can be
missing, so libvirt will fill in the blanks automatically for
the user.
Up until now, it has done so using either pci-bridge, for machine
types based on legacy PCI, or pcie-root-port, for machine types
based on PCI Express. Neither choice is good for pSeries guests,
where PHBs (pci-root) should be used instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Now that the multi-phb support series is in, work on the TODO at
qemuDomainGetMemLockLimitBytes() to arrive at the correct memlock limit
value.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Additional PHBs (pci-root controllers) will be created for
the guest using the spapr-pci-host-bridge QEMU device, if
available; the implicit default PHB, while present in the
guest configuration, will be skipped.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1431193
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Usually, a controller with alias 'x' will create a bus with the
same name; however, the bus created by a PHBs with alias 'x' will
be named 'x.0' instead, so we need to account for that.
As an exception to the exception, the implicit PHB that's added
automatically to every pSeries guest creates the 'pci.0' bus.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
This new capability can be used to detect whether a QEMU
binary supports the spapr-pci-host-bridge controller.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
pSeries guests will soon need the new information; luckily,
we can figure it out automatically most of the time, so
users won't have to worry about it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Adding it to the virDomainControllerPCIModelName enumeration
is enough for existing code to handle it, so parsing and
formatting will work without further tweaking.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
pSeries guests will soon be allowed to have multiple
PHBs (pci-root controllers), meaning the current check
on the controller index no longer applies to them.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
pSeries guests will soon be allowed to have multiple
PHBs (pci-root controllers), which of course means that
all but one of them will have a non-zero index; hence,
we'll need to relax the current check.
However, right now the check is performed in the conf
module, which is generic rather than tied to the QEMU
driver, and where we don't have information such as the
guest machine type available.
To make this change of behavior possible down the line,
we need to move the check from the XML parser to the
drivers. Luckily, only QEMU and bhyve are using PCI
controllers, so this doesn't result in much duplication.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Moving the check and rewriting it this way doesn't alter
the current behavior, but will allow us to special-case
pci-root down the line.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
We will soon need to be able to return a NULL pointer
without the caller considering that an error: to make
it possible, change the return type to int and use
an out parameter for the string instead.
Add some documentation for the function as well.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
The current algorithm for slot allocation tries to be clever
and avoid looking at buses / slots more than once unless it's
necessary. Unfortunately that makes the code more complex,
and it will cause problem later on in some situations unless
even more complex code is added.
Since the performance gains are going to be pretty modest
anyway, we can just get rid of the extra complexity and use a
completely straighforward implementation instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
We use hostdev->info frequently enough that having
a shorter name for it makes the code more readable.
We will also be adding even more uses later on.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Move @function after @flags to match other functions in the
same module like virDomainPCIAddressReserveNextAddr().
Also move virDomainPCIAddressReserveNextAddr() closer to
virDomainPCIAddressReserveAddr() in the header file.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
This function was private to the QEMU driver and was,
accordingly, called qemuDomainPCIBusFullyReserved().
However the function is really not QEMU-specific at
all, so it makes sense to move it closer to the
virDomainPCIAddressBus struct it operates on.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
The virDomainDeviceInfoIsSet() function no longer exists.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Modify code to have two spaces between functions, follow function more
recent function formatting w/r/t args per line and function return type
and name on separate lines.
Signed-off-by: John Ferlan <jferlan@redhat.com>
New name is qemuBlockStorageSourceGetGlusterProps and also hardcode the
protocol name rather than calling the ToString function, since this
function can't be made universal.
New name is qemuBlockStorageSourceBuildHostsJSONSocketAddress since it
formats the JSON object in accordance with qemu's SocketAddress type.
Since the new naming in qemu uses 'inet' instead of 'tcp' add a
compatibility layer for gluster which uses the old name.
Rename it to qemuBlockStorageSourceGetBackendProps and refactor it to
return the JSON object instead of filling a pointer since now it's
always expected to return data.
Add logic which will call qemuGetDriveSourceProps only in cases where we
need the JSON representation. This will allow qemuGetDriveSourceProps to
generate the JSON representation for all possible disk sources.
The command line generators for the protocols above hardcoded a default
port number. Since we now always assign it when parsing the source
definition, this ad-hoc code is not required any more.
Fill them in right away rather than having to figure out at runtime
whether they are necessary or not.
virStorageSourceNetworkDefaultPort does not need to be exported any
more.
Since we're storing a virUUIDFormat'd string in our Hash Table, let's
modify the Lookup API to receive a formatted string as well.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This reverts commit e4b980c853.
When a binary links against a .a archive (as opposed to a shared library),
any symbols which are marked as 'weak' get silently dropped. As a result
when the binary later runs, those 'weak' functions have an address of
0x0 and thus crash when run.
This happened with virtlogd and virtlockd because they don't link to
libvirt.so, but instead just libvirt_util.a and libvirt_rpc.a. The
virRandomBits symbols was weak and so left out of the virtlogd &
virtlockd binaries, despite being required by virHashTable functions.
Various other binaries like libvirt_lxc, libvirt_iohelper, etc also
link directly to .a files instead of libvirt.so, so are potentially
at risk of dropping symbols leading to a later runtime crash.
This is normal linker behaviour because a weak symbol is not treated
as undefined, so nothing forces it to be pulled in from the .a You
have to force the linker to pull in weak symbols using -u$SYMNAME
which is not a practical approach.
This risk is silent bad linkage that affects runtime behaviour is
not acceptable for a fix that was merely trying to fix the test
suite. So stop using __weak__ again.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When libvirt starts a new QEMU domain, it replaces host-model CPUs with
the appropriate custom CPU definition. However, when reconnecting to a
domain started by older libvirt (< 2.3), the domain would still have a
host-model CPU in its active definition.
https://bugzilla.redhat.com/show_bug.cgi?id=1463957
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
qemuProcessReconnect will need to call additional functions which were
originally defined further in qemu_process.c.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Separated from qemuProcessUpdateAndVerifyCPU to handle updating of an
active guest CPU definition according to live data from QEMU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
In addition to updating a guest CPU definition the function verifies
that all required features are provided to the guest. Let's make it
obvious by calling it qemuProcessUpdateAndVerifyCPU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Separated from qemuProcessUpdateLiveGuestCPU. The function makes sure
a guest CPU provides all features required by a domain definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Separated from qemuProcessUpdateLiveGuestCPU. Its purpose is to fetch
guest CPU data from a running QEMU process. The data can later be used
to verify and update the active guest CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
CPU features unknown to a hypervisor will not be present in dataDisabled
even though the features won't naturally be enabled because.
Thus any features we asked for which are not in dataEnabled should be
considered disabled.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When checking ABI stability between two domain definitions, we first
make migratable copies of them. However, we also asked for the guest CPU
to be updated, even though the updated CPU is supposed to be already
included in the original definitions. Moreover, if we do this on the
destination host during migration, we're potentially updating the
definition with according to an incompatible host CPU.
While updating the CPU when checking ABI stability doesn't make any
sense, it actually just worked because updating the CPU doesn't do
anything for custom CPUs (only host-model CPUs are affected) and we
updated both definitions in the same way.
Less then a year ago commit v2.3.0-rc1~42 stopped updating the CPU in
the definition we got internally and only the user supplied definition
was updated. However, the same commit started updating host-model CPUs
to custom CPUs which are not affected by the request to update the CPU.
So it still seemed to work right, unless a user upgraded libvirt 2.2.0
to a newer version while there were some domains with host-model CPUs
running on the host. Such domains couldn't be migrated with a user
supplied XML since libvirt would complain:
Target CPU mode custom does not match source host-model
The fix is pretty straightforward, we just need to stop updating the CPU
when checking ABI stability.
https://bugzilla.redhat.com/show_bug.cgi?id=1463957
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Currently the scan of the /proc/mounts file used to find cgroup mount
points doesn't take into account that mount points may hidden by other
mount points. For, example in certain Kubernetes environments the
/proc/mounts contains the following lines:
cgroup /sys/fs/cgroup/net_prio,net_cls cgroup ...
tmpfs /sys/fs/cgroup tmpfs ...
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup ...
In this particular environment the first mount point is hidden by the
second one. The correct mount point is the third one, but libvirt will
never process it because it only checks the first mount point for each
controller (net_cls in this case). So libvirt will try to use the first
mount point, which doesn't actually exist, and the complete detection
process will fail.
To avoid that issue this patch changes the virCgroupDetectMountsFromFile
function so that when there are duplicates it takes the information from
the last line in /proc/mounts. This requires removing the previous
explicit condition to skip duplicates, and adding code to free the
memory used by the processing of duplicated lines.
Related-To: https://bugzilla.redhat.com/1468214
Related-To: https://github.com/kubevirt/libvirt/issues/4
Signed-off-by: Juan Hernandez <jhernand@redhat.com>
After 426dc5eb2 qemuCaps and virDomainDefPtr are unused here,
remove it from the call stack
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Obviously, old gcc-s ale sad when a variable shares the name with
a function. And we do have such variable (added in 4d8a914be0):
@mount. Rename it to @mountpoint so that compiler's happy again.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The way we create devices under /dev is highly linux specific.
For instance we do mknod(), mount(), umount(), etc. Some
platforms are even missing some of these functions. Then again,
as declared in qemuDomainNamespaceAvailable(): namespaces are
linux only. Therefore, to avoid obfuscating the code by trying to
make it compile on weird platforms, just provide a non-linux stub
for qemuDomainAttachDeviceMknodRecursive(). At the same time,
qemuDomainAttachDeviceMknodHelper() which actually calls the
non-existent functions is moved under ifdef __linux__ block since
its only caller is in that block too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1467826
Commit id 'b9b1aa639' was supposed to add logic to set the allocation
for sparse files when wr_highest_offset was zero; however, an unconditional
setting was done just prior. For block devices, this means allocation is
always returning 0 since 'actual-size' will be zero.
Remove the unconditional setting and add the note about it being possible
to still be zero for block devices. As soon as the guest starts writing to
the volume, the allocation value will then be obtainable from qemu via
the wr_highest_offset.
The Win32 platform will fail to link if you use weak symbols
because it is incompatible with exporting symbols in a DLL:
Cannot export virRandomGenerateWWN: symbol wrong type (2 vs 3)
We only need weak symbols for our test suite to do LD_PRELOAD
and this doesn't work on Win32, so we can just drop the hack
for Win32
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If we exceed a fixed limit in RPC code we get a horrible message
like this, if the parameter type is a 'string', because we forgot
to initialize the error message type field:
$ virsh snapshot-list ostack1
error: too many remote undefineds: 1329 > 1024
It would also be useful to know which RPC call and field was
exceeded. So this patch makes us report:
$ virsh snapshot-list ostack1
error: too many remote undefineds: 1329 > 1024,
in parameter 'names' for 'virDomainSnapshotListNames'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
On domain startup, bind host or bind service can be omitted
and we will format a working command line.
Extend this to hotplug as well and specify the service to QEMU
even if the host is missing.
https://bugzilla.redhat.com/show_bug.cgi?id=1452441
Currently all mockable functions are annotated with the 'noinline'
attribute. This is insufficient to guarantee that a function can
be reliably mocked with an LD_PRELOAD. The C language spec allows
the compiler to assume there is only a single implementation of
each function. It can thus do things like propagating constant
return values into the caller at compile time, or creating
multiple specialized copies of the function body each optimized
for a different caller. To prevent these optimizations we must
also set the 'noclone' and 'weak' attributes.
This fixes the test suite when libvirt.so is built with CLang
with optimization enabled.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The TODO macro expands to an fprintf() call and is used in several
places in the Xen driver. Anything that wishes to print such debug
messages should use the logging macros. In this case though, all the
places in the Xen driver should have been raising a formal libvirt
error instead. Add proper error handling and delete the TODO macro
to prevent future misuse.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The HOST_NAME_MAX, INET_ADDRSTRLEN and VIR_LOOPBACK_IPV4_ADDR
constants are only used by a handful of files, so are better
kept in virsocketaddr.h or the source file that uses them.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We only ever test libvirt with GCC or CLang which provides a
GCC compatible compilation environment. Between them, these
compilers cover every important operating system platform,
even Windows.
Mandate their use to make it explicit that we don't care about
compilers like Microsoft VCC or other UNIX vendor C compilers.
GCC 4.4 was picked as the baseline, since RHEL-6 ships 4.4.7
and that lets us remove a large set of checks. There is a slight
issue that CLang reports itself as GCC 4.2, so we must also check
if __clang__ is defined. We could check a particular CLang version
too, but that would require someone to figure out a suitable min
version which is fun because OS-X reports totally different CLang
version numbers from CLang builds on Linux/BSD
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Back in this commit:
commit b436a8ae5c
Author: Fabian Freyer <fabian.freyer@physik.tu-berlin.de>
Date: Thu Jun 9 00:50:35 2016 +0000
gnulib: add getopt module
config-post.h was modified to define __GNUC_PREREQ, but the
original definition was never removed from internal.h, and
that is now dead code since config.h is always the first file
included.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, the only type of chardev that we create the backend
for in the namespace is type='dev'. This is not enough, other
backends might have files under /dev too. For instance channels
might have a unix socket under /dev (well, bind mounted under
/dev from a different place).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1462060
Just like in the previous commit, when attaching a file based
device which has its source living under /dev (that is not a
device rather than a regular file), calling mknod() is no help.
We need to:
1) bind mount device to some temporary location
2) enter the namespace
3) move the mount point to desired place
4) umount it in the parent namespace from the temporary location
At the same time, the check in qemuDomainNamespaceSetupDisk makes
no longer sense. Therefore remove it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1462060
When building a qemu namespace we might be dealing with bare
regular files. Files that live under /dev. For instance
/dev/my_awesome_disk:
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/dev/my_awesome_disk'/>
<target dev='vdc' bus='virtio'/>
</disk>
# qemu-img create -f qcow2 /dev/my_awesome_disk 10M
So far we were mknod()-ing them which is
obviously wrong. We need to touch the file and bind mount it to
the original:
1) touch /var/run/libvirt/qemu/fedora.dev/my_awesome_disk
2) mount --bind /dev/my_awesome_disk /var/run/libvirt/qemu/fedora.dev/my_awesome_disk
Later, when the new /dev is built and replaces original /dev the
file is going to live at expected location.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently, we silently assume that file we are creating in the
namespace is either a link or a device (character or block one).
This is not always the case. Therefore instead of doing something
wrong, claim about unsupported file type.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently, we silently assume that file we are creating in the
namespace is either a link or a device (character or block one).
This is not always the case. Therefore instead of doing something
wrong, claim about unsupported file type.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This function is going to be used on other places, so
instead of copying code we can just call the function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1459592
In 290a00e41d I've tried to fix the process of building a
qemu namespace when dealing with file mount points. What I
haven't realized then is that we might be dealing not with just
regular files but also special files (like sockets). Indeed, try
the following:
1) socat unix-listen:/tmp/soket stdio
2) touch /dev/socket
3) mount --bind /tmp/socket /dev/socket
4) virsh start anyDomain
Problem with my previous approach is that I wasn't creating the
temporary location (where mount points under /dev are moved) for
anything but directories and regular files.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
It comes very handy to have source path for chardevs. We already
have such function: virDomainAuditChardevPath() but it's static
and has name not suitable for exposing. Moreover, while exposing
it change its name slightly to virDomainChrSourceDefGetPath.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If a value of the first level object contains more objects needing
deflattening which would be wrapped in an actual object the function
would not recurse into them.
By this simple addition we can fully deflatten the objects.
As it turns out sometimes users pass in an arbitrarily nested structure
e.g. for the qemu backing chains JSON pseudo protocol. This new
implementation deflattens now a single object fully even with nested
keys.
Additionally it's not necessary now to stick with the "file." prefix for
the properties.
Currently the function would deflatten the object by dropping the 'file'
prefix from the attributes. This does not really scale well or adhere to
the documentation.
Until we refactor the worker to properly deflatten everything we at
least simulate it by adding the "file" wrapper object back.
Users may want to run the init command of a container as a special
user / group. This is achieved by adding <inituser> and <initgroup>
elements. Note that the user can either provide a name or an ID to
specify the user / group to be used.
This commit also fixes a side effect of being able to run the command
as a non-root user: the user needs rights on the tty to allow shell
job control.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Some containers may want the application to run in a special directory.
Add <initdir> element in the domain configuration to handle this case
and use it in the lxc driver.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
virCommand is a version of virExec that doesn't fork, however it is
just calling execve and doesn't honors setting uid/gid and pwd.
This commit extrac those pieces from virExec() to a virExecCommon()
function that is called from both virExec() and virCommandExec().
When running an application container, setting environment variables
could be important.
The newly introduced <initenv> tag in domain configuration will allow
setting environment variables to the init program.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
All of these four functions (virStreamRecvAll, virStreamSendAll,
virStreamSparseRecvAll, virStreamSparseSendAll) take one or more
callback functions that handle various aspects of streams.
However, if any of them fails no error is reported therefore
caller does not know what went wrong.
At the same time, we silently presumed callbacks to set errno on
failure. With this change we should document it explicitly as the
error is not properly reported.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If one these four functions fail (virStreamRecvAll,
virStreamSendAll, virStreamSparseRecvAll, virStreamSparseSendAll)
the stream is aborted by calling virStreamAbort(). This is a
public API; therefore, the first thing it does is error reset. At
that point any error that caused us to abort stream in the first
place is gone.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Our documentation to the virStreamRecvAll, virStreamSendAll,
virStreamSparseRecvAll, and virStreamSparseSendAll functions
indicates that if these functions fail, then virStreamAbort is
called. But that is not necessarily true. For instance all of
these functions allocate a buffer to work with. If the allocation
fails, no virStreamAbort() is called despite -1 being returned.
It's the same story with argument sanity checks and a lot of
other checks.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Problem with our error reporting is that the error object is a
thread local variable. That means if there's an error reported
within the I/O thread it gets logged and everything, but later
when the event loop aborts the stream it doesn't see the original
error. So we are left with some generic error. We can do better
if we copy the error message between the threads.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When the I/O thread quits (e.g. due to an I/O error, lseek()
error, whatever), any subsequent virFDStream API should return
error too. Moreover, when invoking stream event callback, we must
set the VIR_STREAM_EVENT_ERROR flag so that the callback knows
something bad happened.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This is only used in qemu_command.c, so move it, and clarify that
it's really about identifying if the serial config is a platform
device or not.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Some qemu arch/machine types have built in platform devices that
are always implicitly available. For platform serial devices, the
current code assumes that only old style -serial config can be
used for these devices.
Apparently though since -chardev was introduced, we can use -chardev
in these cases, like this:
-chardev pty,id=foo
-serial chardev:foo
Since -chardev enables all sorts of modern features, use this method
for platform devices.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Every qemu version we support has QEMU_CAPS_CHARDEV, so stop
explicitly tracking it and blacklist it like we've done for many
other feature flags.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
AFAIK there aren't any cases where we will/should hit the old code
path for our supported qemu versions, so drop the old code.
Massive test suite churn follows
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
AFAIK there aren't any cases where we should fail these checks with
supported qemu versions, so just drop them.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
AFAIK there aren't any qemu arch/machine types with platform parallel
devices that would require old style -parallel config, so we shouldn't
ever need this nowadays.
Remove a now redundant test
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Rather than try to whitelist all device configs that can't use
-chardev, blacklist the only one that really can't, which is the
default serial/console target type=isa case.
ISA specifically isn't a valid config for arm/aarch64, but we've
always implicitly treated it to mean 'default platform device'.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
vcpu properties gathered from query-hotpluggable cpus need to be passed
back to qemu. As qemu did not use the node-id property until now and
libvirt forgot to pass it back properly (it was parsed but not passed
around) we did not honor this.
This patch adds node-id to the structures where it was missing and
passes it around as necessary.
The test data was generated with a VM with following config:
<numa>
<cell id='0' cpus='0,2,4,6' memory='512000' unit='KiB'/>
<cell id='1' cpus='1,3,5,7' memory='512000' unit='KiB'/>
</numa>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1452053
Introduces support for virDomainSetMemory. This also serves an an
example for how to use the new method invocation API with a more
complicated method, this time including an EPR and embedded param.
This commit adds support for virDomainSendKey. It also serves as an
example of how to use the new method invocation APIs with a single
"simple" type parameter.
Update the generator to generate basic property type information for
each CIM object representation. Right now, it generates arrays of
hypervCimType structs:
struct _hypervCimType {
const char *name;
const char *type;
bool isArray;
};
This commit introduces functionality for creating and working with
invoke parameters. This commit does not include any code for serializing
and actually performing the method invocations; it merely defines the
functions and API for using invocation parameters in driver code.
HYPERV_DEFAULT_PARAM_COUNT was chosen because almost no method
invocations have more than 4 parameters.
Functions added:
* hypervInitInvokeParamsList
* hypervFreeInvokeParams
* hypervAddSimpleParam
* hypervAddEprParam
* hypervCreateEmbeddedParam
* hypervSetEmbeddedProperty
* hypervAddEmbeddedParam
* hypervFreeEmbeddedParam
The log category for virnetdaemon.c was mistakenly set
to rpc.netserver. Some useful info about the inhibitor
file descriptor was also never logged.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The DBus conditional was renamed way back:
commit da77f04ed5
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Sep 20 15:05:39 2012 +0100
Convert HAVE_DBUS to WITH_DBUS
but the shutdown inhibit code was not updated. Thus libvirt
was never inhibiting shutdown by a logged in user when VMs
are running.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It's obvious that unsigned long long is 64 bit and also our web page
generator would misplace the comment after the return value due to the
way it's parsing them.
vcpu 0 must be always enabled and non-hotpluggable, thus you can't
modify it using the vcpu hotplug APIs. Disallow it so that users can't
create invalid configurations.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1459785
The default conf files, for example libvirtd.conf, virtlockd.conf, and
virtlogd.conf, should be located under the directory "/etc/libvirt" when
root as root, rather than "/etc". When run as non-root, the configuration
files should be located under "$XDG_CONFIG_HOME/libvirt/", rather than
"XDG_CONFIG_HOME".
Signed-off-by: Lily Zhu <lizhu@redhat.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When one has a non-blocking stream and aborts or finishes it without
removing the callback, any event loop invocation will trigger that
callback, but it cannot be removed any more. We cannot remove the
callback automatically from virStream{Abort,Finish} functions due to
forward-compatibility. So let's at least document this behaviour,
because it is not easy to find out the reason for.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add support for vgaconf driver configuration. In domain xml it looks like
this:
<video>
<driver vgaconf='io|on|off'>
<model .../>
</video>
It was added with bhyve gop video in mind to allow users control how the
video device is exposed to the guest, specifically, how VGA I/O is
handled.
One can refer to the bhyve manual page to get more detailed description
of the possible VGA configuration options:
https://www.freebsd.org/cgi/man.cgi?query=bhyve&manpath=FreeBSD+12-current
The relevant part could be found using the 'vgaconf' keyword.
Also, add some tests for this new feature.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit 54fa1b44af added virDomainDeviceInfo::loadparm
and updated virDomainDeviceInfoClear() accordingly, but
omitted the necessary virDomainDeviceInfoCopy() changes.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
While qemuProcessIncomingDefNew takes an fd argument and stores it in
qemuProcessIncomingDef structure, the caller is still responsible for
closing the file descriptor.
Introduced by commit v1.2.21-140-ge7c6f4575.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since qemu commit 3ef6c40ad0b it can fail if trying to hotplug a
disk that is not qcow2 despite us saying it is. We need to error
out in that case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If a remote call fails during event registration (more than likely from
a network failure or remote libvirtd restart timed just right), then when
calling the virObjectEventStateDeregisterID we don't want to call the
registered @freecb function because that breaks our contract that we
would only call it after succesfully returning. If the @freecb routine
were called, it could result in a double free from properly coded
applications that free their opaque data on failure to register, as seen
in the following details:
Program terminated with signal 6, Aborted.
#0 0x00007fc45cba15d7 in raise
#1 0x00007fc45cba2cc8 in abort
#2 0x00007fc45cbe12f7 in __libc_message
#3 0x00007fc45cbe86d3 in _int_free
#4 0x00007fc45d8d292c in PyDict_Fini
#5 0x00007fc45d94f46a in Py_Finalize
#6 0x00007fc45d960735 in Py_Main
#7 0x00007fc45cb8daf5 in __libc_start_main
#8 0x0000000000400721 in _start
The double dereference of 'pyobj_cbData' is triggered in the following way:
(1) libvirt_virConnectDomainEventRegisterAny is invoked.
(2) the event is successfully added to the event callback list
(virDomainEventStateRegisterClient in
remoteConnectDomainEventRegisterAny returns 1 which means ok).
(3) when function remoteConnectDomainEventRegisterAny is hit,
network connection disconnected coincidently (or libvirtd is
restarted) in the context of function 'call' then the connection
is lost and the function 'call' failed, the branch
virObjectEventStateDeregisterID is therefore taken.
(4) 'pyobj_conn' is dereferenced the 1st time in
libvirt_virConnectDomainEventFreeFunc.
(5) 'pyobj_cbData' (refered to pyobj_conn) is dereferenced the
2nd time in libvirt_virConnectDomainEventRegisterAny.
(6) the double free error is triggered.
Resolve this by adding a @doFreeCb boolean in order to avoid calling the
freeCb in virObjectEventStateDeregisterID for any remote call failure in
a remoteConnect*EventRegister* API. For remoteConnect*EventDeregister* calls,
the passed value would be true indicating they should run the freecb if it
exists; whereas, it's false for the remote call failure path.
Patch based on the investigation and initial patch posted by
fangying <fangying1@huawei.com>.
The function to check if -chardev is supported by QEMU was written a
long time ago, where adding chardevs did not make sense on the fixed ARM
platforms. Since then, we now have a general purpose virt platform,
which should support plugging in any device over PCIe which is supported
in a similar fashion on x86.
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1371892
As it turns out the volume create, build, and refresh path was not peeking
at the meta data, so immediately after a create operation the value displayed
for capacity was still incorrect. However, if a pool refresh was done the
correct value was fetched as a result of a meta data peek.
The reason is it seems historically if the file type is RAW then peeking
at the file just took the physical value for the capacity. However, since
we know if it's an encrypted file, then peeking at the meta data will be
required in order to get a true capacity value.
So check for encryption in the source and if present, use the meta data
in order to fill in the capacity value and set the payload_offset.
Similarly to commit 5da28cc306 this check
actually does not make sense since duplicate WWNs are used e.g. when
multipathing disks.
This reverts commit 780fe4e4ba.
https://bugzilla.redhat.com/show_bug.cgi?id=1461270
When fetching stats for a vhost-user type of interface, we run
couple of ovs-vsctl commands and parse their output. However, not
all stats exist at all times, for instance "rx_dropped" or
"tx_errors" can be missing. Thing is, we ask for a bulk of
statistics and if one of them is missing an error is reported
instead of returning the rest. Since we ignore errors, we fail to
set statistics. Fix this by asking for each piece alone.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 5c54d29aae forgot to do that when moving the only function
using it and it broke the build on some platforms.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Our commit e13e8808f9 was way too generic. Currently, virtlogd is
used only for chardevs type of file and nothing else. True, we
must not relabel the path in this case, but we have to in all
other cases. For instance, if you want to have a physical console
attached to your guest:
<console type='dev'>
<source path='/dev/ttyS0'/>
<target type='virtio' port='1'/>
</console>
Starting such domain fails because qemu doesn't have access to
/dev/ttyS0 because we haven't relabelled the path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This commit fixes a locale problem with locales that use comma as a mantissa
separator. Example: 12.34 en_US = 12,34 pt_BR. Since strtod() is a non-safe
function, virStrToDouble() will have problems to parse double numbers from
kernel settings and other double numbers from static files (XMLs, JSONs, etc).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1457634
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1457481
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
The function virDoubleToStr() is defined in virutil.* and virStrToDouble() is
defined in virstring.*. Joining both functions into the same file makes more
sense.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Even though we got both the original CPU (used for starting a domain)
and the updated version (the CPU really provided by QEMU) during
incoming migration, restore, or snapshot revert, we still need to update
the CPU according to the data we got from the freshly started QEMU.
Otherwise we don't know whether the CPU we got from QEMU matches the one
before migration. We just need to keep the original CPU in
priv->origCPU.
Messed up by me in v3.4.0-58-g8e34f4781.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This function is called unconditionally from qemuProcessStop to
make sure we leave no dangling dirs behind. However, whenever the
directory we want to rmdir() is not there (e.g. because it hasn't
been created in the first place because domain doesn't use
hugepages at all), we produce a warning like this:
2017-06-20 15:58:23.615+0000: 32638: warning :
qemuProcessBuildDestroyHugepagesPath:3363 : Unable to remove
hugepage path: /dev/hugepages/libvirt/qemu/1-instance-00000001
(errno=2)
Fix this by not producing the warning on ENOENT.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Similarly to how we specify the groups of 5 capabilities in the header
file move the labels to separate line also for the VIR_ENUM_IMPL part.
This simplifies rebase conflict resolution in the capability file since
only lines have to be shuffled around, but they don't need to be edited.
Commit 7456c4f5f introduced a regression by not reloading the backing
chain of a disk after snapshot. The regression was caused as
src->relPath was not set and thus the block commit code could not
determine the relative path.
This patch adds code that will load the backing store string if
VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT and store it in the correct place
when a snapshot is successfully completed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1461303
It is necessary for some parts of the code to refresh just data
based on the based on the backing store string. Add a convenience
function that will retrieve this data.
Changing labelling of the images does not need to happen after setting
the labeling and lock manager access. This saves the cleanup of the
labeling if the relative path can't be determined.
Check for the LOADPARM capabilility and potentially add a loadparm=x to
the "-machine" string for the QEMU command line.
Also add xml2argv test cases for loadparm.
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Add new capability for the "-machine loadparm" QEMU option.
Add the capabilities replies/xml for s390x for QEMU 2.9.50.
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Update the per device boot schema to add an optional loadparm parameter.
eg: <boot order='1' loadparm='2'/>
Extend the virDomainDeviceInfo to support loadparm option.
Modify the appropriate functions to parse loadparm from boot device xml.
Add the xml2xml test to validate the field.
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Starting from qemu 2.9, more granular options are supported. Add parser
for the relevant bits.
With this patch libvirt is able to parse the host and target IQN of from
the JSON pseudo-protocol specification.
This corresponds to BlockdevOptionsIscsi in qemu qapi.
'SocketAddress' structure was changed to contain 'inet' instead of
'tcp' since qemu commit c5f1ae3ae7b. Existing entries have a backward
compatibility layer.
Libvirt will parse 'inet' and 'tcp' as equivalents.
It was added in commit 6c2e4c3856
so that Coverity would not complain about passing -1 to
qemuDomainDetachThisHostDevice(), but the function in question
has changed since and so the annotation doesn't apply anymore.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The same json strucutre is used for NBD and sheepdog volumes for
specifying of the host. Rename the function and fix up error messages to
be more universal.
When added in multiple previous commits, it was used only with -device
qxl(-vga), but for some QEMUs (< 1.6) we need to add this
functionality when using -vga qxl as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1283207
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In the case that virtlogd is used as stdio handler we pass to QEMU
only FD to a PIPE connected to virtlogd instead of the file itself.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1430988
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Improve the code to decide whether to use virtlogd or not by checking
the same variable that is updated in qemuProcessPrepareDomain().
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
In QEMU driver we can use virtlogd as stdio handler for source backend
of char devices if current QEMU is new enough and it's enabled in
qemu.conf. We should store this information while starting a guest
because the config option may change while the guest is running.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1431112
Imagine a FS mounted on /dev/blah/blah2. Our process of creating
suffix for temporary location where all the mounted filesystems
are moved is very simplistic. We want:
/var/run/libvirt/qemu/$domName.$suffix\
were $suffix is just the mount point path stripped of the "/dev/"
prefix. For instance:
/var/run/libvirt/qemu/fedora.mqueue for /dev/mqueue
/var/run/libvirt/qemu/fedora.pts for /dev/pts
and so on. Now if we plug /dev/blah/blah2 into the example we see
some misbehaviour:
/var/run/libvirt/qemu/fedora.blah/blah2
Well, misbehaviour if /dev/blah/blah2 is a file, because in that
case we call virFileTouch() instead of virFileMakePath().
The solution is to replace all the slashes in the suffix with say
dots. That way we don't have to care about nested directories.
IOW, the result we want for given example is:
/var/run/libvirt/qemu/fedora.blah.blah2
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1431112
There can be nested mount points. For instance /dev/shm/blah can
be a mount point and /dev/shm too. It doesn't make much sense to
return the former path because callers preserve the latter (and
with that the former too). Therefore prune nested mount points.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1431112
After 290a00e41d we know how to deal with file mount points.
However, when cleaning up the temporary location for preserved
mount points we are still calling rmdir(). This won't fly for
files. We need to call unlink(). Now, since we don't really care
if the cleanup succeeded or not (it's the best effort anyway), we
can call both rmdir() and unlink() without need for
differentiation between files and directories.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
On some platforms the number of bits in the cbm_mask might not be
divisible by 4 (and not even by 2), so we need to properly count the
bits. Similar file, min_cbm_bits, is properly parsed and used, but if
the number is greater than one, we lose the information about
granularity when reporting the data in capabilities. For that matter
always report granularity, but if it is not the same as the minimum,
add that information in there as well.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The split firmware and variables files introduced by
https://bugs.debian.org/764918 are in a different directory for
some reason. Let the virtual machine read both.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Change the settings from qemuDomainUpdateDeviceLive() as otherwise the
call would succeed even though nothing has changed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1414627
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Most places which want to check ABI stability for an active domain need
to call this API rather than the original
qemuDomainDefCheckABIStability. The only exception is in snapshots where
we need to decide what to do depending on the saved image data.
https://bugzilla.redhat.com/show_bug.cgi?id=1460952
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When making ABI stability checks for an active domain, we need to make
sure we use the same migratable definition which virDomainGetXMLDesc
with the MIGRATABLE flag provides, otherwise the ABI check will fail.
This is implemented in the new qemuDomainCheckABIStability which takes a
domain object and generates the right migratable definition from it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This patch separates the actual ABI checks from getting migratable defs
in qemuDomainDefCheckABIStability so that we can create another wrapper
which will use different methods to get the migratable defs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The main goal of this function is to enable reusing the parsing code
from qemuDomainDefCopy.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1214369
My fix 671d18594f was incomplete. If domain doesn't have
hugepages enabled, because of missing condition we would still be
putting hugepages path onto qemu cmd line. Clean up the
conditions so that it's more visible next time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
With glibc >= 2.25.90 writev() is only available if you explicitly
include sys/uio.h. This matches the documented requirements, but
older glibc and other *NIX pulled in writev indirectly so the bug
wasn't noticed previously.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
With the current logic, we only free @tlsalias as part of the error
label and would have to free it explicitly earlier in the code. Convert
the error label to cleanup, so that we have only one sink, where we
handle all frees. Since JSON object append operation consumes pointers,
make sure @backend is cleared before we hit the cleanup label.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1214369
Consider the following XML:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='1'/>
</hugepages>
<source type='file'/>
<access mode='shared'/>
</memoryBacking>
<numa>
<cell id='0' cpus='0-3' memory='512000' unit='KiB'/>
<cell id='1' cpus='4-7' memory='512000' unit='KiB'/>
</numa>
The following cmd line is generated:
-object
memory-backend-file,id=ram-node0,mem-path=/var/lib/libvirt/qemu/ram,
share=yes,size=524288000 -numa node,nodeid=0,cpus=0-3,memdev=ram-node0
-object
memory-backend-file,id=ram-node1,mem-path=/var/lib/libvirt/qemu/ram,
share=yes,size=524288000 -numa node,nodeid=1,cpus=4-7,memdev=ram-node1
This is obviously wrong as for node 1 hugepages should have been
used. The hugepages configuration is more specific than <source
type='file'/>.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1455819
It may happen that a domain is started without any huge pages.
However, user might try to attach a DIMM module later. DIMM
backed by huge pages (why would somebody want to mix regular and
huge pages is beyond me). Therefore we have to create the dir if
we haven't done so far.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1455819
Currently, the per-domain path for huge pages mmap() for qemu is
created iff domain has memoryBacking and hugepages in it
configured. However, this alone is not enough because there can
be a DIMM module with hugepages configured too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit v3.4.0-44-gac793bd71 fixed a memory leak, but failed to return
the special -3 value. Thus an attempt to start a domain with corrupted
managed save file would removed the corrupted file and report
"An error occurred, but the cause is unknown" instead of starting the
domain from scratch.
https://bugzilla.redhat.com/show_bug.cgi?id=1460962
Use ATTRIBUTE_FALLTHROUGH, introduced by commit
5d84f5961b, instead of comments to
indicate that the fall through is an intentional behavior.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Add a comment for mon->watch to make clear what's the purpose of this
value.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
The virDomainUSBAddressEnsure returns 0 or -1, so commit id 'de325472'
checking for 1 like qemuDomainAttachChrDeviceAssignAddr was wrong.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Commit 824272cb28 attempted to fix escaping of characters in unix
socket path but it was wrong. We need to escape only ',', there is
no escape character for '='.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1447618
Currently, any attempt to change MTU on an interface that is
plugged to a running domain is silently ignored. We should either
do what's asked or error out. Well, we can update the host side
of the interface, but we cannot change 'host_mtu' attribute for
the virtio-net device. Therefore we have to error out.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
https://bugzilla.redhat.com/show_bug.cgi?id=1408701
While implementing MTU (572eda12ad and friends), I've forgotten
to actually set MTU on the host NIC in case of hotplug. We
correctly tell qemu on the monitor what the MTU should be, but we
are not actually setting it on the host NIC.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
https://bugzilla.redhat.com/show_bug.cgi?id=1459091
We try to get the last element of the passed path by calling
strrch(path, '/'). However, the pointer that strrchr() returns
points at the slash, We want string that starts right after that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1459091
Currently, we are querying for vhostuser interface name in post
parse callback. At that time interface might not yet exist.
However, it has to exist when starting domain. Therefore it makes
more sense to query its name at that point. This partially
reverts 57b5e27.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While reworking client side of streams, I had to postpone payload
decoding so that stream holes and stream data can be
distinguished in virNetClientStreamRecvPacket. That's merely what
18944b7aea does. However, I accidentally removed one important
bit: when server sends us an empty STREAM packet (with no
payload) - meaning end of stream - st->incomingEOF flag needs to
be set. It used to be before I touched the code. After I removed
it, virNetClientStreamRecvPacket will try to fetch more data from
the stream, but it will never come.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
In 9cb891141c we've introduced some logic to clearing suggested
macvtap/macvlan ifnames. The logic consists of comparing ifname
string with strings that libvirt would generate. However, due to
a typo only VIR_NET_GENERATED_MACVTAP_PREFIX was compared. Twice.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When adding the aliased serial stub console, the structure wasn't
properly allocated (VIR_ALLOC instead of virDomainChrDefNew) which then
resulted in SIGSEGV in virDomainChrSourceIsEqual during a serial device
coldplug.
https://bugzilla.redhat.com/show_bug.cgi?id=1434278
Signed-off-by: Erik Skultety <eskultet@redhat.com>
If QEMU is new enough and we have the live updated CPU definition in
either save or migration cookie, we can use it to enforce ABI. The
original guest CPU from domain XML will be stored in private data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since the domain XML saved in a snapshot or saved image uses the
original guest CPU definition but we still want to enforce ABI when
restoring the domain if libvirt and QEMU are new enough, we save the
live updated CPU definition in a save cookie.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since the domain XML send during migration uses the original guest CPU
definition but we still want the destination to enforce ABI if it is new
enough, we send the live updated CPU definition in a migration cookie.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When persistent migration of a transient domain is requested but no
custom XML is passed to the migration API we would just let the
destination daemon make a persistent definition from the live definition
itself. This is not a problem now, but once the destination daemon
starts replacing the original CPU definition with the one from migration
cookie before starting a domain, it would need to add more ugly hacks to
reverse the operation. Let's just always send the persistent definition
in the cookie to make things a bit cleaner.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The destination host may not be able to start a domain using the live
updated CPU definition because either libvirt or QEMU may not be new
enough. Thus we need to send the original guest CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When starting a domain we update the guest CPU definition to match what
QEMU actually provided (since it is allowed to add or removed some
features unless check='full' is specified). Let's store the original CPU
in domain private data so that we can use it to provide a backward
compatible domain XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The following patches will add an actual content in the cookie and use
the data when restoring a domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This patch implements a new save cookie object and callbacks for qemu
driver. The actual useful content will be added in the object later.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
virDomainXMLOption gains driver specific callbacks for parsing and
formatting save cookies.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The code will be used by snapshots and domain save/restore code to store
additional data for a saved running domain. It is analogous to migration
cookies, but simple and one way only.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The new structure encapsulates save image header and associated data
(domain XML).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function is now called virQEMUSaveDataWrite and it is now doing
everything it needs to save both the save image header and domain XML to
a file. Be it a new file or an existing file in which a user wants to
change the domain XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function is supposed to update the save image header after a
successful migration to the save image file.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a preparation for creating a new virQEMUSaveData structure which
will encapsulate all save image header data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since virQEMUSaveHeader will be followed by more than just domain XML,
the old name would be confusing as it was designed to describe the
length of all data following the save image header.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This will be used later when a save cookie will become part of the
snapshot XML using new driver specific parser/formatter functions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function will be used in paths where mismatching CPU defs are not an
error.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Allow starting the block-copy job for a persistent domain if a user
declares by using a flag that the job will not be recovered if the VM is
switched off while the job is active.
This allows to use the block-copy job with persistent VMs under the same
conditions as would apply to transient domains.
Without this patch libvirt would just report the operation of a
completed job as "unknown" instead of "incoming migration".
https://bugzilla.redhat.com/show_bug.cgi?id=1457052
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Now that we have a bit more control, let's convert our object into
a lockable object and let that magic handle the create and lock/unlock.
This commit also introduces virInterfaceObjEndAPI in order to handle the
lock unlock and object unref in one call for consumers returning a NULL
obj upon return. This removes the need for virInterfaceObj{Lock|Unlock}
external API's.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the consumption of @def in virInterfaceObjNew and then handle that
in the error path of virInterfaceObjListAssignDef since it's caller expects
to need to free @def when NULL is returned and so would virInterfaceObjFree.
Signed-off-by: John Ferlan <jferlan@redhat.com>
vCPU ordering information would not be updated if a vCPU emerged or
disappeared during the time libvirtd is not running. This allowed to
create invalid configuration like:
[...]
<vcpu id='56' enabled='yes' hotpluggable='yes' order='57'/>
<vcpu id='57' enabled='yes' hotpluggable='yes' order='58'/>
<vcpu id='58' enabled='yes' hotpluggable='yes'/>
Call the function that records the information on reconnect.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1451251
There's a problem with current streams after I switched them from
iohelper to thread implementation. Previously, iohelper made sure
not to exceed specified @length resulting in the pipe EOF
appearing at the exact right moment (the pipe was used to tunnel
the data from the iohelper to the daemon). Anyway, when switching
to thread I had to write the I/O code from scratch. Whilst doing
that I took an inspiration from the iohelper code, but since the
usage of pipe switched to slightly different meaning, there was
no 1:1 relationship between the codes.
Moreover, after introducing VIR_FDSTREAM_MSG_TYPE_HOLE, the
condition that should made sure we won't exceed @length was
completely wrong.
The fix is to:
a) account for holes for @length
b) cap not just data sections but holes too (if @length would be
exceeded)
For this purpose, the condition needs to be brought closer to the
code that handles holes and data sections.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Make the decision based on the usage of childBuf buffer.
This fixes the oddity in the test case introduced by commit c1c4d0d
where we would format an empty pair tag.
We need to decide whether to format <controller> as a single tag
or if it has any subelements.
Rewrite the function to use a separate buffer for subelements,
to make adding new options easier.
After some discussion on and off the linux-audit mailing list, we
should use different fields for the audit messages.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1218603
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
They do the same thing with only one difference. Let's put them
together (like we already do with virFDStreamCloseInt) so that future
changes don't miss one of the implementations. Also to clean up the
code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1450349
Problem is, qemu fails to load guest memory image if these
attribute change on migration/restore from an image.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While checking for ABI stability, drivers might pose additional
checks that are not valid for general case. For instance, qemu
driver might check some memory backing attributes because of how
qemu works. But those attributes may work well in other drivers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
It was only ever used in node_device_hal.c which really never used it
anyway since the NODE_DEV_UDI was never referenced. Remove free_udi()
and @privData as well as the references to obj->privateData & obj->privateFree.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for privatizing the virNodeDeviceObj - create an accessor
for the @def field and then use it for various callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create nodeDeviceObjFindByName which will perform the corresponding
virNodeDeviceObjFindByName call for various node_device_driver callers
rather than having the same repetitive code.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than taking an virNodeDeviceObjPtr and dereffing the obj->def,
just pass the def.
Also check for an error in the function to have the calling function goto
cleanup on error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Alter the node_device_driver source and prototypes to follow more
recent code style guidelines w/r/t spacing between functions, format
of the function, and the prototype definitions.
While the new names for nodeDeviceUpdateCaps, nodeDeviceUpdateDriverName,
and nodeDeviceGetTime don't follow exactly w/r/t a "vir" prefix, they
do follow other driver nomenclature style.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In order to ensure that whenever something is added to virNodeDevCapType
that both functions are considered for processing of a new capability,
change the if-then-else construct into a switch statement.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When searching for an NPIV capable fc_host, not only does there need to
be an "fc_host" capability with the specified wwnn/wwpn or fabric_wwn,
but that scsi_host must be vport capable; otherwise, one could end up
picking an exising vHBA/NPIV which wouldn't be good.
Currently not a problem since scsi_hosts are in an as found forward linked
list and the vport capable scsi_hosts will always appear before a vHBA by
definition. However, in the near term future a hash table will be used to
lookup the devices and that could cause problems for these algorithms.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Alter the algorithm to return a list of matching names rather than a
list of match virInterfaceObjPtr which are then just dereferenced
extracting the def->name and def->mac. Since the def->mac would be
the same as the passed @mac, just return a list of names and as long
as there's only one, extract the [0] entry from the passed list.
Also alter the error message on failure to include the mac that wasn't
found.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the structs into virinterfaceobj.c, create necessary accessors, and
initializers.
This also includes reworking virInterfaceObjListClone to handle receiving
a source interfaces list pointer, creating the destination interfaces object,
and copying everything from source into dest.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to make the obj much more private, so make it easier to
see future changes which will require accessors for the obj->def
This also includes modifying some interfaces->objs[i]->X references to be
obj = interfaces->objs[i]; and then def = obj->def
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than using goto cleanup on object find failure and having cleanup
need to check if the obj was present before unlocking, just return immediately.
Signed-off-by: John Ferlan <jferlan@redhat.com>
qemuDomainGetBlockInfo would error out if qemu did not report
'wr_highest_offset'. This usually does not happen, but can happen
briefly during active layer block commit. There's no need to report the
error, we can simply report that the disk is fully alocated at that
point.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1452045
The host address or the socket path have already been checked at the
begining of the function virStorageSourceParseNBDColonString(). So,
when the parameter is not a unix socket, there is no reason to check
the address again because if it does not exists, the logic will fail
in the first IF conditional.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
VIR_STRDUP returns -1 if the string copy was not successful. So, the
current comparison/logic is throwing an error when VIR_STRDUP() returns
1. Only when source is NULL, it is considering as a success which is
not right.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
In 48d9e6cdcc and friends we've allowed users to back guest
memory by a file inside the host. And in order to keep things
manageable the memory_backing_dir variable was introduced to
qemu.conf to specify the directory where the files are kept.
However, libvirt's policy is that directories are created on
domain startup if they don't exist. We've missed this one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
commit a8eba5036 added further checking of the guest shutdown cause, but
this enhancement is available since qemu 2.10, causing a crash because
of a NULL pointer dereference on older qemus.
Thread 1 "libvirtd" received signal SIGSEGV, Segmentation fault.
0x00007ffff72441af in virJSONValueObjectGet (object=0x0,
key=0x7fffd5ef11bf "guest")
at util/virjson.c:769
769 if (object->type != VIR_JSON_TYPE_OBJECT)
(gdb) bt
0 in virJSONValueObjectGet
1 in virJSONValueObjectGetBoolean
2 in qemuMonitorJSONHandleShutdown
3 in qemuMonitorJSONIOProcessEvent
4 in qemuMonitorJSONIOProcessLine
5 in qemuMonitorJSONIOProcess
6 in qemuMonitorIOProcess
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Callers expect the return value to be the total number of vcpus in the
host (including offline vcpus). The refactor in c67e04e25f
broke this assumption by using virHostCPUGetOnlineBitmap which only
creates a bitmap long enough to hold the last online vcpu.
Report the full number of host vcpus by returning value from
virHostCPUGetCount().
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
When a number of SRIOV VFs (up to 128 on Intel XL710) is created:
for i in `seq 0 1`; do
echo 63 > /sys/class/net/<interface>/device/sriov_numvfs
done
libvirtd will then report "udev_monitor_receive_device returned NULL"
error because the netlink socket buffer is not big enough (using GDB on
libudev confirmed this with ENOBUFFS) and thus some udev events were
dropped. This results in some devices being missing in the nodedev-list
output. This patch overrides the system's rmem_max limit but for that,
we need to make sure we've got root privileges.
https://bugzilla.redhat.com/show_bug.cgi?id=1450960
Signed-off-by: ning.bo <ning.bo9@zte.com.cn>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The @cpus is allocated by virFileReadValueBitmap() but never
freed:
==21274== 40 (32 direct, 8 indirect) bytes in 1 blocks are definitely lost in loss record 808 of 1,004
==21274== at 0x4C2E080: calloc (vg_replace_malloc.c:711)
==21274== by 0x54BA561: virAlloc (viralloc.c:144)
==21274== by 0x54BC604: virBitmapNewEmpty (virbitmap.c:126)
==21274== by 0x54BD059: virBitmapParseUnlimited (virbitmap.c:570)
==21274== by 0x54EECE9: virFileReadValueBitmap (virfile.c:4113)
==21274== by 0x5563132: virCapabilitiesInitCaches (capabilities.c:1548)
==21274== by 0x2BB86E59: virQEMUCapsInit (qemu_capabilities.c:1132)
==21274== by 0x2BBEC067: virQEMUDriverCreateCapabilities (qemu_conf.c:928)
==21274== by 0x2BC3DEAA: qemuStateInitialize (qemu_driver.c:845)
==21274== by 0x5625AAC: virStateInitialize (libvirt.c:770)
==21274== by 0x124519: daemonRunStateInit (libvirtd.c:881)
==21274== by 0x554C927: virThreadHelper (virthread.c:206)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Similar to scsi_host and fc_host, there is a relation between a
scsi_target and its transport specific fc_remote_port. Let's expose this
relation and relevant information behind it.
An example for a virsh nodedev-dumpxml:
virsh # nodedev-dumpxml scsi_target0_0_0
<device>
<name>scsi_target0_0_0</name>
<path>/sys/devices/[...]/host0/rport-0:0-0/target0:0:0</path>
<parent>scsi_host0</parent>
<capability type='scsi_target'>
<target>target0:0:0</target>
<capability type='fc_remote_port'>
<rport>rport-0:0-0</rport>
<wwpn>0x9d73bc45f0e21a86</wwpn>
</capability>
</capability>
</device>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
We will need some convenient helper functions for managing sysfs-entries
for fibre channel-backed devices. Let's implement them and make them
available in the private API.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Now that the node_device driver is aware of CCW devices, let's hook up
virsh so that we can filter them properly.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Make CCW devices available to the node_device driver. The devices are
already seen by udev so let's implement necessary code for detecting
them properly.
Topologically, CCW devices are similar to PCI devices, e.g.:
+- ccw_0_0_1a2b
|
+- scsi_host0
|
+- scsi_target0_0_0
|
+- scsi_0_0_0_0
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Unlock @obj in case of an error too.
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Since the switch statement is already using the deref'd @cap variable
and the VIR_NODE_DEV_CAP_NET case uses it, the SCSI_HOST and PCI_DEV
cases may as well use it too.
Suggested-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
QEMU will likely report the details of it shutting down, particularly
whether the shutdown was initiated by the guest or host. We should
forward that information along, at least for shutdown events. Reset
has that as well, however that is not a lifecycle event and would add
extra constants that might not be used. It can be added later on.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1384007
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The number of records that virConnectGetAllDomainStats can return per
domain is currently limited to 4096. This is quite low -- for
example, a single guest with ~320 disks will hit this limit. This
increases the limit to make it much larger. Note that
VIR_NET_MESSAGE_MAX still protects the total message size in the case
where there are many domains and many disks per domain.
I tested this using a guest with 500 disks with no issues.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1440683
When increasing the buffer size up to VIR_NET_MESSAGE_MAX, we
currently quadruple it each time. This unfortunately means that we
cannot allow certain buffer sizes -- for example the current
VIR_NET_MESSAGE_MAX == 33554432 can never be "hit" since ‘newlen’
jumps from 16MB to 64MB.
Instead of quadrupling, double it each time.
Thanks: Daniel Berrange.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Commit 4337bc57be introduced code that would in certain error paths
unref the last reference of a pointer, but return it.
Clear the pointers before returning them.
@tmp is leaked after the second call to virVHBAGetConfig within
virVHBAIsVportCapable code block because it wasn't freed after making the
first call to the function.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The @ipv6_host allocated in virAsprintf may be lost when virAsprintf
addrstr failed.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There is a VIR_FREE after a return statement. That code section is never
executed and for this reason the "tty" variable is not being freed. This
commit rearranges the logic.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Many vendor id's and product id's have leading zeros. We should show
them in the logs.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: Laine Stump <laine@laine.org>
The @meminfo allocated in qemuMonitorGetMemoryDeviceInfo() may be
lost when qemuDomainObjExitMonitor() failed.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1420740
Testing found an inventive way to cause an error at shutdown by providing the
parent name for the fc host creation using the "same name" as the HBA. Since
the code thus assumed the parent host name provided was the parent HBA and
just extracted out the host number and sent that along to the vport_destroy
this avoided checks made for equality.
So just add the equality check to that path to resolve.
While most of the APIs are okay with 16M messages, the bulk stats API
can run into the limit in big configurations. Before we devise a new
plan for this, bump this limit slightly to accomodate some more configs.
If the first console is just a copy of the first serial device we
don't need to iterate over the same device twice in order to perform
actions like security labeling, cgroup configuring, etc.
Currently only security SELinux manager was aware of this fact.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Put domain access after acquiring job condition, otherwise
another job can change it meanwhile.
Signed-off-by: Konstantin Neumoin <kneumoin@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We have to use waitDomainJob instead of waitJob, because of it
unlock the domain until job has finished, so domain will be available
for other clients.
Signed-off-by: Konstantin Neumoin <kneumoin@virtuozzo.com>
Setting the 'group_name' for a disk would falsely trigger a error path
as in commit 4b57f76502 we did not properly check the return value of
VIR_STRDUP.
Attempting to start a domain with USB hostdevs but no USB controllers
fails with the rather cryptic error
libxl: error: libxl_qmp.c:287:qmp_handle_error_response: received an
error message from QMP server: Bus 'xenusb-0.0' not found
This can be fixed by creating default USB controllers. When no USB
controllers are defined, create the number of 8 port controllers
necessary to accommodate the number of defined USB devices.
Note that USB controllers are already created as needed in the
domainAttachDevice code path. E.g. a USB controller will be created,
if necessary, when attaching a USB device with
'virsh attach-device dom usbdev.xml'.
This reverts commit 2841e675.
It turns out that adding the host_mtu field to the PCI capabilities in
the guest bumps the length of PCI capabilities beyond the 32 byte
boundary, so the virtio-net device gets 64 bytes of ioport space
instead of 32, which offsets the address of all the other following
devices. Migration doesn't work very well when the location and length
of PCI capabilities of devices is changed between source and
destination.
This means that we need to make sure that the absence/presence of
host_mtu on the qemu commandline always matches between source and
destination, which means that we need to make setting of host_mtu an
opt-in thing (it can't happen automatically when the bridge being used
has a non-default MTU, which is what commit 2841e675 implemented).
I do want to re-implement this feature with an <mtu auto='on'/>
setting, but probably won't backport that to any stable branches, so
I'm first reverting the original commit, and that revert can be pushed
to the few releases that have been made since the original (3.1.0 -
3.3.0)
Resolves: https://bugzilla.redhat.com/1449346
If a VNC listen address is not specified in domXML, libxl
will default to 127.0.0.1, but this is never reflected in the domXML.
In the case of spice, a missing listen address resulted in listening
on all interfaces, i.e. '0.0.0.0'. If not specified, set the listen
address in virDomainGraphicsDef struct to the libxl default when
creating the frame buffer device. Additionally, set default spice
listen address to 127.0.0.1.
There's a slight problem with the current function. Assume we are
currently in a data section and we have say 42 bytes until next
section. Therefore, just before (handler) is called to fill up
the buffer with data, @want is changed to 42 to match the amount
of data left in the current section. However, after hole is
processed, we are back in data section but with incredibly small
@want size. Nobody will ever reset it back. This results in
incredible data fragmentation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To every virDirOpen we must have VIR_DIR_CLOSE otherwise FD is
leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The error message would contain first vcpu id after the list of vcpus
selected for modification. To print the proper vcpu id remember the
first vcpu selected to be modified.
@reply is a DBusMessage object returned by virDBusCallMethod in
get machine object call path, dereference it before calling
virDBusCallMethod again to get machine name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
It is possible to crash libvirtd when converting xl native config to
domXML when the xl config contains an empty disk source, e.g. an empty
CDROM. Fix by checking that the disk source is non-NULL before parsing it.
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Some older systems (such as RHEL6) lack SEEK_HOLE and SEEK_DATA
which virFileInData relies on. Provide a stub for these systems.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code causes the 'offset' variable to be overwritten (possibly with
NULL if neither of the vCPUs is halted) which causes a crash since the
variable is still used after that part.
Additionally there's a bug, since strstr() would look up the '(halted)'
string in the whole string rather than just the currently processed line
the returned data is completely bogus.
Rather than switching to single line parsing let's remove the code
altogether since it has a commonly used JSON monitor alternative and
the data itself is not very useful to report.
The code was introduced in commit cc5e695bde
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1452106
There is a wrong 'return' statement after a 'goto' statement inside the
function virConnectCloseCallbackDataRegister(). This commit only removes
the 'return'.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Currently, we don't assign any meaning to that. Our current view
on virStream is that it's merely a pipe. And pipes don't support
seeking.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit e3ba4025 introduced srv->handles and VIR_RESIZE_N to allocate
@handles as necessary, but did not free the handles during when calling
virNetlinkEventServiceStop.
Commit 15a71e60 introduced the virNetlinkEventServiceStopAll function, and
the code in virNetlinkEventServiceStop is copied to this function, so just
call virNetlinkEventServiceStop instead.
Start discovering the mediated devices on the host system and format the
attributes for the mediated device into the XML. Compared to the parent
device which reports generic information about the abstract mediated
devices types, a child device only reports the type name it has been
instantiated from and the IOMMU group number, since that's device
specific compared to the rest of the info that can be gathered about
mediated devices at the moment.
This patch introduces both the formatting and parsing routines, updates
nodedev.rng schema, adding a testcase as well.
The resulting mdev child device XML:
<device>
<name>mdev_4b20d080_1b54_4048_85b3_a6a62d165c01</name>
<path>/sys/devices/.../4b20d080-1b54-4048-85b3-a6a62d165c01</path>
<parent>pci_0000_06_00_0</parent>
<driver>
<name>vfio_mdev</name>
</driver>
<capability type='mdev'>
<type id='vendor_supplied_type_id'/>
<iommuGroup number='NUM'/>
<capability/>
<device/>
https://bugzilla.redhat.com/show_bug.cgi?id=1452072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The parent device needs to report the generic stuff about the supported
mediated devices types, like device API, available instances, type name,
etc. Therefore this patch introduces a new nested capability element of
type 'mdev_types' with the resulting XML of the following format:
<device>
...
<capability type='pci'>
...
<capability type='mdev_types'>
<type id='vendor_supplied_id'>
<name>optional_vendor_supplied_codename</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>NUM</availableInstances>
</type>
...
<type>
...
</type>
</capability>
</capability>
...
</device>
https://bugzilla.redhat.com/show_bug.cgi?id=1452072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The reason for introducing two capabilities, one for the device itself
(cap 'mdev') and one for the parent device listing the available types
('mdev_types'), is that we should be able to do
'virsh nodedev-list --cap' not only for existing mdev devices but also
for devices that support creation of mdev devices, since one day libvirt
might be actually able to create the mdev devices in an automated way
(just like we do for NPIV/vHBA).
https://bugzilla.redhat.com/show_bug.cgi?id=1452072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since there's at least SRIOV and MDEV sub-capabilities to be parsed,
let's make the code more readable by splitting it to several logical
blocks.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Namely, this patch is about virMediatedDeviceGetIOMMUGroup{Dev,Num}
functions. There's no compelling reason why these functions should take
an object, on the contrary, having to create an object every time one
needs to query the IOMMU group number, discarding the object afterwards,
seems odd.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Commit 8e09663 "pci: recognize/report GEN4 (PCIe 4.0) card 16GT/s Link
speed" introduced another speed into enum, but mistakenly also altered
field width, so one bit of link width was included there.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
These flags to APIs will tell if caller wants to use sparse
stream for storage transfer. At the same time, it's safe to
enable them in storage driver frontend and rely on our backends
checking the flags. This way we can enable specific flags only on
some specific backends, e.g. enable
VIR_STORAGE_VOL_DOWNLOAD_SPARSE_STREAM for filesystem backend but
not iSCSI backend.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now, not all APIs are going to support sparse streams. To some it
makes no sense at all, e.g. virDomainOpenConsole() or
virDomainOpenChannel(). To others, we will need a special flag to
indicate that client wants to enable sparse streams. Instead of
having to write RPC dispatchers by hand we can just annotate in
our .x files that a certain flag to certain RPC call enables this
feature. For instance:
/**
* @generate: both
* @readstream: 1
* @sparseflag: VIR_SPARSE_STREAM
* @acl: storage_vol:data_read
*/
REMOTE_PROC_DOMAIN_SOME_API = XXX,
Therefore, whenever client calls virDomainSomeAPI(..,
VIR_SPARSE_STREAM); daemon will mark that down and send stream
skips when possible.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Basically, what is needed here is to introduce new message type
for the messages passed between the event loop callbacks and the
worker thread that does all the I/O. The idea is that instead of
a queue of read buffers we will have a queue where "hole of size
X" messages appear. That way the event loop callbacks can just
check the head of the queue and see if the worker thread is in
data or a hole section and how long the section is.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Whenever server sends a client stream packet (either regular with
actual data or stream skip one) it is queued on @st->rx. So the
list is a mixture of both types of stream packets. So now that we
have all the helpers needed we can wire their processing up. But
since virNetClientStreamRecvPacket doesn't support
VIR_STREAM_RECV_STOP_AT_HOLE flag yet, let's turn all received
skips into zeroes repeating requested times.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have RPC wrappers over VIR_NET_STREAM_HOLE we can
start wiring them up. This commit wires up situation when a
client wants to send a hole to daemon.
To keep stream offsets synchronous, upon successful call on the
daemon skip the same hole in local part of the stream.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a function that handles an incoming STREAM_HOLE packet.
Even though it is not wired up yet, it will be soon. At the
beginning do couple of checks whether server plays nicely and
sent us a STREAM_HOLE packed only after we've enabled sparse
streams. Then decodes the message payload to see how big the hole
is and stores it in passed @length argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While the previous commit implemented a helper for sending a
STREAM_HOLE packet for daemon, this is a client's counterpart.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is just a helper function that takes in a length value,
encodes it into XDR and sends to client.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a special type of stream packet, that is bidirectional
and contains information regarding how many bytes each side will
be skipping in the stream.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a new argument to daemonCreateClientStream in order to allow for
future expansion to mark that a specific stream can be used to skip
data, such as the case with sparsely populated files. The new flag will
be the eventual decision point between client/server to decide whether
both ends can support and want to use sparse streams.
A new bool 'allowSkip' is added to both _virNetClientStream and
daemonClientStream in order to perform the tracking.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a virStreamPtr pointer to the _virNetClientStream
in order to reverse track the parent stream.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is just an internal API, that calls corresponding function
in stream driver. This function will set @data = 1 if the
underlying file is in data section, or @data = 0 if it is in a
hole. At any rate, @length is set to number of bytes remaining in
the section the file currently is.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is just a wrapper over new function that have been just
introduced: virStreamSendHole() . It's very similar to
virStreamSendAll() except it handles sparse streams well.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is just a wrapper over new functions that have been just
introduced: virStreamRecvFlags(), virStreamRecvHole(). It's very
similar to virStreamRecvAll() except it handles sparse streams
well.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a new flag to virStreamRecvFlags in order to handle being able to
stop reading from the stream so that the consumer can generate a "hole"
in stream target. Generation of a hole replaces the need to receive and
handle a sequence of zero bytes for sparse stream targets.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is basically a counterpart for virStreamSendHole().
If one side of a stream called virStreamSendHole() the other
should call virStreamRecvHole() to get the size of the hole.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This API is used to tell the other side of the stream to skip
some bytes in the stream. This can be used to create a sparse
file on the receiving side of a stream.
It takes @length argument, which says how big the hole is. This
skipping is done from the current point of stream. Since our
streams are not rewindable like regular files, we don't need
@whence argument like seek(2) has.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are three virStreamDriver's currently supported:
* virFDStream
* remote driver
* ESX driver
For now, backend virStreamRecvFlags support for only remote driver and
ESX driver is sufficient. Future patches will update virFDStream.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch is adding the virStreamRecvFlags as a variant to the
virStreamRecv function in order to allow for future expansion of
functionality for processing sparse streams using a @flags
argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function takes a FD and determines whether the current
position is in data section or in a hole. In addition to that,
it also determines how much bytes are there remaining till the
current section ends.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
One big downside of using the pipe to transfer the data is that
we can really transfer just bare data. No metadata can be carried
through unless some formatted messages are introduced. That would
be quite painful to achieve so let's use a message queue. It's
fairly easy to exchange info between threads now that iohelper is
no longer used.
The reason why we cannot use the FD for plain files directly is
that despite us setting noblock flag on the FD, any
read()/write() blocks regardless (which is a show stopper since
those parts of the code are run from the event loop) and poll()
reports such FD as always readable/writable - even though the
subsequent operation might block.
The pipe is still not gone though. It is used to signal the event
loop that an event occurred (e.g. data is available for reading
in the queue, or vice versa).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since we allow active layer block commit the users are allowed to commit
the top of the chain (e.g. vda) into the backing image. The API would
not accept that parameter, as it tried to look up the image in the
backing chain.
Add the ability to use the top level image target name explicitly as the
top image of the block commit operation.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1451394
The QEMU default is GICv2, and some of the code in libvirt
relies on the exact value. Stop pretending that's not the
case and use GICv2 explicitly where needed.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
There are currently some limitations in the emulated GICv3
that make it unsuitable as a default. Use GICv2 instead.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1450433
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Currently we consider all UNIX paths with specific prefix as generated
by libvirt, but that's a wrong assumption. Let's make the detection
better by actually checking whether the whole path matches one of the
paths that we generate or generated in the past.
The UNIX path isn't stored in config XML since libvirt-1.3.1.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1446980
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The debian etch distro was end-of-life a long time ago so we no
longer need the ULLONG_MAX hack. In any case gnulib now provides
an equivalent fix by default, and so our definition now triggers
syntax-check rule failure
src/internal.h:# define ULLONG_MAX ULONG_LONG_MAX
maint.mk: define the above via some gnulib .h file
maint.mk:843: recipe for target 'sc_prohibit_always-defined_macros' failed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add kernel_irqchip=split/on to the QEMU command line
and a capability that looks for it in query-command-line-options
output. For the 'split' option, use a version check
since it cannot be reasonably probed.
https://bugzilla.redhat.com/show_bug.cgi?id=1427005
Add a new <ioapic> element with a driver attribute.
Possible values are qemu and kvm. With 'qemu', the I/O
APIC can be put in the userspace even for KVM domains.
https://bugzilla.redhat.com/show_bug.cgi?id=1427005
There should be no need to make dir based pools world/group readable.
So use 0711, not 0755, as the default perms for storage dirs.
Updates in v2:
- adapt commit wording to mention dropping group readable as well
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The metadata libvirt cares about is identical for version 3
as for previous versions, so we merely need list the new
version number.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a shutdown is expected because it was triggered via libvirt we can
also expect the monitor to close. In those cases do not report an
internal error like:
"internal error: End of file from qemu monitor"
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Simply tries to match the provided regex on a string and returns
the result. Useful if caller don't care about the matched substring
and want to just test if some pattern patches a string.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The @type from virFileReadValueString needs to be VIR_FREE each time
through the loop since it's not saved and since cleanup can be reached
prior to decoding it for @kernel_type amd bank->type, the cleanup code
needs to also have a VIR_FREE
Found by Coverity
Adjust the current message to make it clear, that it is the hotplug
operation that is unsupported with the given host device type.
https://bugzilla.redhat.com/show_bug.cgi?id=1450072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When running on a NUMA machine, populate the sibling node
and distance information using data supplied by Xen.
With locality distances information, under Xen, new host
capabilities would like:
<topology>
<cells num='4'>
<cell id='0'>
<memory unit='KiB'>263902380</memory>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='21'/>
</distances>
...
</cell>
...
</cells>
...
</topology>
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
GCC complains that inlining virStringTrimOptionalNewline is not
likely on some platforms:
cc1: warnings being treated as errors
../../src/util/virfile.c: In function 'virFileReadValueBitmap':
../../src/util/virstring.h:292: error: inlining failed in call to 'virStringTrimOptionalNewline': call is unlikely and code size would grow [-Winline]
../../src/util/virfile.c:3987: error: called from here [-Winline]
Inlining this function is not going to be a measurable performance
benefit either, since the time required to execute it is going to
be dominated by running of strlen() over the string, not by the
function call overhead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Added only in drivers that were already calling
virCapabilitiesInitNUMA(). Instead of refactoring all the callers to
behave the same way in case of error, just follow what the callers are
doing for all the functions.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We're only adding only info about L3 caches, we can add more
later (just by changing one line), but for now that's more than enough
without overwhelming anyone.
XML snippet of how this should look like (also seen as part of the commit):
<cache>
<bank id='0' level='3' type='both' size='8192' unit='KiB' cpus='0-7'/>
</cache>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It is no longer needed thanks to the great virfilewrapper.c. And this
way we don't have to add a new set of functions for each prefixed
path.
While on that, add two functions that weren't there before, string and
scaled integer reading ones. Also increase the length of the string
being read by one to accompany for the optional newline at the
end (i.e. change INT_STRLEN_BOUND to INT_BUFSIZE_BOUND).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So, because mingw is somehow OK with dereferencing a pointer within a
VIR_DEBUG macro, compared to outside of it to which it complained with a
"potential NULL pointer dereference" error (still a false positive), we
can make the code a tiny bit cleaner.
Sighed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When adding a nwfilter onto the list in
virNWFilterObjListAssignDef() this array is re-allocated to match
demand for new size. However, it is never freed leading to a
leak:
==26535== 136 bytes in 1 blocks are definitely lost in loss record 1,079 of 1,250
==26535== at 0x4C2E2BE: realloc (vg_replace_malloc.c:785)
==26535== by 0x54BA28E: virReallocN (viralloc.c:245)
==26535== by 0x54BA384: virExpandN (viralloc.c:294)
==26535== by 0x54BA657: virInsertElementsN (viralloc.c:436)
==26535== by 0x55DB011: virNWFilterObjListAssignDef (virnwfilterobj.c:362)
==26535== by 0x55DB530: virNWFilterObjListLoadConfig (virnwfilterobj.c:503)
==26535== by 0x55DB635: virNWFilterObjListLoadAllConfigs (virnwfilterobj.c:539)
==26535== by 0x2AC5A28B: nwfilterStateInitialize (nwfilter_driver.c:250)
==26535== by 0x5621C64: virStateInitialize (libvirt.c:770)
==26535== by 0x124379: daemonRunStateInit (libvirtd.c:881)
==26535== by 0x554AC78: virThreadHelper (virthread.c:206)
==26535== by 0x8F5F493: start_thread (in /lib64/libpthread-2.23.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
After bdcf6e481 there is a crasher in libvirt. The commit assumes
that priv->perf is always set. That is not true. For inactive
domains, the priv->perf is not allocated as it is set in
qemuProcessLaunch(). Now, usually we differentiate between
accesses to inactive and active definition and it works just
fine. Except for 'domstats'. There priv->perf is accessed without
prior check for domain inactivity. While we could check for that,
more robust solution is to make virPerfEventIsEnabled() accept
NULL.
How to reproduce:
1) ensure you have at least one inactive domain
2) virsh domstats
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
This patch fixes the following MinGW error (although actually being a
false positive):
../../src/util/virmdev.c: In function 'virMediatedDeviceListMarkDevices':
../../src/util/virmdev.c:453:21: error: potential null pointer
dereference [-Werror=null-dereference]
const char *mdev_path = mdev->path;
^~~~~~~~~
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The problem resides in virHostdevUpdateActiveMediatedDevices which gets
called during qemuProcessReconnect. The issue here is that
virMediatedDeviceListAdd takes a pointer to the item to be added to the
list to which VIR_APPEND_ELEMENT is used, which also clears the pointer.
However, in this case only the local copy of the pointer got cleared,
leaving the original pointing to valid memory. To sum it up, during
cleanup phase, the original pointer is freed and the daemon crashes
basically any time it would access it.
Backtrace:
0x00007ffff3ccdeba in __strcmp_sse2_unaligned
0x00007ffff72a444a in virMediatedDeviceListFindIndex
0x00007ffff7241446 in virHostdevReAttachMediatedDevices
0x00007fffc60215d9 in qemuHostdevReAttachMediatedDevices
0x00007fffc60216dc in qemuHostdevReAttachDomainDevices
0x00007fffc6046e6f in qemuProcessStop
0x00007fffc6091596 in processMonitorEOFEvent
0x00007fffc6091793 in qemuProcessEventHandler
0x00007ffff7294bf5 in virThreadPoolWorker
0x00007ffff7294184 in virThreadHelper
0x00007ffff3fdc3c4 in start_thread () from /lib64/libpthread.so.0
0x00007ffff3d269cf in clone () from /lib64/libc.so.6
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1446455
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Use a local variable to hold data, rather than accessing the pointer
after calling virMediatedDeviceListAdd (therefore VIR_APPEND_ELEMENT).
Although not causing an issue at the moment, this change is a necessary
prerequisite for tweaking virMediatedDeviceListAdd in a separate patch,
which will take a reference for the source pointer (instead of pointer
value) and will clear it along the way.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Even though there are several checks before calling this function
and for some scenarios we don't call it at all (e.g. on disk hot
unplug), it may be possible to sneak in some weird files (e.g. if
domain would have RNG with /dev/shm/some_file as its backend). No
matter how improbable, we shouldn't unlink it as we would be
unlinking a file from the host which we haven't created in the
first place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Just like in previous commit, this fixes the same issue for
hotplug.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
While the code allows devices to already be there (by some
miracle), we shouldn't try to create devices that don't belong to
us. For instance, we shouldn't try to create /dev/shm/file
because /dev/shm is a mount point that is preserved. Therefore if
a file is created there from an outside (e.g. by mgmt application
or some other daemon running on the system like vhostmd), it
exists in the qemu namespace too as the mount point is the same.
It's only /dev and /dev only that is different. The same
reasoning applies to all other preserved mount points.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Currently, all we need to do in qemuDomainCreateDeviceRecursive() is to
take given @device, get all kinds of info on it (major & minor numbers,
owner, seclabels) and create its copy at a temporary location @path
(usually /var/run/libvirt/qemu/$domName.dev), if @device live under
/dev. This is, however, very loose condition, as it also means
/dev/shm/* is created too. Therefor, we will need to pass more arguments
into the function for better decision making (e.g. list of mount points
under /dev). Instead of adding more arguments to all the functions (not
easily reachable because some functions are callback with strictly
defined type), lets just turn this one 'const char *' into a 'struct *'.
New "arguments" can be then added at no cost.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
When setting up mount namespace for a qemu domain the following
steps are executed:
1) get list of mountpoints under /dev/
2) move them to /var/run/libvirt/qemu/$domName.ext
3) start constructing new device tree under /var/run/libvirt/qemu/$domName.dev
4) move the mountpoint of the new device tree to /dev
5) restore original mountpoints from step 2)
Note the problem with this approach is that if some device in step
3) requires access to a mountpoint from step 2) it will fail as
the mountpoint is not there anymore. For instance consider the
following domain disk configuration:
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/dev/shm/vhostmd0'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</disk>
In this case operation fails as we are unable to create vhostmd0
in the new device tree because after step 2) there is no /dev/shm
anymore. Leave aside fact that we shouldn't try to create devices
living in other mountpoints. That's a separate bug that will be
addressed later.
Currently, the order described above is rearranged to:
1) get list of mountpoints under /dev/
2) start constructing new device tree under /var/run/libvirt/qemu/$domName.dev
3) move them to /var/run/libvirt/qemu/$domName.ext
4) move the mountpoint of the new device tree to /dev
5) restore original mountpoints from step 3)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
When we get a POLLHUP or VIR_EVENT_HANDLE_HANGUP event for a client, we
still want to read from the socket to process any accumulated data. But
doing so inevitably results in an error and a call to
virNetClientMarkClose before we get to processing the hangup event (and
another call to virNetClientMarkClose). However the close reason passed
to the second virNetClientMarkClose call is ignored because another one
was already set. We need to pass the correct close reason when marking
the socket to be closed for the first time.
https://bugzilla.redhat.com/show_bug.cgi?id=1373859
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While fixing a bug with incorrectly freed memory in commit
v3.1.0-399-g5498aa29a, I accidentally broke persistent migration of
transient domains. Before adding qemuDomainDefCopy in the path, the code
just took NULL from vm->newDef and used it as the persistent def, which
resulted in no persistent XML being sent in the migration cookie. This
scenario is perfectly valid and the destination correctly handles it by
using the incoming live definition and storing it as the persistent one.
After the mentioned commit libvirtd would just segfault in the described
scenario.
https://bugzilla.redhat.com/show_bug.cgi?id=1446205
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If we are encoding a block of data that is 16 bytes in length,
we cannot leave it as 16 bytes, we must pad it out to the next
block boundary, 32 bytes. Without this padding, the decoder will
incorrectly treat the last byte of plain text as the padding
length, as it can't distinguish padded from non-padded data.
The problem exhibited itself when using a 16 byte passphrase
for a LUKS volume
$ virsh secret-set-value 55806c7d-8e93-456f-829b-607d8c198367 \
$(echo -n 1234567812345678 | base64)
Secret value set
$ virsh start demo
error: Failed to start domain demo
error: internal error: process exited while connecting to monitor: >>>>>>>>>>Len 16
2017-05-02T10:35:40.016390Z qemu-system-x86_64: -object \
secret,id=virtio-disk1-luks-secret0,data=SEtNi5vDUeyseMKHwc1c1Q==,\
keyid=masterKey0,iv=zm7apUB1A6dPcH53VW960Q==,format=base64: \
Incorrect number of padding bytes (56) found on decrypted data
Notice how the padding '56' corresponds to the ordinal value of
the character '8'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating v3.2.0-77-g8be3ccd04 commit, I completely forgot that one
migration capability is very special. It's the "events" capability which
tells QEMU to report "MIGRATION" events. Since libvirt always wants the
events, it is enabled in qemuConnectMonitor and the rest of the code
should not touch it.
https://bugzilla.redhat.com/show_bug.cgi?id=1439841https://bugzilla.redhat.com/show_bug.cgi?id=1441165
Messed-up-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
... with VIR_NET_GENERATED_MACV???_PREFIX, which is defined in
util/virnetdevmacvlan.h.
Since VIR_NET_GENERATED_PREFIX is used for plain tap devices, it is
renamed to VIR_NET_GENERATED_TAP_PREFIX and moved to virnetdev.h
The parser had been clearing out *all* suggested device names for
type='direct' (aka macvtap) interfaces. All of the code implementing
macvtap allows for a user-specified device name, so we should allow
it. In the case that an interface name starts with "macvtap" or
"macvlan" though, we do still clear it out, just as we do with "vnet"
(which is the prefix used for automatically generated tap device
names), since those are the prefixes for the names we autogenerate for
macvtap and macvlan devices.
Resolves: https://bugzilla.redhat.com/1335798