Unify the NumOfVolumes API into virstorageobj.c from storage_driver and
test_driver. The only real difference between the two is the test driver
doesn't call using the aclfilter API.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Unify the *ListDevice API into virnodedeviceobj.c from node_device_driver
and test_driver. The only real difference between the two is that the test
driver doesn't call the aclfilter API. The name of the new API follows that
of other drivers to "GetNames".
NB: Change some variable names to match what they really are - consistency
with other drivers. Also added a clear of the input names.
This also allows virNodeDeviceObjHasCap to be static to virnodedeviceobj
Signed-off-by: John Ferlan <jferlan@redhat.com>
Unify the NumOfDevices API into virnodedeviceobj.c from node_device_driver
and test_driver. The only real difference between the two is that the test
driver doesn't call the aclfilter API.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Unlike other drivers, this is a test driver only API. Still combining
the logic of testConnectListInterfaces and testConnectListDefinedInterfaces
makes things a bit easier in the long run.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Unlike other drivers, this is a test driver only API. Still combining
the logic of testConnectNumOfInterfaces and testConnectNumOfDefinedInterfaces
makes things a bit easier in the long run.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This new internal API makes a copy of virCPUDef while removing all
features which would block migration. It uses cpu_map.xml as a database
of such features, which should only be used as a fallback when we cannot
get the data from a hypervisor. The main goal of this API is to decouple
this filtering from virCPUUpdate so that the hypervisor driver can
filter the features according to the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently, if we want to zero out disk source (e,g, due to
startupPolicy when starting up a domain) we use
virDomainDiskSetSource(disk, NULL). This works well for file
based storage (storage type file, dir, or block). But it doesn't
work at all for other types like volume and network.
So imagine that you have a domain that has a CDROM configured
which source is a volume from an inactive pool. Because it is
startupPolicy='optional', the CDROM is empty when the domain
starts. However, the source element is not cleared out in the
status XML and thus when the daemon restarts and tries to
reconnect to the domain it refreshes the disks (which fails - the
storage pool is still not running) and thus the domain is killed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Having to use cpuBaseline with VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES
flag to expand CPU features is strange. Not to mention that cpuBaseline
can only expand host CPU definitions (i.e., it completely ignores
feature policies). The new virCPUExpandFeatures API is designed to work
with both host and guest CPU definitions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The global functions virNetDevReplaceMacAddress(),
virNetDevReplaceNetConfig(), virNetDevRestoreMacAddress(), and
virNetDevRestoreNetConfig() are no longer used, as their functionality
has been replaced by virNetDev(Save|Read|Set)NetConfig().
The static functions virNetDevReplaceVfConfig() and
virNetDevRestoreVfConfig() were only used by the above-named global
functions that were removed.
These three functions are destined to replace
virNetDev(Replace|Restore)NetConfig() and
virNetDev(Replace|Restore)MacAddress(), which both do the save and set
together as a single step. We need to separate the save, read, and set
steps because there will be situations where we need to do something
else in between (in particular, we will need to rebind a VF's driver
after save but before set).
This patch creates the new functions, but doesn't call them - that
will come in a subsequent patch. Note that the new functions to
read/write the file that stores the original network config now uses
JSON rather than plaintext (it still recognizes the old format as well
though, so it won't get confused during an upgrade).
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach'
are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Beside creation, disposal, getter, and setter methods the module exports
methods to work with lists of mediated devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This way more drivers can utilize the functionality without copying
the code. And we can therefore test it in one place for all of them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That file has only two exported files and each one of them has
different naming. virNode is what all the other files use, so let's
use it. It wasn't used before because the clash with public API
naming, so let's fix that by shortening the name (there is no other
private variant of it anyway).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no "node driver" as there was before, drivers have to do
their own ACL checking anyway, so they all specify their functions and
nodeinfo is basically just extending conf/capablities. Hence moving
the code to src/conf/ is the right way to go.
Also that way we can de-duplicate some code that is in virsysfs and/or
virhostcpu that got duplicated during the virhostcpu.c split. And
Some cleanup is done throughout the changes, like adding the vir*
prefix etc.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no reason for it not to be in the utils, all global symbols
under that file already have prefix vir* and there is no reason for it
to be part of DRIVER_SOURCES because that is just a leftover from
older days (pre-driver modules era, I believe).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While on that, drop support for kernels from RHEL-5 era (missing
cpu/present file). Also add some useful functions and export them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
By using this we are able to easily switch the sysfs path being
used (fake it). This will not only help tests in the future but can
be also used from files where the code is duplicated currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These helpers are doing just a read and covert the value, but they
properly size the read limit, handle additional whitespace characters,
and unify error reporting.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commits eaf18f4c2b and 86dd9fac0f separated util/host{cpu,mem}
stuff from nodeinfo, but did not adjust the syms file.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It is everywhere else. I even remember one of our scripts failing if
the newline is missing, but it doesn't happen currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Guests are handled in callers, but if something goes wrong (when it
cannot be added to virCapabilities, for example), there's no way for
them to free it properly.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Both QEMU and bhyve are using the same function for setting up the CPU
in virCapabilities, so de-duplicate it, save code and time, and help
other drivers adopt it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When using thin provisioning, management tools need to resize the disk
in certain cases. To avoid having them to poll disk usage introduce an
event which will be fired when a given offset of the storage is written
by the hypervisor. Together with the API which will be added later, it
will allow registering thresholds for given storage backing volumes and
this event will then notify management if the threshold is exceeded.
The function has very specific semantics. Split out the part that parses
the backing store specification string into a separate helper so that it
can be reused later while keeping the wrapper with existing semantics.
Note that virStorageFileParseChainIndex is pretty well covered by the
test suite.
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new
function returns the netdev name of the referenced VF device
(e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver.
This function unbinds a device from its driver, then immediately
rebinds it to its driver again. The code for this new function is just
the 2nd half of virPCIDeviceBindWithDriverOverride(), so that
function's 2nd half is replaced with a call to virPCIDeviceRebind().
This patch splits out the part of virNetDevTapCreateInBridgePort()
that would need to be re-done if an existing tap device had to be
re-attached to a bridge, and puts it into a separate function. This
can be used both when an existing domain interface config is updated
to change its connection, and also to re-attach to the "same" bridge
when a network has been stopped and restarted. So far it is used for
nothing.
This function provides the bridge/bond device that the given network
device is attached to. The return value is 0 or -1, and the master
device is a char** argument to the function - this is needed in order
to allow for a "success" return from a device that has no master.
When enabling IPv6 on all interfaces, we may get the host Router
Advertisement routes discarded. To avoid this, the user needs to set
accept_ra to 2 for the interfaces with such routes.
See https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
on this topic.
To avoid user mistakenly losing routes on their hosts, check
accept_ra values before enabling IPv6 forwarding. If a RA route is
detected, but neither the corresponding device nor global accept_ra
is set to 2, the network will fail to start.
virNetlinkCommand() processes only one response message, while some
netlink commands, like route dumping, need to process several.
Add virNetlinkDumpCommand() as a virNetlinkCommand() sister.
Use "virStoragePoolObj" as a prefix for any external API in virstorageobj.
Also a couple of functions were local to virstorageobj.c, so remove their
external defs iin virstorageobj.h.
NB: The virStorageVolDef* API's won't change.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move all the StoragePoolObj related API's into their own module
virstorageobj from the storage_conf
Purely code motion at this point, plus adjustments to cleanly build
Signed-off-by: John Ferlan <jferlan@redhat.com>
When starting a domain with custom guest CPU specification QEMU may add
or remove some CPU features. There are several reasons for this, e.g.,
QEMU/KVM does not support some requested features or the definition of
the requested CPU model in libvirt's cpu_map.xml differs from the one
QEMU is using. We can't really avoid this because CPU models are allowed
to change with machine types and libvirt doesn't know (and probably
doesn't even want to know) about such changes.
Thus when we want to make sure guest ABI doesn't change when a domain
gets migrated to another host, we need to update our live CPU definition
according to the CPU QEMU created. Once updated, we will change CPU
checking to VIR_CPU_CHECK_FULL to make sure the virtual CPU created
after migration exactly matches the one on the source.
https://bugzilla.redhat.com/show_bug.cgi?id=822148https://bugzilla.redhat.com/show_bug.cgi?id=824989
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Move the bulk of createVport and rename to virNodeDeviceCreateVport.
Remove the deleteVport entirely and replace with virNodeDeviceDeleteVport
Signed-off-by: John Ferlan <jferlan@redhat.com>
The function is actually in virutil.c, but prototyped in virfile.h.
This patch fixes that by renaming the function to virWaitForDevices,
adding the prototype in virutil.h and libvirt_private.syms, and then
changing the callers to use the new name.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the virStoragePoolSourceAdapter from storage_conf.h and rename
to virStorageAdapter.
Continue with code realignment for brevity and flow.
Signed-off-by: John Ferlan <jferlan@redhat.com>
cpuNodeData has always been followed by cpuDecode as no hypervisor
driver is really interested in raw CPUID data for a host CPU. Let's
create a new CPU driver API which returns virCPUDefPtr directly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Move all the NWFilterObj API's into their own module virnwfilterobj
from the nwfilter_conf
Purely code motion at this point, plus adjustments to cleanly build.
Whole implementations along with helper totalling screens of code were
conditionally compiled. That made the code totally unreadable and
untestable. Rename functions to have the architecture in the name so
that all can be compiled at the same time and introduce header to allow
testing them all.
After the system has been booted, it should not change.
Cache the return value of virSystemdHasMachined.
Allow starting and terminating machines with just one
DBus call, instead of three, reducing the chance of
the call timing out.
Also introduce a small function for resetting the cache
to be used in tests.
Move all the NodeDeviceObj API's into their own module virnodedeviceobj
from the node_device_conf
Purely code motion at this point, plus adjustments to cleanly build.
The API is useful for creating virCPUData in a hypervisor driver from
data we got by querying the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is useful for creating virCPUData in a hypervisor driver from
data we got by querying the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is useful for creating virCPUData in a hypervisor driver from
data we got by querying the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The new API is called virCPUDataFree. Individual CPU drivers are no
longer required to implement their own freeing function unless they need
to free architecture specific data from virCPUData.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This will eventually replace virQEMUBuildBufferEscapeComma, however
it's not possible right now. Some parts of the code that uses the
old function needs to be refactored.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Rework the code to perform the various searches by parent, parent_wwnn/
parent_wwpn, parent_fabric_wwn, or vport capable in order to return the
'parent_host' number that is vHBA capable.
The former virNodeDeviceGetParentHost is renamed to add the ByParent
on it fixes an issue where if no parent was supplied in the XML to
create the vHBA, then virNodeDeviceFindByName was called with a NULL
second parameter which had bad results.
The reworked code will make the various calls to fetch the NPIV host
by the passed parameter options or if none are provided find a vport
capable NPIV HBA to perform the create. If the call is from the delete
path, then this option won't be allowed.
Each of virNodeDeviceGetParentHostBy* functions is now static, so
remove them external definitions.
A secondary benefit of this is the test_driver now can make use of
the new API to add some new tests to test the various creation options.
Create a virscsihost.c and place the functions there. That removes the
last #ifdef __linux__ from virutil.c.
Take the opporunity to also change the function names and in one case
the parameters slightly
Use the new virNodeDeviceGetParentName instead. Modify the callers to
build the node device scsi_host# name string in order to call the new
function so that proper lookup occurs.
Rather than have them mixed in with the virutil apis, create a separate
virvhba.c module and move the vHBA related calls into there. Soon there
will be more added.
Also modify the names of the functions and some arguments to be more
indicative of what is really happening. Adjust the callers respectively.
While I was changing fchosttest, rather than the non-descriptive names
test1...test6, rename them to match what the test is doing.
So rather than comparing 2 paths (strings) as they are, which can very
easily lead to unnecessary errors (e.g. in storage driver) that the paths
are not the same when in fact they'd be e.g. just symlinks to the same
location, we should put our best effort into resolving any symlinks and
canonicalizing the path and only then compare the 2 paths for equality.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This patchs allows to set the timeout value used for all
openvswitch calls. The default timeout value remains as
before at 5 seconds.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need to traverse the symlinks one step at the time.
Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Based on work of Mehdi Abaakouk <sileht@sileht.net>.
When parsing vhost-user interface XML and no ifname is found we
can try to fill it in in post parse callback. The way this works
is we try to make up interface name from given socket path and
then ask openvswitch whether it knows the interface.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Other drivers (like qemu) would like to know if the namespaces
are available therefore it makes sense to move this function to
a shared module.
At the same time, this function had some default namespaces that
are checked with every call. It is not necessary - let callers
pass just those namespaces they are interested in.
With the move the function is renamed to
virProcessNamespaceAvailable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Surprisingly there was a virDomainPCIAddressReleaseAddr() function
already, but it was completely unused. Since we don't reserve entire
slots at once any more, there is no need to release entire slots
either, so we just replace the single call to
virDomainPCIAddressReleaseSlot() with a call to
virDomainPCIAddressReleaseAddr() and remove the now unused function.
The keen observer may be concerned that ...Addr() doesn't call
virDomainPCIAddressValidate(), as ...Slot() did. But really the
validation was pointless anyway - if the device hadn't been suitable
to be connected at that address, it would have failed validation
before every being reserved in the first place, so by definition it
will pass validation when it is being unplugged. (And anyway, even if
something "bad" happened and we managed to have a device incorrectly
at the given address, we would still want to be able to free it up for
use by a device that *did* validate properly).
Since we don't actually reserve an entire slot at a time anymore, the
name of this function is just confusing, and it's almost identical in
operation to virDomainPCIAddressReserveNextAddr() anyway, so remove
the *Slot() function and replace calls to it with calls to *Addr(...,
-1).
This utility function iterates through all devices looking for any
with a PCI address that has function != 0 (which implies that multiple
functions are in use on that slot), then uses an inner iterator to
find the device that's on function 0 of that same slot and sets the
"multi" in its virDomainDeviceInfo (as long as it hasn't already been
set explicitly by someone who presumably has better information than
we do).
It isn't yet called from anywhere, so will have no functional effect.
https://bugzilla.redhat.com/show_bug.cgi?id=1373711
Add support and documentation for the [NO_]OVERWRITE flags for the
logical backend.
Update virsh.pod with a description of the process for usage of
the flags and building of the pool's volume group.
Signed-off-by: John Ferlan <jferlan@redhat.com>
With our new qemu namespace code in place, the relabelling of
devices is done not as good is it could: a child process is
spawned, it enters the mount namespace of the qemu process and
then runs desired API of the security driver.
Problem with this approach is that internal state transition of
the security driver done in the child process is not reflected in
the parent process. While currently it wouldn't matter that much,
it is fairly easy to forget about that. We should take the extra
step now while this limitation is still fresh in our minds.
Three new APIs are introduced here:
virSecurityManagerTransactionStart()
virSecurityManagerTransactionCommit()
virSecurityManagerTransactionAbort()
The Start() is going to be used to let security driver know that
we are starting a new transaction. During a transaction no
security labels are actually touched, but rather recorded and
only at Commit() phase they are actually updated. Should
something go wrong Abort() aborts the transaction freeing up all
memory allocated by transaction.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The public virSecret object has a single "usage_id" field
but the virSecretDef object has a different 'char *' field
for each usage type, but the code all assumes every usage
type has a corresponding single string. Get rid of the
pointless union in virSecretDef and just use "usage_id"
everywhere. This doesn't impact public XML format, only
the internal handling.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When changing the metadata via virDomainSetMetadata, we now
emit an event to notify the app of changes. This is useful
when co-ordinating different applications read/write of
custom metadata.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently when spawning containers with systemd, the container PID 1
will get moved into the systemd machine slice. Libvirt then manually
moves the libvirt_lxc and qemu-nbd processes into the cgroups associated
with the slice, but skips the systemd controller cgroup. This means that
from systemd's POV, libvirt_lxc and qemu-nbd are still part of the
libvirtd.service unit.
On systemctl daemon-reload, it will notice that libvirt_lxc & qemu-nbd
are in the libvirtd.service unit for the systemd controller, but in the
machine cgroups for resources. Systemd will thus move them back into
the libvirtd.service resource cgroups next time libvirtd is restarted.
This causes libvirtd to kill off the container due to incorrect cgroup
placement.
The solution is to ensure that when moving libvirt_lxc & qemu-nbd, we
also move the systemd cgroup controller placement. Normally this is
not something we ever want todo, but this is a special case as we are
intentionally wanting to move them to a different systemd unit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1349696
When creating a vHBA, the process is to feed XML to nodeDeviceCreateXML
that lists the <parent> scsi_hostX to use to create the vHBA. However,
between reboots, it's possible that the <parent> changes its scsi_hostX
to scsi_hostY and saved XML to perform the creation will either fail or
create a vHBA using the wrong parent.
So add the ability to provide "wwnn" and "wwpn" or "fabric_wwn" to
the <parent> instead of a name of the scsi_hostN that is the parent.
The allowed XML will thus be:
<parent>scsi_host3</parent> (current)
or
<parent wwnn='$WWNN' wwpn='$WWPN'/>
or
<parent fabric_wwn='$WWNN'/>
Using the wwnn/wwpn or fabric_wwn ensures the same 'scsi_hostN' is
selected between hardware reconfigs or host reboots. The fabric_wwn
Using the wwnn/wwpn pair will provide the most specific search option,
while fabric_wwn will at least ensure usage of the same SAN, but maybe
not the same scsi_hostN.
This patch will add the new fields to the nodedev.rng for input purposes
only since the input XML is essentially thrown away, no need to Format
the values since they'd already be printed as part of the scsi_host
data block.
New API virNodeDeviceGetParentHostByWWNs will take the parent "wwnn" and
"wwpn" in order to search the list of devices for matching capability
data fields wwnn and wwpn.
New API virNodeDeviceGetParentHostByFabricWWN will take the parent "fabric_wwn"
in order to search the list of devices for matching capability data field
fabric_wwn.
If a <parent> is not supplied in the XML used to create a non-persistent
vHBA, then instead of failing, let's try to find a "vports" capable node
device and use that.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Clang 3.9 refuses to compile the existing code with the
following error:
util/virfirewall.c:425:20: error: passing an object that undergoes
default argument promotion to 'va_start'
has undefined behavior [-Werror,-Wvarargs]
va_start(args, layer);
^
util/virfirewall.c:420:37: note: parameter of type 'virFirewallLayer'
is declared here
virFirewallLayer layer,
^
This happens because 'layer' is of type virFirewallLayer, which
is an enum type and not a standard type such as eg. void* or int.
To solve the issue, turn virFirewallAddRule() from a very thin
wrapper around virFirewallAddRuleFullV() to a macro that expands
to a call to virFirewallAddRuleFull() - itself a very thin wrapper
around the aforementioned virFirewallAddRuleFullV() - with no loss
of functionality or type safety.
This function will be needed by the QEMU driver in an upcoming
patch. Additionally, removed a useless empty line.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
The API creates PTR domain which corresponds to a given addr/prefix.
Both IPv4 and IPv6 addresses are supported, but the prefix must be
divisible by 8 for IPv4 and divisible by 4 for IPv6.
The generated PTR domain has the following format
IPv4: 1.2.3.4.in-addr.arpa
IPv6: 0.1.2.3.4.5.6.7.8.9.a.b.c.d.e.f.0.1.2.3.4.5.6.7.8.9.a.b.c.d.e.f.ip6.arpa
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
These helpers will manage the log destination defaults (fetch/set). The reason
for this is to stay consistent with the current daemon's behaviour with respect
to /etc/libvirt/<daemon>.conf file, since both assignment of an empty string
or not setting the log output variable at all trigger the daemon's decision on
the default log destination which depends on whether the daemon runs daemonized
or not.
This patch also changes the logic of the selection of the default
logging output compared to how it is done now. The main difference though is
that we should only really care if we're running daemonized or not, disregarding
the fact of (not) having a TTY completely (introduced by commit eba36a3878) as
that should be of the libvirtd's parent concern (what FD it will pass to it).
Before:
if (godaemon || !hasTTY):
if (journald):
use journald
if (godaemon):
if (privileged):
use SYSCONFIG/libvirtd.log
else:
use XDG_CONFIG_HOME/libvirtd.log
else:
use stderr
After:
if (godaemon):
if (journald):
use journald
else:
if (privileged):
use SYSCONFIG/libvirtd.log
else:
use XDG_CONFIG_HOME/libvirtd.log
else:
use stderr
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We will need this function in near future so that we know what
/dev device corresponds to the SCSI device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need this function in near future so that we know what
/dev device corresponds to the SCSI device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need this function in near future so that we know what
/dev device corresponds to the USB device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Namely, virFileGetACLs, virFileSetACLs, virFileFreeACLs and
virFileCopyACLs. These functions are going to be required when we
are creating /dev for qemu. We have copy anything that's in
host's /dev exactly as is. Including ACLs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The functions to retrieve online and present host CPU information
are only supported on Linux for the time being.
This leads to runtime errors if these function are used on other
platforms. To avoid that, code in higher levels using the functions
must replicate the conditional compilation in higher level which
is error prone (and is plainly spoken ugly).
Adding a function virHostCPUHasBitmap that can be used to check
for host CPU bitmap support.
NB: There are other functions including the host CPU count that
are lacking support on all platforms, but they are too essential
in order to be bypassed.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Instead of having duplicated code in qemuStorageLimitsRefresh and
virStorageBackendUpdateVolTargetInfo to get capacity specific data
about the storage backing source or volume -- create a common API
to handle the details for both.
As a side effect, virStorageFileProbeFormatFromBuf returns to being
a local/static helper to virstoragefile.c
For the QEMU code - if the probe is done, then the format is saved so
as to avoid future such probes.
For the storage backend code, there is no need to deal with the probe
since we cannot call the new API if target->format == NONE.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Instead of having duplicated code in qemuStorageLimitsRefresh and
virStorageBackendUpdateVolTargetInfoFD to fill in the storage backing
source or volume allocation, capacity, and physical values - create a
common API that will handle the details for both.
The common API will fill in "default" capacity values as well - although
those more than likely will be overridden by subsequent code. Having just
one place to make the determination of what the values should be will
make things be more consistent.
For the QEMU code - the data filled in will be for inactive domains
for the GetBlockInfo and DomainGetStatsOneBlock API's. For the storage
backend code - the data will be filled in during the volume updates.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id '8dc27259' introduced virStorageSourceUpdateBlockPhysicalSize
in order to retrieve the physical size for a block backed source device
for an active domain since commit id '15fa84ac' changed to use the
qemuMonitorGetAllBlockStatsInfo and qemuMonitorBlockStatsUpdateCapacity
API's to (essentially) retrieve the "actual-size" from a 'query-block'
operation for the source device.
However, the code only was made functional for a BLOCK backing type
and it neglected to use qemuOpenFile, instead using just open. After
the open the block lseek would find the end of the block and set the
physical value, close the fd and return.
Since the code would return 0 immediately if the source device wasn't
a BLOCK backed device, the physical would be displayed incorrectly,
such as follows in domblkinfo for a file backed source device:
Capacity: 1073741824
Allocation: 0
Physical: 0
This patch will modify the algorithm to get the physical size for other
backing types and it will make use of the qemuDomainStorageOpenStat
helper in order to open/stat the source file depending on its type.
The qemuDomainGetStatsOneBlock will no longer inhibit printing errors,
but it will still ignore them leaving the physical value set to 0.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to the code move to virnetdevtap.c, this change:
* renames virNetInterfaceStats to virNetDevTapInterfaceStats
* changes 'path' to 'ifname', to use the same vocable as other
method in virnetdevtap.c.
* Add the attributes checker
When vhostuser interfaces are used, the interface statistics
are not available in /proc/net/dev.
This change looks at the openvswitch interfaces statistics
tables to provide this information for vhostuser interface.
Note that in openvswitch world drop/error doesn't always make sense
for some interface type. When these informations are not available we
set them to 0 on the virDomainInterfaceStats.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since its introduction in 2012 this internal API did nothing.
Moreover we have the same API that does exactly the same:
virSecurityManagerDomainSetPathLabel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This module will be used to track:
<domain, mac address list>
pairs. It will be important to know these mappings without
libvirt connection (that is from a JSON file), because NSS
module will use those to provide better host name translation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are couple of places where we have a string and want to
save it to a file. Atomically. In all those places we use
virFileRewrite() but also implement the very same callback which
takes the string and write it into temp file. This makes no
sense. Unify the callbacks and move them to one place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The path to the config file for a PCI device is conventiently stored
in a virPCIDevice object, but that object's contents aren't directly
visible outside of virpci.c, so we need to have an accessor function
for it if anyone needs to look at it.
This new function just calls fstat() (if provided with a valid fd) or
stat() (if fd is -1) and returns st_size (or -1 if there is an
error). We may decide we want this function to be more complex, and
handle things like block devices - this is a placeholder (that works)
for any more complicated function.
We have couple of functions that operate over NULL terminated
lits of strings. However, our naming sucks:
virStringJoin
virStringFreeList
virStringFreeListCount
virStringArrayHasString
virStringGetFirstWithPrefix
We can do better:
virStringListJoin
virStringListFree
virStringListFreeCount
virStringListHasString
virStringListGetFirstWithPrefix
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For a new hostdev type='scsi_host' we have a number of
required functions for managing, adding, and removing the
host device to/from guests. Provide the basic infrastructure
for these tasks.
The name "SCSIVHost" (and its variants) is chosen to avoid
conflicts with existing code named "SCSIHost" to refer to
a hostdev type='scsi' protcol='none'.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Add the function virHostdevIsSCSIDevice() which detects whether a
hostdev is a SCSI device or not.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
PPC driver needs to convert POWERx_v* legacy CPU model names into POWERx
to maintain backward compatibility with existing domains. This patch
adds a new step into the guest CPU configuration work flow which CPU
drivers can use to convert legacy CPU definitions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When starting a new domain, we allocate the USB addresses and keep
an address cache in the domain object's private data.
However this data is lost on libvirtd restart.
Also generate the address cache if all the addresses have been
specified, so that devices hotplugged after libvirtd restart
also get theirs assigned.
https://bugzilla.redhat.com/show_bug.cgi?id=1387666
This time do not require an address cache as a parameter.
Simplify qemuDomainAttachChrDeviceAssignAddr to not generate
the virtio serial address cache for devices of other types.
Partially reverts commit 925fa4b.
Commit 19a148b dropped the cache from QEMU's private domain object.
Assume the callers do not have the cache by default and use
a longer name for the internal ones that do.
This makes the shorter 'virDomainVirtioSerialAddrAutoAssign'
name availabe for a function that will not require the cache.
There is an existing virDomainPCIAddressReserveNextSlot() which will
reserve all functions of the next available PCI slot. One place in the
qemu PCI address assignment code requires reserving a *single*
function of the next available PCI slot. This patch modifies and
renames virDomainPCIAddressReserveNextSlot() so that it can fulfill
both the original purpose and the need to reserve a single function.
(This is being done so that the abovementioned code in qemu can have
its "kind of open coded" solution replaced with a call to this new
function).
This new function can be used to check if e.g. name of XML
node don't contains forbidden chars like "/" or "\n".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Make sure that the topology results into a sane number of cpus (up to
UINT_MAX) so that it can be sanely compared to the vcpu count of the VM.
Additionally the helper added in this patch allows to fetch the total
number the topology results to so that it does not have to be
reimplemented later.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1378290
This is mainly virLogAddOutputTo* which were replaced by virLogNewOutputTo* and
the previously poorly named ones virLogParseAndDefine* functions. All of these
are unnecessary now, since all the original callers were transparently switched
to the new model of separate parsing and defining logic.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This method will eventually replace virLogParseAndDefineFilters which
currently does both parsing and defining.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This API is the entry point to output modification of the logger. Currently,
everything is done by virLogParseAndDefineOutputs. Parsing and defining will be
split into two operations both handled by this method transparently.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Abstraction added over parsing a single filter. The method parses potentially a
set of logging filters, while adding each filter logging object to a
caller-provided array.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Another abstraction added on the top of parsing a single logging output. This
method takes and parses the whole set of outputs, adding each single output
that has already been parsed into a caller-provided array. If the user-supplied
string contained duplicate outputs, only the last occurrence is taken into
account (all the others are removed from the list), so we silently avoid
duplicate logs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as for outputs, introduce a new method, that is basically the same as
virLogParseAndDefineFilter with the difference that it does not define the
filter. It rather returns a newly created object that needs to be inserted into
a list and then defined separately.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Introduce a method to parse an individual logging output. The difference
compared to the virLogParseAndDefineOutput is that this method does not define
the output, instead it makes use of the virLogNewOutputTo* methods introduced
in the previous patch and just returns the virLogOutput object that has to be
added to a list of object which then can be defined as a whole via
virLogDefineOutputs. The idea remains still the same - split parsing and
defining of the logging primitives (outputs, filters).
Additionally, since virLogNewOutputTo* methods are now finally used,
ATTRIBUTE_UNUSED can be successfully removed from the methods' definitions,
since that was just to avoid compiler complaints about unused static functions.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of filters. It takes a list of
filters, preferably created by virLogParseFilters. The original set of filters
is reset and replaced by the new user-provided set of filters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of outputs. It takes a list of
outputs, preferably created by virLogParseOutputs. The original set of outputs
is reset and replaced by the new user-provided set of outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Outputs are a bit trickier than filters, since the user(config)-specified
set of outputs can contain duplicates. That would lead to logging the same
message twice. For compatibility reasons, we cannot just error out and forbid
the daemon to start if we find duplicate outputs which do not make sense.
Instead, we could silently take into account only the last occurrence of the
duplicate output and remove all the previous ones, so that the logger will not
try to use them when it is looping over all of its registered outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order to later split output parsing and output defining, introduce a new
function which will create a new virLogOutput object which the parser will
insert into a list with the list being eventually defined.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now virLogParse* functions are doing both parsing and defining of filters
and outputs which should be two separate operations. Since the naming is
apparently a bit poor this patch renames these functions to
virLogParseAndDefine* which eventually will be replaced by virLogSet*.
Additionally, virLogParse{Filter,Output} will be later (after the split) reused,
so that these functions do exactly what the their name suggests.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Provide the Steal API for any code paths that will desire to grab the
object array and then free it afterwards rather than relying to freeing
the whole chain from the reply.
Certain operations may make the vcpu order information invalid. Since
the order is primarily used to ensure migration compatibility and has
basically no other user benefits, clear the order prior to certain
operations and document that it may be cleared.
All the operations that would clear the order can still be properly
executed by defining a new domain configuration rather than using the
helper APIs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370357
Both cpuCompare* APIs are renamed to virCPUCompare*. And they should now
work for any guest CPU definition, i.e., even for host-passthrough
(trivial) and host-model CPUs. The implementation in x86 driver is
enhanced to provide a hint about -noTSX Broadwell and Haswell models
when appropriate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function is similar to virCPUDataCheckFeature, but it works directly
on CPU definition rather than requiring it to be transformed into CPU
data first.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is supposed to make sure the provided CPU definition does not
use a CPU model which is not supported by the hypervisor (if at all
possible, of course).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The reworked API is now called virCPUUpdate and it should change the
provided CPU definition into a one which can be consumed by the QEMU
command line builder:
- host-passthrough remains unchanged
- host-model is turned into custom CPU with a model and features
copied from host
- custom CPU with minimum match is converted similarly to host-model
- optional features are updated according to host's CPU
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function filters all CPU features through a given callback while
copying CPU model related parts of a CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function moves CPU model related parts from one CPU definition to
another. It can be used to avoid unnecessary copies from a temporary CPU
definitions which will be freed anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Useful for copying a CPU definition without model related parts (i.e.,
without model name, feature list, vendor).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In case a hypervisor is able to tell us a list of supported CPU models
and whether each CPU models can be used on the current host, we can
propagate this to domain capabilities. This is a better alternative
to calling virConnectCompareCPU for each supported CPU model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Listing all CPU models supported by QEMU in domain capabilities makes
little sense when libvirt will refuse any model it doesn't know about.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The patch adds <cpu> element to domain capabilities XML:
<cpu>
<mode name='host-passthrough' supported='yes'/>
<mode name='host-model' supported='yes'/>
<mode name='custom' supported='yes'>
<model>Broadwell</model>
<model>Broadwell-noTSX</model>
...
</mode>
</cpu>
Applications can use it to inspect what CPU configuration modes are
supported for a specific combination of domain type, emulator binary,
guest architecture and machine type.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We will need this function shortly when implementing
nodeGetCPUStats in the test driver.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268
The code for replacing domain's transient definition with the persistent
one is repeated in several places and we'll need to add one more. Let's
make a nice helper for it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently the QEMU processes inherit their core dump rlimit
from libvirtd, which is really suboptimal. This change allows
their limit to be directly controlled from qemu.conf instead.
Since the domain lock is not held during preparation of an external XML
config, it is possible that the value can change resulting in unexpected
failures during ABI consistency checking for some save and migrate
operations.
This patch adds a new flag to skip the checking of the cur_balloon value
and then sets the destination value to the source value to ensure
subsequent checks without the skip flag will succeed.
This way it is protected from forges and is keeped up to date too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Finding an USB device from the vendor/device values will be needed
by libxl driver to convert from vendor/device to bus/dev addresses.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This event is emitted when a nodedev XML definition is updated,
like when cdrom media is changed in a cdrom block device.
Also includes node device update event implementation for udev
backend, virsh nodedev-event support, and event-test support
https://bugzilla.redhat.com/show_bug.cgi?id=1356436
According to RFC 3721 (https://www.ietf.org/rfc/rfc3721.txt), there are
two ways to "discover" targets in/for the iSCSI environment. Discovery
is the process which allows the initiator to find the targets to which
it has access and at least one address at which each target may be
accessed.
The method currently implemented in libvirt using the virISCSIScanTargets
API is known as "SendTargets" discovery. This method is more useful when
the target IP Address and TCP port information are available, e.g. in
libvirt terms the "portal". It returns a list of targets for the portal.
From that list, the target can be found. This operation can also fill an
iSCSI node table into which iSCSI logins may occur. Commit id '56057900'
altered that filling by adding the "--op nonpersistent" since it was
not necessarily desired to perform that for non libvirt related targets.
The second method is "Static Configuration". This method not only needs
the IP Address and TCP port (e.g. portal), but also the iSCSI target name.
In libvirt terms this would be the device path field from the iSCSI pool
<source> XML. This patch implements the second methodology using that
required device path as the targetname.
To allow richer definitions of disk sources add infrastructure that will
allow to register functionst generating a JSON object based definition.
This infrastructure will then convert the definition to the proper
command line syntax and use it in cases where it's necessary. This will
allow to keep legacy definitions for back-compat when possible and use
the new definitions for the configurations requiring them.
Add support for converting objects nested in arrays with a numbering
discriminator on the command line. This syntax is used for the
object-based specification of disk source properties.
For use with memory hotplug virQEMUBuildCommandLineJSONRecurse attempted
to format JSON arrays as bitmap on the command line. Make the formatter
function configurable so that it can be reused with different syntaxes
of arrays such as numbered arrays for use with disk sources.
This patch extracts the code and adds a parameter for the function that
will allow to plug in different formatters.
Refactor the command line generator by adding a wrapper (with
documentation) that will handle the outermost object iteration.
This patch also renames the functions and tweaks the error message for
nested arrays to be more universal.
The new function is then reused to simplify qemucommandutiltest.
As we already test that the extraction of the backing store string works
well additional tests for the backing store string parser can be made
simpler.
Export virStorageSourceNewFromBackingAbsolute and use it to parse the
backing store strings, format them using virDomainDiskSourceFormat and
match them against expected XMLs.
Nothing in the code path after the removed call has needs/uses the alias
anyway (as would be the case for command line building or talking to monitor).
The alias is VIR_FREE'd in virDomainDeviceInfoClear which is called for any
device that needs/uses an alias via virDomainDeviceDefFree or virDomainDefFree
as well as during virDomainDeviceInfoFree for host devices.
For persistent domains, the domain definition (including aliases) gets
freed a few screens later when it's replaced with newDef.
For transient domains, the definition is freed/unref'd along with the
virDomainObj a few moments later.
The address sets (pci, ccw, virtio serial) are currently cached
in qemu private data, but all the information required to recreate
these sets is in the domain definition. Therefore I am removing
the redundant data and adding a way to recalculate these sets.
Add a function that calculates the virtio serial address set
from the domain definition.
Credit goes to Cole Robinson.
When parsing a command line with USB devices that have
no address specified, QEMU automatically adds a USB hub
if the device would fill up all the available USB ports.
To help most of the users, add one hub if there are more
USB devices than available ports. For wilder configurations,
expect the user to provide us with more hubs and/or controllers.
A new type to track USB addresses.
Every <controller type='usb' index='i'/> is represented by an
object of type virDomainUSBAddressHub located at buses[i].
Each of these hubs has up to 'nports' ports.
If a port is occupied, it has the corresponding bit set in
the 'ports' bitmap, e.g. port 1 would have the 0th bit set.
If there is a hub on this port, then hubs[i] will point
to this hub.
Partially resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1301021
If the volume xml was looking to create a luks volume take the necessary
steps in order to make that happen.
The processing will be:
1. create a temporary file (virStorageBackendCreateQemuImgSecretPath)
1a. use the storage driver state dir path that uses the pool and
volume name as a base.
2. create a secret object (virStorageBackendCreateQemuImgSecretObject)
2a. use an alias combinding the volume name and "_luks0"
2b. add the file to the object
3. create/add luks options to the commandline (virQEMUBuildLuksOpts)
3a. at the very least a "key-secret=%s" using the secret object alias
3b. if found in the XML the various "cipher" and "ivgen" options
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to tracking which USB addresses are occupied.
Introduce two helper functions for printing the port path
as a string and appending it to a virBuffer.
Currently many users of virConf APIs are defining the same
macros for calling virConfValue() and then doing type
checking. To remove this repeated code, add a set of
typesafe accessor methods.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libxl is the last user and I don't have the toolchain prepared to
compile the libxl driver. Move it to the libxl driver to avoid having to
refactor the code.
Provide a separate method to free a logging filter object. This will come handy
once a method to create an individual logging filter object is introduced.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is just a convenience method for discarding a list of outputs instead of
using a 'for' loop everywhere. It is safe to pass -1 as the number of elements
in the list as well as passing NULL as list reference.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Provide a separate method to free a logging output object. This will come handy
once a method to create an individual logging output object is introduced.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This patch takes the code out of
lxcContainerRenameAndEnableInterfaces() that adds all IP addresses and
IP routes to the interface, and puts it into a utility function
virNetDevIPInfoAddToDev() in virnetdevip.c so that it can be used by
anyone.
One small change in functionality -
lxcContainerRenameAndEnableInterfaces() previously would add all IP
addresses to the interface while it was still offline, then set the
interface online, and then add the routes. Because I don't want the
utility function to set the interface online, I've moved this up so
the interface is first set online, then IP addresses and routes are
added. This is the same order that the network service from
initscripts (in ifup-ether) does it, so it shouldn't pose any problem
(and hasn't, in the tests that I've run).
(This patch had been pushed earlier in commit
f1e0d0da11, but was reverted in commit
05eab47559 because it had been
accidentally pushed during the freeze for release 2.0.0)
A helper that will execute a callback on every USB device
in the domain definition.
With an ability to skip USB hubs, since we will want to treat
them differently in some cases.
This patch takes the code out of
lxcContainerRenameAndEnableInterfaces() that adds all IP addresses and
IP routes to the interface, and puts it into a utility function
virNetDevIPInfoAddToDev() in virnetdevip.c so that it can be used by
anyone.
One small change in functionality -
lxcContainerRenameAndEnableInterfaces() previously would add all IP
addresses to the interface while it was still offline, then set the
interface online, and then add the routes. Because I don't want the
utility function to set the interface online, I've moved this up so
the interface is first set online, then IP addresses and routes are
added. This is the same order that the network service from
initscripts (in ifup-ether) does it, so it shouldn't pose any problem
(and hasn't, in the tests that I've run).
There are currently two places in the domain where this combination is
used, and there is about to be another. This patch puts them together
for brevity and uniformity.
As with the newly-renamed virNetDevIPAddr and virNetDevIPRoute
objects, the new virNetDevIPInfo object will need to be accessed by a
utility function that calls low level Netlink functions (so we don't
want it to be in the conf directory) and will be called from multiple
hypervisor drivers (so it can't be in any hypervisor directory); the
most appropriate place is thus once again the util directory.
The parse and format functions are in conf/domain_conf.c because only
the domain XML (i.e. *not* the network XML) has this exact combination
of IP addresses plus routes. Note that virDomainNetIPInfoFormat() will
end up being the only caller to virDomainNetRoutesFormat() and
virDomainNetIPsFormat(), so it will just subsume those functions in a
later patch, but we can't do that until they are no longer called.
(It would have been nice to include the interface name within the
virNetDevIPInfo object (with a slight name change), but that can't
be done cleanly, because in each case the interface name is provided
in a different place in the XML relative to the routes and IP
addresses, so putting it in this object would actually make the code
more confused rather than simpler).
These functions all need to be called from a utility function that
must be located in the util directory, so we move them all into
util/virnetdevip.[ch] now that it exists.
Function and struct names were appropriately changed for the new
location, but all code is unchanged aside from motion and renaming.
This patch splits virnetdev.[ch] into multiple files, with the new
virnetdevip.[ch] containing all the functions related to setting and
retrieving IP-related info for a device (both addresses and routes).
I'm tired of mistyping this all the time, so let's do it the same all
the time (similar to how we changed all "Pci" to "PCI" awhile back).
(NB: I've left alone some things in the esx and vbox drivers because
I'm unable to compile them and they weren't obviously *not* a part of
some API. I also didn't change a couple of variables named,
e.g. "somethingIptables", because they were derived from the name of
the "iptables" command)
These had been declared in conf/device_conf.h, but then used in
util/virnetdev.c, meaning that we had to #include conf/device_conf.h
in virnetdev.c (which we have for a long time said shouldn't be done.
This caused a bigger problem when I tried to #include util/virnetdev.h
in a file in src/conf (which is allowed) - for some reason the
"device_conf.h: File not found" error.
The solution is to move the data types and functions used in util
sources from conf to util. Some names were adjusted during the move
("virInterface" --> "virNetDevIf", and "VIR_INTERFACE" -->
"VIR_NETDEV_IF")
virNetDevLinkDump should have been in virnetlink.c, but that file
didn't exist yet when the function was created. It didn't really
matter until now - I found that having virnetlink.h included by
virnetdev.h caused build problems when trying to #include virnetdev.h
in a .c file in src/conf (due to missing directory in -I). Rather than
fix that to further institutionalize the incorrect placement of this
one function, this patch moves the function.
The VIR_STORAGE_POOL_EVENT_REFRESHED constant does not
reflect any change in the lifecycle of the storage pool.
It should thus not be part of the storage pool lifecycle
event set, but rather be a top level event in its own
right. Thus we introduce VIR_STORAGE_POOL_EVENT_ID_REFRESH
to replace it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This kvmGetMaxVCPUs() needs to be used at two different places
so move it to utils with appropriate name and mark it as private
global now.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Move to virsecret.c and rename to virSecretLookupParseSecret. Also convert
to usage xmlNodePtr and virXMLPropString rather than virXPathString.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the enum into a new src/util/virsecret.h, rename it to be
virSecretLookupType. Add a src/util/virsecret.h in order to perform
a couple of simple operations on the secret XML and virSecretLookupTypeDef
for clearing and copying.
This includes quite a bit of collateral damage, but the goal is to remove
the "virStorage*" and replace with the virSecretLookupType so that it's
easier to to add new lookups that aren't necessarily storage pool related.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This will be used for the caller that needs to specify a separator.
Currently identical to virBitmapParse.
Also change one test case to use the new function.
Basically, there are just two functions introduced here:
virDomainRedirdevDefFind which looks up given redirdev in domain
definition, and virDomainRedirdevDefRemove which removes the
device at given index in the array of devices.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While we need to know the difference between the total memory stored in
<memory> and the actual size not included in the possible memory modules
we can't pre-calculate it reliably. This is due to the fact that
libvirt's XML is copied via formatting and parsing the XML and the
initial memory size can be reliably calculated only when certain
conditions are met due to backwards compatibility.
This patch removes the storage of 'initial_memory' and fixes the helpers
to recalculate the initial memory size all the time from the total
memory size. This conversion is possible when we also make sure that
memory hotplug accounts properly for the update of the total memory size
and thus the helpers for inserting and removing memory devices need to
be tweaked too.
This fixes a bug where a cold-plug and cold-remove of a memory device
would increase the size reported in <memory> in the XML by the size of
the memory device. This would happen as the persistent definition is
copied before attaching the device and this would lead to the loss of
data in 'initial_memory'.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1344892
This option allows or disallows detection of zero-writes if it is set to
"on" or "off", respectively. It can be also set to "unmap" in which
case it will try discarding that part of image based on the value of the
"discard" option.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virQEMUDriverConfig object contains lists of
loader:nvram pairs to advertise firmwares supported by
by the driver, and qemu_conf.c contains code to populate
the lists, all of which is useful for other drivers too.
To avoid code duplication, introduce a virFirmware object
to encapsulate firmware details and switch the qemu driver
to use it.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
In libxl driver we do virObjectRef in libxlDomainObjBeginJob,
If virCondWaitUntil failed, it goes to error, do virObjectUnref,
There's a chance that someone undefine the vm at the same time,
and refs unref to zero, vm is freed in libxlDomainObjBeginJob.
But the vm outside function is not Null, we do virObjectUnlock(vm).
That's how we overwrite the vm memory after it's freed. I fix it.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In the 162efa1a commit the function was introduced, but the
commit forgot to update livirt_private.syms accordingly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In preparation for moving all the CPU related APIs out of
the nodeinfo file, give them a virHostCPU name prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for moving all the memory related APIs out of
the nodeinfo file, give them a virHostMem name prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virCPUData, virCPUx86Feature, and virCPUx86Model all contained a pointer
to virCPUx86Data, which was not very nice since the real CPUID data were
accessible by yet another pointer from virCPUx86Data. Moreover, using
virCPUx86Data directly will make static definitions of internal CPU
features a bit easier.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Since it will not be called from outside of conf we can unexport it too
if we move it to the appropriate place.
Test suite change is necessary since the error will be reported sooner
now.
Until now we weren't able to add checks that would reject configuration
once accepted by the parser. This patch adds a new callback and
infrastructure to add such checks. In this patch all the places where
rejecting a now-invalid configuration wouldn't be a good idea are marked
with a new parser flag.
Move the module from qemu_command.c to a new module virqemu.c and
rename the API to virQEMUBuildObjectCommandline.
This API will then be shareable with qemu-img and the need to build
a security object for luks support.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This function is not used anywhere. Moreover, the code that would
use lives in virperf.c and therefore has access to the FD anyway.
Well, for instance virPerfReadEvent is doing just that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Make virDomainControllerFindUnusedIndex() a global function so that it
can be used outside domain_conf.c (as well as higher up in
domain_conf.c itself)/ Also make its DomainDef arg a const* so that
functions which only have a const* to the domain can use it.
Move the logic from qemuDomainGenerateRandomKey into this new
function, altering the comments, variable names, and error messages
to keep things more generic.
NB: Although perhaps more reasonable to add soemthing to virrandom.c.
The virrandom.c was included in the setuid_rpc_client, so I chose
placement in vircrypto.
Introduce virCryptoHaveCipher and virCryptoEncryptData to handle
performing encryption.
virCryptoHaveCipher:
Boolean function to determine whether the requested cipher algorithm
is available. It's expected this API will be called prior to
virCryptoEncryptdata. It will return true/false.
virCryptoEncryptData:
Based on the requested cipher type, call the specific encryption
API to encrypt the data.
Currently the only algorithm support is the AES 256 CBC encryption.
Adjust tests for the API's
This solution does not keep snapshots cache because vz sdk lacks good support
for snapshot related events.
Libvirt and vz sdk has different approach to snapshot ids. vz sdk always
auto generate them while libvirt has ability to specify id by user.
Thus I have no other choice rather than simply ignore ids set by user
or generated by libvirt.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
For a few cases where we handle secret information it's good to clear
the buffers containing sensitive data before freeing them.
Introduce VIR_DISPOSE, VIR_DISPOSE_N and VIR_DISPOSE_STRING that allow
simple clearing fo the buffers holding sensitive information on cleanup
paths.
Move some parts of virStorageFileRemoveLastPathComponent
into a separate function so they can be reused.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For disks sources described by a libvirt volume we don't need to do a
complicated check since virStorageTranslateDiskSourcePool already
correctly determines the actual disk type.
Replace the checks using a new accessor that does not open-code the
whole logic.
We had both and the only difference was that the latter also included
information about multifunction setting. The problem with that was that
we couldn't use functions made for only one of the structs (e.g.
parsing). To consolidate those two structs, use the one in virpci.h,
include that in domain_conf.h and add the multifunction member in it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When the domain definition describes a machine with NUMA, setting the
maximum vCPU count via the API might lead to an invalid config.
Add a check that will forbid this until we add more advanced cpu config
capabilities.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1327499
Add virDomainObjGetShortName() and use it. For now that's used in one
place, but we should expose it so that future patches can use it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Introduce the final accessor's to _virSecretObject data and move the
structure from virsecretobj.h to virsecretobj.c
The virSecretObjSetValue logic will handle setting both the secret
value and the value_size. Some slight adjustments to the error path
over what was in secretSetValue were made.
Additionally, a slight logic change in secretGetValue where we'll
check for the internalFlags and error out before checking for
and erroring out for a NULL secret->value. That way, it won't be
obvious to anyone that the secret value wasn't set rather they'll
just know they cannot get the secret value since it's private.
Move and rename the secretRewriteFile, secretSaveDef, and secretSaveValue
from secret_driver to virsecretobj
Need to make some slight adjustments since the secretSave* functions
called secretEnsureDirectory, but otherwise mostly just a move of code.
Move and rename secretDeleteSaved from secret_driver into virsecretobj and
split it up into two parts since there is error path code that looks to
just delete the secret data file
Move to secret_conf.c and rename to virSecretLoadAllConfigs. Also includes
moving/renaming the supporting virSecretLoad, virSecretLoadValue, and
virSecretLoadValidateUUID.
This patch replaces most of the guts of secret_driver.c with recently
added secret_conf.c APIs in order manage secret lists and objects
using the hashed virSecretObjList* lookup API's.
Since threadpool increments the current number of threads according to current
load, i.e. how many jobs are waiting in the queue. The count however, is
constrained by max and min limits of workers. The logic of this new API works
like this:
1) setting the minimum
a) When the limit is increased, depending on the current number of
threads, new threads are possibly spawned if the current number of
threads is less than the new minimum limit
b) Decreasing the minimum limit has no possible effect on the current
number of threads
2) setting the maximum
a) Icreasing the maximum limit has no immediate effect on the current
number of threads, it only allows the threadpool to spawn more
threads when new jobs, that would otherwise end up queued, arrive.
b) Decreasing the maximum limit may affect the current number of
threads, if the current number of threads is less than the new
maximum limit. Since there may be some ongoing time-consuming jobs
that would effectively block this API from killing any threads.
Therefore, this API is asynchronous with best-effort execution,
i.e. the necessary number of workers will be terminated once they
finish their previous job, unless other workers had already
terminated, decreasing the limit to the requested value.
3) setting priority workers
- both increase and decrease in count of these workers have an
immediate impact on the current number of workers, new ones will be
spawned or some of them get terminated respectively.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order for the client to see all thread counts and limits, current total
and free worker count getters need to be introduced. Client might also be
interested in the job queue length, so provide a getter for that too. As with
the other getters, preparing for the admin interface, mutual exclusion is used
within all getters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In a few places in libvirt we busy-wait for events, for example qemu
creating a monitor socket. This is problematic because:
- We need to choose a sufficiently small polling period so that
libvirt doesn't add unnecessary delays.
- We need to choose a sufficiently large polling period so that
the effect of busy-waiting doesn't affect the system.
The solution to this conflict is to use an exponential backoff.
This patch adds two functions to hide the details, and modifies a few
places where we currently busy-wait.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
There are two places in qemu_domain_address.c where we have a switch
statement to convert PCI controller models
(VIR_DOMAIN_CONTROLLER_MODEL_PCI*) into the connection type flag that
is matched when looking for an upstream connection for that model of
controller (VIR_PCI_CONNECT_TYPE_*). This patch makes a utility
function in conf/domain_addr.c to do that, so that when a new PCI
controller is added, we only need to add the new model-->connect-type
in a single place.
commit 30c61901 added new functions to libvirt_private.syms
not alpabetically sorted and erroneously added vz sources to
STATEFUL_DRIVER_SOURCE_FILES, which triggered check-aclrules
running while vz driver isn't ready for it yet.
Pushing under build-breaker rule.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Make it possible to build vz driver as a module and don't link it with
libvirt.so statically.
Remove registering it on client's side as far as we start relying on daemon
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Since we didn't opt to use one single event for device lifecycle for a
VM we are missing one last event if the device removal failed. This
event will be emitted once we asked to eject the device but for some
reason it is not possible.
Those are the last two places that uses the getter functions. Use a
direct access instead and remove those getters.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Removes the check for graphics type, it's not a public API and developer
know what he's doing and this check makes no sense. It also removes
the ability to allocate a new array if there is none. This was used by
the virDomainGraphicsListenAdd* functions and isn't used anymore.
This is now a simple getter with simple check for listens array presence
and whether the index in out of bounds.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This effectively removes virDomainGraphicsListenSetAddress which was
used only to change the address of listen structure and possible change
the listen type. The new function will auto-expand the listens array
and append a new listen.
The old function was used on pre-allocated array of listens and in most
cases it only "add" a new listen. The two remaining uses can access the
listen structure directly.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit id 'fb2bd208' essentially copied the qemuGetSecretString
creating an libxlGetSecretString. Rather than have multiple copies
of the same code, create src/secret/secret_util.{c,h} files and
place the common function in there.
Modify the the build in order to build the module as a library
which is then pulled in by both the qemu and libxl drivers for
usage from both qemu_command.c and libxl_conf.c
Using the existing virUUIDGenerateRandomBytes, move API to virrandom.c
rename it to virRandomBytes and add it to libvirt_private.syms.
This will be used as a fallback for generating a domain master key.
In some cases it's impractical to use the regular APIs as the bitmap
size needs to be pre-declared. These new APIs allow to use bitmaps that
self expand.
The new code adds a property to the bitmap to track the allocation of
memory so that VIR_RESIZE_N can be used.
This patch implement a set of interfaces for perf event. Based on
these interfaces, we can implement internal driver API for perf,
and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-4-git-send-email-qiaowei.ren@intel.com
This allows setting the address in host and/or network order and makes
the naming consistent. Now you don't need to call [hn]to[nh]l()
functions as that is taken care of by these functions. Also, now
the *NetOrder take the address in network order, the other functions in
host order so the naming and usage is consistent. Some places were
having the address in network order and calling ntohl() just so the
original function can call htonl() again. This makes it nicer to read.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It's just a combination of AddImplicitControllers, and AddConsoleCompat.
Every caller that wants ImplicitControllers also wants the ConsoleCompat
AFAICT, so lump them together. We also need it for future patches.
If we expose this information, which is one byte in every PCI config
file, we let all mgmt apps know whether the device itself is an endpoint
or not so it's easier for them to decide whether such device can be
passed through into a VM (endpoint) or not (*-bridge).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1317531
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This is a missing counterpart for virSocketAddrSetIPv4Addr()
and is going to be needed later in the tests.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions are going to be reused very shortly. So instead
of duplicating the code, lets move them into utils module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Introduce a helper to check supported device and domain config and move
the memory hotplug checks to it.
The advantage of this approach is that by default all new features are
considered unsupported by all hypervisors unless specifically changed
rather than the previous approach where every hypervisor would need to
declare that a given feature is unsupported.
The VIR_DOMAIN_EVENT_ID_JOB_COMPLETED event will be triggered once a job
(such as migration) finishes and it will contain statistics for the job
as one would get by calling virDomainGetJobStats. Thanks to this event
it is now possible to get statistics of a completed migration of a
transient domain on the source host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While trying to build with -Os I've encountered some build
failures.
util/vircommand.c: In function 'virCommandAddEnvFormat':
util/vircommand.c:1257:1: error: inlining failed in call to 'virCommandAddEnv': call is unlikely and code size would grow [-Werror=inline]
virCommandAddEnv(virCommandPtr cmd, char *env)
^
util/vircommand.c:1308:5: error: called from here [-Werror=inline]
virCommandAddEnv(cmd, env);
^
This function is big enough for the compiler to be not inlined.
This is the error message I'm seeing:
Then virDomainNumatuneNodeSpecified is exported and called from
other places. It shouldn't be inlined then.
In file included from network/bridge_driver_platform.h:30:0,
from network/bridge_driver_platform.c:26:
network/bridge_driver_linux.c: In function 'networkRemoveRoutingFirewallRules':
./conf/network_conf.h:350:1: error: inlining failed in call to 'virNetworkDefForwardIf.constprop': call is unlikely and code size would grow [-Werror=inline]
virNetworkDefForwardIf(const virNetworkDef *def, size_t n)
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuProcessSetupEmulator runs at a point in time where there is only
the qemu main thread. Use virCgroupAddTask to put just that one task
into the emulator cgroup. That patch makes virCgroupMoveTask and
virCgroupAddTaskStrController obsolete.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Introduce virPolkitAgentCreate and virPolkitAgentDestroy
virPolkitAgentCreate will run the polkit pkttyagent image as an asynchronous
command in order to handle the local agent authentication via stdin/stdout.
The code makes use of the pkttyagent --notify-fd mechanism to let it know
when the agent is successfully registered.
virPolkitAgentDestroy will close the command effectively reaping our
child process
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
Apparently we are not the only ones with dumb free functions
because dbus_message_unref() does not accept NULL either. But if
I were to vote, this one is even more evil. Instead of returning
an error just like we do it immediately dereference any pointer
passed and thus crash you app. Well done DBus!
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f878ebda700 (LWP 31264)]
0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
(gdb) bt
#0 0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
#1 0x00007f87be3f004e in dbus_message_unref () from /usr/lib64/libdbus-1.so.3
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
#3 0x00007f879761bd4d in qemuConnectCgroup (driver=0x7f87600a32a0, vm=0x7f87600c7550) at qemu/qemu_cgroup.c:909
#4 0x00007f87976386b7 in qemuProcessReconnect (opaque=0x7f87600db840) at qemu/qemu_process.c:3386
#5 0x00007f87bf6edfff in virThreadHelper (data=0x7f87600d5580) at util/virthread.c:206
#6 0x00007f87bb602334 in start_thread (arg=0x7f878ebda700) at pthread_create.c:333
#7 0x00007f87bb3481bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) frame 2
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
228 dbus_message_unref(reply);
(gdb) p reply
$1 = (DBusMessage *) 0x0
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Move the logic from virDomainDiskDefDstDuplicates into
virDomainDiskDefCheckDuplicateInfo so that we don't have to loop
multiple times through the array of disks. Since the original function
was called in qemuBuildDriveDevStr, it was actually called for every
single disk which was quite wasteful.
Additionally the target uniqueness check needed to be duplicated in
the disk hotplug case, since the disk was inserted into the domain
definition after the device string was formatted and thus
virDomainDiskDefDstDuplicates didn't do anything in that case.
In the reverted commit d2e5538b1, the libxl driver was changed to copy
interface names autogenerated by libxl to the corresponding network def
in the domain's virDomainDef object. The copied name is freed when the
domain transitions to the shutoff state. But when migrating a domain,
the autogenerated name is included in the XML sent to the destination
host. It is possible an interface with the same name already exists on
the destination host, causing migration to fail.
This patch defines a new capability for setting the network device
prefix that will be used in the driver. Valid prefixes are
VIR_NET_GENERATED_PREFIX or the one announced by the driver.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Same as for deserializer, this method might get handy for admin one day.
The major reason for this patch is to stay consistent with idea, i.e.
when deserializer can be shared, why not serializer as well. The only
problem to be solved was that the daemon side serializer uses a code
snippet which handles sparse arrays returned by some APIs as well as
removes any string parameters that can't be returned to older clients.
This patch makes of the new virTypedParameterRemote datatype introduced
by one of the pvious patches.
Since the method is static to remote_driver, it can't even be used by our
daemon. Other than that, it would be useful to be able to use it with admin as
well. This patch uses the new virTypedParameterRemote datatype introduced in
one of previous patches.
Currently, the deserializer is hardcoded into remote_driver which makes
it impossible for admin to use it. One way to achieve a shared implementation
(besides moving the code to another module) would be pass @ret_params_val as a
void pointer as opposed to the remote_typed_param pointer and add a new extra
argument specifying which of those two protocols is being used and typecast
the pointer at the function entry. An example from remote_protocol:
struct remote_typed_param_value {
int type;
union {
int i;
u_int ui;
int64_t l;
uint64_t ul;
double d;
int b;
remote_nonnull_string s;
} remote_typed_param_value_u;
};
typedef struct remote_typed_param_value remote_typed_param_value;
struct remote_typed_param {
remote_nonnull_string field;
remote_typed_param_value value;
};
That would leave us with a bunch of if-then-elses that needed to be used across
the method. This patch takes the other approach using the new datatype
introduced in one of earlier commits.
A pretty nasty deadlock occurs while trying to rename a VM in parallel
with virDomainObjListNumOfDomains.
The short description of the problem is as follows:
Thread #1:
qemuDomainRename:
------> aquires domain lock by qemuDomObjFromDomain
---------> waits for domain list lock in any of the listed functions:
- virDomainObjListFindByName
- virDomainObjListRenameAddNew
- virDomainObjListRenameRemove
Thread #2:
virDomainObjListNumOfDomains:
------> aquires domain list lock
---------> waits for domain lock in virDomainObjListCount
Introduce generic virDomainObjListRename function for renaming domains.
It aquires list lock in right order to avoid deadlock. Callback is used
to make driver specific domain updates.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In some cases it may be better to have a bitmap representing state of
individual vcpus rather than iterating the definition. The new helper
creates a bitmap representing the state from the domain definition.
The name is confusing, and there are just two uses: one is a test case,
and the other will be removed as part of an upcoming refactoring of
the hostdev code.
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
On the host when we start a container, it will be
placed in a cgroup path of
/machine.slice/machine-lxc\x2ddemo.scope
under /sys/fs/cgroup/*
Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.
While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup
This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg
/machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope
This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup
Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VIR_DOMAIN_EVENT_ID_MIGRATION_ITERATION event will be triggered
whenever VIR_DOMAIN_JOB_MEMORY_ITERATION changes its value, i.e.,
whenever a new iteration over guest memory pages is started during
migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This new function will add a single controller of the given model,
except the case of ich9-usb-ehci1 (the master controller for a USB2
controller set) in which case a set of related controllers will be
added (EHCI1, UHCI1, UHCI2, UHCI3). These controllers will not be
given PCI addresses, but should be otherwise ready to use.
"-1" is allowed for controller model, and means "default for this
machinetype". This matches the existing practice in
qemuDomainDefPostParse(), which always adds the default controller
with model = -1, and relies on the commandline builder to set a model
(that is wrong, but will be fixed later).
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
We only support hotplugging SCSI controllers.
The USB and virtio-serial related code was never reachable because
this function was only called for VIR_DOMAIN_CONTROLLER_TYPE_SCSI
controllers.
This reverts commit ee0d97a and parts of commits 16db8d2
and d6d54cd1.
This function can be used to retrieve the current locked memory
limit for a process, so that the setting can be later restored.
Add a configure check for getrlimit(), which we now use.
On the very first log message we send to any output, we include
the libvirt version number and package string. In some bug reports
we have been given libvirtd.log files that came from a different
host than the corresponding /var/log/libvirt/qemu log files. So
extend the initial log message to include the hostname too.
eg on first log message we would now see:
$ libvirtd
2015-12-04 17:35:36.610+0000: 20917: info : libvirt version: 1.3.0
2015-12-04 17:35:36.610+0000: 20917: info : hostname: dhcp-1-180.lcy.redhat.com
2015-12-04 17:35:36.610+0000: 20917: error : qemuMonitorIO:687 : internal error: End of file from monitor
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Once more stuff will be moved into the vCPU data structure it will be
necessary to get a specific one in some ocasions. Add a helper that will
simplify this task.
Our domain_conf.* files are big enough. Not only they contain XML
parsing code, but they served as a storage of all functions whose
name is virDomain prefixed. This is just wrong as it gathers not
related functions (and modules) into one big file which is then
harder to maintain. Split virDomainObjList module into a separate
file called virdomainobjlist.[ch].
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As we need to provide support for URI aliases in libvirt-admin as well, URI
alias matching needs to be internally visible. Since
virConnectOpenResolveURIAlias does have a compatible signature, it could be
easily reused by libvirt-admin. This patch moves URI alias matching to util,
renaming it accordingly.
virConnectGetConfig and virConnectGetConfigPath were static libvirt
methods, merely because there hasn't been any need for having them
internally exported yet. Since libvirt-admin also needs to reference
its config file, 'xGetConfig' should be exported.
Besides moving, this patch also renames the methods accordingly,
as they are libvirt config specific.
Add the virLogManager API which allows for communication with
the virtlogd daemon to RPC program. This provides the client
side API to open log files for guest domains.
The virtlogd daemon is setup to auto-spawn on first use when
running unprivileged. For privileged usage, systemd socket
activation is used instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virRotatingFileReader and virRotatingFileWriter objects
which allow reading & writing from/to files with automation
rotation to N backup files when a size limit is reached. This
is useful for guest logging when a guaranteed finite size
limit is required. Use of external tools like logrotate is
inadequate since it leaves the possibility for guest to DOS
the host in between invokations of logrotate.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a new helper function "virDiskNameParse" which extends
virDiskNameToIndex but handling both disk index and partition index.
Also rework virDiskNameToIndex to be based on virDiskNameParse.
A test is also added for this function testing both valid and
invalid disk names.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
commit db488c79 assumed that dnsmasq would complete IPv6 DAD before
daemonizing, but in reality it doesn't wait, which creates problems
when libvirt's bridge driver sets the matching "dummy tap device" to
IFF_DOWN prior to DAD completing.
This patch waits for DAD completion by periodically polling the kernel
using netlink to check whether there are any IPv6 addresses assigned
to bridge which have a 'tentative' state (if there are any in this
state, then DAD hasn't yet finished). After DAD is finished, execution
continues. To avoid an endless hang in case something was wrong with
the kernel's DAD, we wait a maximum of 5 seconds.
Let's check to ensure we can find the Partition Table in the label
and that libvirt actually recognizes that type; otherwise, when we
go to read the partitions during a refresh operation we may not be
reading what we expect.
This will expand upon the types of errors or reason that a build
would fail, so we can create more direct error messages.
Add 'initial_memory' member to struct virDomainMemtune so that the
memory size can be pre-calculated once instead of inferring it always
again and again.
Separating of the fields will also allow finer granularity of decisions
in later patches where it will allow to keep the old initial memory
value in cases where we are handling incomming migration from older
versions that did not always update the size from NUMA as the code did
previously.
The change also requires modification of the qemu memory alignment
function since at the point where we are modifying the size of NUMA
nodes the total size needs to be recalculated too.
The refactoring done in this patch also fixes a crash in the hyperv
driver that did not properly initialize def->numa and thus
virDomainNumaGetMemorySize(def->numa) crashed.
In summary this patch should have no functional impact at this point.
Similar to commit id '35847860', it's possible to attempt to create
a 'netfs' directory in an NFS root-squash environment which will cause
the 'vol-delete' command to fail. It's also possible error paths from
the 'vol-create' would result in an error to remove a created directory
if the permissions were incorrect (and disallowed root access).
Thus rename the virFileUnlink to be virFileRemove to match the C API
functionality, adjust the code to following using rmdir or unlink
depending on the path type, and then use/call it for the VIR_STORAGE_VOL_DIR
I always felt like this function is qemu specific rather than
libvirt-wide. Other drivers may act differently on virDomainDef
change and in fact may require talking to underlying hypervisor
even if something else's than disk->src has changed. I know that
the function is still incomplete, but lets break that into two
commits that are easier to review. This one is pure code
movement.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions were made static as a part of commit cbfe38c since
they were no longer called from outside virnetdev.c. We once again
need to call them from another file, so this patch makes them once
again public.
In an NFS root-squashed environment the 'vol-delete' command will fail to
'unlink' the target volume since it was created under a different uid:gid.
This code continues the concepts introduced in virFileOpenForked and
virDirCreate[NoFork] with respect to running the unlink command under
the uid/gid of the child. Unlike the other two, don't retry on EACCES
(that's why we're here doing this now).
That function takes string list and returns first string in that list
that starts with the @prefix parameter with that prefix being skipped as
the caller knows what it starts with (also for easier manipulation in
future).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In order to share as much virsh' logic as possible with upcomming
virt-admin client we need to split virsh logic into virsh specific and
client generic features.
Since majority of virsh methods should be generic enough to be used by
other clients, it's much easier to rename virsh specific data to virshX
than doing this vice versa. It moved generic virsh commands (including info
and opts structures) to generic module vsh.c.
Besides renaming methods and structures, this patch also involves introduction
of a client specific control structure being referenced as private data in the
original control structure, introduction of a new global vsh Initializer,
which currently doesn't do much, but there is a potential for added
functionality in the future.
Lastly it introduced client hooks which are especially necessary during
client connecting phase.
We just need to update the entry in the second hash table. Since commit 8728a56
we have two hash tables for the domain list so that we can do O(1) lookup
regardless of looking up by UUID or name. Since with renaming a domain UUID does
not change, we only need to update the second hash table, where domains are
referenced by their name.
We will call both functions from the qemuDomainRename().
Signed-off-by: Tomas Meszaros <exo@tty.sk>
This new subelement is used in PCI controllers: the toplevel
*attribute* "model" of a controller denotes what kind of PCI
controller is being described, e.g. a "dmi-to-pci-bridge",
"pci-bridge", or "pci-root". But in the future there will be different
implementations of some of those types of PCI controllers, which
behave similarly from libvirt's point of view (and so should have the
same model), but use a different device in qemu (and present
themselves as a different piece of hardware in the guest). In an ideal
world we (i.e. "I") would have thought of that back when the pci
controllers were added, and used some sort of type/class/model
notation (where class was used in the way we are now using model, and
model was used for the actual manufacturer's model number of a
particular family of PCI controller), but that opportunity is long
past, so as an alternative, this patch allows selecting a particular
implementation of a pci controller with the "name" attribute of the
<model> subelement, e.g.:
<controller type='pci' model='dmi-to-pci-bridge' index='1'>
<model name='i82801b11-bridge'/>
</controller>
In this case, "dmi-to-pci-bridge" is the kind of controller (one that
has a single PCIe port upstream, and 32 standard PCI ports downstream,
which are not hotpluggable), and the qemu device to be used to
implement this kind of controller is named "i82801b11-bridge".
Implementing the above now will allow us in the future to add a new
kind of dmi-to-pci-bridge that doesn't use qemu's i82801b11-bridge
device, but instead uses something else (which doesn't yet exist, but
qemu people have been discussing it), all without breaking existing
configs.
(note that for the existing "pci-bridge" type of PCI controller, both
the model attribute and <model> name are 'pci-bridge'. This is just a
coincidence, since it turns out that in this case the device name in
qemu really is a generic 'pci-bridge' rather than being the name of
some real-world chip)
This function should return the greatest CPU number set in
/domain/cpu/numa/cell/@cpus. The idea is that we should compare
the returned value against /domain/vcpu value. Yes, there exist
users who think the following is a good idea:
<vcpu placement='static'>4</vcpu>
<cpu mode='host-model'>
<model fallback='allow'/>
<numa>
<cell id='0' cpus='0-1' memory='1048576' unit='KiB'/>
<cell id='1' cpus='9-10' memory='2097152' unit='KiB'/>
</numa>
</cpu>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Qemu reports physical size 0 for block devices. As 15fa84acbb
changed the behavior of qemuDomainGetBlockInfo to just query the monitor
this created a regression since we didn't report the size correctly any
more.
This patch adds code to refresh the physical size of a block device by
opening it and seeking to the end and uses it both in
qemuDomainGetBlockInfo and also in qemuDomainGetStatsOneBlock that was
broken since it was introduced in this respect.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250982
The nodeinfo is reporting incorrect number of cpus and incorrect host
topology on PPC64 KVM hosts. The KVM hypervisor on PPC64 needs only
the primary thread in a core to be online, and the secondaries offlined.
While scheduling a guest in, the kvm scheduler wakes up the secondaries to
run in guest context.
The host scheduling of the guests happen at the core level(as only primary
thread is online). The kvm scheduler exploits as many threads of the core
as needed by guest. Further, starting POWER8, the processor allows splitting
a physical core into multiple subcores with 2 or 4 threads each. Again, only
the primary thread in a subcore is online in the host. The KVM-PPC
scheduler allows guests to exploit all the offline threads in the subcore,
by bringing them online when needed.
(Kernel patches on split-core http://www.spinics.net/lists/kvm-ppc/msg09121.html)
Recently with dynamic micro-threading changes in ppc-kvm, makes sure
to utilize all the offline cpus across guests, and across guests with
different cpu topologies.
(https://www.mail-archive.com/kvm@vger.kernel.org/msg115978.html)
Since the offline cpus are brought online in the guest context, it is safe
to count them as online. Nodeinfo today discounts these offline cpus from
cpu count/topology calclulation, and the nodeinfo output is not of any help
and the host appears overcommited when it is actually not.
The patch carefully counts those offline threads whose primary threads are
online. The host topology displayed by the nodeinfo is also fixed when the
host is in valid kvm state.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The new name makes it clear that the returned bitmap contains the
information about which CPUs are online, not eg. which CPUs are
present.
No behavioral change.
If one calls update-device with information that is not updatable,
libvirt reports success even though no data were updated. The example
used in the bug linked below uses updating device with <boot order='2'/>
which, in my opinion, is a valid thing to request from user's
perspective. Mainly since we properly error out if user wants to update
such data on a network device for example.
And since there are many things that might happen (update-device on disk
basically knows just how to change removable media), check for what's
changing and moreover, since the function might be usable in other
drivers (updating only disk path is a valid possibility) let's abstract
it for any two disks.
We can't possibly check for everything since for many fields our code
does not properly differentiate between default and unspecified values.
Even though this could be changed, I don't feel like it's worth the
complexity so it's not the aim of this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1007228
This is a self-locking wrapper around virHashTable. Only a limited set
of APIs are implemented now (the ones which are used in the following
patch) as more can be added on demand.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While working in qemu_monitor_json, I repeatedly found myself
getting a value then checking if it was an object. Add some
wrappers to make this task easier.
* src/util/virjson.c (virJSONValueObjectGetByType)
(virJSONValueObjectGetObject, virJSONValueObjectGetArray): New
functions.
(virJSONValueObjectGetString, virJSONValueObjectGetNumberInt)
(virJSONValueObjectGetNumberUint)
(virJSONValueObjectGetNumberLong)
(virJSONValueObjectGetNumberUlong)
(virJSONValueObjectGetNumberDouble)
(virJSONValueObjectGetBoolean): Simplify.
(virJSONValueIsNull): Change return type.
* src/util/virjson.h: Reflect changes.
* src/libvirt_private.syms (virjson.h): Export them.
* tests/jsontest.c (testJSONLookup): New test.
Signed-off-by: Eric Blake <eblake@redhat.com>
The wrapper is useful for calling qemuBlockJobEventProcess with the
event details stored in disk's privateData, which is the most likely
usage of qemuBlockJobEventProcess.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Complex jobs, such as migration, need to monitor several events at once,
which is impossible when each of the event uses its own condition
variable. This patch adds a single condition variable to each domain
object. This variable can be used instead of the other event specific
conditions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add multikey API:
* virTypedParamsFilter that filters all the parameters with specified name.
* virTypedParamsGetStringList that returns a list with all the values for
specified name and string type.
Signed-off-by: Pavel Boldin <pboldin@mirantis.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virDomainObjGetOneDef will help to retrieve the correct definition
pointer from @vm in cases where VIR_DOMAIN_AFFECT_LIVE and
VIR_DOMAIN_AFFECT_CONFIG are mutually exclusive. The function simply
returns the correct pointer. This similarly to virDomainObjGetDefs will
greatly simplify the code.
https://bugzilla.redhat.com/show_bug.cgi?id=1220527
This type of information defines attributes of a system
baseboard. With one exception: board type is yet not implemented
in qemu so it's not introduced here either.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Move all the system_* fields into a separate struct. Not only this
simplifies the code a bit it also helps us to identify whether BIOS
info is present. We don't have to check all the four variables for
being not-NULL, but we can just check the pointer to the struct.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Move all the bios_* fields into a separate struct. Not only this
simplifies the code a bit it also helps us to identify whether BIOS
info is present. We don't have to check all the four variables for
being not-NULL, but we can just check the pointer to the struct.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virDomainLiveConfigHelperMethod that is used for this job now does
modify the flags but still requires the callers to extract the correct
definition objects.
In addition coverity and other static analyzers are usually unhappy as
they don't grasp the fact that @flags are upadted according to the
correct def to be present.
To work this issue around and simplify the calling chain let's add a new
helper that will work only on drivers that always copy the persistent
def to a transient at start of a vm. This will allow to drop a few
arguments. The new function syntax will also fill two definition
pointers rather than modifying the @flags parameter.
Since some functions can be optimized by reusing the buffers that they
already have instead of allocating and copying new ones, lets split
virBitmapToData to two functions where one only converts the data and
the second one is a wrapper that allocates the buffer if necessary.
Store the emulator pinning cpu mask as a pure virBitmap rather than the
virDomainPinDef since it stores only the bitmap and refactor
qemuDomainPinEmulator to do the same operations in a much saner way.
As a side effect virDomainEmulatorPinAdd and virDomainEmulatorPinDel can
be removed since they don't add any value.
Sometimes the only thing we need is the pointer to virDomainDiskDef and
having to call virDomainDiskIndexBy* APIs, storing the disk index, and
looking it up in the disks array is ugly. After this patch, we can just
call virDomainDiskBy* and get the pointer in one step.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Two new domain configuration XML elements are added to enable/disable
the protected key management operations for a guest:
<domain>
...
<keywrap>
<cipher name='aes|dea' state='on|off'/>
</keywrap>
...
</domain>
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Because there are multiple potential reasons for an error, this
function logs any errors before returning NULL (since the caller won't
have the information needed to determine which was the reason for
failure).
The APIs take the memory value in KiB and we store it in KiB
internally, but we cannot parse the whole ULONG_MAX range
on 64-bit systems, because virDomainParseScaledValue
needs to fit the value in bytes in an unsigned long long.
https://bugzilla.redhat.com/show_bug.cgi?id=1176739
Until now the virDomainListAllDomains API would lock the domain list and
then every single domain object to access and filter it. This would
potentially allow a unresponsive VM to block the whole daemon if a
*listAllDomains call would get stuck.
To avoid this problem this patch collects a list of referenced domain
objects first from the list and then unlocks it right away. The
expensive operation requiring locking of the domain object is executed
after the list lock is dropped. While a single blocked domain will still
lock up a listAllDomains call, the domain list won't be held locked and
thus other APIs won't be blocked.
Additionally this patch also fixes the lookup code, where we'd ignore
the vm->removing flag and thus potentially return domain objects that
would be deleted very soon so calling any API wouldn't make sense.
As other clients also could benefit from operating on a list of domain
objects rather than the public domain descriptors a new intermediate
API - virDomainObjListCollect - is introduced by this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1181074
Extend it to a universal helper used for clearing lists of any objects.
Note that the argument type is specifically void * to allow implicit
typecasting.
Additionally add a helper that works on non-NULL terminated arrays once
we know the length.
We already check that any auto-assigned bridge device name for a
virtual network (e.g. "virbr1") doesn't conflict with the bridge name
for any existing libvirt network (via virNetworkSetBridgeName() in
conf/network_conf.c).
We also want to check that the name doesn't conflict with any bridge
device created on the host system outside the control of libvirt
(history: possibly due to the ploriferation of references to libvirt's
bridge devices in HOWTO documents all around the web, it is not
uncommon for an admin to manually create a bridge in their host's
system network config and name it "virbrX"). To add such a check to
virNetworkBridgeInUse() (which is called by virNetworkSetBridgeName())
we would have to call virNetDevExists() (from util/virnetdev.c); this
function calls ioctl(SIOCGIFFLAGS), which everyone on the mailing list
agreed should not be done from an XML parsing function in the conf
directory.
To remedy that problem, this patch removes virNetworkSetBridgeName()
from conf/network_conf.c and puts an identically functioning
networkBridgeNameValidate() in network/bridge_driver.c (because it's
reasonable for the bridge driver to call virNetDevExists(), although
we don't do that yet because I wanted this patch to have as close to 0
effect on function as possible).
There are a couple of inevitable changes though:
1) We no longer check the bridge name during
virNetworkLoadConfig(). Close examination of the code shows that
this wasn't necessary anyway - the only *correct* way to get XML
into the config files is via networkDefine(), and networkDefine()
will always call networkValidate(), which previously called
virNetworkSetBridgeName() (and now calls
networkBridgeNameValidate()). This means that the only way the
bridge name can be unset during virNetworkLoadConfig() is if
someone edited the config file on disk by hand (which we explicitly
prohibit).
2) Just on the off chance that somebody *has* edited the file by hand,
rather than crashing when they try to start their malformed
network, a check for non-NULL bridge name has been added to
networkStartNetworkVirtual().
(For those wondering why I don't instead call
networkValidateBridgeName() there to set a bridge name if one
wasn't present - the problem is that during
networkStartNetworkVirtual(), the lock for the network being
started has already been acquired, but the lock for the network
list itself *has not* (because we aren't adding/removing a
network). But virNetworkBridgeInuse() iterates through *all*
networks (including this one) and locks each network as it is
checked for a duplicate entry; it is necessary to lock each network
even before checking if it is the designated "skip" network because
otherwise some other thread might acquire the list lock and delete
the very entry we're examining. In the end, permitting a setting of
the bridge name during network start would require that we lock the
entire network list during any networkStartNetwork(), which
eliminates a *lot* of parallelism that we've worked so hard to
achieve (it can make a huge difference during libvirtd startup). So
rather than try to adjust for someone playing against the rules, I
choose to instead give them the error they deserve.)
3) virNetworkAllocateBridge() (now removed) would leak any "template"
string set as the bridge name. Its replacement
networkFindUnusedBridgeName() doesn't leak the template string - it
is properly freed.
Add qemuDomainAddIOThread and qemuDomainDelIOThread in order to add or
remove an IOThread to/from the host either for live or config optoins
The implementation for the 'live' option will use the iothreadpids list
in order to make decision, while the 'config' option will use the
iothreadids list. Additionally, for deletion each may have to adjust
the iothreadpin list.
IOThreads are implemented by qmp objects, the code makes use of the existing
qemuMonitorAddObject or qemuMonitorDelObject APIs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to allow IOThreads to be deleted, but an iothreadid may be
included in some domain thread sched, so add a new API to allow removing
an iothread from some entry.
Then during the writing of the threadsched data and an additional check
to determine whether the bitmap is all clear before writing it out.
Since it's only ever referenced in domain_conf.c, make the function
static, but also will need to move it to somewhere before it's referenced
rather than forward referencing it.
Adding a new XML element 'iothreadids' in order to allow defining
specific IOThread ID's rather than relying on the algorithm to assign
IOThread ID's starting at 1 and incrementing to iothreads count.
This will allow future patches to be able to add new IOThreads by
a specific iothread_id and of course delete any exisiting IOThread.
Each iothreadids element will have 'n' <iothread> children elements
which will have attribute "id". The "id" will allow for definition
of any "valid" (eg > 0) iothread_id value.
On input, if any <iothreadids> <iothread>'s are provided, they will
be marked so that we only print out what we read in.
On input, if no <iothreadids> are provided, the PostParse code will
self generate a list of ID's starting at 1 and going to the number
of iothreads defined for the domain (just like the current algorithm
numbering scheme). A future patch will rework the existing algorithm
to make use of the iothreadids list.
On output, only print out the <iothreadids> if they were read in.
This is basically turning qemuDomObjEndAPI into a more general
function. Other drivers which gets a reference to domain objects may
benefit from this function too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1113474
When we set the MAC address of a network device as a part of setting
up macvtap "passthrough" mode (where the domain has an emulated netdev
connected to a host macvtap device that has exclusive use of the
physical device, and sets the device MAC address to match its own,
i.e. "<interface type='direct'> <source mode='passthrough' .../>"), we
use ioctl(SIOCSIFHWADDR) giving it the name of that device. This is
true even if it is an SRIOV Virtual Function (VF).
But, when we are setting the MAC address / vlan ID of a VF in
preparation for "hostdev network" passthrough (this is where we set
the MAC address and vlan id of the VF after detaching the host net
driver and before assigning the device to the domain with PCI
passthrough, i.e. "<interface type='hostdev'>", we do the setting via
a netlink RTM_SETLINK message for that VF's Physical Function (PF),
telling it the VF# we want to change. This sets an "administratively
changed MAC" flag for that VF in the PF's driver, and from that point
on (until the PF driver is reloaded, *not* merely the VF driver) that
VF's MAC address can't be changed using ioctl(SIOCSIFHWADDR) - the
only way to change it is via the PF with RTM_SETLINK.
This means that if a VF is used for hostdev passthrough, it will have
the admin flag set, and future attempts to use that VF for macvtap
passthrough will fail.
The solution to this problem is to check if the device being used for
macvtap passthrough is actually a VF; if so, we use the netlink
RTM_SETLINK message to the PF to set the VF's mac address instead of
ioctl(SIOCSIFHWADDR) directly to the VF; if not, behavior does not
change from previously.
There are three pieces to making this work:
1) virNetDevMacVLan(Create|Delete)WithVPortProfile() now call
virNetDev(Replace|Restore)NetConfig() rather than
virNetDev(Replace|Restore)MacAddress() (simply passing -1 for VF#
and vlanid).
2) virNetDev(Replace|Restore)NetConfig() check to see if the device is
a VF. If so, they find the PF's name and VF#, allowing them to call
virNetDev(Replace|Restore)VfConfig().
3) To prevent mixups when detaching a macvtap passthrough device that
had been attached while running an older version of libvirt,
virNetDevRestoreVfConfig() is potentially given the preserved name
of the VF, and if the proper statefile for a VF can't be found in
the stateDir (${stateDir}/${pfname}_vf${vfid}),
virNetDevRestoreMacAddress() is called instead (which will look in
the file named ${stateDir}/${vfname}).
This problem has existed in every version of libvirt that has both
macvtap passthrough and interface type='hostdev'. Fortunately people
seem to use one or the other though, so it hasn't caused any real
world problem reports.