This controller can be connected (at domain startup time only - not
hotpluggable) only to a port on the pcie root complex ("pcie-root" in
libvirt config), hence the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_ROOT. It provides a hotpluggable port that
will accept any PCI or PCIe device.
New attributes must be added to the controller <target> subelement for
this - chassis and port are guest-visible option values that will be
set by libvirt with values derived from the controller's index and pci
address information.
There are some configuration options to some types of pci controllers
that are currently automatically derived from other parts of the
controller's configuration. For example, in qemu a pci-bridge
controller has an option that is called "chassis_nr"; up until now
libvirt has always set chassis_nr to the index of the pci-bridge. So
this:
<controller type='pci' model='pci-bridge' index='2'/>
will always result in:
-device pci-bridge,chassis_nr=2,...
on the qemu commandline. In the future we may decide there is a better
way to derive that option, but even in that case we will need for
existing domains to retain the same chassis_nr they were using in the
past - that is something that is visible to the guest so it is part of
the guest ABI and changing it would lead to problems for migrating
guests (or just guests with very picky OSes).
The <target> subelement has been added as a place to put the new
"chassisNr" attribute that will be filled in by libvirt when it
auto-generates the chassisNr; it will be saved in the config, then
reused any time the domain is started:
<controller type='pci' model='pci-bridge' index='2'>
<model type='pci-bridge'/>
<target chassisNr='2'/>
</controller>
The one oddity of all this is that if the controller configuration
is changed (for example to change the index or the pci address
where the controller is plugged in), the items in <target> will
*not* be re-generated, which might lead to conflict. I can't
really see any way around this, but fortunately if there is a
material conflict qemu will let us know and we will pass that on
to the user.
This new subelement is used in PCI controllers: the toplevel
*attribute* "model" of a controller denotes what kind of PCI
controller is being described, e.g. a "dmi-to-pci-bridge",
"pci-bridge", or "pci-root". But in the future there will be different
implementations of some of those types of PCI controllers, which
behave similarly from libvirt's point of view (and so should have the
same model), but use a different device in qemu (and present
themselves as a different piece of hardware in the guest). In an ideal
world we (i.e. "I") would have thought of that back when the pci
controllers were added, and used some sort of type/class/model
notation (where class was used in the way we are now using model, and
model was used for the actual manufacturer's model number of a
particular family of PCI controller), but that opportunity is long
past, so as an alternative, this patch allows selecting a particular
implementation of a pci controller with the "name" attribute of the
<model> subelement, e.g.:
<controller type='pci' model='dmi-to-pci-bridge' index='1'>
<model name='i82801b11-bridge'/>
</controller>
In this case, "dmi-to-pci-bridge" is the kind of controller (one that
has a single PCIe port upstream, and 32 standard PCI ports downstream,
which are not hotpluggable), and the qemu device to be used to
implement this kind of controller is named "i82801b11-bridge".
Implementing the above now will allow us in the future to add a new
kind of dmi-to-pci-bridge that doesn't use qemu's i82801b11-bridge
device, but instead uses something else (which doesn't yet exist, but
qemu people have been discussing it), all without breaking existing
configs.
(note that for the existing "pci-bridge" type of PCI controller, both
the model attribute and <model> name are 'pci-bridge'. This is just a
coincidence, since it turns out that in this case the device name in
qemu really is a generic 'pci-bridge' rather than being the name of
some real-world chip)
If a pci address had a function number out of range, the error message
would be:
Insufficient specification for PCI address
which is logged by virDevicePCIAddressParseXML() after
virDevicePCIAddressIsValid returns a failure.
This patch enhances virDevicePCIAddressIsValid() to optionally report
the error itself (since it is the place that decides which part of the
address is "invalid"), and uses that feature when calling from
virDevicePCIAddressParseXML(), so that the error will be more useful,
e.g.:
Invalid PCI address function=0x8, must be <= 7
Previously, virDevicePCIAddressIsValid didn't check for the
theoretical limits of domain or bus, only for slot or function. While
adding log messages, we also correct that ommission. (The RNG for PCI
addresses already enforces this limit, which by the way means that we
can't add any negative tests for this - as far as I know our
domainschematest has no provisions for passing XML that is supposed to
fail).
Note that virDevicePCIAddressIsValid() can only check against the
absolute maximum attribute values for *any* possible PCI controller,
not for the actual maximums of the specific controller that this
device is attaching to; fortunately there is later more specific
validation for guest-side PCI addresses when building the set of
assigned PCI addresses. For host-side PCI addresses (e.g. for
<hostdev> and for network device pools), we rely on the error that
will be logged when it is found that the device doesn't actually
exist.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1004596
https://bugzilla.redhat.com/show_bug.cgi?id=1176020
Some users think this is a good idea:
<vcpu placement='static'>4</vcpu>
<cpu mode='host-model'>
<model fallback='allow'/>
<numa>
<cell id='0' cpus='0-1' memory='1048576' unit='KiB'/>
<cell id='1' cpus='9-10' memory='2097152' unit='KiB'/>
</numa>
</cpu>
It's not. Lets therefore introduce a check and discourage them in
doing so.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function should return the greatest CPU number set in
/domain/cpu/numa/cell/@cpus. The idea is that we should compare
the returned value against /domain/vcpu value. Yes, there exist
users who think the following is a good idea:
<vcpu placement='static'>4</vcpu>
<cpu mode='host-model'>
<model fallback='allow'/>
<numa>
<cell id='0' cpus='0-1' memory='1048576' unit='KiB'/>
<cell id='1' cpus='9-10' memory='2097152' unit='KiB'/>
</numa>
</cpu>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The recent changes to perform SCSI device address checks during the
post parse callbacks ran afoul of the Coverity checker since the changes
assumed that the 'xmlopt' parameter to virDomainDeviceDefPostParse
would be non NULL (commit id 'ca2cf74e87'); however, what was missed
is there was an "if (xmlopt &&" check being made, so Coverity believed
that it could be possible for a NULL 'xmlopt'.
Checking the various calling paths seemingly disproves that. If called
from virDomainDeviceDefParse, there were two other possible calls that
would end up dereffing, so that path could not be NULL. If called via
virDomainDefPostParseDeviceIterator via virDomainDefPostParse there
are two callers (virDomainDefParseXML and qemuParseCommandLine)
which deref xmlopt either directly or through another call.
So I'm removing the check for non-NULL xmlopt.
Rather than provide a somewhat generic error message when the API
returns false, allow the caller to supply a "report = true" option
in order to cause virReportError's to describe which of the 3 paths
that can cause failure.
Some callers don't care about what caused the failure, they just want
to have a true/false - for those, calling with report = false should
be sufficient.
Rather than calling virDomainDiskDefAssignAddress during the parsing of
the XML, moving the setting of disk addresses into the domain/device post
processing.
Commit id '37588b25' which introduced VIR_DOMAIN_DEF_PARSE_DISK_SOURCE
in order to avoid generating the address which wasn't required will not
be affected by this as all it cared about was processing the source XML.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than calling virDomainHostdevAssignAddress during the parsing
of the XML, move the setting of a default hostdev address to domain/
device post processing.
Since the parse code no longer generates an address, we can remove
the virDomainDefMaybeAddHostdevSCSIcontroller since the call to
virDomainHostdevAssignAddress will attempt to add the controllers
that were not already defined in the XML.
This patch will also enforce that the address type is type 'drive'
when a SCSI subsystem <hostdev> element is provided with an <address>.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If virDomainControllerSCSINextUnit failed to find a slot on the current
VIR_DOMAIN_CONTROLLER_TYPE_SCSI controller(s), try to add a new controller;
otherwise, there may be multiple unit=0 entries for the same "next"
controller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
While searching the hostdevs the drive type can be either *_TYPE_DRIVE
or *_TYPE_NONE. If the type is _TYPE_NONE on the first scsi_host, then
there is an erroneous "match" that the address already exists.
Although this works by chance currently because hostdev's are added one
at a time and 'nhostdevs' would be zero, thus returning false for the
first hostdev added, a future patch will move the hostdev address
assignment into post processing resulting in the bad match.
This code is only called by path's expecting either drive or none.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add the xmlopt parameter that was saved during virDomainDefPostParse
to the parameters. A future patch will use it.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Modify virDomainDriveAddressIsUsedBy{Disk|Hostdev} and
virDomainSCSIDriveAddressIsUsed to take 'bus' and 'target'
parameters. Will be used by future patches for more complete
address conflict checks
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the only way virDomainHostdevAssignAddress can be called is from
within virDomainHostdevDefParseXML when hostdev->source.subsys.type is
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, thus there's no need for redundancy.
Signed-off-by: John Ferlan <jferlan@redhat.com>
There are some non-0 default values in virDomainControllerDef (and
will soon be more) that are easier to not forget if the remembering is
done by a single initializer function (rather than inline code after
allocating the obejct with generic VIR_ALLOC().
The function that auto-assigns PCI addresses was written with the
hardcoded assumptions that any PCI bus would have slots available
starting at 1 and ending at 31. This isn't true for many types of
controllers (some have a single slot/port at 0, some have slots/ports
from 0 to 31). This patch updates that function to remove the
hardcoded assumptions. It will properly find/assign addresses for
devices that can only connect to pcie-(root|downstream)-port (which
have minSlot/maxSlot of 0/0) or a pcie-switch-upstream-port (0/31).
It still will not auto-create a new bus of the proper kind for these
connections when one doesn't exist, that task is for another day.
This makes the range and static host array management in
virNetworkDHCPDefParseXML() more similar to what is done in
virNetworkDefUpdateIPDHCPRange() and virNetworkDefUpdateIPDHCPHost() -
they use VIR_APPEND_ELEMENT rather than a combination of
VIR_REALLOC_N() and separate incrementing of the array size.
The one functional change here is that a memory leak of the contents
of the last (unsuccessful) virNetworkDHCPHostDef was previously leaked
in certain failure conditions, but it is now properly cleaned up.
Adding functionality to libvirt that will allow
it query the interface for the availability of RDMA and
tx-udp_tnl-segmentation Offloading NIC capabilities
Here is an example of the feature XML definition:
<device>
<name>net_eth4_90_e2_ba_5e_a5_45</name>
<path>/sys/devices/pci0000:00/0000:00:03.0/0000:08:00.1/net/eth4</path>
<parent>pci_0000_08_00_1</parent>
<capability type='net'>
<interface>eth4</interface>
<address>90:e2:ba:5e:a5:45</address>
<link speed='10000' state='up'/>
<feature name='rx'/>
<feature name='tx'/>
<feature name='sg'/>
<feature name='tso'/>
<feature name='gso'/>
<feature name='gro'/>
<feature name='rxvlan'/>
<feature name='txvlan'/>
<feature name='rxhash'/>
<feature name='rdma'/>
<feature name='txudptnl'/>
<capability type='80203'/>
</capability>
</device>
If one calls update-device with information that is not updatable,
libvirt reports success even though no data were updated. The example
used in the bug linked below uses updating device with <boot order='2'/>
which, in my opinion, is a valid thing to request from user's
perspective. Mainly since we properly error out if user wants to update
such data on a network device for example.
And since there are many things that might happen (update-device on disk
basically knows just how to change removable media), check for what's
changing and moreover, since the function might be usable in other
drivers (updating only disk path is a valid possibility) let's abstract
it for any two disks.
We can't possibly check for everything since for many fields our code
does not properly differentiate between default and unspecified values.
Even though this could be changed, I don't feel like it's worth the
complexity so it's not the aim of this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1007228
In commit 714b38cb23 I tried to avoid
having two disks with the same WWN in a VM. I forgot to check the
hotplug paths though which make it possible bypass that check. Reinforce
the fix by checking the wwn when attaching the disk.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1208009
We should distinguish between success and timeout, to let the user
handle those two events differently.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1142631
Commit id 'e0e290552' added a check to determine if the same bus
had the same target value. It seems that's not quite good enough
as the check should check the target name value regardless of bus type.
Also added a DO_TEST_DIFFERENT to exhibit the issue
As the backend of shmem server is a unix type chr device, save it in
virDomainChrSourceDef, so we can reuse the existing code for chr device.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Using a custom device tree image may cause unexpected behavior in
architectures that use this approach to detect platform devices. Since
usually the device tree is generated by qemu and thus it's not normally
used let's taint VMs using it to make it obvious as a possible source of
problems.
Since the balloon driver does not guarantee that it returns memory to
the host, using the value in the audit message is not a good idea.
This patch removes auditing from updating the balloon size and reports
the total physical size at startup.
https://bugzilla.redhat.com/show_bug.cgi?id=1232606
Since an mpath pool contains all the Multipath devices on a host, allowing
more than one defined on a host at a time should be disallowed under the
policy of disallowing duplicate source pools for the host.
Adjust to docs to clarify the Multipath target path value usage for both
the storage driver (only 1 pool per host) and formatstorage references
(ignore the target element in favor of the default target mapping of
/dev/mapper).
https://bugzilla.redhat.com/show_bug.cgi?id=1201143
The formatdomain.html description for <disk> device 'lun' indicates that
it must be either a type 'block' or type 'network' with protocol 'iscsi';
however, we did not make that check until domain startup.
This caused issues for virt-manager which had an unexpected failure at
run time rather config time.
This patch adds a check in post part disk device checking for the specific
and supported lun types as well as adjusting the test failure to be for
parse config rather than run time.
Certain PCI buses don't support hotplug, and when automatically
assigning PCI addresses for devices, libvirt is very conservative in
its assumptions about whether or not a device will need to be
hotplugged/unplugged in the future. But if the user manually assigns
an address, they likely are aware of any hotplug requirements of the
device (or at least they should be).
In short, after this patch, automatically PCI address assignment will
assume that the device must be plugged in to a hot-pluggable slot, but
manually assignment can place the device in any bus that is
compatible, regardless of whether or not it supports hotplug. If the
user makes a mistake and plugs the device into a bus that doesn't
support hotplug, then later tries to do a hot-unplug, qemu will give
an appropriate error.
(in the future we may want to add a "hotpluggable" attribute to all
devices, with default being "yes" for autoassign, and "no" for manual
assign).
When support for the pcie-root and dmi-to-pci-bridge buses on a Q35
machinetype was added, I was concerned that even though qemu at the
time allowed plugging a PCI device into a PCIe port, that it might not
be supported in the future. To prevent painful backtracking in the
possible future where this happened, I disallowed such connections
except in a few specific cases requested by qemu developers (indicated
in the code with the flag VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG).
Now that a couple years have passed, there is a clear message from
qemu that there is no danger in allowing PCI devices to be plugged
into PCIe ports. This patch eliminates
VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG and changes the code to always
allow PCI->PCIe or PCIe->PCI connection *when the PCI address is
specified in the config. (For newly added devices that haven't yet
been given a PCI address, the auto-placement still prefers using the
correct type of bus).
https://bugzilla.redhat.com/show_bug.cgi?id=1235116
According to our XML definition, zero is as valid as any other value.
Mainly because it should be kernel-agnostic.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
This patch provides support for the new watchdog model "diag288".
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
This patch provides support for a new watchdog action "inject-nmi" which
allows to define an inject of a non-maskable interrupt into a guest.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Commit id '1feaccf0' attempted to handle an empty secrettype value; however,
it made a mistake by processing the secretType as if it was the original
secrettype string. The 'secretType' is actually whether 'usage' or 'uuid'
was used.
Thus adjust part of the change to make the same check for def->src->type !=
VIR_STORAGE_TYPE_VOLUME before setting auth_secret_usage from the
secrettype field.
Luckily the aforementioned commits misdeed would be overwritten by the
call to virStorageTranslateDiskSourcePool
Just refactor existing code to use a child buf instead of
check all element before format <blkiotune> and <cputune>.
This will avoid the more and more bigger element check during
we introduce new elements in <blkiotune> and <cputune> in the
future.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
The SCSI Architecture Model defines a logical unit address
as 64-bits in length, so change the field accordingly so
that the entire value could be stored.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
The address elements are all unsigned integers, so we should
use the appropriate print directive when printing it.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
The SCSI address element attributes bus, target, and unit are expected
to be positive values, so make sure no one provides a negative value since
the value is stored as an unsigned.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
So that they can format private data (e.g., disk private data) stored
elsewhere in the domain object.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Complex jobs, such as migration, need to monitor several events at once,
which is impossible when each of the event uses its own condition
variable. This patch adds a single condition variable to each domain
object. This variable can be used instead of the other event specific
conditions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virDomainObjGetOneDef will help to retrieve the correct definition
pointer from @vm in cases where VIR_DOMAIN_AFFECT_LIVE and
VIR_DOMAIN_AFFECT_CONFIG are mutually exclusive. The function simply
returns the correct pointer. This similarly to virDomainObjGetDefs will
greatly simplify the code.
If @flags contains only VIR_DOMAIN_AFFECT_CONFIG and @vm is active, the
function would return the active config rather than the persistent one
that it should return. This happened due to the fact that
virDomainObjGetDefs was checking the updated flags which may not contain
VIR_DOMAIN_AFFECT_LIVE if it is not requested even if @vm is active.
Additionally the function would not take the flags into account when
setting the pointers which was later used to determine whether the code
needs to update the given configuration.
The mistake was caught by the virt-test suite.
https://bugzilla.redhat.com/show_bug.cgi?id=1220527
This type of information defines attributes of a system
baseboard. With one exception: board type is yet not implemented
in qemu so it's not introduced here either.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1200206
Commit id '1b4eaa61' added the ability to have a mode='direct' for
an iscsi disk volume. It relied on virStorageTranslateDiskSourcePool
in order to copy any disk source pool authentication information to
the direct disk volume, but it neglected to also copy the 'secrettype'
field which ends up being used in the domain volume formatting code.
Adding a secrettype for this case will allow for proper formatting later
and allow disk snapshotting to work properly
Additionally libvirtd restart processing would fail to find the domain
since the translation processing code is run after domain xml processing,
so handle the the case where the authdef could have an empty secrettype
field when processing the auth and additionally ignore performing the
actual and expected auth secret type checks for a DISK_VOLUME since that
data will be reassembled later during translation processing of the
running domain.
During a review, I've noticed this error message that was eventually
produced when I was trying to define a domain:
error: invalid argument: could not find capabilities for arch=mips64el
domaintype=(null)
Look at the (null). Why is it there? Well, during XML parsing, we try
to look up the default emulator for given OS type and possibly virt
type too. And this is the problem, because if we don't want to look up
by virt type, a -1 is passed to note this fact. Later, the code
handles -1 just right. Except for error message. When it is
constructed (in a very fabulous way I must say), the value is compared
to zero, not -1. And since we don't have any translation from -1 to a
virt type string, we just print (null).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A variable can't be named system, obviously. Well, it can if the
compiler is new enough to distinguish a variable named system and a
function call system(). And some older systems, don't have wise
compiler.
CC util/libvirt_util_la-virsysinfo.lo
cc1: warnings being treated as errors
../../src/util/virsysinfo.c: In function 'virSysinfoParseSystem':
../../src/util/virsysinfo.c:649: error: declaration of 'system' shadows a global declaration [-Wshadow]
/usr/include/stdlib.h:717: error: shadowed declaration is here [-Wshadow]
make[3]: *** [util/libvirt_util_la-virsysinfo.lo] Error 1
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Move all the system_* fields into a separate struct. Not only this
simplifies the code a bit it also helps us to identify whether BIOS
info is present. We don't have to check all the four variables for
being not-NULL, but we can just check the pointer to the struct.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Move all the bios_* fields into a separate struct. Not only this
simplifies the code a bit it also helps us to identify whether BIOS
info is present. We don't have to check all the four variables for
being not-NULL, but we can just check the pointer to the struct.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Multi != One. And indeed, libvirt behaves the same way for queues='1'
as without such setting. Let's make it clear in the XML.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Coverity rightfully determined that in commit 3d021381c7
I made a mistake in the first check if @persDef is not NULL is
dereferencing it rather than checking.
Additionally if the vm is online the code would set @liveDef twice
rather than modifying @persDef. Fix both mistakes.
virDomainLiveConfigHelperMethod that is used for this job now does
modify the flags but still requires the callers to extract the correct
definition objects.
In addition coverity and other static analyzers are usually unhappy as
they don't grasp the fact that @flags are upadted according to the
correct def to be present.
To work this issue around and simplify the calling chain let's add a new
helper that will work only on drivers that always copy the persistent
def to a transient at start of a vm. This will allow to drop a few
arguments. The new function syntax will also fill two definition
pointers rather than modifying the @flags parameter.
While we probably won't see machines with more than 65536 cpus for a
while lets store the cpu count as an integer so that we can avoid quite
a lot of overflow checks in our code.
We have been formatting the first serial device also
as a console device, but only if there were no other consoles.
If there is a <serial> device present in the XML, but no serial
<console>, or if there isn't any <console> at all but the domain
definition hasn't gone through a parse->format->parse round-trip,
the <console> device would not be formatted.
Change the code to always add the stub device for the first
serial device.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1089914
Console/channel devices have their pty devices assigned when the emulator is
actually started. If time is spent in guest preparation, someone attempts
to open the console/channel, the libvirt crashes in virChrdevLockFilePath().
The patch attempts to fix the crash by adding a check before attempting to
open.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Store the emulator pinning cpu mask as a pure virBitmap rather than the
virDomainPinDef since it stores only the bitmap and refactor
qemuDomainPinEmulator to do the same operations in a much saner way.
As a side effect virDomainEmulatorPinAdd and virDomainEmulatorPinDel can
be removed since they don't add any value.
As soon as we keep backward compatibility we treat this constant
as synonym to VIR_DOMAIN_VIRT_PARALLELS.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
There are now many more reasons that virSocketAddrGetRange() could
fail, so it is much more informative to report the error there instead
of in the caller. (one of the two callers was previously assuming
success, which is almost surely safe based on the parsing that has
already happened to the config by that time, but it still is nicer to
account for an error "just in case")
Part of fix for: https://bugzilla.redhat.com/show_bug.cgi?id=985653
virSocketAddrGetRange() has been updated to take the network address
and prefix, and now checks that both the start and end of the range
are within that network, thus validating that the entire range of
addresses is in the network. For IPv4, it also checks that ranges to
not start with the "network address" of the subnet, nor end with the
broadcast address of the subnet (this check doesn't apply to IPv6,
since IPv6 doesn't have a broadcast or network address)
Negative tests have been added to the network update and socket tests
to verify that bad ranges properly generate an error.
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=985653
Use xmlFreeDoc instead of plain xmlFree.
4 bytes in 1 blocks are definitely lost in loss record 9 of 1,084
at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x70730D6: xmlStrndup (in /usr/lib64/libxml2.so.2.9.2)
by 0x701E3DC: xmlNewDoc (in /usr/lib64/libxml2.so.2.9.2)
by 0x70C39F8: xmlSAX2StartDocument (in /usr/lib64/libxml2.so.2.9.2)
by 0x7017245: xmlParseDocument (in /usr/lib64/libxml2.so.2.9.2)
by 0x7017606: xmlDoRead (in /usr/lib64/libxml2.so.2.9.2)
by 0x5309DAD: virXMLParseHelper (virxml.c:742)
by 0x5367584: virStoragePoolLoadState (storage_conf.c:1863)
It's not a problem at all and causes virt-manager to break down.
Note: netcf 0.2.8 and earlier generates invalid XML for a bond with no
interfaces anyway, so in that case this error in libvirt is never
reached since we fail earlier.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
If the redirfilter has no usbdev sub-elements, then do not format anything
rather than formatting an empty pair of elements:
<redirfilter>
</redirfilter>
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit id '73eda710' added virDomainKeyWrapDefParseXML which uses
virXPathNodeSet, but does not handle a -1 return thus causing a possible
loop condition exit problem later when the return value is used.
Change the logic to return the value from virXPathNodeSet if <= 0
The XML parser sets a default <mode> if none is explicitly passed in.
This is then used at pool/vol creation time, and unconditionally reported
in the XML.
The problem with this approach is that it's impossible for other code
to determine if the user explicitly requested a storage mode. There
are some cases where we want to make this distinction, but we currently
can't.
Handle <mode> parsing like we handle <owner>/<group>: if no value is
passed in, set it to -1, and adjust the internal consumers to handle
it.
https://bugzilla.redhat.com/show_bug.cgi?id=998813
Like usb-serial, the pci-serial device allows a serial device to be
attached to PCI bus. An example XML looks like this:
<serial type='dev'>
<source path='/dev/ttyS2'/>
<target type='pci-serial' port='0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</serial>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Sometimes the only thing we need is the pointer to virDomainDiskDef and
having to call virDomainDiskIndexBy* APIs, storing the disk index, and
looking it up in the disks array is ugly. After this patch, we can just
call virDomainDiskBy* and get the pointer in one step.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
After parsing the memory device XML the function would not restore the
XML parser context causing invalid XPath starting point for the rest of
the elements. This is a regression since 3e4230d2.
The test case addition uses the <idmap> element that is currently unused
by qemu, but parsed after the memory device definition and formatted
always.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1223631
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
virDomainParseMemory parses the size and then rounds up while converting
it to kibibytes. Since the number is limit-checked before the rounding
it's possible to use a number that would be correctly parsed the first
time, but not the second time. For numbers not limited to 32 bit systems
the magic is 9223372036854775807 bytes. That number then can't be parsed
back in kibibytes.
To solve the issue add a second overflow check for the few values that
would cause the problem. Since virDomainParseMemory is used in config
parsing, this avoids vanishing VMs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1221504
So far, we are not reporting if numatune was even defined. The
value of zero is blindly returned (which maps onto
VIR_DOMAIN_NUMATUNE_MEM_STRICT). Unfortunately, we are making
decisions based on this value. Instead, we should not only return
the correct value, but report to the caller if the value is valid
at all.
For better viewing of this patch use '-w'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=976387
For a domain configured using the host cdrom, we should taint the domain
due to problems encountered when the host and guest try to control the tray.
For some reason a union (_virNodeDevCapData) that had only been
declared inside the toplevel struct virNodeDevCapsDef was being used
as an argument to functions all over the place. Since it was only a
union, the "type" attribute wasn't necessarily sent with it. While
this works, it just seems wrong.
This patch creates a toplevel typedef for virNodeDevCapData and
virNodeDevCapDataPtr, making it a struct that has the type attribute
as a member, along with an anonymous union of everything that used to
be in union _virNodeDevCapData. This way we only have to change the
following:
s/union _virNodeDevCapData */virNodeDevCapDataPtr /
and
s/caps->type/caps->data.type/
This will make me feel less guilty when adding functions that need a
pointer to one of these.
Two new domain configuration XML elements are added to enable/disable
the protected key management operations for a guest:
<domain>
...
<keywrap>
<cipher name='aes|dea' state='on|off'/>
</keywrap>
...
</domain>
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Because there are multiple potential reasons for an error, this
function logs any errors before returning NULL (since the caller won't
have the information needed to determine which was the reason for
failure).
The APIs take the memory value in KiB and we store it in KiB
internally, but we cannot parse the whole ULONG_MAX range
on 64-bit systems, because virDomainParseScaledValue
needs to fit the value in bytes in an unsigned long long.
https://bugzilla.redhat.com/show_bug.cgi?id=1176739
Since 'autofill'd iothreadid entries are not written during XML format
processing, it is possible that if an iothreadid in the middle of an
autofilled list would then change it's id on a subsequent restart.
Thus during the iothreadid deletion, if we determine the delete is not
the "last" thread, then clear the autofill bit for all iothreadid's
following the one being deleted (either the first or one in the middle).
This way, iothreadid's will be printed/saved.
https://bugzilla.redhat.com/show_bug.cgi?id=1171984https://bugzilla.redhat.com/show_bug.cgi?id=1188463
Remove the check for the source host name for iSCSI source XML processing
declaring duplicate sources when the source device path and if present the
initiator of a proposed storage pool matches an existing storage pool.
The backend iSCSI storage driver uses 'iscsiadm --mode session' to query
available iscsid target sessions. The output displayed is the IP address
and the IQN (target path) of known targets. The displayed IP address
is a resolved address based on the session --login. Additionally, iscsid
keeps track of the various ways to define the host name (IPv4 Address,
IPv6 Address, /etc/hosts, etc.) for that IQN (see output of an 'iscsiadm
--mode node'). If an incoming IQN matches and the host name provided by
libvirt is resolved to the existing IQN, then iscsid will "reuse" the
session. Although libvirt could do the same name resolution, if there
is a difference, iscsid could still declare two seemingly different sources
to be the same and not create a new session which means libvirt now has
two storage pools looking at the same source. Thus to avoid any strange
host name resolution issues, just rely on iscsid for that and do not
allow multiple pools on the same host to use the same device path (IQN).
Only perform the port number check if the incoming definition actually
provides it. Since the port number is optional we could erroneously pass
a duplicate source host check since some storage pool backends which fill
in the default port number (e.g., iSCSI and sheepdog) for the started pool.
There is a lot of places, were it's pretty easy for user to enter some
characters that we need to escape to create a valid XML description.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1197580
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1220265
Passing the return value to an enum directly is not safe. Fix this by
comparing the true integer result of virTristateSwitchTypeFromString().
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Specifying a balloon size more than the memory size of a guest isn't
something that should be rejected when parsing the XML. Truncate the
size to the maximum memory size.
Until now the virDomainListAllDomains API would lock the domain list and
then every single domain object to access and filter it. This would
potentially allow a unresponsive VM to block the whole daemon if a
*listAllDomains call would get stuck.
To avoid this problem this patch collects a list of referenced domain
objects first from the list and then unlocks it right away. The
expensive operation requiring locking of the domain object is executed
after the list lock is dropped. While a single blocked domain will still
lock up a listAllDomains call, the domain list won't be held locked and
thus other APIs won't be blocked.
Additionally this patch also fixes the lookup code, where we'd ignore
the vm->removing flag and thus potentially return domain objects that
would be deleted very soon so calling any API wouldn't make sense.
As other clients also could benefit from operating on a list of domain
objects rather than the public domain descriptors a new intermediate
API - virDomainObjListCollect - is introduced by this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1181074
Extend it to a universal helper used for clearing lists of any objects.
Note that the argument type is specifically void * to allow implicit
typecasting.
Additionally add a helper that works on non-NULL terminated arrays once
we know the length.
My commit 747761a79 (v1.2.15 only) dropped this bit of logic when filling
in a default arch in the XML:
- /* First try to find one matching host arch */
- for (i = 0; i < caps->nguests; i++) {
- if (caps->guests[i]->ostype == ostype) {
- for (j = 0; j < caps->guests[i]->arch.ndomains; j++) {
- if (caps->guests[i]->arch.domains[j]->type == domain &&
- caps->guests[i]->arch.id == caps->host.arch)
- return caps->guests[i]->arch.id;
- }
- }
- }
That attempt to match host.arch is important, otherwise we end up
defaulting to i686 on x86_64 host for KVM, which is not intended.
Duplicate it in the centralized CapsLookup function.
Additionally add some testcases that would have caught this.
https://bugzilla.redhat.com/show_bug.cgi?id=1219191
https://bugzilla.redhat.com/show_bug.cgi?id=1176020
We had a check for the vcpu count total number in <numa>
before, however this check is not good enough. There are
some examples:
1. one of cpu id is out of maxvcpus, can set success(cpu count = 5 < 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-3,100' memory='512000' unit='KiB'/>
2. use the same cpu in 2 cell, can set success(cpu count = 8 < 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-3' memory='512000' unit='KiB'/>
<cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
3. use the same cpu in 2 cell, cannot set success(cpu count = 11 > 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-6' memory='512000' unit='KiB'/>
<cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
Add a check for numa cpus, check if duplicate use one cpu in more
than one cell.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some platforms, like aarch64, don't have APIC but GIC. So there's
no reason to have <apic/> feature turned on. However, we are
still missing <gic/> feature. This commit introduces the feature
to XML parser and formatter, adds documentation and updates RNG
schema.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The phyp driver stuffed it into a DomainDefPtr during its attachdevice
routine, but the value is never advertised via capabilities so it should
be safe to drop.
Have the phyp driver use OSTYPE_LINUX, which is what it advertises via
capabilities.
Build with clang fails with:
CC conf/libvirt_conf_la-domain_conf.lo
conf/domain_conf.c:13377:9: error: variable 'cpumask' is used
uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
if (!(tmp = virXMLPropString(node, "cpuset"))) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and many other similar errors regarding the 'cpuset' variable.
Fix by explicitly initializing it with NULL.
If someone has updated a network to change its bridge name, but the
network is still active (so that bridge name hasn't taken effect yet),
we still want to disallow another network from taking that new name.
We already check that any auto-assigned bridge device name for a
virtual network (e.g. "virbr1") doesn't conflict with the bridge name
for any existing libvirt network (via virNetworkSetBridgeName() in
conf/network_conf.c).
We also want to check that the name doesn't conflict with any bridge
device created on the host system outside the control of libvirt
(history: possibly due to the ploriferation of references to libvirt's
bridge devices in HOWTO documents all around the web, it is not
uncommon for an admin to manually create a bridge in their host's
system network config and name it "virbrX"). To add such a check to
virNetworkBridgeInUse() (which is called by virNetworkSetBridgeName())
we would have to call virNetDevExists() (from util/virnetdev.c); this
function calls ioctl(SIOCGIFFLAGS), which everyone on the mailing list
agreed should not be done from an XML parsing function in the conf
directory.
To remedy that problem, this patch removes virNetworkSetBridgeName()
from conf/network_conf.c and puts an identically functioning
networkBridgeNameValidate() in network/bridge_driver.c (because it's
reasonable for the bridge driver to call virNetDevExists(), although
we don't do that yet because I wanted this patch to have as close to 0
effect on function as possible).
There are a couple of inevitable changes though:
1) We no longer check the bridge name during
virNetworkLoadConfig(). Close examination of the code shows that
this wasn't necessary anyway - the only *correct* way to get XML
into the config files is via networkDefine(), and networkDefine()
will always call networkValidate(), which previously called
virNetworkSetBridgeName() (and now calls
networkBridgeNameValidate()). This means that the only way the
bridge name can be unset during virNetworkLoadConfig() is if
someone edited the config file on disk by hand (which we explicitly
prohibit).
2) Just on the off chance that somebody *has* edited the file by hand,
rather than crashing when they try to start their malformed
network, a check for non-NULL bridge name has been added to
networkStartNetworkVirtual().
(For those wondering why I don't instead call
networkValidateBridgeName() there to set a bridge name if one
wasn't present - the problem is that during
networkStartNetworkVirtual(), the lock for the network being
started has already been acquired, but the lock for the network
list itself *has not* (because we aren't adding/removing a
network). But virNetworkBridgeInuse() iterates through *all*
networks (including this one) and locks each network as it is
checked for a duplicate entry; it is necessary to lock each network
even before checking if it is the designated "skip" network because
otherwise some other thread might acquire the list lock and delete
the very entry we're examining. In the end, permitting a setting of
the bridge name during network start would require that we lock the
entire network list during any networkStartNetwork(), which
eliminates a *lot* of parallelism that we've worked so hard to
achieve (it can make a huge difference during libvirtd startup). So
rather than try to adjust for someone playing against the rules, I
choose to instead give them the error they deserve.)
3) virNetworkAllocateBridge() (now removed) would leak any "template"
string set as the bridge name. Its replacement
networkFindUnusedBridgeName() doesn't leak the template string - it
is properly freed.
Add qemuDomainAddIOThread and qemuDomainDelIOThread in order to add or
remove an IOThread to/from the host either for live or config optoins
The implementation for the 'live' option will use the iothreadpids list
in order to make decision, while the 'config' option will use the
iothreadids list. Additionally, for deletion each may have to adjust
the iothreadpin list.
IOThreads are implemented by qmp objects, the code makes use of the existing
qemuMonitorAddObject or qemuMonitorDelObject APIs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to allow IOThreads to be deleted, but an iothreadid may be
included in some domain thread sched, so add a new API to allow removing
an iothread from some entry.
Then during the writing of the threadsched data and an additional check
to determine whether the bitmap is all clear before writing it out.
With iothreadid's allowing any 'id' value for an iothread_id, the
iothreadsched code needs a slight adjustment to allow for "any"
unsigned int value in order to create the bitmap of ids that will
have scheduler adjustments. Adjusted the doc description as well.
Remove the iothreadspin array from cputune and replace with a cpumask
to be stored in the iothreadids list.
Adjust the test output because our printing goes in order of the iothreadids
list now.
Since it's only ever referenced in domain_conf.c, make the function
static, but also will need to move it to somewhere before it's referenced
rather than forward referencing it.
Add 'thread_id' to the virDomainIOThreadIDDef as a means to store the
'thread_id' as returned from the live qemu monitor data.
Remove the iothreadpids list from _qemuDomainObjPrivate and replace with
the new iothreadids 'thread_id' element.
Rather than use the default numbering scheme of 1..number of iothreads
defined for the domain, use the iothreadid's list for the iothread_id
Since iothreadids list keeps track of the iothread_id's, these are
now used in place of the many places where a for loop would "know"
that the ID was "+ 1" from the array element.
The new tests ensure usage of the <iothreadid> values for an exact number
of iothreads and the usage of a smaller number of <iothreadid> values than
iothreads that exist (and usage of the default numbering scheme).
Adding a new XML element 'iothreadids' in order to allow defining
specific IOThread ID's rather than relying on the algorithm to assign
IOThread ID's starting at 1 and incrementing to iothreads count.
This will allow future patches to be able to add new IOThreads by
a specific iothread_id and of course delete any exisiting IOThread.
Each iothreadids element will have 'n' <iothread> children elements
which will have attribute "id". The "id" will allow for definition
of any "valid" (eg > 0) iothread_id value.
On input, if any <iothreadids> <iothread>'s are provided, they will
be marked so that we only print out what we read in.
On input, if no <iothreadids> are provided, the PostParse code will
self generate a list of ID's starting at 1 and going to the number
of iothreads defined for the domain (just like the current algorithm
numbering scheme). A future patch will rework the existing algorithm
to make use of the iothreadids list.
On output, only print out the <iothreadids> if they were read in.
use virNetworkRouteDefFree() instead of VIR_FREE to free routes, otherwise
the element 'family' would not be freed.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
use cleanup instead of error, so that the allocated strings could also get freed
when there's no error.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
virBufferContentAndReset() doesn't free buf contents, we should use
virBufferFreeAndReset() to get buf freed.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
This hash table will contain the same data as already existing one.
The only difference is that while the first table uses domain uuid as
key, the new table uses domain name. This will allow much faster (and
lockless) lookups by domain name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Every domain that grabs a domain object to work over should
reference it to make sure it won't disappear meanwhile.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is basically turning qemuDomObjEndAPI into a more general
function. Other drivers which gets a reference to domain objects may
benefit from this function too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In one of my previous patches (b68a56bcfe) I made class_id to
format more frequently. Well, now it's formatting way too
frequent - even for regular active XML. Users don't need to see
it, so lets format it only for the status XML where it's really
needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This needs to specified in way too many places for a simple validation
check. The ostype/arch/virttype validation checks later in
DomainDefParseXML should catch most of the cases that this was covering.
This revealed that GuestDefaultEmulator was a bit buggy, capable
of returning an emulator that didn't match the passed domain type. Fix
up the test suite input to continue to pass.
This is a helper function to look up all capabilities data for all
the OS bits that are relevant to <domain>. This is
- os type
- arch
- domain type
- emulator
- machine type
This will be used to replace several functions in later commits.
But the internal API stays the same, and we just convert the value as
needed. Not useful yet, but this is the beginning step of using an enum
for ostype throughout the code.
When parsing XML, we validate the passed ostype + arch combo against
the detected hypervisor capabilities. This has led to the following
problem:
- Define x86 qemu guest
- qemu is inadvertently removed from the host
- libvirtd is restarted. fails to parse VM config since arch is removed
- 'virsh list --all' is now empty, user is wondering where their VMs went
Add a new internal flag VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS. Use
it when loading VM and snapshot configs from disk.
https://bugzilla.redhat.com/show_bug.cgi?id=1043572
If no <os><type> was specified:
before: unknown OS type no OS type
after : xml error: an os <type> must be specified
If an <os><type> is specified that's not in our capabiliities data:
before: unknown OS type: $type
after : unsupported configuration: no support found for os <type> '$type'
VIR_ERR_OS_TYPE is now unused (as it should be frankly) so drop its strings
as well to save our translators some effort.
After a360912179 the formatting of virDomainActualNetDefPtr was
changed a bit. However, during the function rewrite, iface's class_id
is not formatted as frequently as it could be. In fact, after rewrite
it's formatted only for iface of type VIR_DOMAIN_NET_TYPE_DIRECT where
it makes no sense and is unused. While where needed (_TYPE_NETWORK) is
not formatted at all. This makes the daemon forget it upon daemon
restart resulting in bad behaviour.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Check the proposed pool source host XML definition against existing gluster
pools to ensure the incoming definition doesn't use the same source dir and
soure host XML definition as an existing pool.
Check the proposed pool source host XML definition against existing sheepdog
pools to ensure the incoming definition doesn't use the same source host XML
definition as an existing pool.
Rather than have duplicate code doing the same check, have the netfs
matching processing code use the new virStoragePoolSourceMatchSingleHost.
Signed-off-by: John Ferlan <jferlan@redhat.com>