Even though hal doesn't make use of it, the privileged flag is related
to the daemon/driver rather than the backend actually used.
While at it, get rid of some tab indentation in the driver state struct.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There were a bunch of commentary blocks that were literally useless in
terms of describing what the code following them does, since most of
them were documenting "the obvious" or it just wouldn't help at all.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So we have a syntax-check rule to catch all tab indents but it naturally
can't catch tab spacing, i.e. as a delimiter. This patch is a result of
running 'vim -en +retab +wq'
(using tabstop=8 softtabstop=4 shiftwidth=4 expandtab) on each file from
a list generated by the following:
find . -regextype gnu-awk \
-regex ".*\.(rng|syms|html|s?[ch]|py|pl|php(\.code)?)(\.in)?" \
| xargs git grep -lP "\t"
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When formatting an inactive or migratable XML we will need to suppress
backing chain members which were detected from the disk to keep
semantics straight. This means we need to record, whether a
virStorageSource originates from autodetection.
The file object is needed when formatting the command line, but it makes
nesting of the objects less easy for use with blockdev. Separate the
wrapping into the 'file' object into a helper used specifically for disk
sources in the old code path.
Move qemuFreeKeywords into qemu_parse_command.c as
qemuParseKeywordsFree and call it rather than inline code
in multiple places.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Even though only family and model are used for matching CPUID data with
CPU models from cpu_map.xml, stepping is used by x86DataFilterTSX which
is supposed to disable TSX on CPU models with broken TSX support. Thus
we need to start parsing stepping from QEMU to make sure we don't
disable TSX on CPUs which provide working TSX implementation. See the
following patch for a real world example of such CPU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
If same boot order is specified twice (or more) in domain xml
we call free for uninitiaziled loadparm on cleanup in virDomainDeviceBootParseXML
and SIGABRT (or similar) as a result.
When libvirt older than 3.9.0 reconnected to a running domain started by
old libvirt it could have messed up the expansion of host-model by
adding features QEMU does not support (such as cmt). Thus whenever we
reconnect to a running domain, revert to an active snapshot, or restore
a saved domain we need to check the guest CPU model and remove the
CPU features unknown to QEMU. We can do this because we know the domain
was successfully started, which means the CPU did not contain the
features when libvirt started the domain.
https://bugzilla.redhat.com/show_bug.cgi?id=1495171
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When reconnecting to a domain started with a host-model CPU which was
started by old libvirt that did not replace host-model with the real CPU
definition, libvirt replaces the host-model CPU with the CPU from
capabilities (because this is what the old libvirt did when it started
the domain). Without this patch libvirt could use features unknown to
QEMU in the CPU definition which replaced the original host-model CPU.
Such domain would keep running just fine, but any attempt to migrate it
will fail and once the domain is saved or snapshotted, restoring it
would fail too.
In other words whenever we want to use the CPU definition from host
capabilities as a guest CPU definition, we have to filter the unknown
features.
https://bugzilla.redhat.com/show_bug.cgi?id=1495171
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When migration fails, QEMU may provide a description of the error in
the reply to query-migrate QMP command. We can fetch this error and use
it instead of the generic "unexpectedly failed" message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Express a properly terminated backing chain by putting a
virStorageSource of type VIR_STORAGE_TYPE_NONE in the chain. The newly
used helpers simplify this greatly.
The change fixes a bug as formatting an incomplete backing chain and
parsing it back would end up in expressing a terminated chain since
src->backingStoreRaw was not populated. By relying on the terminator
object this can be now processed appropriately.
Add helpers that will simplify checking if a backing file is valid or
whether it has backing store. The helper virStorageSourceIsBacking
returns true if the given virStorageSource is a valid backing store
member. virStorageSourceHasBacking returns true if the virStorageSource
has a backing store child.
Adding these functions creates a central points for further refactors.
Storage driver uses virStorageSource only partially to store it's
configuration but fully when parsing backing files of storage volumes.
This patch sets the 'type' field to a value other than
VIR_STORAGE_TYPE_NONE so that further patches can add a terminator
element to backing chains without breaking iteration.
The backing store indexes were not bound to the storage sources in any
way. To allow us to bind a given alias to a given storage source we need
to save the index in virStorageSource. The backing store ids are now
generated when detecting the backing chain.
Since we don't re-detect the backing chain after snapshots, the
numbering needs to be fixed there.
Existing qemuParseCommandLineMem() will parse "-m 4G" format string.
This patch allows it to parse "-m size=8126464k,slots=32,maxmem=33554432k"
format along with existing format. And adds a testcase to validate the changes.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Hyper-V uses its own specific memory management so no mapping is going to
be perfect. However, it is more correct to map Limit to max_memory (it
really is the upper limit of what the VM may potentially use) and keep
cur_balloon equal to total_memory.
The typical value returned from Hyper-V in Limit is 1 TiB, which is not
really going to work if interpreted as "startup memory" to be ballooned
away later.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
The code was vulnerable to SQL injection. Likely not a security issue due to
WMI SQL and other constraints but still lame. For example:
virsh # dominfo \"
error: failed to get domain '"'
error: internal error: SOAP fault during enumeration: code 's:Sender', subcode
'n:CannotProcessFilter', reason 'The data source could not process the filter.
The filter might be missing or it might be invalid. Change the filter and try
the request again. ', detail 'The WS-Management service cannot process the
request. The WQL query is invalid. '
This commit fixes the Hyper-V driver by escaping all WMI SQL string parameters.
The same command with the fix:
virsh # dominfo \"
error: failed to get domain '"'
error: Domain not found: No domain with name "
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
"%s is not a Hyper-V server" is not a correct generalization of all possible
error conditions of hypervEnumAndPull. For example:
$ virsh --connect hyperv://localhost/?transport=http
Enter username for localhost [administrator]:
Enter administrator's password for localhost: <enters incorrect password>
error: failed to connect to the hypervisor
error: internal error: localhost is not a Hyper-V server
This commit removes the general virReportError from hypervInitConnection and
also the "Invalid query" virReportError from hypervSerializeEprParam, which
does not correctly describe the error either (virBufferCheckError has
already set a meaningful error message at that point).
The same scenario with the fix:
$ virsh --connect hyperv://localhost/?transport=http
Enter username for localhost [administrator]:
Enter administrator's password for localhost: <enters incorrect password>
error: failed to connect to the hypervisor
error: internal error: Transport error during enumeration: User, password or
similar was not accepted (26)
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
The default_tls_x509_verify (and related) parameters in qemu.conf
control whether the QEMU TLS servers request & verify certificates
from clients. This works as a simple access control system for
servers by requiring the CA to issue certs to permitted clients.
This use of client certificates is disabled by default, since it
requires extra work to issue client certificates.
Unfortunately the code was using this configuration parameter when
setting up both TLS clients and servers in QEMU. The result was that
TLS clients for character devices and disk devices had verification
turned off, meaning they would ignore errors while validating the
server certificate.
This allows for trivial MITM attacks between client and server,
as any certificate returned by the attacker will be accepted by
the client.
This is assigned CVE-2017-1000256 / LSN-2017-0002
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Somewhere around commit 9ff9d9f reserving entire PCI slots was
eliminated, as demonstrated by commit 6cc2014.
Reserve the functions required by the implicit devices:
00:01.0 ISA Bridge
00:01.1 IDE Controller
00:01.2 USB Controller (unless USB is disabled)
00:01.3 Bridge
https://bugzilla.redhat.com/show_bug.cgi?id=1460143
Gather query-cpu-definitions results and use them for testing CPU model
usability blockers in CPUID to virCPUDef translation.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When decoding CPUID data to virCPUDef we need to be careful about using
a CPU model which cannot be directly used on the current host. Normally,
libvirt would notice the features which prevent the model from being
usable and it would disable them in the computed virCPUDef, but this
won't work in case the definition of the CPU model in QEMU contains more
features than what we have in cpu_map.xml. We need to count with the
usability blockers we got from QEMU and explicitly disable all of them
to make the computed virCPUDef usable.
https://bugzilla.redhat.com/show_bug.cgi?id=1464832
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This internal API can be used to find a specific CPU model in
virDomainCapsCPUModels list.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The "preferred" parameter is not used by any caller of cpuDecode
anymore. It's only used internally in cpu_x86 to implement cpuBaseline.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All APIs which expect a list of CPU models supported by hypervisors were
switched from char **models and int models to just accept a pointer to
virDomainCapsCPUModels object stored in domain capabilities. This avoids
the need to transform virDomainCapsCPUModelsPtr into a NULL-terminated
list of model names and also allows the various cpu driver APIs to
access additional details (such as its usability) about each CPU model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
query-cpu-definitions QMP command returns a list of unavailable features
which prevent CPU models from being usable on the current host. So far
we only checked whether the list was empty to mark CPU models as
(un)usable. This patch parses all unavailable features for each CPU
model and stores them in virDomainCapsCPUModel as a list of usability
blockers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When a hypervisor marks a CPU model as unusable on the current host, it
may also give us a list of features which prevent the model from being
usable. Storing this list in virDomainCapsCPUModel will help the CPU
driver with creating a host-model CPU configuration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The API makes a deep copy of a NULL-terminated string list.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently, if parsing of device info fails info->alias is freed.
It doesn't make much sense to leave the rest of the struct
behind.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's one 'return' in the middle of the function body. It's
very easy to miss and so it makes adding new code harder. Also
the function doesn't follow our style 100%.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1497396
In 0d3d020ba6 I've added capability to accept MAC addresses
for the API too. However, the implementation was faulty. It needs
to lookup the corresponding interface in the domain definition
and pass the ifname instead of MAC address.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit id '8708ca01c' added a check to determine whether the NIC had
Switchdev capabilities; however, in doing so inadvertently would cause
network devices without a PCI device to not be added to the node device
database. Thus, network devices having a "computer" as a parent, such
as "net_lo*", "net_virbr*", "net_tun*", "net_vnet*", etc. were not added.
Alter the check to not even check for Switchdev bits if no PCI device found.
https://bugzilla.redhat.com/show_bug.cgi?id=1497396
The other APIs accept both, ifname and MAC address. There's no
reason virDomainInterfaceStats can't do the same.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Every caller reports the error themselves. Might as well move it
into the function and thus unify it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Let's use the RWObjectLockable for the various list lock mgmt.
Only time need Write lock will be for Add/Remove logic.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The command "info migrate" of qemu outputs the dirty-pages-rate during
migration, but page size is different in different architectures. So
page size should be output to calculate dirty pages in bytes.
Page size is already implemented with commit
030ce1f8612215fcbe9d353dfeaeb2937f8e3f94 in qemu.
Now Implement the counter-part in libvirt.
Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
libvirtd throws unhandled signal 11 on ppc while running
virsh cpu-compare with missing model tag in the xml. This
patch errors out in such situation.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Parse the -M (or -machine) command line option before starting
processing in earnest and have a fallback ready in case it's not
present, so that while parsing other options we can rely on
def->os.machine being initialized.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1379218
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This commit fixes the deadlock introduced by commit
0980764dee. The call getgrouplist() of
the glibc library isn't safe to be called in between fork and
exec (see commit 75c125641a).
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Fixes: 0980764dee ("util: share code between virExec and virCommandExec")
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
These functions are used by an upcoming commit.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Rename virDomainNumaDefCPUFormat to virDomainNumaDefCPUFormatXML,
matching its peer virDomainNumaDefCPUParseXML and the general
vir*{Format,Parse}XML conventions.
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Generating libvirt packages per make rpm, "with-libxl=1" and "with-xen=1",
adds strict runtime dependencies per libxenlight for xen-libs package from
core libvirt-libs package. This is not necessary and unfortunate since
those dependencies set demand to "xen-libs" package even when there's no
need for libvirt xen or libxl driver components.
This patch is to have two separate xenconfig lib tool libraries: one for
core libvirt (without XL), and a another that contains xl for libxl driver
(libvirt_driver_libxl_impl.la) which when loading the driver, loads the
remaining symbols (xen{Format,Parse}XL. For the user/sysadmin, this means
the xen dependencies are moved into libxl driver, instead of core libvirt.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
In preparation for privatizing the object, use the accessor to fetch
the obj->def instead of the direct reference.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Modify virStoragePoolObjGetAutostartLink and
virStoragePoolObjGetConfigFile to return "const char *"
since that's how both are used and to ensure no one
tries to VIR_FREE the result.
To avoid any issues later on if paths ever change (unlikely but
possible) and to match the style of other generated rules the paths
of the static rules have to be quoted as well.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
libvirt allows spaces in vm names, there were issues in the past but it
seems not removed so the assumption has to be that spaces are continuing
to be allowed.
Therefore virt-aa-helper should not reject spaces in vm names anymore if
it is going to be refused causing issues then the parser or xml schema
should do so.
Apparmor rules are in quotes, so a space in a path based on the name works.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If users only specified vendor&product (the common case) then parsing
the xml via virDomainHostdevSubsysUSBDefParseXML would only set these.
Bus and Device would much later be added when the devices are prepared
to be added.
Due to that a hot-add of a usb hostdev works as the device is prepared
and virt-aa-helper processes the new internal xml. But on an initial
guest start at the time virt-aa-helper renders the apparmor rules the
bus/device id's are not set yet:
p ctl->def->hostdevs[0]->source.subsys.u.usb
$12 = {autoAddress = false, bus = 0, device = 0, vendor = 1921, product
= 21888}
That causes rules to be wrong:
"/dev/bus/usb/000/000" rw,
The fix calls virHostdevFindUSBDevice after reading the XML from
virt-aa-helper to only add apparmor rules for devices that could be found
and now are fully known to be able to write the rule correctly.
It uncondtionally sets virHostdevFindUSBDevice mandatory attribute as
adding an apparmor rule for a device not found makes no sense no matter
what startup policy it has set.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Skip purging the backing chain and redetecting it when it was not going
to change during the time we were not present.
The decision is based on the new flag which records whether there were
blockjobs running to the status XML.
https://bugzilla.redhat.com/show_bug.cgi?id=1447169
Since domain can have at most one watchdog it simplifies things a
bit. However, since we must be able to set the watchdog action as
well, new monitor command needs to be used.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently we don't do it. Therefore we accept senseless
combinations of models and buses they are attached to.
Moreover, diag288 watchdog is exclusive to s390(x).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
For VMs with persistent config the config may change upon successful
completion of a job. Save it always if a persistent VM finishes a
blockjob. This will simplify further additions.
The status XML would be saved only for the copy job (in case of success)
or on failure even for other jobs. As the status contains the backing
chain data, which change after success we should always save it on
block job completion.
Checking of disk presence accesses storage on the host so it should be
done from the host setup function. Move the code to new function called
qemuProcessPrepareHostStorage and remove qemuDomainCheckDiskPresence.
Introduce a new function to prepare domain disks which will also do the
volume source to actual disk source translation.
The 'pretend' condition is not transferred to the new location since it
does not help in writing tests and also no tests abuse it.
Pass flags to the function rather than just whether we have incoming
migration. This also enforces correct startup policy for USB devices
when reverting from a snapshot.
qemuMigrationPrepareAny called multiple of the functions starting the
qemu process for incoming migration by adding the flags explicitly.
Extract them to a variable so that they can be easily used for other
calls or changed in the future.
Interestingly enough, we don't document the point of view of the
interface statistics. Therefore it's unknown to users if for
instance rx_packets is the number of packets received by domain or
received by host (from domain). Document this explicitly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Similarly to previous patch, for some types of interface domain
and host are on the same side of RX/TX barrier. In that case, we
need to set up the QoS differently. Well, swapped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1497410
The comment in virNetDevTapInterfaceStats() implementation for
Linux states that packets transmitted by domain are received by
the host and vice versa. Well, this is true but not for all types
of interfaces. For instance, for macvtaps when TAP device is
hooked right onto a physical device any packet that domain sends
looks also like a packet sent to the host. Therefore, we should
allow caller to chose if the stats returned should be straight
copy or swapped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Small wrapper to lookup interface in domain definition by its
name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Users might have configured interface so that it's type of
network, but the corresponding network plugs interfaces into an
OVS bridge. Therefore, we have to check for the actual type of
the interface instead of the configured one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This code compiles only on Linux. Therefore the condition we
check is always true.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
qemu 2.7.0 introduces multiqueue virtio-blk(commit 2f27059).
This patch introduces a new attribute "queues". An example of
the XML:
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' queues='4'/>
The corresponding QEMU command line:
-device virtio-blk-pci,scsi=off,num-queues=4,id=virtio-disk0
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
When detaching an <interface/> from a domain, the MAC address is
parsed and if not present one is generated. If no corresponding
interface is found in the domain, the following error is
reported:
error: operation failed: no device matching mac address 52:54:00:75:32:5b found
where the MAC address is the auto generated one. This might be
very confusing. Solution to this is to ignore auto generated MAC
address when looking up the device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
It will come handy to know if the MAC address was generated (e.g.
during XML parse) or if it was parsed since provided by user in
the XML.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Found by Coverity. If virNWFilterHashTablePut, then the 3rd arg @val
must be free'd since it would be leaked.
This also fixes potential problem on the error path where the caller
could assume the virNWFilterHashTablePut was successful when in fact
it failed leading to other issues.
Rather than using loop break;'s in order to force a return
of rc = -1, let's just return -1 immediately on the various
error paths and then return 0 on the success path.
virDomainDiskSourceParse got to the point of being an ugly spaghetti
mess by adding more and more stuff into it. Split out parsing of network
disk information into a separate function so that it stays contained.
On pure success paths, virNWFilterIPAddrMapAddIPAddr was validly
consuming the input @addr; however, on failure paths it was possible
that virNWFilterVarValueCreateSimple succeed, but virNWFilterHashTablePut
failed resulting in virNWFilterVarValueFree being called to clean
up @val which also cleaned up the input @addr. Thus the caller had
no way to determine on failure whether it too should clean up the
passed parameter.
Instead, let's create a copy of the input @addr, then handle that
properly in the API allowing/forcing the caller to free it's own
copy of the input parameter.
Alter qemu command line generation in order to possibly add TLS for
a suitably configured domain.
Sample TLS args generated by libvirt -
-object tls-creds-x509,id=objvirtio-disk0_tls0,dir=/etc/pki/qemu,\
endpoint=client,verify-peer=yes \
-drive file.driver=vxhs,file.tls-creds=objvirtio-disk0_tls0,\
file.vdisk-id=eb90327c-8302-4725-9e1b-4e85ed4dc251,\
file.server.type=tcp,file.server.host=192.168.0.1,\
file.server.port=9999,format=raw,if=none,\
id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,\
id=virtio-disk0
Update the qemuxml2argvtest with a couple of examples. One for a
simple case and the other a bit more complex where multiple VxHS disks
are added where at least one uses a VxHS that doesn't require TLS
credentials and thus sets the domain disk source attribute "tls = 'no'".
Update the hotplug to be able to handle processing the tlsAlias whether
it's to add the TLS object when hotplugging a disk or to remove the TLS
object when hot unplugging a disk. The hot plug/unplug code is largely
generic, but the addition code does make the VXHS specific checks only
because it needs to grab the correct config directory and generate the
object as the command line would do.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Introduce a function to setup any TLS needs for a disk source.
If there's a configuration or other error setting up the disk source
for TLS, then cause the domain startup to fail.
For VxHS, follow the chardevTLS model where if the src->haveTLS hasn't
been configured, then take the system/global cfg->haveTLS setting for
the storage source *and* mark that we've done so via the tlsFromConfig
setting in storage source.
Next, if we are using TLS, then generate an alias into a virStorageSource
'tlsAlias' field that will be used to create the TLS object and added to
the disk object in order to link the two together for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add an optional virTristateBool haveTLS to virStorageSource to
manage whether a storage source will be using TLS.
Sample XML for a VxHS disk:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source protocol='vxhs' name='eb90327c-8302-4725-9e1b-4e85ed4dc251' tls='yes'>
<host name='192.168.0.1' port='9999'/>
</source>
<target dev='vda' bus='virtio'/>
</disk>
Additionally add a tlsFromConfig boolean to control whether the TLS
setting was due to domain configuration or qemu.conf global setting
in order to decide whether to Format the haveTLS setting for either
a live or saved domain configuration file.
Update the qemuxml2xmltest in order to add a test to show the proper
parsing.
Also update the docs to describe the tls attribute.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new TLS X.509 certificate type - "vxhs". This will handle the
creation of a TLS certificate capability for properly configured
VxHS network block device clients.
The following describes the behavior of TLS for VxHS block device:
(1) Two new options have been added in /etc/libvirt/qemu.conf
to control TLS behavior with VxHS block devices
"vxhs_tls" and "vxhs_tls_x509_cert_dir".
(2) Setting "vxhs_tls=1" in /etc/libvirt/qemu.conf will enable
TLS for VxHS block devices.
(3) "vxhs_tls_x509_cert_dir" can be set to the full path where the
TLS CA certificate and the client certificate and keys are saved.
If this value is missing, the "default_tls_x509_cert_dir" will be
used instead. If the environment is not configured properly the
authentication to the VxHS server will fail.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virNWFilterIPAddrMapAddIPAddr code can consume the @addr parameter
on success when the @ifname is found in the ipAddressMap->hashTable
hash table in the call to virNWFilterVarValueAddValue; however, if
not found in the hash table, then @addr is formatted into a @val
which is stored in the table and on return the caller would be
expected to free @addr.
Thus, the caller has no way to determine on success whether @addr was
consumed, so in order to fix this create a @tmp variable which will
be stored/consumed when virNWFilterVarValueAddValue succeeds. That way
the caller can free @addr whether the function returns success or failure.
The packet with passed FD has the following format:
--------------------------
| len | header | payload |
--------------------------
where "payload" has an additional count of FDs before the actual data:
------------------
| nfds | payload |
------------------
When the packet is received we parse the "header", which as a side
effect updates msg->bufferOffset to point to the beginning of "payload".
If the message call contains FDs, we need to also parse the count of
FDs, which also updates the msg->bufferOffset.
The issue here is that when we attempt to read the FDs data from the
socket and we receive EAGAIN we finish the reading and call poll()
to wait for the data the we need. When the data arrives we already have
the packet in our buffer so we read the "header" again but this time
we don't read the count of FDs because we already have it stored.
That means that the msg->bufferOffset is not updated to point to the
actual beginning of the payload data, but it points to the count of
FDs. After all FDs are processed we dispatch the message to process
it and decode the payload. Since the msg->bufferOffset points to wrong
data, we decode the wrong payload and the API call fails with
error messages:
Domain not found: no domain with matching uuid '67656e65-7269-6300-0c87-5003ca6941f2' ()
Broken by commit 133c511b52 which fixed a FD and memory leak.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
VM private data is cleared when the VM is turned off and also when the
VM object is being freed. Some of the clearing code was duplicated.
Extract it to a separate function.
This also removes the now unnecessary function
qemuDomainClearPrivatePaths.
Calling fallocate on the new (smaller) capacity ensures
that the whole file is allocated, but it does not reduce
the file size.
Also call ftruncate after fallocate.
https://bugzilla.redhat.com/show_bug.cgi?id=1366446
We have been trying to implement the ALLOCATE flag to mean
"the volume should be fully allocated after the resize".
Since commit b0579ed9 we do not allocate from the existing
capacity, but from the existing allocation value.
However this value is a total of all the allocated bytes,
not an offset.
For a sparsely allocated file:
$ perl -e 'print "x"x8192;' > vol1
$ fallocate -p -o 0 -l 4096 vol1
$ virsh vol-info vol1 default
Capacity: 8.00 KiB
Allocation: 4.00 KiB
Treating allocation as an offset would result in an incompletely
allocated file:
$ virsh vol-resize vol1 --pool default 16384 --allocate
Capacity: 16.00 KiB
Allocation: 12.00 KiB
Call fallocate from zero on the whole requested capacity to fully
allocate the file. After that, the volume is fully allocated
after the resize:
$ virsh vol-resize vol1 --pool default 16384 --allocate
$ virsh vol-info vol1 default
Capacity: 16.00 KiB
Allocation: 16.00 KiB
Introduce a new function virFileAllocate that will call the
non-destructive variants of safezero, essentially reverting
my commit 1390c268
safezero: fall back to writing zeroes even when resizing
back to the state as of commit 18f0316
virstoragefile: Have virStorageFileResize use safezero
This means that _ALLOCATE flag will no longer work on platforms
without the allocate syscalls, but it will not overwrite data
either.
Use an empty string to let qemu fill out the default.
This matches what's done in qemuBuildChrChardevStr.
https://bugzilla.redhat.com/show_bug.cgi?id=1454671
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This reverts commit edaf4ebe95.
This uses "reconnect" as attribute for <source> element, but we already
have a <reconnect> element for <source> element for chardev devices.
Since this is the same feature for different device it should be
presented in XML the same way.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Some values we read from the qemu monitor may be changed with the actual
state by the incoming migration. This means that we should refresh
certain things only after the migration has finished.
This is mostly visible in the cdrom tray state, which is by default
closed but may be opened by the guest OS. This would be refreshed before
qemu transferred the actual state and thus libvirt would think that the
tray is closed.
Note that this patch moves only a few obvious query commands. Others may
be moved later after individual assessment.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463168
When the vcpu is successfully removed libvirt would remove the cgroup.
In cases when removal of the cgroup fails libvirt would report an error.
This does not make much sense, since the vcpu was removed and we can't
really do anything with the cgroup. This patch silences the errors from
cgroup removal.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1462092
It is possible (although possibly not very useful) to leave out
the service attribute when using <source mode='bind'/>
Fix the formatter bug introduced by commit 4a0da34 and format
the host when its present (checked for non-NULL inside
virBufferEscapeString) instead of basing it on the presence
of the service attribute.
https://bugzilla.redhat.com/show_bug.cgi?id=1455825
The alias recorded in disk->info.alias is the alias for the frontend
device but we are interested in the backend drive. This messed up the
disk node name extraction code as qemu reports the drive alias in the
block query commands. This was broken in the node name detector
refactoring done in commit 0175dc6ea0
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1494327
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some distros (see diff) chose to backport QMP support rather than rebase
to newer version of qemu. As a hack they added the string 'libvirt' to
the qemu -help output. Remove this as downstream-only hacks should be
carried by downstream and not litter upstream.
This effectively reverts commit ff88cd5905
Because qemuDomainDefCopy needs a string representation of a domain
definition, there's no reason for calling the lower level
qemuDomainDefFormatBuf API.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virDomainDefFormatInternal (called by qemuDomainDefFormatXMLInternal)
already checks for buffer errors and properly resets the buffer on
failure.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
instead of only unloading it. This makes sure old profiles don't pile up
in /etc/apparmor.d/libvirt and we get updates to modified templates on
VM restart.
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
After commit 8708ca01c0 libvirtd consistently aborts with "stack
smashing detected" when nodedev driver is initialized.
This is caused by nlmsg_parse() being told that its array of nlattr*
has CTRL_CMD_MAX (10) entries, when in fact it is declared to have
CTRL_ATTR_MAX (8) entries. Since all the entries are initialized to
NULL, the result is that nlmsg_parse is overwriting 2*(sizof(nlattr*))
bytes outside the array.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id '5604c056' used the wrong API to generate the
<secret type='%s'..." field. The previous code used the
correct API as was done in commit id '6887af39'. The data
is actually a usage type not an auth type even though the
result is the same.
Passing a NULL value for the argument secAlias to the function
qemuDomainGetTLSObjects would cause a segmentation fault in
libvirtd.
Changed code to check before dereferencing a NULL secAlias.
Signed-off-by: Ashish Mittal <ashmit602@gmail.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1448268
When migrating to a file (e.g. when doing 'virsh save file'),
couple of things are happening in the thread that is executing
the API:
1) the domain obj is locked
2) iohelper is spawned as a separate process to handle all I/O
3) the thread waits for iohelper to finish
4) the domain obj is unlocked
Now, the problem is that while the thread waits in step 3 for
iohelper to finish this may take ages because iohelper calls
fdatasync(). And unfortunately, we are waiting the whole time
with the domain locked. So if another thread wants to jump in and
say copy the domain name ('virsh list' for instance), they are
stuck.
The solution is to unlock the domain whenever waiting for I/O and
lock it back again when it finished.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1471225
Commit id '99a2d6af2' was a bit too aggressive with determining whether
the provided path was a "physical" cd-rom in order to generate a taint
message due to the possibility of some guest and host trying to control
the tray. For cd-rom guest devices backed to some VIR_STORAGE_TYPE_FILE
storage, this wouldn't be a problem and as such it shouldn't be a problem
for guest devices using some sort of block device on the host such as
iSCSI, LVM, or a Disk pool would present.
So before issuing a taint message, let's check if the provided path of
the VIR_STORAGE_TYPE_BLOCK backed device is a "known" physical cdrom name
by comparing the beginning of the path w/ "/dev/cdrom" and "/dev/sr".
Also since it's possible the provided path could resolve to some /dev/srN
device, let's get that path as well and perform the same check.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virSocketAddrFormat() allocates the string and it's caller
responsibility to free it afterwards.
==28857== 11 bytes in 1 blocks are definitely lost in loss record 37 of 168
==28857== at 0x4C2BEDF: malloc (vg_replace_malloc.c:299)
==28857== by 0x9A81D79: strdup (in /lib64/libc-2.23.so)
==28857== by 0x5DA3BF0: virStrdup (virstring.c:902)
==28857== by 0x5D96182: virSocketAddrFormatFull (virsocketaddr.c:427)
==28857== by 0x5D95E13: virSocketAddrFormat (virsocketaddr.c:352)
==28857== by 0x5706890: qemuBuildHostNetStr (qemu_command.c:3891)
==28857== by 0x57138D3: qemuBuildInterfaceCommandLine (qemu_command.c:8597)
==28857== by 0x5713D6A: qemuBuildNetCommandLine (qemu_command.c:8699)
==28857== by 0x57176F6: qemuBuildCommandLine (qemu_command.c:10027)
==28857== by 0x5769D61: qemuProcessCreatePretendCmd (qemu_process.c:6004)
==28857== by 0x4056EC: testCompareXMLToArgv (qemuxml2argvtest.c:502)
==28857== by 0x41DF40: virTestRun (testutils.c:180)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. The commit itself tried to address
this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive.
Not to mention that masking a previously set flag is only acceptable if
the flag was set by a public API user. Internally, libvirt should have
never set the flag in the first place.
https://bugzilla.redhat.com/show_bug.cgi?id=1485022
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When a user requested a domain XML description with
VIR_DOMAIN_XML_UPDATE_CPU flag, libvirt would use the host CPU
definition from host capabilities rather than the one which will
actually be used once the domain is started.
https://bugzilla.redhat.com/show_bug.cgi?id=1481309
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In the past we updated host-model CPUs with host CPU data by adding a
model and features, but keeping the host-model mode. And since the CPU
model is not normally formatted for host-model CPU defs, we had to pass
the updateCPU flag to the formatting code to be able to properly output
updated host-model CPUs. Libvirt doesn't do this anymore, host-model
CPUs are turned into custom mode CPUs once updated with host CPU data
and thus there's no reason for keeping the hacks inside CPU XML
formatters.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This will be used to improve the validation for this type of devices.
The former @def parameter is renamed to @dev, leaving @def for the
virDomainDef (following the style used elsewhere).
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
The iohelper currently calls saferead() to get data from the
underlying file. This has a problem with O_DIRECT when hitting
end-of-file. saferead() is asked to read 1MB, but the first
read() it does may return only a few KB, so it'll try another
read() to fill the remaining buffer. Unfortunately the buffer
pointer passed into this 2nd read() is likely not aligned
to the extent that O_DIRECT requires, so rather than seeing
'0' for end-of-file, we'll get -1 + EINVAL due to misaligned
buffer.
The way the iohelper is currently written, it already handles
getting short reads, so there is actually no need to use
saferead() at all. We can simply call read() directly. The
benefit of this is that we can now write() the data immediately
so when we go into the subsequent reads() we'll always have a
correctly aligned buffer.
Technically the file position ought to be aligned for O_DIRECT
too, but this does not appear to matter when at end-of-file.
Tested-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For vhost-user ports, Open vSwitch acts as the server and QEMU the client.
When OVS crashed or restart, QEMU shoule be reconnect to OVS.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This commit adds new events for two methods and operations: *PoolBuild() and
*PoolDelete(). Using the event-test and the commands set below we have the
following outputs:
$ sudo ./event-test
Registering event callbacks
myStoragePoolEventCallback EVENT: Storage pool test Defined 0
myStoragePoolEventCallback EVENT: Storage pool test Created 0
myStoragePoolEventCallback EVENT: Storage pool test Started 0
myStoragePoolEventCallback EVENT: Storage pool test Stopped 0
myStoragePoolEventCallback EVENT: Storage pool test Deleted 0
myStoragePoolEventCallback EVENT: Storage pool test Undefined 0
Another terminal:
$ sudo virsh pool-define test.xml
Pool test defined from test.xml
$ sudo virsh pool-build test
Pool test built
$ sudo virsh pool-start test
Pool test started
$ sudo virsh pool-destroy test
Pool test destroyed
$ sudo virsh pool-delete test
Pool test deleted
$ sudo virsh pool-undefine test
Pool test has been undefined
This commits can be a solution for RHBZ #1475227.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1475227
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The patch passes the reconnect timeout to QEMU by monitor on
chardev hotplug.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The util/vircrypto.c file uses gnutls, so we must directly link
libvirt_util.la with gnutls to avoid errors on OS which do not
resolve symbols against indirectly linked libraries.
This fixes a build failure on Ubuntu Trusty
CCLD storagevolxml2argvtest
/usr/bin/ld: ../src/.libs/libvirt_util.a(libvirt_util_la-vircrypto.o): undefined reference to symbol 'gnutls_strerror@@GNUTLS_1_4'
//usr/lib/x86_64-linux-gnu/libgnutls.so.26: error adding symbols: DSO missing from command line
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VxHS block device will only use the newer formatting options and
avoid the legacy URI syntax.
An excerpt for a sample QEMU command line is:
-drive file.driver=vxhs,file.vdisk-id=eb90327c-8302-4725-9e1b-4e85ed4dc251,\
file.server.type=tcp,file.server.host=192.168.0.1,\
file.server.port=9999,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,\
id=virtio-disk0
Update qemuxml2argvtest with a simple test.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Extract out the "guts" of building a server entry into it's own
separately callable/usable function in order to allow building
a server entry for a consumer with src->nhosts == 1.
Add the backing parse and a test case to verify parsing of VxHS
backing storage.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new virStorageNetProtocol for Veritas HyperScale (VxHS) disks
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Using the query-qmp-schema introspection - look for the 'vxhs'
blockdevOptions type.
NB: This is a "best effort" type situation as there is not a
mechanism to determine whether the running QEMU has been
built with '--enable-vxhs'. All we can do is check if the
option to use vxhs for a blockdev-add exists in the command
infrastructure which does not take that into account when
building its table of commands and options.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The mlx4 (Mellanox) netdev driver implements the sysfs phys_port_id
file for both VFs and PFs, so you can find the VF netdev plugged into
the same physical port as any given PF netdev by comparing the
contents of phys_port_id of the respective netdevs. That's what
libvirt does when attempting to find the PF netdev for a given VF
netdev (or vice versa).
Most other netdev's drivers don't implement phys_port_id, so the file
is visible in sysfs directory listing, but attempts to read it result
in ENOTSUPP. In these cases, libvirt is unable to read phys_port_id of
either the PF or the VF, so it just returns the first entry in the
PF/VF's list of netdevs.
But we've found that the i40e driver is in between those two
situations - it implements phys_port_id for PF netdevs, but doesn't
implement it for VF netdevs. So libvirt would successfully read the
phys_port_id of the PF netdev, then try to find a VF netdev with
matching phys_port_id, but would fail because phys_port_id is NULL for
all VFs. This would result in a message like the following:
Could not find network device with phys_port_id '3cfdfe9edc39'
under PCI device at /sys/class/net/ens4f1/device/virtfn0
To solve this problem in a way that won't break functionality for
anyone else, this patch saves the first netdev name we find for the
device, and returns that if we fail to find a netdev with the desired
phys_port_id.
Documentation states:
"'offset' and 'size' represent an area which must lie entirely within
the device or file." Enforce the that the buffer lies within fully.
Commit 3956af495e broke the blockPeek API since virStorageFileRead
allocates a return buffer and fills it with the data, while the API
fills a user-provided buffer. This did not get caught by the compiler
since the API prototype uses a 'void *'.
Fix it by transferring the data from the allocated buffer to the user
provided buffer.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1491217
This is particularly useful on operating systems that don't ship
Python as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.
While at it, make it explicit that our scripts are only going to
work with Python 2, and remove the usage of unbuffered I/O, which
as far as I can tell has no effect on the output files.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This is particularly useful on operating systems that don't ship
Perl as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.
In one case (src/rpc/genprotocol.pl) the interpreter path was
missing altogether.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
I don't want to mask the real problem, but one can advocate
that we should be marking graphics ports as already in use on
qemuProcessReconnect anyway, because we already know that they
are taken.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Instead of checking for all possible constants that every
kernel header with devlink support should have (and defining
HAVE_DECL_DEVLINK as 1 if any of them is present due to the
way AC_CHECK_DECLS works), only check for DEVLINK_CMD_ESWITCH_GET.
This is the name of the constant since kernel 4.11. Between 4.8
and 4.11, the now deprecated spelling DEVLINK_CMD_ESWITCH_MODE_GET
was used.
Assume DEVLINK_ESWITCH_MODE_SWITCHDEV is available, since it was
introduced along with the deprecated spelling.
Introduce virStoragePoolObjForEachVolume to scan each volume
calling the passed callback function until all volumes have been
processed in the storage pool volume list, unless the callback
function returns an error.
Introduce virStoragePoolObjSearchVolume to search each volume
calling the passed callback function until it returns true
indicating that the desired volume was found.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create/use virStoragePoolObjAddVol in order to add volumes onto list.
Create/use virStoragePoolObjRemoveVol in order to remove volumes from list.
Create/use virStoragePoolObjGetVolumesCount to get count of volumes on list.
For the storage driver, the logic alters when the volumes.obj list grows
to after we've fetched the volobj. This is an optimization of sorts, but
also doesn't "needlessly" grow the volumes.objs list and then just decr
the count if the virGetStorageVol fails.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create/use a helper to perform object allocation.
Adjust storagevolxml2argvtest.c in order to use the allocator and
setting of the obj->def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for making a private object, create accessor API's for
consumer storage functions to use:
virStoragePoolObjGetDef
virStoragePoolObjSetDef
virStoragePoolObjGetNewDef
virStoragePoolObjDefUseNewDef
virStoragePoolObjGetConfigFile
virStoragePoolObjSetConfigFile
virStoragePoolObjGetAutostartLink
virStoragePoolObjIsActive
virStoragePoolObjSetActive
virStoragePoolObjIsAutostart
virStoragePoolObjSetAutostart
virStoragePoolObjGetAsyncjobs
virStoragePoolObjIncrAsyncjobs
virStoragePoolObjDecrAsyncjobs
Signed-off-by: John Ferlan <jferlan@redhat.com>
Available since QEMU 2.10.0 (specifically commit
v2.9.0-2233-g53f9a6f45f).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The features were added to QEMU by commit v2.4.0-1690-gf7fda28094 as
Skylake Server features.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Adding functionality to libvirt that will allow querying the interface
for the availability of switchdev Offloading NIC capabilities.
The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
command to retrieve the switchdev NIC feature with command example:
devlink dev eswitch show pci/0000:03:00.0
This feature is needed for Openstack so we can do a scheduling decision
if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
And select the appropriate hypervisors with the requested capability see [1].
[1] - https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1075520
Apart from generic checks, we need to constrain netmask/prefix
length a bit. Thing is, with current implementation QEMU needs to
be able to 'assign' some IP addresses to the virtual network. For
instance, the default gateway is at x.x.x.2, dns is at x.x.x.3,
the default DHCP range is x.x.x.15-x.x.x.30. Since we don't
expose these settings yet, it's safer to require shorter prefix
to have room for the defaults.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: laine@laine.org
https://bugzilla.redhat.com/show_bug.cgi?id=1075520
Currently, all that users can specify for an interface type of
'user' is the common attributes: PCI address, NIC model (and
that's basically it). However, some need to configure other
address range than the default one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: laine@laine.org
Only feature policy is checked on s390, which was previously done in
virCPUUpdate, but that's not the correct place for the check once we
have virCPUValidateFeatures.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This new API may be used to check whether all features used in a CPU
definition are valid (e.g., libvirt knows their name, their policy is
supported, etc.). Leaving this API unimplemented in an arch subdriver
means libvirt does not restrict CPU features usable on the associated
architectures.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The host CPU definitions reported in the capabilities XML may contain
CPU features unknown to QEMU, but the result of virConnectBaselineCPU is
supposed to be directly usable as a guest CPU definition and thus it
should only contain features QEMU knows about.
https://bugzilla.redhat.com/show_bug.cgi?id=1450317
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The filter only needs to know the CPU architecture. Passing
virQEMUCapsPtr as opaque is a bit overkill.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The implementation of virConnectBaselineCPU may be different for each
hypervisor. Thus it shouldn't really be implmented in the cpu code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id 'e02ff020cac' neglected to use the attrBuf and childBuf
in the virDomainDiskSourceFormatNetwork call.
So make the necessary alterations to allow usage.
Rather than checking during XML processing, move the check for
valid <encryption> into virDomainDiskDefParseValidate and alter
the text of the message slightly to be a bit more correct.
Rather than checking during XML processing, move the checks for correct
and valid auth into virDomainDiskDefParseValidate. This will introduce
virDomainDiskSourceDefParseAuthValidate to validate that the authdef
stored for the virStorageSource is valid. This can then be expanded
to service backingStore sources as well.
Alter the message text slightly as well to distinguish between an
unknown name and an incorrectly used name. Since type is not a
mandatory field, add the NULLSTR() around the output of the unknown
error. NB, a config using unknown formatting would fail virschematest
since it only accepts 'iscsi' and 'ceph' as "valid" types.
Some operations done to rollback disk image labelling and locking might
overwrite (or clear) the actual error. Remember the original error when
tearing down disk access so that it's not obscured.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1461301
Some cleanup paths overwrite a usefull error message with a less useful
one and we then try to preserve the original message. The handlers added
in this patch will simplify the operations since they are designed right
for the purpose.
Block job QMP commands with underscores rather than dashes were never
released in upstream qemu, (they were added, but modified in the same
release [1]), but a certain distro managed to backport the version in the
middle.
The change also slightly modified semantics for the abort command, which
made us have a lot of code which was only ever present in certain
downstream distros.
Clean the upstream code from the legacy cruft and support only the
upstream implementations.
[1] See qemu commit v1.0-2176-gdb58f9c060
Reviewed-by: Eric Blake <eblake@redhat.com>
No need to pass a @driver parameter since all that's done is deref
the @cfg especially since the only caller can just pass an already
referenced @cfg.
Also, looks like commit id '0298531b' at one time had a different
name for the API, so I took the liberty of fixing the comments too
since I would already be updating them for the @cfg variable.
For a logged in user this a path like /dev/dri/renderD128 will have
default ownership root:video which won't work for the qemu:qemu user,
so we need to chown it.
We only do this when mount namespaces are enabled in the qemu driver,
so the chown'ing doesn't interfere with other users of the shared
render node path
https://bugzilla.redhat.com/show_bug.cgi?id=1460804
The VIR_SECURITY_MANAGER_MOUNT_NAMESPACE flag informs the DAC driver
if mount namespaces are in use for the VM. Will be used for future
changes.
Wire it up in the qemu driver
https://bugzilla.redhat.com/show_bug.cgi?id=1464313
If a Disk pool was defined/created using XML that either didn't
specify a specific format or specified format type='unknown', then
restarting a pool after an initial disk backend build with overwrite
would fail after a libvirtd restart for a non-autostarted pool.
This is because the persistent pool data is not updated during pool
build w/ overwrite processing to have the VIR_STORAGE_POOL_DISK_DOS
default format.
So in addition to the alteration done during disk build processing,
alter the default expectation for disk startup to be DOS if nothing
has been defined yet. That will either succeed if the pool had been
successfully built previously using the default DOS format or fail
with a message indicating the format is something else that does not
match the expect format 'dos'.
https://bugzilla.redhat.com/show_bug.cgi?id=1477880
If the "/#" is missing from the provided iSCSI path, then we need
to provide the default LUN of /0; otherwise, QEMU will fail to parse
the URL causing a failure to either create the guest or hotplug
attach the storage.
During post parse, for any iSCSI disk or hostdev, scan the source
path looking for the presence of '/', if found, then we can assume
the LUN is provided. If not found, alter the input XML to add the
"/0". This will cause the generated XML to have the generated
value when the domain config is saved after post parse.
Commit 703abf1d7 changed the logic so that we don't attempt to re-create
the image if it's a block device. This was done by modifying the
'reuse' variable. Unfortunately after modifying it one of the uses was
to infer whether we should probe the disk format. After changes in the
commit mentioned above we would attempt the probe if the target of the
copy is a block device and the format was not provided explicitly rather
than using the format of the disk.
Fix it by explicitly checking whether the user requested a reuse of the
disk rather than the modified boolean flag.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1490826
https://bugzilla.redhat.com/show_bug.cgi?id=1447169
With this patch users can cold plug a watchdog. Things are pretty
simple because a domain can have at most one watchdog device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If there was an error when constructing the buffer, NULL is
returned. The buffer is never freed though.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When destroying a domain libvirt marks it internally with a
beingDestroyed flag to make sure the qemuDomainDestroyFlags API itself
cleans up after the domain rather than letting an uninformed EOF handler
do it. However, when the domain is being started at the moment libvirt
was asked to destroy it, only the starting thread can properly clean up
after the domain and thus it ignores the beingDestroyed flag. Once
qemuDomainDestroyFlags finally gets a job, the domain may not be running
anymore, which should not be reported as an error if the domain has been
starting up.
https://bugzilla.redhat.com/show_bug.cgi?id=1445600
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This option requires:
<ioapic driver='qemu'/>
Report an error in case someone tries to combine
it with different ioapic setting.
Setting 'eim' on without enabling 'intremap' does not make sense.
https://bugzilla.redhat.com/show_bug.cgi?id=1457610
Add a 'cleanup' label and improve the readability of one of the
checks by making it conform to our formatting standard and moving
the corresponding comment.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
TPM 2 does not implement sysfs files for cancellation of commands.
We therefore use /dev/null for the cancel path passed to QEMU.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add a new CPU model called 'EPYC' to model processors from AMD EPYC
family (which includes EPYC 76xx,75xx,74xx, 73xx and 72xx).
The following features bits have been added/removed compare to Opteron_G5
Added: monitor, movbe, rdrand, mmxext, ffxsr, rdtscp, cr8legacy, osvw,
fsgsbase, bmi1, avx2, smep, bmi2, rdseed, adx, smap, clfshopt, sha
xsaveopt, xsavec, xgetbv1, arat
Removed: xop, fma4, tbm
The patch is depend on EPYC CPU model supported introduced in qemu [1]
[1] https://patchwork.kernel.org/patch/9902205/
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
In case of real migration (not migrating to file on save, dump etc)
migration info is not complete at time qemu finishes migration
in normal (non postcopy) mode. We need to update disks stats,
downtime info etc. Thus let's not expose this job status as
completed.
To archive this let's set status to 'qemu completed' after
qemu reports migration is finished. It is not visible as complete
job to clients. Cookie code on confirm phase will finally turn
job into completed. As we don't need more things to do when
migrating to file status is set to 'completed' as before
in this case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When getting job info in case mirror does not reach ready phase
fetch mirror stats from qemu. Otherwise mirror stats are already
saved in current job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Looks like it is more simple to drop this optimization as we are
going to add getting disks stats during migration via quering qemu
process and checking if we have to acquire job condition becomes
more complicate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking stat.status let's set status to migrating
as soon as migrate command is send (waiting for completion
is a good place too).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Setting status to none has little value - getting job status
will not return even elapsed time.
After this patch getting job stats stays correct in a sence
it will not fetch migration stats because it consults
stats.status before doing the fetch.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Querying destination migration statistics may result in getting
a failure or getting a elapsed time value depending on stats.status
value which is odd. Instead let's always fail. Clients should
be ready to handle this as currently getting failure period
can be considerable.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuMigrationFetchJobStatus is rather inconvinient. Some of its
callers don't need status to be updated, some don't need to update
elapsed time right away. So let's update status or elapsed time
in callers instead.
This patch drops updating job status on getting job stats by
client. This way we will not provide status 'completed' while
it is not yet updated by migration routine.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This way we get stats only in one place. The former code waits for
complete/postcopy status basically and don't need to mess with stats.
The patch drops raising an error on stats updates failure. This
does not make much sense anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status
instead of checking job.current->stats.status. The latter can be changed
when fetching migration statistics. Moving state function from the variable
and leave only store function seems more managable.
This patch removes all state checking usage of stats except for
qemuDomainGetJobStatsInternal. This place will be handled separately.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch simply switches code from using VIR_DOMAIN_JOB_* to
introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom
to introduce states for postcopy and mirroring phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1439991
Whenever a device is being updated via
virDomainUpdateDeviceFlags() API, we parse the device XML and
ideally run some generic checks to validate the configuration
(e.g. if device defines per-device boot order but the domain has
os/boot element already). Well, that's the theory - due to a
missing check we've jumped early from that check function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Neither @cfg nor (now) @driver is used in the API, so remove them
and mark @opaque as UNUSED.
NB: Commit id 'fa3c558596' dropped the unused @qemuCaps which was the
last consumer of @driver other than @cfg, but even @cfg was never used
even in the original implementation from commit id 'd987f63a'.
arm/aarch64 -M virt on KVM doesn't and will never work with standard
VGA card emulation. The recommended method is to use type=virtio, so
let's make it the default for video devices without an explicit type
set by the user.
https://bugzilla.redhat.com/show_bug.cgi?id=1404112
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This allows drivers to set their own default. But if a driver neglects
to fill one in, we still error like we previously would at parse time.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Will be needed for future patches to pull the default video type
setting out of XML parsing routines.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
There were a few places in our code where the following pattern in 'if'
condition occurred:
if ((foo = bar() < 0))
do something;
This patch adjusts the conditions to the expected format:
if ((foo = bar()) < 0)
do something;
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1488192
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Suggested-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Although not previously explicitly documented, the expectation for
the libvirt event loop is that an implementation is registered early
in application startup, before calling any libvirt APIs and then
run forever after. Replacing a previously registered event loop is
not safe & subject to races even if virConnectClose has been called
on open handles, due to delayed deregistration of callbacks during
conenction close.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Funny thing. So when initializing LXC driver's capabilities,
firstly the virLXCDriverGetCapabilities() is called. This creates
new capabilities, stores them under driver->caps, ref() them and
return them. However, the return value is ignored. Secondly, the
function is called yet again and since we have driver->caps set,
they are ref()-ed again an returned. So in the end, driver's
capabilities have refcount of three when in fact they should have
refcount of one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If you use the VDDK library to access virtual machines remotely, you
really need to know the Managed Object Reference ("moref") of the VM.
This must be passed each time you connect to the API.
For example nbdkit's VDDK plugin requires a moref to be passed to
mount up a VM's disk remotely:
nbdkit vddk user=root password=+/tmp/rootpw \
server=esxi.example.com thumbprint=xx:xx:xx:... \
vm=moref=2 \
file="[datastore1] Fedora/Fedora.vmdk"
Getting the moref is a huge pain. To get some idea of what it is, why
it is needed, and how much trouble it is to get it, see:
https://blogs.vmware.com/vsphere/2012/02/uniquely-identifying-virtual-machines-in-vsphere-and-vcloud-part-1-overview.htmlhttps://blogs.vmware.com/vsphere/2012/02/uniquely-identifying-virtual-machines-in-vsphere-and-vcloud-part-2-technical.html
However the moref is available conveniently in the internals of the
libvirt VMX driver. This patch exposes it as a custom XML element
using the same "vmware:" namespace which was previously used for the
datacenterpath (see libvirt commit 636a990587).
It appears in the XML like this:
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
<name>Fedora</name>
...
<vmware:datacenterpath>ha-datacenter</vmware:datacenterpath>
<vmware:moref>2</vmware:moref>
</domain>
Note that the moref can appear as either a simple ID (for esx://
connections) or as a "vm-<ID>" (for vpx:// connections). It should be
treated by users as an opaque string.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1487322
In ace45e67ab I tried to fix a problem that we get the reply to
a D-Bus call while we were sleeping. In that case the callback
was never set. So I changed the code that the callback is called
directly in this case. However, I hadn't realized that since the
callback is called out of order it locks the virNetDaemon.
Exactly the very same virNetDaemon object that we are dealing
with right now and that we have locked already (in
virNetDaemonAddShutdownInhibition())
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We call qemuDomainGetMachineName on domain start. On first
start (after daemon start) pid is 0 and virSystemdGetMachineNameByPID
don't get called. But after domain shutting down pid became -1 so
on next start virSystemdGetMachineNameByPID is called and returned an error.
Error is ignored so it is not critical. But at least on my system
(systemd-219 with extra patches) systemd-machined is crashed on
this request.
This behaviour is triggered by eaf2c9f89.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1484230
When updating a virtio enabled vNIC and trying to change either
of rx_queue_size or tx_queue_size success is reported although no
operation is actually performed. Moreover, there's no way how to
change these on the fly. This is due to way we check for changes:
explicitly for each struct member. Therefore it's easy to miss
one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1437797
Rather than using refreshVol which essentially only updates the
allocation, capacity, and permissions for the volume, but not
the format which does get updated in a pool refresh - let's use
the same helper that pool refresh uses in order to update the
volume target.
Currently while parsing domain XML we clear the UNIX path if it matches
one of the auto-generated paths by libvirt. After that when the guest
is started new path is generated but the mode is also changed to "bind".
In the real-world use-case the mode should not change, it only happens
if a user provides a mode='connect' and path that matches one of the
auto-generated path or not provides a path at all.
Before *reconnect* feature was introduced there was no issue, but with
the new feature we need to make sure that it's used only with "connect"
mode, therefore we need to move the mode change into parsing in order
to have a proper error reported by validation code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Inspired by the recent GIT / Mercurial security flaws
(http://blog.recurity-labs.com/2017-08-10/scm-vulns),
consider someone/something manages to feed libvirt a bogus
URI such as:
virsh -c qemu+ssh://-oProxyCommand=gnome-calculator/system
In this case, the hosname "-oProxyCommand=gnome-calculator"
will get interpreted as an argument to ssh, not a hostname.
Fortunately, due to the set of args we have following the
hostname, SSH will then interpret our bit of shell script
that runs 'nc' on the remote host as a cipher name, which is
clearly invalid. This makes ssh exit during argv parsing and
so it never tries to run gnome-calculator.
We are lucky this time, but lets be more paranoid, by using
'--' to explicitly tell SSH when it has finished seeing
command line options. This forces it to interpret
"-oProxyCommand=gnome-calculator" as a hostname, and thus
see a fail from hostname lookup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When recreating folders with namespaces, the directory type was not
being handled at all. It's not special, we probably just didn't know
that that can be used as a volume path as well. The code failed
gracefully, but we want to allow that so that we can use <disk
type='dir'> in domains again.
Partially-resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1443434
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Our backing probing code handles directory file types properly in
virStorageFileGetMetadataRecurse(), by that I mean it leaves them
alone. However its caller, the virStorageFileGetMetadata() resets the
type to raw before probing, without even checking the type. We need
to special-case TYPE_DIR in order to achieve desired results.
Also, in order to properly test this, we need to stop resetting format
of volumes in tests for TYPE_DIR (probably the reason why we didn't
catch that and why the test data didn't need to be modified).
Partially-resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1443434
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This commit adds qemu driver implementation to edit xml
configuration of managed save state file of a domain.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
This commit adds qemu driver implementation to get xml description
for managed save state domain.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Similar to domainSaveImageDefineXML this commit adds domainManagedSaveDefineXML
API which allows to edit domain's managed save state xml configuration.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Similar to domainSaveImageGetXMLDesc this commit adds domainManagedSaveGetXMLDesc
API which allows to get the xml of managed save state domain.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1476866
For some reason, we completely ignore <on_reboot/> setting for
domains. The implementation is simply not there. It never was.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This API is definitely modifying state of @vm. Therefore it
should grab a job.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
At some places we either already have synchronous job or we just
released it. Also, some APIs might want to use this code without
having to release their job. Anyway, the job acquire code is
moved out to qemuDomainRemoveInactiveJob so that
qemuDomainRemoveInactive does just what it promises.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
We can now check for the error and not care about the return value as
it will be properly handled in virBufferContentAndReset() anyway.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The function is useful even without using the return value. And if
needed, the return value can be obtained by other calls as well. The
potential for clean-up can be seen in the following patch.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Otherwise longer domain names might generate paths that are too long
to be created. This follows what other parts of the code do as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1453194
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We always truncated the name at 20 bytes instead of characters. In
case 20 bytes were in the middle of a multi-byte character, then the
string became invalid and various parts of the code would error
out (e.g. XML parsing of that string). Let's instead properly
truncate it after 20 characters instead.
We cannot test this in our test suite because we would need to know
what locales are installed on the system where the tests are ran and
if there is supported one (most probably there will be, but we cannot
be 100% sure), we could initialize gettext in qemuxml2argvtest, but
there would still be a chance of getting two different (both valid,
though) results.
In order to test this it is enough to start a machine with a name for
which trimming it after 20 bytes would create invalid sequence (e.g.
1234567890123456789č where č is any multi-byte character). Then start
the domain and restart libvirtd. The domain would disappear because
such illegal sequence will not go through the XML parser. And that's
not a bug of the parser, it should not be in the XML in the first
place, but since we don't use any sophisticated formatter, just
mash some strings together, the formatting succeeds.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1448766
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The reconnect attribute for chardev devices in QEMU is used to
configure the reconnect timeout in seconds. Setting '0' value disables
the reconnect functionality thus we don't allow to set '0' for QEMU.
To disable the reconnect user should use <reconnect enabled='no'/>.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1254971
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This reverts commit 92840eb3a7.
More recent reviews/changes don't have the vir*ObjNew APIs
consuming the @def, so remove from Interface as well. Changes
needed to also deal with conflicts from commit id '46f5eca4'.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit 94c465d0 refactored the logging setup phase but introduced an
issue, where the daemon ignores verbose mode when there are no outputs
defined and the default must be used. The problem is that the default
output was determined too early, thus ignoring the potential '--verbose'
option taking effect. This patch postpones the creation of the default
output to the very last moment when nothing else can change. Since the
default output is only created during the init phase, it's safe to leave
the pointer as NULL for a while, but it will be set eventually, thus not
affecting runtime.
Patch also adjusts both the other daemons.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1442947
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We can't retrieve the isolation group of a device that's not present
in the system. However, it's very common for VFs to be created late
in the boot, so they might not be present yet when libvirtd starts,
which would cause the guests using them to disappear.
Moreover, for other architectures and even ppc64 before isolation
groups were introduced, it's considered perfectly fine to configure a
guest to use a device that's not yet (or no longer) available to the
host, with the obvious caveat that such a guest won't be able to
start before the device is available.
In order to be consistent, when a device's isolation group can't be
determined fall back to not isolating it rather than erroring out or,
worse, making the guest disappear.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1484254
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
While formatting disk or chardev element they both uses
virDomainDiskSourceDefFormatSeclabel() function which also closes
the source element. This is not extendable.
Use the new virXMLFormatElement() to properly format the source
element with possible child elements.
As a side effect it fixes a bug in disk source formatting.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This helper allows you to better structurize the code if some element
may or may not contains attributes and/or child elements.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
To handle setting a default heads value. Convert callers that were
doing it by hand
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
And into DeviceDefValidate which is the expected place
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
The ram/vram = 0 bits aren't needed, and PostParse will fill in the
needed QXL default
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Both of these are dead code: qemu_command.c explicitly rejects
VIRT_XEN earlier in the call chain, and qemu_parse_command.c
will never set VIRT_XEN anymore
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This is more user-friendly because the error will be
displayed directly instead of being buried in the log.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Using a variable named 'stat' clashes with the system function
'stat()' causing compiler warnings on some platforms:
libxl/libxl_driver.c: In function 'libxlDomainBlockStatsVBD':
libxl/libxl_driver.c:5387: error: declaration of 'stat' shadows a global declaration [-Wshadow]
/usr/include/sys/stat.h:455: error: shadowed declaration is here [-Wshadow]
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
When parsing the config, we look for the SCSI controllers one by one,
remembering their models, then let virDomainDefAddImplicitDevices
add them if any SCSI disk is using them.
Since these controllers are not really implicit (they are present
in the source config), add them explicitly.
This patch maintains the behavior of not adding a controller
if it was present in the config, but no disk was using it.
This also resolves the memory leak of virVMXParseConfig overwriting
the video device added by calling virDomainDefAddImplicitDevices
before the parsing is finished.
Reported-by: Michal Privoznik <mprivozn@redhat.com>
For selected hostdev types, we validate that the address type
matches the subsystem type when parsing the XML.
Move it to the validation phase, to allow extending the checks
to other subsystem types without making existing domains disappear.
At the time the check was written virtuozzo did not use disabled items in boot
order configuration. Boot items were always enabled. Now they can be disabled
as well. Supporting such items is easy - they just should be ignored.
When parsing bootable devices, we maintain a bitmap of used
<boot order=""> elements. Use it in the post-parse function
to figure out whether the user tried to mix per-device and
per-domain boot elements.
This removes the need to count them twice.
These functions contain the post-parse steps common for all drivers.
Rename it to use the 'Common' prefix, instead of the vagueness
of 'Internal', leaving 'Internal' available for other vague uses.
Since the source element is parsed only once for these type of
character devices we don't have to use temporary variable and
check whether the variable was already set.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The extra check whether (connect|bind)(Host|Service) was set is
required because for UDP chardev there can be two source elements.
Without the check there could be a memory leak.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In order to ensure that the default protocol is RAW, explicitly
assigning VIR_DOMAIN_CHR_TCP_PROTOCOL_RAW = 0.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Currently we accept and correctly parse this chardev XML:
...
<channel type='tcp'>
<source mode='connect'/>
<source mode='bind' host='localhost'/>
<source service='4567'/>
<target type='virtio' name='test'/>
</channel>
...
The parsed formatted XML is:
...
<channel type='tcp'>
<source mode='connect' host='localhost' service='4567'/>
<target type='virtio' name='test'/>
</channel>
...
That behavior is super wrong and should not be allowed. If you notice
the current parse takes the first found attribute and uses that value,
so for example from the "<source mode='bind' host='localhost'/>" only
the "host" attribute is used. It works the same way for all possible
attributes that we are able to parse for source element.
This patch enforces providing only one source element for all character
devices, only for UDP type we allow to provide two source elements
since you can specify both modes.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Since its introduction in commit 874e65aa, if someone requests:
<os><bios useserial="yes"/><os/>
we report an error if we cannot successfully count the number
of serial devices via an XPath query.
Instead of fixing the check (and moving it to the validation phase,
to prevent existing domains from disappearing), drop it completely.
For QEMU, the number of serials is checked when building the command
line.
When security drivers are active but confinement is not enabled,
there is no need to autogenerate <seclabel> elements when starting
a domain def that contains no <seclabel> elements. In fact,
autogenerating the elements can result in needless save/restore and
migration failures when the security driver is not active on the
restore/migration target.
This patch changes the virSecurityManagerGenLabel function in
src/security_manager.c to only autogenerate a <seclabel> element
if none is already defined for the domain *and* default
confinement is enabled. Otherwise the needless <seclabel>
autogeneration is skipped.
Resolves: https://bugzilla.opensuse.org/show_bug.cgi?id=1051017
I mistakenly thought pSeries guests supported 32 PHBs,
but it turns out they only support 31. Validate the
target index accordingly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1479647
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Validation should happen after parsing, so the proper
location for it is virDomainControllerDefValidate()
rather than virDomainControllerDefParseXML().
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Use the new facility which allows to ignore failures in post parse
callbacks if they are not fatal so that VM configs are not lost if the
emulator binary is missing.
If qemuCaps can't be populated on daemon restart skip certain portions
of the post parse callbacks during config reload and re-run the callback
during VM startup.
This fixes VMs vanishing if the emulator binary was broken or
uninstalled and libvirtd was restarted.
qemuDomainControllerDefPostParse assigns the default USB controller
model when it was not specified by the user. Skip this step if @qemuCaps
is missing so that we don't fill wrong data. This will then be fixes by
re-running the post parse callback.
Report the given GIC version as unsupported if @qemuCapsi is NULL. This
will be helpful to run post parse callbacks even if qemu is not
currently installed.
If qemuCaps are not present, just return the original machine type name.
This will help in situations when qemuCaps is not available in the post
parse callback.
Some failures of the post parse callback can be tolerated. This is
specifically desired when loading the configs of existing VMs. In such
case the post parse callback should not really be modifying anything
in the definition.
This patch adds a parse flag VIR_DOMAIN_DEF_PARSE_ALLOW_POST_PARSE_FAIL
which will allow the callbacks to report non-fatal failures by returning
a positive return value. In such case the field 'postParseFailed' in the
domain definition is set to true, to notify the drivers that the
callback failed and possibly needs to be re-run.
Post parse callbacks will need to be able to signal that they failed
non-fatally. This means that we need to return the value returned by the
callback without modification.
The domain post parse callback, domain address callback and the domain
device callback (for every single device) would each grab qemuCaps for
the current emulator. This is quite wasteful. Use the new callback to do
this just once.
Some drivers use def-specific private data across callbacks (e.g.
qemuCaps in the qemu driver). Currently it's mostly allocated in every
single callback. This is rather wasteful, given that every single call
to the device callback allocates it.
The new callback will allocate the data (if not provided externally) and
then use it for the VM, address and device post parse callbacks.
Add yet another post parse callback, which is executed prior the real
one without @parseOpaque. This is meant to set basics before
@parseOpaque (in case of the qemu driver qemuCaps) can be allocated.
This callback will allow to optimize passing of custom parseOpaque
through the callbacks.
The helper returns true if a string contains any of the given chars.
virStringHasControlChars can be reimplemented using that helper.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Let this new method handle the device object we obtained from the
monitor in order to enhance readability.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So we have a sanity check for the udev monitor fd. Theoretically, it
could happen that the udev monitor fd changes (due to our own wrongdoing,
hence the 'sanity' here) and if that happens it means we are handling an
event from a different entity than we think, thus we should remove the
handle if someone somewhere somehow hits this hypothetical case.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
It might happen that virFileResolveLinkHelper fails on the lstat system
call. virFileResolveLink expects the caller to report an error when it
fails, however this wasn't the case for udevProcessMediatedDevice.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Testing qemu-2.10-rc3 shows issues like:
qemu-system-aarch64: -drive file=/home/ubuntu/vm-start-stop/vms/
7936-0_CODE.fd,if=pflash,format=raw,unit=1: Failed to unlock byte 100
There is an apparmor deny due to qemu now locking those files:
apparmor="DENIED" operation="file_lock" [...]
name="/home/ubuntu/vm-start-stop/vms/7936-0_CODE.fd"
name="/var/lib/uvtool/libvirt/images/kvmguest-artful-normal.qcow"
[...] comm="qemu-system-aarch64" requested_mask="k" denied_mask="k"
The profile needs to allow locking for loader and nvram files via
the locking (k) rule.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Testing qemu-2.10-rc2 shows issues like:
qemu-system-x86_64: -drive file=/var/lib/uvtool/libvirt/images/kvmguest- \
artful-normal.qcow,format=qcow2,if=none,id=drive-virtio-disk0:
Failed to lock byte 100
It seems the following qemu commit changed the needs for the backing
image rules:
(qemu) commit 244a5668106297378391b768e7288eb157616f64
Author: Fam Zheng <famz@redhat.com>
file-posix: Add image locking to perm operations
The block appears as:
apparmor="DENIED" operation="file_lock" [...]
name="/var/lib/uvtool/libvirt/images/kvmguest-artful-normal.qcow"
[...] comm="qemu-system-x86" requested_mask="k" denied_mask="k"
With that qemu change in place the rules generated for the image
and backing files need the allowance to also lock (k) the files.
Disks are added via add_file_path and with this fix rules now get
that permission, but no other rules are changed, example:
- "/var/lib/uvtool/libvirt/images/kvmguest-artful-normal-a2.qcow" rw,
+ "/var/lib/uvtool/libvirt/images/kvmguest-artful-normal-a2.qcow" rwk
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
It's equivalent of calling virXPathString("string(.)", ctxt) but it
doesn't have to use the XPath resolving and parsing.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The virXMLPropStringLimit is an equivalent of virXPathStringLimit
which should be preferred if you already have a XML dom node or
if you need to parse more than one property.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Back in the day when I was implementing QoS for networks there
were no self inflating virBitmaps. Only the static ones.
Therefore, I had to allocate the whole 8KB of memory in order to
keep track of used/unused class IDs. This is rather wasteful
because nobody is ever gonna use that much classes (kernel
overhead would drastically lower the bandwidth). Anyway, now that
we have self inflating bitmaps we can start small and allocate
more if there's need for it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Rename the variable, recent review requested just use of @filter,
so be consistent throughout.
NB: Also change the virNWFilterPtr to be @nwfilter to not conflict
with the renamed variable.
Use the structure names in the @data setup - makes it easier than
going back to find the struct fields to make sure the order of the
data is correct.
Signed-off-by: John Ferlan <jferlan@redhat.com>
To be consistent with the API definition, use the @maxnames instead
of @nnames when describing/comparing against the maximum names to
be provided for the *ConnectList[Defined]Networks APIs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the virObjectRef in virNetworkObjAssignDefLocked to after
the virHashAddEntry to make it "clearer" why the @ref is being
incremented. Upon return from the ObjNew we will have 1 ref on
the object already, adding it to the hash table requires the
increment.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to privatize the virNetworkObj - create an accessor function
to get the current @persistent value. Also change the value to a bool rather
than an unsigned int (since that's how it's generated anyway).
Signed-off-by: John Ferlan <jferlan@redhat.com>
In order to privatize the virNetworkObj create accessors in virnetworkobj
in order to handle the get/set of the active value.
Also rather than an unsigned int, convert it to a boolean to match other
drivers representation and the reality of what it is.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for making the object private, create a couple of API's
to get the obj->def & obj->newDef and set the obj->def.
While altering networkxml2conftest.c to use the virNetworkObjSetDef
API, fix the name of the variable from @dev to @def
Signed-off-by: John Ferlan <jferlan@redhat.com>
Change the variable name to be a bit more descriptive and less confusing
when used with the data.network.actual->class_id.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for making the object private, create/use a couple of API's
to get/set the obj->dnsmasqPid and obj->radvdPid.
NB: Since the pid's can sometimes changed based on intervening functions,
be sure to always fetch the latest value.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since we can only ever have one reference to obj->macmap, rather
than only clearing obj->macmap during virNetworkObjUnrefMacMap
(e.g. virtual network from networkShutdownNetwork), let's just
unconditionally clear the obj->macmap to ensure that some future
change that created it's own reference to obj->macmap wouldn't
have that reference disappear if virNetworkObjDispose got called.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for having a private virNetworkObj - let's create/move some
API's that handle the obj->macmap. The API's will be renamed to have a
virNetworkObj prefix to follow conventions and the arguments slightly
modified to accept what's necessary to complete their task.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move networkMacMgrFileName into src/util/virmacmap.c and rename to
virMacMapFileName. We're about to move some more MacMgr processing
files into virnetworkobj and it doesn't make sense to have this helper
in the driver or in virnetworkobj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If an environment specific _tls_x509_cert_dir is provided, then
do not VIR_STRDUP the defaultTLSx509secretUUID as that would be
for the "default" environment and not the vnc, spice, chardev, or
migrate environments. If the environment needs a secret to decode
it's certificate, then it must provide the secret. If the secrets
happen to be the same, then configuration would use the same UUID
as the default (but we cannot assume that nor can we assume that
the secret would be necessary).
Rather than assuming that what's passed to virObject{Ref|Unref}
would be a virObjectPtr as long as it's not NULL, let's do the
similar checks virObjectIsClass in order to prevent a possible
increment or decrement to some field at the obj->u.s.refs offset.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virObjectIsClass API has only ever checked object validity
based on if the @obj is not NULL and it was derived from some class.
While this has worked well in general, there is one additional
check that could be made prior to calling virClassIsDerivedFrom
which loops through the classes checking the magic number against
the klass expected magic number.
If by chance a non virObject is passed, rather than assuming the
void * @obj is a _virObject and thus offsetting to obj->klass,
obj->magic, and obj->parent, let's check that the void * @obj
has at least the "base part" of the magic number in the right
place and generate a more specific VIR_WARN message if not.
There are many consumers to virObjectIsClass, include the locking
primitives virObject{Lock|Unlock}, virObjectRWLock{Read|Write},
and virObjectRWUnlock. For those callers, the locking call will
not fail, but it also will not attempt a virMutex* call which
will "most likely" fail since the &obj->lock is used.
In order to avoid some possible future wrap on the 0xCAFExxxx
value, add a check during initialization that some new class
won't cause the wrap. Should be good for a few years at least!
It is still left up to the caller to handle the failed API calls
just as it would be if it passed a NULL opaque pointer anyobj.
If virObjectIsClass fails "internally" to virobject.c, create a
macro to generate the VIR_WARN describing what the problem is.
Also improve the checks and message a bit to indicate which was
the failure - whether the obj was NULL or just not the right class
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than overload virObjectUnlock as commit id '77f4593b' has
done, create a separate virObjectRWUnlock API that will force the
consumers to make the proper decision regarding unlocking the
RWLock's. Similar to the RWLockRead and RWLockWrite, use the
virObjectGetRWLockableObj helper. This restores the virObjectUnlock
code to using the virObjectGetLockableObj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Introduce a helper to handle the error path more cleanly. The same
as virObjectGetLockableObj in order to essentially follow the original
logic of commit 'b545f65d' to ensure that the input argument at least
has some validity before using.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Now that virObjectRWLockWrite exists to handle the virObjectRWLockable
objects, let's restore virObjectLock to only handle virObjectLockable
class locks. There still exists the possibility that the input @anyobj
isn't a valid object and the resource isn't truly locked, but that
also exists before commit id '77f4593b'.
This also restores some logic that commit id '77f4593b' removed
with respect to a common code path that commit id '10c2bb2b' had
introduced as virObjectGetLockableObj. This code path merely does
the same checks as the original virObjectLock commit 'b545f65d',
but in callable/reusable helper to ensure the @obj at least has
some validity before using.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Instead of making virObjectLock be the entry point for two
different types of locks, let's create a virObjectRWLockWrite API
which will only handle the virObjectRWLockableClass objects.
Use the new virObjectRWLockWrite for the virdomainobjlist code
in order to handle the Add, Remove, Rename, and Load operations
that need to be very synchronous.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the class it represents is based on virObjectRWLockableClass
and in order to make sure we differentiate just in case anyone somehow
believes they could use virObjectLockRead for a virObjectLockableClass,
let's rename the API to use the RW in the name. Besides the RW locks
refer to pthread_rwlock_{init|rdlock|wrlock|unlock|destroy} while the
other locks refer to pthread_mutex_{init|lock|unlock|destroy}.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The correct lock order is:
nwfilter driver lock (not used in this code path)
nwfilter update lock
virt driver lock (not used in this code path)
domain object lock
but the current code have this order:
domain object lock
nwfilter update lock
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This way later patches can add another structures with virResctrl
prefix without the meaning being even more confusing than it needs to
be.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That means that returning negative values means error and non-negative
values differ in meaning, but are all successful.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It doesn't access anything from conf/ and ti will be needed to use
from other util/ places. This split makes the separation clearer.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 81fb440b further qualified an if statement by adding the
boolean saveVlan to the condition. Coverity pointed out that this
change in the logic eliminated the need to check saveVlan in an
argument to virAsprintf().
Commit 9a94af6d restructured virHostdevReadNetConfig() so that it
would manually set ret = 0 after successfully reading the device's
config, but Coverity pointed out that "ret = 0" was erroneously placed
outside of an "else" clause, meaning that the the value of ret set in
the "if" clause was unnecessarily and incorrectly overwritten.
This patch moves ret = 0 into the else clause, which should silence
Coverity.
When using a VF from an SRIOV-capable network card in a guest (either
in macvtap passthrough mode, or via VFIO PCI device assignment), The
associated PF netdev must be online in order for the VF to be usable
by the guest. The guest, however, is not able to change the state of
the PF. And libvirt *could* set the PF online as needed, but that
could lead to the host receiving unexpected IPv6 traffic (since the
default for an unconfigured interface is to participate in IPv6
autoconf). For this reason, before assigning a VF to a guest, libvirt
verifies that the related PF netdev is online - if it isn't, then we
log an error and don't allow the guest startup to continue.
Until now, this check was done during virNetDevSetNetConfig(). This
works nicely because the same function is called both for macvtap
passthrough and for VFIO device assignment. But in the case of VFIO,
the VF has already been unbound from its netdev driver by the time we
get to virNetDevSetNetConfig(), and in the case of dual port Mellanox
NICs that have their VFs setup in single port mode, the *only* way to
determine the proper PF netdev to query for online status is via the
"phys_port_id" file that is in the VF netdev's sysfs directory. *BUT*
if we've unbound the VF from the netdev driver, then it doesn't *have*
a netdev sysfs directory.
So, in order to check the correct PF netdev for online status, this
patch moved the check earlier in the setup, into
virNetDevSaveNetConfig(), which is called *before* unbinding the VF
from its netdev driver.
(Note that this implies that if you are using VFIO device assignment
for the VFs of a Mellanox NIC that has the VFs programmed in single
port mode, you must let the VFs be bound to their net driver and use
"managed='yes'" in the device definition. To be more specific, this is
only true if the VFs in single port mode are using port *2* of the PF
- if the VFs are using only port 1, then the correct PF netdev will be
arrived at by default/chance))
This resolves: https://bugzilla.redhat.com/267191
virHostdevRestoreNetConfig() calls virNetDevReadNetConfig() to try and
read the "original config" of a netdev, and if that fails, it tries
again with a different directory/netdev name. This achieves the
desired effect (we end up finding the config wherever it may be), but
for each failure, virNetDevReadNetConfig() places a nice error message
in the system logs. Experience has shown that false-positive error
logs like this lead to erroneous bug reports, and can often mislead
those searching for *real* bugs.
This patch changes virNetDevReadNetConfig() to explicitly check if the
file exists before calling virFileReadAll(); if it doesn't exist,
virNetDevReadNetConfig() returns a success, but leaves all the
variables holding the results as NULL. (This makes sense if you define
the purpose of the function as "read a netdev's config from its config
file *if that file exists*).
To take advantage of that change, the caller,
virHostdevRestoreNetConfig() is modified to fail immediately if
virNetDevReadNetConfig() returns an error, and otherwise to try the
different directory/netdev name if adminMAC & vlan & MAC are all NULL
after the preceding attempt.
Mellanox ConnectX-3 dual port SRIOV NICs present a bit of a challenge
when assigning one of their VFs to a guest using VFIO device
assignment.
These NICs have only a single PCI PF device, and that single PF has
two netdevs sharing the single PCI address - one for port 1 and one
for port 2. When a VF is created it can also have 2 netdevs, or it can
be setup in "single port" mode, where the VF has only a single netdev,
and that netdev is connected either to port 1 or to port 2.
When the VF is created in dual port mode, you get/set the MAC
address/vlan tag for the port 1 VF by sending a netlink message to the
PF's port1 netdev, and you get/set the MAC address/vlan tag for the
port 2 VF by sending a netlink message to the PF's port 2 netdev. (Of
course libvirt doesn't have any way to describe MAC/vlan info for 2
ports in a single hostdev interface, so that's a bit of a moot point)
When the VF is created in single port mode, you can *set* the MAC/vlan
info by sending a netlink message to *either* PF netdev - the driver
is smart enough to understand that there's only a single netdev, and
set the MAC/vlan for that netdev. When you want to *get* it, however,
the driver is more accurate - it will return 00:00:00:00:00:00 for the
MAC if you request it from the port 1 PF netdev when the VF was
configured to be single port on port 2, or if you request if from the
port 2 PF netdev when the VF was configured to be single port on port
1.
Based on this information, when *getting* the MAC/vlan info (to save
the original setting prior to assignment), we determine the correct PF
netdev by matching phys_port_id between VF and PF.
(IMPORTANT NOTE: this implies that to do PCI device assignment of the
VFs on dual port Mellanox cards using <interface type='hostdev'>
(i.e. if you want the MAC address/vlan tag to be set), not only must
the VFs be configured in single port mode, but also the VFs *must* be
bound to the host VF net driver, and libvirt must use managed='yes')
By the time libvirt is ready to set the new MAC/vlan tag, the VF has
already been unbound from the host net driver and bound to
vfio-pci. This isn't problematic though because, as stated earlier,
when a VF is created in single port mode, commands to configure it can
be sent to either the port 1 PF netdev or the port 2 PF netdev.
When it is time to restore the original MAC/vlan tag, again the VF
will *not* be bound to a host net driver, so it won't be possible to
learn from sysfs whether to use the port 1 or port 2 PF netdev for the
netlink commands. And again, it doesn't matter which netdev you
use. However, we must keep in mind that we saved the original settings
to a file called "${PF}_${VFNUM}". To solve this problem, we just
check for the existence of ${PF1}_${VFNUM} and ${PF2}_${VFNUM}, and
use whichever one we find (since we know that only one can be there)
This patch updates functions in netdev.c to pay attention to
phys_port_id. It uses the new function virNetDevGetPhysPortID() to
learn the phys_port_id of a VF or PF, then sends that info to
virPCIGetNetName(), which has newly been modified to take an optional
phys_port_id.
A single PCI device may have multiple netdevs associated with it. Each
of those netdevs will have a different phys_port_id entry in
sysfs. This patch modifies virPCIGetNetName() to allow selecting one
of the potential many netdevs in two different ways:
1) by setting the "idx" argument, the caller can select the 1st (0),
2nd (1), etc. netdev from the PCI device's net subdirectory.
2) If the physPortID arg is set (to a null-terminated string) then
virPCIGetNetName() returns the netdev that has that phys_port_id in
the sysfs file of the same name in the netdev's directory.
On Linux each network device *can* (but not necessarily *does*) have
an attribute called phys_port_id which can be read from the file of
that name in the netdev's sysfs directory. The examples I've seen have
been a many-digit hexadecimal number (as an ASCII string).
This value can be useful when a single PCI device is associated with
multiple netdevs (e.g a dual port Mellanox SR-IOV NIC - this card has
a single PCI Physical Function (PF), and that PF has two netdevs
associated with it (the "net" subdirectory of the PF in sysfs has two
links rather than the usual single link to a netdev directory). Each
of the PF netdevs has a different phys_port_id. The Virtual Functions
(VF) are similar - the PF (a PCI device) has "n" VFs (also each of
these is a PCI device), each VF has two netdevs, and each of the VF
netdevs points back to the VF PCI device (with the "device" entry in
its sysfs directory) as well as having a phys_port_id matching the PF
netdev it is associated with.
virNetDevGetPhysPortID() simply attempts to read the phys_port_id for
the given netdev and return it to the caller. If this particular
netdev driver doesn't support phys_port_id, it returns NULL (*not* a
NULL-terminated string, but a NULL pointer) but still counts it as a
success.
https://bugzilla.redhat.com/show_bug.cgi?id=1458638
This code is so complicated because we allow enabling the same
bits at many places. Just like in this case: huge pages can be
enabled by global <hugepages/> element under <memoryBacking> or
on per <memory/> basis. To complicate things a bit more, users
are allowed to omit the page size which case the default page
size is used. And this is what is causing this bug. If no page
size is specified, @pagesize is keeping value of zero throughout
whole function. Therefore we need yet another boolean to hold
[use, don't use] information as we can't sue @pagesize for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In virDomainNetDefParseXML() the def->coalesce is parsed and
allocated by virDomainNetDefCoalesceParseXML() but in fact it's
never freed .
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1467245
Currently, there's a bug when undefining a domain with NVRAM
store. Basically, the unlink() of the NVRAM store file happens
during the undefine procedure iff domain is inactive. So, if
domain is running and undefine is called the file is left behind.
It won't be removed in the domain cleanup process either
(qemuProcessStop). One of the solutions is to remove if
regardless of the domain state and rely on qemu having the file
opened. This still has a downside that if the domain is defined
back the NVRAM store file is going to be new, any changes to the
current one are lost (just like with any other file that is
deleted while a process has it opened). But is it really a
downside?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Slight refactor of the WMI serialization code to minimize mixing
openwsman and libxml2 APIs that triggered clang alignment warnings.
The only usage of libxml2 APIs now is in creating CDATA blocks,
because the openwsman API does not provide that functionality. The
clang alignment warning in this case is silenced by casting to a
void pointer first.
The API docs for the various vir$OBJECTGetConnect functions
contain a warning
WARNING: When writing libvirt bindings in other languages, do
not use this function. Instead, store the connection and
the domain object together.
There is no reason why language bindings should not use this
method, and indeed the Perl, Python, and Go bindings all use
these methods.
This warning was originally added back in
commit 3edb4bc9fb
Author: Daniel Veillard <veillard@redhat.com>
Date: Tue Jul 24 15:32:55 2007 +0000
* libvirt.spec.in NEWS docs/* po/*: preparing release 0.3.1
* src/libvirt.c python/generator.py: some cleanup and warnings
from Richard W.M. Jones
IIUC, the rational was that these APIs do not need to be
directly exposed to the non-C language, as the language
can expose the same concept itself by storing the original
virConnectPtr object alongside the virDomainPtr. There's
no reason to mandate such an approach though - it is valid
for languages to expose this directly if that suits their
needs better.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We're storing the machine name in @priv but free it just in
qemuProcessStop, Therefore this may leak.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When parsing boot options from domain XML in
virDomainDefParseBootOptions() initenv id stored to:
def->os.initenv[i]->name
def->os.initenv[i]->value
But these are never freed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds support for automatic VNC port assignment for bhyve guests.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Most other top level objects have already had their limits increased
to 16384. Increase the storage pool, nwfilter & snapshot object
limits to match. For snapshots at least, we have seen hosts which
exceeded the current limit
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code only currently handles writing an x86 default -cpu
argument, and doesn't know anything about other architectures.
Let's make this explicit rather than leaving ex. qemu ppc64 to
throw an error about -cpu qemu64
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Certain XML features that aren't in the <cpu> block map to -cpu
flags on the qemu cli. If one of these is specified but the user
didn't explicitly pass an XML <cpu> model, we need to format a
default model on the command line.
The current code handles this by sprinkling this default cpu handling
among all the different flag string formatting. Instead, switch it
to do this just once.
This alters some test output slightly: the previous code would
write the default -cpu in some cases when no flags were actually
added, so the output was redundant.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
My commit 0c1d863 broke formatting of passthrough smartcard devices:
<smartcard mode='passthrough' type='spicevmc'/>
resulted in invalid XML:
<smartcard mode='passthrough'>
type='spicevmc'>
<address type='ccid' controller='0' slot='0'/>
</smartcard>
Split out chardev source formatting function into two -
one formatting the attributes and other formatting the subelements.
Reported-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Disk serial schema has extra '.+' allowed characters in comparison
with check in code. Looks like there is no reason for that as qemu
allows any character AFAIK for serial. This discrepancy is originated
in commit id '85d15b51' where the ability to add serial was added.
Alter the disk-serial test to add a disk with all the possible
characters listed as the serial value.
https://bugzilla.redhat.com/show_bug.cgi?id=1458630
Introduce virQEMUDriverConfigTLSDirResetDefaults in order to check
if the defaultTLSx509certdir was changed, then change the default
for any other *TLSx509certdir that was not set to the default default.
Introduce virQEMUDriverConfigValidate to validate the existence of
any of the *_tls_x509_cert_dir values that were uncommented/set,
incuding the default.
Update the qemu.conf description for default to describe the consequences
if the default directory path does not exist.
Signed-off-by: John Ferlan <jferlan@redhat.com>
After an OOM error, virBuffer* APIs set buf->use to zero.
Adding a buffer to the parent buffer only if use is non-zero
would quietly drop data on error.
Check the error beforehand to make sure buf->use is zero
because we have not attempted to add anything to it.
Convert virDomainSmartcardDefFormat to use a separate buffer
for possible subelements, to avoid the need for duplicated
formatting logic in virDomainDeviceInfoNeedsFormat.
This function has grown to format more than just the address.
Delete the comment completely to avoid failing to update it
in the future.
Also, the indentation is now handled by the virBuffer APIs,
so the comment about indentation no longer makes sense.
This function returns false if virDomainDeviceInfoFormat
would not format anything.
Using it as the sole condition to decide whether to call
virDomainDeviceInfoFormat or not is pointless, since
the conditions are repeated in virDomainDeviceInfoFormat.
Not every platform is guaranteed to have dlopen/dlsym, so we should
conditionalize its use. Suprisingly it is actually present for Win32
via the mingw-dlfcn add on, but we should still conditionalize it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
These functions were made exportable back in 3aa3e072 when I was
splitting network code into parsing and list management parts.
Since then the split is finished now and these two functions do
not need to be exported anymore.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This member allows us to store a pointer to some private data.
However, the comment says it's used in both domain driver and
network driver. Well, it is not. It's just one pointer and domain
driver uses it directly. Network driver has a global driver
variable. Update the comment to not confuse others.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since the introduction of shmem, there was a split of preparation code
from the formatting code from qemuBuildCommandLine() into
qemuProcessPrepareDomain(). Let's fix shmem in this regard, so that
we can slowly get to a cleaner codebase.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So there are couple of issues here. Firstly, we never unref the
@pendingReply and thus it leaks.
==13279== 144 (72 direct, 72 indirect) bytes in 1 blocks are definitely lost in loss record 1,095 of 1,259
==13279== at 0x4C2E080: calloc (vg_replace_malloc.c:711)
==13279== by 0x781FA97: _dbus_pending_call_new_unlocked (in /usr/lib64/libdbus-1.so.3.14.11)
==13279== by 0x7812A4C: dbus_connection_send_with_reply (in /usr/lib64/libdbus-1.so.3.14.11)
==13279== by 0x56BEDF3: virNetDaemonCallInhibit (virnetdaemon.c:514)
==13279== by 0x56BEF18: virNetDaemonAddShutdownInhibition (virnetdaemon.c:536)
==13279== by 0x12473B: daemonInhibitCallback (libvirtd.c:742)
==13279== by 0x1249BD: daemonRunStateInit (libvirtd.c:823)
==13279== by 0x554FBCF: virThreadHelper (virthread.c:206)
==13279== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
==13279== by 0x928DE3C: clone (in /lib64/libc-2.23.so)
Secondly, while we send the message, we are suspended ('cos we're
talking to a UNIX socket). However, until we are resumed back
again the reply might have came therefore subsequent
dbus_pending_call_set_notify() has no effect and in fact the
virNetDaemonGotInhibitReply() callback is never called. Thirdly,
the dbus_connection_send_with_reply() has really stupid policy
for return values. To cite the man page:
Returns
FALSE if no memory, TRUE otherwise.
Yes, that's right. If anything goes wrong and it's not case of
OOM then TRUE is returned, i.e. you're trying to pass FDs and
it's not supported, or you're not connected, or anything else.
Therefore, checking for return value of
dbus_connection_send_with_reply() is not enoguh. We also have to
check if @pendingReply is not NULL before proceeding any further.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We are given a string in @machinename, we never allocate it, just
merely use it for reading. We should not free it otherwise it
leads to double free:
==32191== Thread 17:
==32191== Invalid free() / delete / delete[] / realloc()
==32191== at 0x4C2D1A0: free (vg_replace_malloc.c:530)
==32191== by 0x54BBB84: virFree (viralloc.c:582)
==32191== by 0x2BC04499: qemuProcessStop (qemu_process.c:6313)
==32191== by 0x2BC500FF: processMonitorEOFEvent (qemu_driver.c:4724)
==32191== by 0x2BC502FC: qemuProcessEventHandler (qemu_driver.c:4769)
==32191== by 0x5550640: virThreadPoolWorker (virthreadpool.c:167)
==32191== by 0x554FBCF: virThreadHelper (virthread.c:206)
==32191== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
==32191== by 0x928DE3C: clone (in /lib64/libc-2.23.so)
==32191== Address 0x31893d70 is 0 bytes inside a block of size 1,100 free'd
==32191== at 0x4C2D1A0: free (vg_replace_malloc.c:530)
==32191== by 0x54BBB84: virFree (viralloc.c:582)
==32191== by 0x54C1936: virCgroupValidateMachineGroup (vircgroup.c:343)
==32191== by 0x54C4B29: virCgroupNewDetectMachine (vircgroup.c:1550)
==32191== by 0x2BBDDA29: qemuConnectCgroup (qemu_cgroup.c:972)
==32191== by 0x2BC05DA7: qemuProcessReconnect (qemu_process.c:6822)
==32191== by 0x554FBCF: virThreadHelper (virthread.c:206)
==32191== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
==32191== by 0x928DE3C: clone (in /lib64/libc-2.23.so)
==32191== Block was alloc'd at
==32191== at 0x4C2BE80: malloc (vg_replace_malloc.c:298)
==32191== by 0x4C2E35F: realloc (vg_replace_malloc.c:785)
==32191== by 0x54BB492: virReallocN (viralloc.c:245)
==32191== by 0x54BEDF2: virBufferGrow (virbuffer.c:150)
==32191== by 0x54BF3B9: virBufferVasprintf (virbuffer.c:408)
==32191== by 0x54BF324: virBufferAsprintf (virbuffer.c:381)
==32191== by 0x55BB271: virDomainGenerateMachineName (domain_conf.c:27078)
==32191== by 0x2BBD5B8F: qemuDomainGetMachineName (qemu_domain.c:9595)
==32191== by 0x2BBDD9B4: qemuConnectCgroup (qemu_cgroup.c:966)
==32191== by 0x2BC05DA7: qemuProcessReconnect (qemu_process.c:6822)
==32191== by 0x554FBCF: virThreadHelper (virthread.c:206)
==32191== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
Moreover, make the @machinename 'const char *' to mark it
explicitly that we are not changing the passed string.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit @4cb719b2dc moved the driver locks around since these have become
unnecessary at spots where the code handles now self-lockable object
list, but missed the possible double unlock if udevEnumerateDevices
fails, because at that point the driver lock had been already dropped.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In commit 5e515b542d I've attempted to fix the inability to access
storage from the apparmor helper program by linking with the storage
driver. By linking with the .so the linker complains that it's not
portable. Fix this by loading the module dynamically as we are supposed
to do.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Driver modules proved to be reliable for a long time. Since support for
not building modules complicates the code and makefiles drop it.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
If a domain name contains a '=' and the unix socket path is
auto-generated or socket path provided by user contains '=' QEMU
is unable to properly parse the command line. In order to make it
work we need to use the new command line syntax for VNC if it's
available, otherwise we can use the old syntax.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1352529
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Remove the complex and unreliable code which inferred the node name
hierarchy only from data returned by 'query-named-block-nodes'. It turns
out that query-blockstats contain the full hierarchy of nodes as
perceived by qemu so the inference code is not necessary.
In query blockstats, the 'parent' object corresponds to the storage
behind a storage volume and 'backing' corresponds to the lower level of
backing chain. Since all have node names this data can be really easily
used to detect node names.
In addition to the code refactoring the one remaining test case needed
to be fixed along.
Reviewed-by: Eric Blake <eblake@redhat.com>
The same operation will become useful in other places so rename the
function to be more generic and move it to the top so that it can be
reused earlier in the file.
Reviewed-by: Eric Blake <eblake@redhat.com>
Allow getting the raw data from query-blockstats, so that we can use it
to detect the backing chain later on.
Reviewed-by: Eric Blake <eblake@redhat.com>
Disallow providing the wwnn/wwpn of the HBA in the adapter XML:
<adapter type='fc_host' [parent='scsi_hostN'] wwnn='HBA_wwnn'
wwpn='HBA_wwpn'/>
This should be considered a configuration error since a vHBA
would not be created. In order to use the HBA as the backing the
following XML should be used:
<adapter type='scsi_host' name='scsi_hostN'/>
So add a check prior to the checkParent call to validate that
the provided wwnn/wwpn resolves to a vHBA and not an HBA.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since commit 2e6ecba1bc, the pointer to the qemu driver is saved in
domain object's private data and hence does not have to be passed as
yet another parameter if domain object is already one of them.
This is a first (example) patch of this kind of clean up, others will
hopefully follow.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The switch contains considerable amount of changes:
virQEMUCapsRememberCached() is removed because this is now handled
by virFileCacheSave().
virQEMUCapsInitCached() is removed because this is now handled by
virFileCacheLoad().
virQEMUCapsNewForBinary() is split into two functions,
virQEMUCapsNewData() which creates new data if there is nothing
cached and virQEMUCapsLoadFile() which loads the cached data.
This is now handled by virFileCacheNewData().
virQEMUCapsCacheValidate() is removed because this is now handled by
virFileCacheValidate().
virQEMUCapsCacheFree() is removed because it's no longer required.
Add virCapsPtr into virQEMUCapsCachePriv because for each call of
virFileCacheLookup*() we need to use current virCapsPtr.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This is a preparation for following patches where we switch to
virFileCache for QEMU capabilities cache
The host arch will always remain the same but virCaps may change. Now
the host arch is stored while creating new qemu capabilities cache.
It removes the need to pass virCaps into virQEMUCapsCache*() functions.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This will store private data that will be used by following patches
when switching to virFileCache.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The new virFileCache will nicely handle the caching logic for any data
that we would like to cache. For each type of data we will just need
to implement few handlers that will take care of creating, validating,
loading and saving the cached data.
The cached data must be an instance of virObject.
Currently we cache QEMU capabilities which will start using
virFileCache.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
It's possible to have more than one unnamed virtio-serial unix channel.
We need to generate a unique name for each channel. Currently, we use
".../unknown.sock" for all of them. Better practice would be to specify
an explicit target path name; however, in the absence of that, we need
uniqueness in the names we generate internally.
Before the changes we'd get /var/lib/libvirt/qemu/channel/target/unknown.sock
for each instance of
<channel type='unix'>
<source mode='bind'/>
<target type='virtio'/>
</channel>
Now, we get vioser-00-00-01.sock, vioser-00-00-02.sock, etc.
Signed-off-by: Scott Garfinkle <seg@us.ibm.com>
It is more related to a domain as we might use it even when there is
no systemd and it does not use any dbus/systemd functions. In order
not to use code from conf/ in util/ pass machineName in cgroups code
as a parameter. That also fixes a leak of machineName in the lxc
driver and cleans up and de-duplicates some code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This way the function can work as a central point of clean-up code and
we don't have to duplicate code. And it works similarly to the qemu
driver.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Rather than rely on virSecretObjEndAPI to make the final virObjectUnref
after the call to virSecretObjListRemove, be more explicit by calling
virObjectUnref and setting @obj to NULL for secretUndefine and in
the error path of secretDefineXML. Calling EndAPI will end up calling
Unlock on an already unlocked object which has indeteriminate results
(usually an ignored error).
The virSecretObjEndAPI will both Unref and Unlock the object; however,
the virSecretObjListRemove would have already Unlock'd the object so
calling Unlock again is incorrect. Once the virSecretObjListRemove
is called all that's left is to Unref our interest since that's the
corrollary to the virSecretObjListAdd which returned our ref interest
plus references for each hash table in which the object resides. In math
terms, after an Add there's 2 refs on the object (1 for the object and
1 for the list). After calling Remove there's just 1 ref on the object.
For the Add callers, calling EndAPI removes the ref for the object and
unlocks it, but since it's in a list the other 1 remains.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the virSecretLoadValue fails, the code jumped to cleanup without
setting @ret = obj, thus calling virSecretObjListRemove which only
accounts for the object reference related to adding the object to
the list during virSecretObjListAdd, but does not account for the
reference to the object itself as the return of @ret would be NULL
so the caller wouldn't call virSecretObjEndAPI on the object recently
added thus reducing the refcnt to zero.
This patch will perform the ObjListRemove in the failure path of
virSecretLoadValue and Unref @obj in order to perform clean up
and return @obj as NULL. The @def will be freed as part of the
virObjectUnref.
Since the virSecretObjListAdd technically consumes @def on success,
the secretDefineXML should set @def = NULL immediately and process
the remaining calls using a new @objDef variable. We can use use
VIR_STEAL_PTR since we know the Add function just stores @def in
obj->def.
Because we steal @def into @objDef, if we jump to restore_backup:
and @backup is set, then we need to ensure the @def would be
free'd properly, so we'll steal it back from @objDef. For the other
condition this fixes a double free of @def if the code had jumped to
@backup == NULL thus calling virSecretObjListRemove without setting
@def = NULL. In this case, the subsequent call to DefFree would
succeed and free @def; however, the call to EndAPI would also
call DefFree because the Unref done would be the last one for
the @obj meaning the obj->def would be used to call DefFree,
but it's already been free'd because @def wasn't managed right
within this error path.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than assign to a local variable, let's just assign directly to the
object using the error path for cleanup.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This reverts commit 328bd24443.
As it turns out, this is not portable and very Linux & glibc
specific. Worse, this may lead to not starving writers on Linux
but everywhere else. Revert this and if the starvation occurs
resolve it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
The original name didn't hint at the fact that PHBs are
a pSeries-specific concept.
Suggested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Recent commits made it so that pci-root controllers for
pSeries guests are automatically assigned the
spapr-pci-host-bridge model name; however, that prevents
guests to migrate to older versions of libvirt which don't
know about that model name at all, which at the moment is
all of them :)
To avoid the issue, just strip the model name from PHBs
when formatting the migratable XML; guests that use more
than one PHB are not going to be migratable anyway.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1458708
If the parent provided for the storage pool adapter is not vHBA
capable, then issue a configuration error even though the provided
wwnn/wwpn were found.
It is a configuration error to provide a mismatched parent to
the wwnn/wwpn. The @parent is optional and is used as a means to
perform duplicate pool source checks.
https://bugzilla.redhat.com/show_bug.cgi?id=1472277
Commit id '106930aaa' altered the order of checking for an existing
vHBA (e.g something created via nodedev-create functionality outside
of the storage pool logic) which inadvertantly broke the code to
decide whether to alter/force the fchost->managed field to be 'yes'
because the storage pool will be managing the created vHBA in order
to ensure when the storage pool is destroyed that the vHBA is also
destroyed.
This patch moves the check (and checkParent helper) for an existing
vHBA back into the createVport in storage_backend_scsi. It also
adjusts the checkParent logic to more closely follow the intentions
prior to commit id '79ab0935'. The changes made by commit id '08c0ea16f'
are only necessary to run the virStoragePoolFCRefreshThread when
a vHBA was really created because there's a timing lag such that
the refreshPool call made after a startPool from storagePoolCreate*
wouldn't necessarily find LUNs, but the thread would. For an already
existing vHBA, using the thread is unnecessary since the vHBA already
exists and the lag to configure the LUNs wouldn't exist.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since virnodedeviceobj now has a self-lockable hash table, there's no
need to lock the table from the driver for processing. Thus remove the
locks from the driver for NodeDeviceObjList mgmt.
This includes the test driver as well.
Rather than use a forward linked list of elements, it'll be much more
efficient to use a hash table to reference the elements by unique name
and to perform hash searches.
This patch does all the heavy lifting of converting the list object to
use a self locking list that contains the hash table. Each of the FindBy
functions that do not involve finding the object by it's key (name) is
converted to use virHashSearch in order to find the specific object.
When searching for the key (name), it's possible to use virHashLookup.
For any of the list perusal functions that are required to evaluate
each object, the virHashForEach function is used.
Alter the node device deletion logic to make use of the parent field
from the obj->def rather than call virNodeDeviceObjListGetParentHost.
As it turns out the saved @def won't have parent_wwnn/wwpn or
parent_fabric_wwn, so the only logical path would be to call
virNodeDeviceObjListGetParentHostByParent which we can accomplish
directly via virNodeDeviceObjListFindByName.
There is no reason why two threads trying to look up two domains
should mutually exclude each other. Utilize new
virObjectRWLockable that was just introduced.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Up until now we only had virObjectLockable which uses mutexes for
mutually excluding each other in critical section. Well, this is
not enough. Future work will require RW locks so we might as well
have virObjectRWLockable which is introduced here.
Moreover, polymorphism is introduced to our code for the first
time. Yay! More specifically, virObjectLock will grab a write
lock, virObjectLockRead will grab a read lock then (what a
surprise right?). This has great advantage that an object can be
made derived from virObjectRWLockable in a single line and still
continue functioning properly (mutexes can be viewed as grabbing
write locks only). Then just those critical sections that can
grab a read lock need fixing. Therefore the resulting change is
going to be way smaller.
In order to avoid writer starvation, the object initializes RW
lock that prefers writers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We already have virRWLockInit. But this uses pthread defaults
which prefer reader to initialize the RW lock. This may lead to
writer starvation. Therefore we need to have the counterpart that
prefers writers. Now, according to the
pthread_rwlockattr_setkind_np() man page setting
PTHREAD_RWLOCK_PREFER_WRITER_NP attribute is no-op. Therefore we
need to use PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP
attribute. So much for good enum value names.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It was observed while adding new property that there should be a space
before closing a curly brace in intel-iommu object property definition.
Fixing it as a separate patch.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Currently, @port is type of string. Well, that's overkill and
waste of memory. Port is always an integer. Use it as such.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Split out parsing of one host into a separate function and add a new
function to loop through all the host XML nodes.
This change removes multiple levels of nesting due to the old XML
parsing approach used.
So the way we format this huge virQEMUCaps enum is we group the
values in groups of five. And then at the beginning of each group
we have a small comment that says what's the number of the first
item in the group. Well, the last commit of 11b2ebf3e1 does not
follow this formatting.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Alter the virStoragePoolObjNumOfVolumes, virStoragePoolObjVolumeGetNames,
and virStoragePoolObjVolumeListExport APIs to take a virStoragePoolObjPtr
instead of the &obj->volumes and obj->def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
A virStoragePoolObjPtr will be an 'obj'.
A virStoragePoolPtr will be a 'pool'.
A virStorageVolPtr will be a 'vol'.
A virStorageVolDefPtr will be a 'voldef'.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than have repetitive code - create/use a couple of helpers:
testStoragePoolObjFindActiveByName
testStoragePoolObjFindInactiveByName
This will also allow for the reduction of some cleanup path logic.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rework some of the test driver API's to remove the need to return
failure when testStoragePoolObjFindByName returns NULL rather than
going to cleanup. This removes the need for check for "if (obj)" and in
some instances the need to for a cleanup label and a local ret variable.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a path for UEFI VMs for AArch32 VMs, based on the path Debian is using.
libvirt is the de facto canonical location for defining where distros
should place these firmware images, so let's define this path here to try
and minimize distro fragmentation.
The call to qemuBuildDeviceAddressStr() happens no matter
what, so we can move it to the outer possible scope inside
the function.
We can also move the call to virBufferAsprintf() after all
the checks have been performed, where it makes more sense.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The virDomainDeviceInfo struct is defined in device_conf,
so generic functions that operate on it should also be
defined there rather than in domain_conf.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This patch addresses the same aspects on PPC the bug 1103314 addressed
on x86.
PCI expander bus creates multiple primary PCI busses, where each of these
busses can be assigned a specific NUMA affinity, which, on x86 is
advertised through ACPI on a per-bus basis.
For SPAPR, a PHB's NUMA affinities are assigned on a per-PHB basis, and
there is no mechanism for advertising NUMA affinities to a guest on a
per-bus basis. So, even if qemu-ppc manages to get some sort of multi-bus
topology working using PXB, there is no way to expose the affinities
of these busses to the guest. It can only be exposed on a per-PHB/per-domain
basis.
So patch enables NUMA node tag in pci-root controller on PPC.
The way to set the NUMA node is through the numa_node option of
spapr-pci-host-bridge device. However for the implicit PHB, the only way
to set the numa_node is from the -global option. The -global option applies
to all the PHBs unless explicitly specified with the option on the
respective PHB of CLI. The default PHB has the emulated devices only, so
the patch prevents setting the NUMA node for the default PHB.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The patch adds a capability for spapr-pci-host-bridge.numa_node.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Instead of going through two completely different code paths,
one of which repeats the same hardcoded bit of information
three times in rapid succession, depending on whether or not
a firmware list has been provided at configure time, just
provide a reasonable default value and remove the extra code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
'numad' may return a nodeset which contains NUMA nodes without memory
for certain configurations. Since cgroups code will not be happy using
nodes without memory we need to store only numa nodes with memory in
autoNodeset.
On the other hand autoCpuset should contain cpus also for nodes which
do not have any memory.
A new function virNetDevOpenvswitchUpdateVlan has been created to instruct
OVS of the changes. qemuDomainChangeNet has been modified to handle the
update of the VLAN configuration for a running guest and rely on
virNetDevOpenvswitchUpdateVlan to do the actual update if needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Preparation for switching to virFileCache where there are two callbacks,
one to get a new data and second one to load a cached data.
This also removes virQEMUCapsReset which is no longer required.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
It's not required and following patches will change the code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Cleanups the code a little bit and reduces amount of arguments passed
throughout the functions.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
While searching for an element using a function it may be
desirable to know the element key for future operation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
At present shared disks can be migrated with either readonly or cache=none. But
cache=directsync should be safe for migration, because both cache=directsync and cache=none
don't use the host page cache, and cache=direct write through qemu block layer cache.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Wang Yechao <wang.yechao255@zte.com.cn>
Use virStorageSource accessors to check the file and call
virStorageFileAccess before even attempting to stat the target. This
will be helpful once we try to add network destinations for block copy,
since there will be no need to stat them.
When copying to a block device, the block device will already exist. To
allow users using a block device without any preparation, they need to
use the block copy without VIR_DOMAIN_BLOCK_COPY_REUSE_EXT.
This means that if the target is an existing block device we don't need
to prepare it, but we can't reject it as being existing.
To avoid breaking this feature, explicitly assume that existing block
devices will be reused even without that flag explicitly specified,
while skipping attempts to create it.
qemuMonitorDriveMirror still needs to honor the flag as specified by the
user, since qemu overwrites the metadata otherwise.
The refactor to split up storage driver into modules broke the apparmor
helper program, since that did not initialize the storage driver
properly and thus detection of the backing chain could not work.
Register the storage driver backends explicitly. Unfortunately it's now
necessary to link with the full storage driver to satisfy dependencies
of the loadable modules.
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The purpose of this function is to tell if the current position
in given FD is in data section or a hole and how much bytes there
is remaining until the end of the section. This is achieved by
couple of lseeks(). The most important part is that we reposition
the FD back, so that the position is unchanged from the caller
POV. And until now the final lseek() back to the original
position was done with no check for errors. And I was convinced
that that's okay since nothing can go wrong. However, review
feedback from a related series persuaded me, that it's better to
be safe than sorry. Therefore, lets check if the final lseek()
succeeded and if it doesn't report an error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All the pieces are now in place, so we can finally start
using isolation groups to achieve our initial goal, which is
separating hostdevs from emulated PCI devices while keeping
hostdevs that belong to the same host IOMMU group together.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1280542
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
These rules will make it possible for libvirt to
automatically assign PCI addresses in a way that
respects any isolation constraints devices might
have.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Isolation groups will eventually allow us to make sure certain
devices, eg. PCI hostdevs, are assigned to guest PCI buses in
a way that guarantees improved isolation, error detection and
recovery for machine types and hypervisors that support it,
eg. pSeries guest on QEMU.
This patch merely defines storage for the new information
we're going to need later on and makes sure it is passed from
the hypervisor driver (QEMU / bhyve) down to the generic PCI
address allocation code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>