When libvirt older than 3.9.0 reconnected to a running domain started by
old libvirt it could have messed up the expansion of host-model by
adding features QEMU does not support (such as cmt). Thus whenever we
reconnect to a running domain, revert to an active snapshot, or restore
a saved domain we need to check the guest CPU model and remove the
CPU features unknown to QEMU. We can do this because we know the domain
was successfully started, which means the CPU did not contain the
features when libvirt started the domain.
https://bugzilla.redhat.com/show_bug.cgi?id=1495171
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When reconnecting to a domain started with a host-model CPU which was
started by old libvirt that did not replace host-model with the real CPU
definition, libvirt replaces the host-model CPU with the CPU from
capabilities (because this is what the old libvirt did when it started
the domain). Without this patch libvirt could use features unknown to
QEMU in the CPU definition which replaced the original host-model CPU.
Such domain would keep running just fine, but any attempt to migrate it
will fail and once the domain is saved or snapshotted, restoring it
would fail too.
In other words whenever we want to use the CPU definition from host
capabilities as a guest CPU definition, we have to filter the unknown
features.
https://bugzilla.redhat.com/show_bug.cgi?id=1495171
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When migration fails, QEMU may provide a description of the error in
the reply to query-migrate QMP command. We can fetch this error and use
it instead of the generic "unexpectedly failed" message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Express a properly terminated backing chain by putting a
virStorageSource of type VIR_STORAGE_TYPE_NONE in the chain. The newly
used helpers simplify this greatly.
The change fixes a bug as formatting an incomplete backing chain and
parsing it back would end up in expressing a terminated chain since
src->backingStoreRaw was not populated. By relying on the terminator
object this can be now processed appropriately.
Add helpers that will simplify checking if a backing file is valid or
whether it has backing store. The helper virStorageSourceIsBacking
returns true if the given virStorageSource is a valid backing store
member. virStorageSourceHasBacking returns true if the virStorageSource
has a backing store child.
Adding these functions creates a central points for further refactors.
Storage driver uses virStorageSource only partially to store it's
configuration but fully when parsing backing files of storage volumes.
This patch sets the 'type' field to a value other than
VIR_STORAGE_TYPE_NONE so that further patches can add a terminator
element to backing chains without breaking iteration.
The backing store indexes were not bound to the storage sources in any
way. To allow us to bind a given alias to a given storage source we need
to save the index in virStorageSource. The backing store ids are now
generated when detecting the backing chain.
Since we don't re-detect the backing chain after snapshots, the
numbering needs to be fixed there.
Existing qemuParseCommandLineMem() will parse "-m 4G" format string.
This patch allows it to parse "-m size=8126464k,slots=32,maxmem=33554432k"
format along with existing format. And adds a testcase to validate the changes.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Hyper-V uses its own specific memory management so no mapping is going to
be perfect. However, it is more correct to map Limit to max_memory (it
really is the upper limit of what the VM may potentially use) and keep
cur_balloon equal to total_memory.
The typical value returned from Hyper-V in Limit is 1 TiB, which is not
really going to work if interpreted as "startup memory" to be ballooned
away later.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
The code was vulnerable to SQL injection. Likely not a security issue due to
WMI SQL and other constraints but still lame. For example:
virsh # dominfo \"
error: failed to get domain '"'
error: internal error: SOAP fault during enumeration: code 's:Sender', subcode
'n:CannotProcessFilter', reason 'The data source could not process the filter.
The filter might be missing or it might be invalid. Change the filter and try
the request again. ', detail 'The WS-Management service cannot process the
request. The WQL query is invalid. '
This commit fixes the Hyper-V driver by escaping all WMI SQL string parameters.
The same command with the fix:
virsh # dominfo \"
error: failed to get domain '"'
error: Domain not found: No domain with name "
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
"%s is not a Hyper-V server" is not a correct generalization of all possible
error conditions of hypervEnumAndPull. For example:
$ virsh --connect hyperv://localhost/?transport=http
Enter username for localhost [administrator]:
Enter administrator's password for localhost: <enters incorrect password>
error: failed to connect to the hypervisor
error: internal error: localhost is not a Hyper-V server
This commit removes the general virReportError from hypervInitConnection and
also the "Invalid query" virReportError from hypervSerializeEprParam, which
does not correctly describe the error either (virBufferCheckError has
already set a meaningful error message at that point).
The same scenario with the fix:
$ virsh --connect hyperv://localhost/?transport=http
Enter username for localhost [administrator]:
Enter administrator's password for localhost: <enters incorrect password>
error: failed to connect to the hypervisor
error: internal error: Transport error during enumeration: User, password or
similar was not accepted (26)
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
The default_tls_x509_verify (and related) parameters in qemu.conf
control whether the QEMU TLS servers request & verify certificates
from clients. This works as a simple access control system for
servers by requiring the CA to issue certs to permitted clients.
This use of client certificates is disabled by default, since it
requires extra work to issue client certificates.
Unfortunately the code was using this configuration parameter when
setting up both TLS clients and servers in QEMU. The result was that
TLS clients for character devices and disk devices had verification
turned off, meaning they would ignore errors while validating the
server certificate.
This allows for trivial MITM attacks between client and server,
as any certificate returned by the attacker will be accepted by
the client.
This is assigned CVE-2017-1000256 / LSN-2017-0002
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Somewhere around commit 9ff9d9f reserving entire PCI slots was
eliminated, as demonstrated by commit 6cc2014.
Reserve the functions required by the implicit devices:
00:01.0 ISA Bridge
00:01.1 IDE Controller
00:01.2 USB Controller (unless USB is disabled)
00:01.3 Bridge
https://bugzilla.redhat.com/show_bug.cgi?id=1460143
Gather query-cpu-definitions results and use them for testing CPU model
usability blockers in CPUID to virCPUDef translation.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When decoding CPUID data to virCPUDef we need to be careful about using
a CPU model which cannot be directly used on the current host. Normally,
libvirt would notice the features which prevent the model from being
usable and it would disable them in the computed virCPUDef, but this
won't work in case the definition of the CPU model in QEMU contains more
features than what we have in cpu_map.xml. We need to count with the
usability blockers we got from QEMU and explicitly disable all of them
to make the computed virCPUDef usable.
https://bugzilla.redhat.com/show_bug.cgi?id=1464832
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This internal API can be used to find a specific CPU model in
virDomainCapsCPUModels list.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The "preferred" parameter is not used by any caller of cpuDecode
anymore. It's only used internally in cpu_x86 to implement cpuBaseline.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All APIs which expect a list of CPU models supported by hypervisors were
switched from char **models and int models to just accept a pointer to
virDomainCapsCPUModels object stored in domain capabilities. This avoids
the need to transform virDomainCapsCPUModelsPtr into a NULL-terminated
list of model names and also allows the various cpu driver APIs to
access additional details (such as its usability) about each CPU model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
query-cpu-definitions QMP command returns a list of unavailable features
which prevent CPU models from being usable on the current host. So far
we only checked whether the list was empty to mark CPU models as
(un)usable. This patch parses all unavailable features for each CPU
model and stores them in virDomainCapsCPUModel as a list of usability
blockers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When a hypervisor marks a CPU model as unusable on the current host, it
may also give us a list of features which prevent the model from being
usable. Storing this list in virDomainCapsCPUModel will help the CPU
driver with creating a host-model CPU configuration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The API makes a deep copy of a NULL-terminated string list.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently, if parsing of device info fails info->alias is freed.
It doesn't make much sense to leave the rest of the struct
behind.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's one 'return' in the middle of the function body. It's
very easy to miss and so it makes adding new code harder. Also
the function doesn't follow our style 100%.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1497396
In 0d3d020ba6 I've added capability to accept MAC addresses
for the API too. However, the implementation was faulty. It needs
to lookup the corresponding interface in the domain definition
and pass the ifname instead of MAC address.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit id '8708ca01c' added a check to determine whether the NIC had
Switchdev capabilities; however, in doing so inadvertently would cause
network devices without a PCI device to not be added to the node device
database. Thus, network devices having a "computer" as a parent, such
as "net_lo*", "net_virbr*", "net_tun*", "net_vnet*", etc. were not added.
Alter the check to not even check for Switchdev bits if no PCI device found.
https://bugzilla.redhat.com/show_bug.cgi?id=1497396
The other APIs accept both, ifname and MAC address. There's no
reason virDomainInterfaceStats can't do the same.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Every caller reports the error themselves. Might as well move it
into the function and thus unify it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Let's use the RWObjectLockable for the various list lock mgmt.
Only time need Write lock will be for Add/Remove logic.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The command "info migrate" of qemu outputs the dirty-pages-rate during
migration, but page size is different in different architectures. So
page size should be output to calculate dirty pages in bytes.
Page size is already implemented with commit
030ce1f8612215fcbe9d353dfeaeb2937f8e3f94 in qemu.
Now Implement the counter-part in libvirt.
Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
libvirtd throws unhandled signal 11 on ppc while running
virsh cpu-compare with missing model tag in the xml. This
patch errors out in such situation.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Parse the -M (or -machine) command line option before starting
processing in earnest and have a fallback ready in case it's not
present, so that while parsing other options we can rely on
def->os.machine being initialized.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1379218
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This commit fixes the deadlock introduced by commit
0980764dee. The call getgrouplist() of
the glibc library isn't safe to be called in between fork and
exec (see commit 75c125641a).
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Fixes: 0980764dee ("util: share code between virExec and virCommandExec")
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
These functions are used by an upcoming commit.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Rename virDomainNumaDefCPUFormat to virDomainNumaDefCPUFormatXML,
matching its peer virDomainNumaDefCPUParseXML and the general
vir*{Format,Parse}XML conventions.
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Generating libvirt packages per make rpm, "with-libxl=1" and "with-xen=1",
adds strict runtime dependencies per libxenlight for xen-libs package from
core libvirt-libs package. This is not necessary and unfortunate since
those dependencies set demand to "xen-libs" package even when there's no
need for libvirt xen or libxl driver components.
This patch is to have two separate xenconfig lib tool libraries: one for
core libvirt (without XL), and a another that contains xl for libxl driver
(libvirt_driver_libxl_impl.la) which when loading the driver, loads the
remaining symbols (xen{Format,Parse}XL. For the user/sysadmin, this means
the xen dependencies are moved into libxl driver, instead of core libvirt.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
In preparation for privatizing the object, use the accessor to fetch
the obj->def instead of the direct reference.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Modify virStoragePoolObjGetAutostartLink and
virStoragePoolObjGetConfigFile to return "const char *"
since that's how both are used and to ensure no one
tries to VIR_FREE the result.
To avoid any issues later on if paths ever change (unlikely but
possible) and to match the style of other generated rules the paths
of the static rules have to be quoted as well.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
libvirt allows spaces in vm names, there were issues in the past but it
seems not removed so the assumption has to be that spaces are continuing
to be allowed.
Therefore virt-aa-helper should not reject spaces in vm names anymore if
it is going to be refused causing issues then the parser or xml schema
should do so.
Apparmor rules are in quotes, so a space in a path based on the name works.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If users only specified vendor&product (the common case) then parsing
the xml via virDomainHostdevSubsysUSBDefParseXML would only set these.
Bus and Device would much later be added when the devices are prepared
to be added.
Due to that a hot-add of a usb hostdev works as the device is prepared
and virt-aa-helper processes the new internal xml. But on an initial
guest start at the time virt-aa-helper renders the apparmor rules the
bus/device id's are not set yet:
p ctl->def->hostdevs[0]->source.subsys.u.usb
$12 = {autoAddress = false, bus = 0, device = 0, vendor = 1921, product
= 21888}
That causes rules to be wrong:
"/dev/bus/usb/000/000" rw,
The fix calls virHostdevFindUSBDevice after reading the XML from
virt-aa-helper to only add apparmor rules for devices that could be found
and now are fully known to be able to write the rule correctly.
It uncondtionally sets virHostdevFindUSBDevice mandatory attribute as
adding an apparmor rule for a device not found makes no sense no matter
what startup policy it has set.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Skip purging the backing chain and redetecting it when it was not going
to change during the time we were not present.
The decision is based on the new flag which records whether there were
blockjobs running to the status XML.
https://bugzilla.redhat.com/show_bug.cgi?id=1447169
Since domain can have at most one watchdog it simplifies things a
bit. However, since we must be able to set the watchdog action as
well, new monitor command needs to be used.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently we don't do it. Therefore we accept senseless
combinations of models and buses they are attached to.
Moreover, diag288 watchdog is exclusive to s390(x).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
For VMs with persistent config the config may change upon successful
completion of a job. Save it always if a persistent VM finishes a
blockjob. This will simplify further additions.
The status XML would be saved only for the copy job (in case of success)
or on failure even for other jobs. As the status contains the backing
chain data, which change after success we should always save it on
block job completion.
Checking of disk presence accesses storage on the host so it should be
done from the host setup function. Move the code to new function called
qemuProcessPrepareHostStorage and remove qemuDomainCheckDiskPresence.
Introduce a new function to prepare domain disks which will also do the
volume source to actual disk source translation.
The 'pretend' condition is not transferred to the new location since it
does not help in writing tests and also no tests abuse it.
Pass flags to the function rather than just whether we have incoming
migration. This also enforces correct startup policy for USB devices
when reverting from a snapshot.
qemuMigrationPrepareAny called multiple of the functions starting the
qemu process for incoming migration by adding the flags explicitly.
Extract them to a variable so that they can be easily used for other
calls or changed in the future.
Interestingly enough, we don't document the point of view of the
interface statistics. Therefore it's unknown to users if for
instance rx_packets is the number of packets received by domain or
received by host (from domain). Document this explicitly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Similarly to previous patch, for some types of interface domain
and host are on the same side of RX/TX barrier. In that case, we
need to set up the QoS differently. Well, swapped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1497410
The comment in virNetDevTapInterfaceStats() implementation for
Linux states that packets transmitted by domain are received by
the host and vice versa. Well, this is true but not for all types
of interfaces. For instance, for macvtaps when TAP device is
hooked right onto a physical device any packet that domain sends
looks also like a packet sent to the host. Therefore, we should
allow caller to chose if the stats returned should be straight
copy or swapped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Small wrapper to lookup interface in domain definition by its
name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Users might have configured interface so that it's type of
network, but the corresponding network plugs interfaces into an
OVS bridge. Therefore, we have to check for the actual type of
the interface instead of the configured one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This code compiles only on Linux. Therefore the condition we
check is always true.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
qemu 2.7.0 introduces multiqueue virtio-blk(commit 2f27059).
This patch introduces a new attribute "queues". An example of
the XML:
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' queues='4'/>
The corresponding QEMU command line:
-device virtio-blk-pci,scsi=off,num-queues=4,id=virtio-disk0
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
When detaching an <interface/> from a domain, the MAC address is
parsed and if not present one is generated. If no corresponding
interface is found in the domain, the following error is
reported:
error: operation failed: no device matching mac address 52:54:00:75:32:5b found
where the MAC address is the auto generated one. This might be
very confusing. Solution to this is to ignore auto generated MAC
address when looking up the device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
It will come handy to know if the MAC address was generated (e.g.
during XML parse) or if it was parsed since provided by user in
the XML.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Found by Coverity. If virNWFilterHashTablePut, then the 3rd arg @val
must be free'd since it would be leaked.
This also fixes potential problem on the error path where the caller
could assume the virNWFilterHashTablePut was successful when in fact
it failed leading to other issues.
Rather than using loop break;'s in order to force a return
of rc = -1, let's just return -1 immediately on the various
error paths and then return 0 on the success path.
virDomainDiskSourceParse got to the point of being an ugly spaghetti
mess by adding more and more stuff into it. Split out parsing of network
disk information into a separate function so that it stays contained.
On pure success paths, virNWFilterIPAddrMapAddIPAddr was validly
consuming the input @addr; however, on failure paths it was possible
that virNWFilterVarValueCreateSimple succeed, but virNWFilterHashTablePut
failed resulting in virNWFilterVarValueFree being called to clean
up @val which also cleaned up the input @addr. Thus the caller had
no way to determine on failure whether it too should clean up the
passed parameter.
Instead, let's create a copy of the input @addr, then handle that
properly in the API allowing/forcing the caller to free it's own
copy of the input parameter.
Alter qemu command line generation in order to possibly add TLS for
a suitably configured domain.
Sample TLS args generated by libvirt -
-object tls-creds-x509,id=objvirtio-disk0_tls0,dir=/etc/pki/qemu,\
endpoint=client,verify-peer=yes \
-drive file.driver=vxhs,file.tls-creds=objvirtio-disk0_tls0,\
file.vdisk-id=eb90327c-8302-4725-9e1b-4e85ed4dc251,\
file.server.type=tcp,file.server.host=192.168.0.1,\
file.server.port=9999,format=raw,if=none,\
id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,\
id=virtio-disk0
Update the qemuxml2argvtest with a couple of examples. One for a
simple case and the other a bit more complex where multiple VxHS disks
are added where at least one uses a VxHS that doesn't require TLS
credentials and thus sets the domain disk source attribute "tls = 'no'".
Update the hotplug to be able to handle processing the tlsAlias whether
it's to add the TLS object when hotplugging a disk or to remove the TLS
object when hot unplugging a disk. The hot plug/unplug code is largely
generic, but the addition code does make the VXHS specific checks only
because it needs to grab the correct config directory and generate the
object as the command line would do.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Introduce a function to setup any TLS needs for a disk source.
If there's a configuration or other error setting up the disk source
for TLS, then cause the domain startup to fail.
For VxHS, follow the chardevTLS model where if the src->haveTLS hasn't
been configured, then take the system/global cfg->haveTLS setting for
the storage source *and* mark that we've done so via the tlsFromConfig
setting in storage source.
Next, if we are using TLS, then generate an alias into a virStorageSource
'tlsAlias' field that will be used to create the TLS object and added to
the disk object in order to link the two together for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add an optional virTristateBool haveTLS to virStorageSource to
manage whether a storage source will be using TLS.
Sample XML for a VxHS disk:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source protocol='vxhs' name='eb90327c-8302-4725-9e1b-4e85ed4dc251' tls='yes'>
<host name='192.168.0.1' port='9999'/>
</source>
<target dev='vda' bus='virtio'/>
</disk>
Additionally add a tlsFromConfig boolean to control whether the TLS
setting was due to domain configuration or qemu.conf global setting
in order to decide whether to Format the haveTLS setting for either
a live or saved domain configuration file.
Update the qemuxml2xmltest in order to add a test to show the proper
parsing.
Also update the docs to describe the tls attribute.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new TLS X.509 certificate type - "vxhs". This will handle the
creation of a TLS certificate capability for properly configured
VxHS network block device clients.
The following describes the behavior of TLS for VxHS block device:
(1) Two new options have been added in /etc/libvirt/qemu.conf
to control TLS behavior with VxHS block devices
"vxhs_tls" and "vxhs_tls_x509_cert_dir".
(2) Setting "vxhs_tls=1" in /etc/libvirt/qemu.conf will enable
TLS for VxHS block devices.
(3) "vxhs_tls_x509_cert_dir" can be set to the full path where the
TLS CA certificate and the client certificate and keys are saved.
If this value is missing, the "default_tls_x509_cert_dir" will be
used instead. If the environment is not configured properly the
authentication to the VxHS server will fail.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virNWFilterIPAddrMapAddIPAddr code can consume the @addr parameter
on success when the @ifname is found in the ipAddressMap->hashTable
hash table in the call to virNWFilterVarValueAddValue; however, if
not found in the hash table, then @addr is formatted into a @val
which is stored in the table and on return the caller would be
expected to free @addr.
Thus, the caller has no way to determine on success whether @addr was
consumed, so in order to fix this create a @tmp variable which will
be stored/consumed when virNWFilterVarValueAddValue succeeds. That way
the caller can free @addr whether the function returns success or failure.
The packet with passed FD has the following format:
--------------------------
| len | header | payload |
--------------------------
where "payload" has an additional count of FDs before the actual data:
------------------
| nfds | payload |
------------------
When the packet is received we parse the "header", which as a side
effect updates msg->bufferOffset to point to the beginning of "payload".
If the message call contains FDs, we need to also parse the count of
FDs, which also updates the msg->bufferOffset.
The issue here is that when we attempt to read the FDs data from the
socket and we receive EAGAIN we finish the reading and call poll()
to wait for the data the we need. When the data arrives we already have
the packet in our buffer so we read the "header" again but this time
we don't read the count of FDs because we already have it stored.
That means that the msg->bufferOffset is not updated to point to the
actual beginning of the payload data, but it points to the count of
FDs. After all FDs are processed we dispatch the message to process
it and decode the payload. Since the msg->bufferOffset points to wrong
data, we decode the wrong payload and the API call fails with
error messages:
Domain not found: no domain with matching uuid '67656e65-7269-6300-0c87-5003ca6941f2' ()
Broken by commit 133c511b52 which fixed a FD and memory leak.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
VM private data is cleared when the VM is turned off and also when the
VM object is being freed. Some of the clearing code was duplicated.
Extract it to a separate function.
This also removes the now unnecessary function
qemuDomainClearPrivatePaths.
Calling fallocate on the new (smaller) capacity ensures
that the whole file is allocated, but it does not reduce
the file size.
Also call ftruncate after fallocate.
https://bugzilla.redhat.com/show_bug.cgi?id=1366446
We have been trying to implement the ALLOCATE flag to mean
"the volume should be fully allocated after the resize".
Since commit b0579ed9 we do not allocate from the existing
capacity, but from the existing allocation value.
However this value is a total of all the allocated bytes,
not an offset.
For a sparsely allocated file:
$ perl -e 'print "x"x8192;' > vol1
$ fallocate -p -o 0 -l 4096 vol1
$ virsh vol-info vol1 default
Capacity: 8.00 KiB
Allocation: 4.00 KiB
Treating allocation as an offset would result in an incompletely
allocated file:
$ virsh vol-resize vol1 --pool default 16384 --allocate
Capacity: 16.00 KiB
Allocation: 12.00 KiB
Call fallocate from zero on the whole requested capacity to fully
allocate the file. After that, the volume is fully allocated
after the resize:
$ virsh vol-resize vol1 --pool default 16384 --allocate
$ virsh vol-info vol1 default
Capacity: 16.00 KiB
Allocation: 16.00 KiB
Introduce a new function virFileAllocate that will call the
non-destructive variants of safezero, essentially reverting
my commit 1390c268
safezero: fall back to writing zeroes even when resizing
back to the state as of commit 18f0316
virstoragefile: Have virStorageFileResize use safezero
This means that _ALLOCATE flag will no longer work on platforms
without the allocate syscalls, but it will not overwrite data
either.
Use an empty string to let qemu fill out the default.
This matches what's done in qemuBuildChrChardevStr.
https://bugzilla.redhat.com/show_bug.cgi?id=1454671
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This reverts commit edaf4ebe95.
This uses "reconnect" as attribute for <source> element, but we already
have a <reconnect> element for <source> element for chardev devices.
Since this is the same feature for different device it should be
presented in XML the same way.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Some values we read from the qemu monitor may be changed with the actual
state by the incoming migration. This means that we should refresh
certain things only after the migration has finished.
This is mostly visible in the cdrom tray state, which is by default
closed but may be opened by the guest OS. This would be refreshed before
qemu transferred the actual state and thus libvirt would think that the
tray is closed.
Note that this patch moves only a few obvious query commands. Others may
be moved later after individual assessment.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463168
When the vcpu is successfully removed libvirt would remove the cgroup.
In cases when removal of the cgroup fails libvirt would report an error.
This does not make much sense, since the vcpu was removed and we can't
really do anything with the cgroup. This patch silences the errors from
cgroup removal.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1462092
It is possible (although possibly not very useful) to leave out
the service attribute when using <source mode='bind'/>
Fix the formatter bug introduced by commit 4a0da34 and format
the host when its present (checked for non-NULL inside
virBufferEscapeString) instead of basing it on the presence
of the service attribute.
https://bugzilla.redhat.com/show_bug.cgi?id=1455825
The alias recorded in disk->info.alias is the alias for the frontend
device but we are interested in the backend drive. This messed up the
disk node name extraction code as qemu reports the drive alias in the
block query commands. This was broken in the node name detector
refactoring done in commit 0175dc6ea0
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1494327
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some distros (see diff) chose to backport QMP support rather than rebase
to newer version of qemu. As a hack they added the string 'libvirt' to
the qemu -help output. Remove this as downstream-only hacks should be
carried by downstream and not litter upstream.
This effectively reverts commit ff88cd5905
Because qemuDomainDefCopy needs a string representation of a domain
definition, there's no reason for calling the lower level
qemuDomainDefFormatBuf API.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virDomainDefFormatInternal (called by qemuDomainDefFormatXMLInternal)
already checks for buffer errors and properly resets the buffer on
failure.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
instead of only unloading it. This makes sure old profiles don't pile up
in /etc/apparmor.d/libvirt and we get updates to modified templates on
VM restart.
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
After commit 8708ca01c0 libvirtd consistently aborts with "stack
smashing detected" when nodedev driver is initialized.
This is caused by nlmsg_parse() being told that its array of nlattr*
has CTRL_CMD_MAX (10) entries, when in fact it is declared to have
CTRL_ATTR_MAX (8) entries. Since all the entries are initialized to
NULL, the result is that nlmsg_parse is overwriting 2*(sizof(nlattr*))
bytes outside the array.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id '5604c056' used the wrong API to generate the
<secret type='%s'..." field. The previous code used the
correct API as was done in commit id '6887af39'. The data
is actually a usage type not an auth type even though the
result is the same.
Passing a NULL value for the argument secAlias to the function
qemuDomainGetTLSObjects would cause a segmentation fault in
libvirtd.
Changed code to check before dereferencing a NULL secAlias.
Signed-off-by: Ashish Mittal <ashmit602@gmail.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1448268
When migrating to a file (e.g. when doing 'virsh save file'),
couple of things are happening in the thread that is executing
the API:
1) the domain obj is locked
2) iohelper is spawned as a separate process to handle all I/O
3) the thread waits for iohelper to finish
4) the domain obj is unlocked
Now, the problem is that while the thread waits in step 3 for
iohelper to finish this may take ages because iohelper calls
fdatasync(). And unfortunately, we are waiting the whole time
with the domain locked. So if another thread wants to jump in and
say copy the domain name ('virsh list' for instance), they are
stuck.
The solution is to unlock the domain whenever waiting for I/O and
lock it back again when it finished.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1471225
Commit id '99a2d6af2' was a bit too aggressive with determining whether
the provided path was a "physical" cd-rom in order to generate a taint
message due to the possibility of some guest and host trying to control
the tray. For cd-rom guest devices backed to some VIR_STORAGE_TYPE_FILE
storage, this wouldn't be a problem and as such it shouldn't be a problem
for guest devices using some sort of block device on the host such as
iSCSI, LVM, or a Disk pool would present.
So before issuing a taint message, let's check if the provided path of
the VIR_STORAGE_TYPE_BLOCK backed device is a "known" physical cdrom name
by comparing the beginning of the path w/ "/dev/cdrom" and "/dev/sr".
Also since it's possible the provided path could resolve to some /dev/srN
device, let's get that path as well and perform the same check.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virSocketAddrFormat() allocates the string and it's caller
responsibility to free it afterwards.
==28857== 11 bytes in 1 blocks are definitely lost in loss record 37 of 168
==28857== at 0x4C2BEDF: malloc (vg_replace_malloc.c:299)
==28857== by 0x9A81D79: strdup (in /lib64/libc-2.23.so)
==28857== by 0x5DA3BF0: virStrdup (virstring.c:902)
==28857== by 0x5D96182: virSocketAddrFormatFull (virsocketaddr.c:427)
==28857== by 0x5D95E13: virSocketAddrFormat (virsocketaddr.c:352)
==28857== by 0x5706890: qemuBuildHostNetStr (qemu_command.c:3891)
==28857== by 0x57138D3: qemuBuildInterfaceCommandLine (qemu_command.c:8597)
==28857== by 0x5713D6A: qemuBuildNetCommandLine (qemu_command.c:8699)
==28857== by 0x57176F6: qemuBuildCommandLine (qemu_command.c:10027)
==28857== by 0x5769D61: qemuProcessCreatePretendCmd (qemu_process.c:6004)
==28857== by 0x4056EC: testCompareXMLToArgv (qemuxml2argvtest.c:502)
==28857== by 0x41DF40: virTestRun (testutils.c:180)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. The commit itself tried to address
this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive.
Not to mention that masking a previously set flag is only acceptable if
the flag was set by a public API user. Internally, libvirt should have
never set the flag in the first place.
https://bugzilla.redhat.com/show_bug.cgi?id=1485022
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When a user requested a domain XML description with
VIR_DOMAIN_XML_UPDATE_CPU flag, libvirt would use the host CPU
definition from host capabilities rather than the one which will
actually be used once the domain is started.
https://bugzilla.redhat.com/show_bug.cgi?id=1481309
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In the past we updated host-model CPUs with host CPU data by adding a
model and features, but keeping the host-model mode. And since the CPU
model is not normally formatted for host-model CPU defs, we had to pass
the updateCPU flag to the formatting code to be able to properly output
updated host-model CPUs. Libvirt doesn't do this anymore, host-model
CPUs are turned into custom mode CPUs once updated with host CPU data
and thus there's no reason for keeping the hacks inside CPU XML
formatters.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This will be used to improve the validation for this type of devices.
The former @def parameter is renamed to @dev, leaving @def for the
virDomainDef (following the style used elsewhere).
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
The iohelper currently calls saferead() to get data from the
underlying file. This has a problem with O_DIRECT when hitting
end-of-file. saferead() is asked to read 1MB, but the first
read() it does may return only a few KB, so it'll try another
read() to fill the remaining buffer. Unfortunately the buffer
pointer passed into this 2nd read() is likely not aligned
to the extent that O_DIRECT requires, so rather than seeing
'0' for end-of-file, we'll get -1 + EINVAL due to misaligned
buffer.
The way the iohelper is currently written, it already handles
getting short reads, so there is actually no need to use
saferead() at all. We can simply call read() directly. The
benefit of this is that we can now write() the data immediately
so when we go into the subsequent reads() we'll always have a
correctly aligned buffer.
Technically the file position ought to be aligned for O_DIRECT
too, but this does not appear to matter when at end-of-file.
Tested-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For vhost-user ports, Open vSwitch acts as the server and QEMU the client.
When OVS crashed or restart, QEMU shoule be reconnect to OVS.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This commit adds new events for two methods and operations: *PoolBuild() and
*PoolDelete(). Using the event-test and the commands set below we have the
following outputs:
$ sudo ./event-test
Registering event callbacks
myStoragePoolEventCallback EVENT: Storage pool test Defined 0
myStoragePoolEventCallback EVENT: Storage pool test Created 0
myStoragePoolEventCallback EVENT: Storage pool test Started 0
myStoragePoolEventCallback EVENT: Storage pool test Stopped 0
myStoragePoolEventCallback EVENT: Storage pool test Deleted 0
myStoragePoolEventCallback EVENT: Storage pool test Undefined 0
Another terminal:
$ sudo virsh pool-define test.xml
Pool test defined from test.xml
$ sudo virsh pool-build test
Pool test built
$ sudo virsh pool-start test
Pool test started
$ sudo virsh pool-destroy test
Pool test destroyed
$ sudo virsh pool-delete test
Pool test deleted
$ sudo virsh pool-undefine test
Pool test has been undefined
This commits can be a solution for RHBZ #1475227.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1475227
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The patch passes the reconnect timeout to QEMU by monitor on
chardev hotplug.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The util/vircrypto.c file uses gnutls, so we must directly link
libvirt_util.la with gnutls to avoid errors on OS which do not
resolve symbols against indirectly linked libraries.
This fixes a build failure on Ubuntu Trusty
CCLD storagevolxml2argvtest
/usr/bin/ld: ../src/.libs/libvirt_util.a(libvirt_util_la-vircrypto.o): undefined reference to symbol 'gnutls_strerror@@GNUTLS_1_4'
//usr/lib/x86_64-linux-gnu/libgnutls.so.26: error adding symbols: DSO missing from command line
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VxHS block device will only use the newer formatting options and
avoid the legacy URI syntax.
An excerpt for a sample QEMU command line is:
-drive file.driver=vxhs,file.vdisk-id=eb90327c-8302-4725-9e1b-4e85ed4dc251,\
file.server.type=tcp,file.server.host=192.168.0.1,\
file.server.port=9999,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,\
id=virtio-disk0
Update qemuxml2argvtest with a simple test.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Extract out the "guts" of building a server entry into it's own
separately callable/usable function in order to allow building
a server entry for a consumer with src->nhosts == 1.
Add the backing parse and a test case to verify parsing of VxHS
backing storage.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new virStorageNetProtocol for Veritas HyperScale (VxHS) disks
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Using the query-qmp-schema introspection - look for the 'vxhs'
blockdevOptions type.
NB: This is a "best effort" type situation as there is not a
mechanism to determine whether the running QEMU has been
built with '--enable-vxhs'. All we can do is check if the
option to use vxhs for a blockdev-add exists in the command
infrastructure which does not take that into account when
building its table of commands and options.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The mlx4 (Mellanox) netdev driver implements the sysfs phys_port_id
file for both VFs and PFs, so you can find the VF netdev plugged into
the same physical port as any given PF netdev by comparing the
contents of phys_port_id of the respective netdevs. That's what
libvirt does when attempting to find the PF netdev for a given VF
netdev (or vice versa).
Most other netdev's drivers don't implement phys_port_id, so the file
is visible in sysfs directory listing, but attempts to read it result
in ENOTSUPP. In these cases, libvirt is unable to read phys_port_id of
either the PF or the VF, so it just returns the first entry in the
PF/VF's list of netdevs.
But we've found that the i40e driver is in between those two
situations - it implements phys_port_id for PF netdevs, but doesn't
implement it for VF netdevs. So libvirt would successfully read the
phys_port_id of the PF netdev, then try to find a VF netdev with
matching phys_port_id, but would fail because phys_port_id is NULL for
all VFs. This would result in a message like the following:
Could not find network device with phys_port_id '3cfdfe9edc39'
under PCI device at /sys/class/net/ens4f1/device/virtfn0
To solve this problem in a way that won't break functionality for
anyone else, this patch saves the first netdev name we find for the
device, and returns that if we fail to find a netdev with the desired
phys_port_id.
Documentation states:
"'offset' and 'size' represent an area which must lie entirely within
the device or file." Enforce the that the buffer lies within fully.
Commit 3956af495e broke the blockPeek API since virStorageFileRead
allocates a return buffer and fills it with the data, while the API
fills a user-provided buffer. This did not get caught by the compiler
since the API prototype uses a 'void *'.
Fix it by transferring the data from the allocated buffer to the user
provided buffer.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1491217
This is particularly useful on operating systems that don't ship
Python as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.
While at it, make it explicit that our scripts are only going to
work with Python 2, and remove the usage of unbuffered I/O, which
as far as I can tell has no effect on the output files.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This is particularly useful on operating systems that don't ship
Perl as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.
In one case (src/rpc/genprotocol.pl) the interpreter path was
missing altogether.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
I don't want to mask the real problem, but one can advocate
that we should be marking graphics ports as already in use on
qemuProcessReconnect anyway, because we already know that they
are taken.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Instead of checking for all possible constants that every
kernel header with devlink support should have (and defining
HAVE_DECL_DEVLINK as 1 if any of them is present due to the
way AC_CHECK_DECLS works), only check for DEVLINK_CMD_ESWITCH_GET.
This is the name of the constant since kernel 4.11. Between 4.8
and 4.11, the now deprecated spelling DEVLINK_CMD_ESWITCH_MODE_GET
was used.
Assume DEVLINK_ESWITCH_MODE_SWITCHDEV is available, since it was
introduced along with the deprecated spelling.
Introduce virStoragePoolObjForEachVolume to scan each volume
calling the passed callback function until all volumes have been
processed in the storage pool volume list, unless the callback
function returns an error.
Introduce virStoragePoolObjSearchVolume to search each volume
calling the passed callback function until it returns true
indicating that the desired volume was found.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create/use virStoragePoolObjAddVol in order to add volumes onto list.
Create/use virStoragePoolObjRemoveVol in order to remove volumes from list.
Create/use virStoragePoolObjGetVolumesCount to get count of volumes on list.
For the storage driver, the logic alters when the volumes.obj list grows
to after we've fetched the volobj. This is an optimization of sorts, but
also doesn't "needlessly" grow the volumes.objs list and then just decr
the count if the virGetStorageVol fails.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create/use a helper to perform object allocation.
Adjust storagevolxml2argvtest.c in order to use the allocator and
setting of the obj->def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for making a private object, create accessor API's for
consumer storage functions to use:
virStoragePoolObjGetDef
virStoragePoolObjSetDef
virStoragePoolObjGetNewDef
virStoragePoolObjDefUseNewDef
virStoragePoolObjGetConfigFile
virStoragePoolObjSetConfigFile
virStoragePoolObjGetAutostartLink
virStoragePoolObjIsActive
virStoragePoolObjSetActive
virStoragePoolObjIsAutostart
virStoragePoolObjSetAutostart
virStoragePoolObjGetAsyncjobs
virStoragePoolObjIncrAsyncjobs
virStoragePoolObjDecrAsyncjobs
Signed-off-by: John Ferlan <jferlan@redhat.com>
Available since QEMU 2.10.0 (specifically commit
v2.9.0-2233-g53f9a6f45f).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The features were added to QEMU by commit v2.4.0-1690-gf7fda28094 as
Skylake Server features.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Adding functionality to libvirt that will allow querying the interface
for the availability of switchdev Offloading NIC capabilities.
The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
command to retrieve the switchdev NIC feature with command example:
devlink dev eswitch show pci/0000:03:00.0
This feature is needed for Openstack so we can do a scheduling decision
if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
And select the appropriate hypervisors with the requested capability see [1].
[1] - https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1075520
Apart from generic checks, we need to constrain netmask/prefix
length a bit. Thing is, with current implementation QEMU needs to
be able to 'assign' some IP addresses to the virtual network. For
instance, the default gateway is at x.x.x.2, dns is at x.x.x.3,
the default DHCP range is x.x.x.15-x.x.x.30. Since we don't
expose these settings yet, it's safer to require shorter prefix
to have room for the defaults.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: laine@laine.org
https://bugzilla.redhat.com/show_bug.cgi?id=1075520
Currently, all that users can specify for an interface type of
'user' is the common attributes: PCI address, NIC model (and
that's basically it). However, some need to configure other
address range than the default one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: laine@laine.org
Only feature policy is checked on s390, which was previously done in
virCPUUpdate, but that's not the correct place for the check once we
have virCPUValidateFeatures.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This new API may be used to check whether all features used in a CPU
definition are valid (e.g., libvirt knows their name, their policy is
supported, etc.). Leaving this API unimplemented in an arch subdriver
means libvirt does not restrict CPU features usable on the associated
architectures.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The host CPU definitions reported in the capabilities XML may contain
CPU features unknown to QEMU, but the result of virConnectBaselineCPU is
supposed to be directly usable as a guest CPU definition and thus it
should only contain features QEMU knows about.
https://bugzilla.redhat.com/show_bug.cgi?id=1450317
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The filter only needs to know the CPU architecture. Passing
virQEMUCapsPtr as opaque is a bit overkill.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The implementation of virConnectBaselineCPU may be different for each
hypervisor. Thus it shouldn't really be implmented in the cpu code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id 'e02ff020cac' neglected to use the attrBuf and childBuf
in the virDomainDiskSourceFormatNetwork call.
So make the necessary alterations to allow usage.
Rather than checking during XML processing, move the check for
valid <encryption> into virDomainDiskDefParseValidate and alter
the text of the message slightly to be a bit more correct.
Rather than checking during XML processing, move the checks for correct
and valid auth into virDomainDiskDefParseValidate. This will introduce
virDomainDiskSourceDefParseAuthValidate to validate that the authdef
stored for the virStorageSource is valid. This can then be expanded
to service backingStore sources as well.
Alter the message text slightly as well to distinguish between an
unknown name and an incorrectly used name. Since type is not a
mandatory field, add the NULLSTR() around the output of the unknown
error. NB, a config using unknown formatting would fail virschematest
since it only accepts 'iscsi' and 'ceph' as "valid" types.
Some operations done to rollback disk image labelling and locking might
overwrite (or clear) the actual error. Remember the original error when
tearing down disk access so that it's not obscured.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1461301
Some cleanup paths overwrite a usefull error message with a less useful
one and we then try to preserve the original message. The handlers added
in this patch will simplify the operations since they are designed right
for the purpose.
Block job QMP commands with underscores rather than dashes were never
released in upstream qemu, (they were added, but modified in the same
release [1]), but a certain distro managed to backport the version in the
middle.
The change also slightly modified semantics for the abort command, which
made us have a lot of code which was only ever present in certain
downstream distros.
Clean the upstream code from the legacy cruft and support only the
upstream implementations.
[1] See qemu commit v1.0-2176-gdb58f9c060
Reviewed-by: Eric Blake <eblake@redhat.com>
No need to pass a @driver parameter since all that's done is deref
the @cfg especially since the only caller can just pass an already
referenced @cfg.
Also, looks like commit id '0298531b' at one time had a different
name for the API, so I took the liberty of fixing the comments too
since I would already be updating them for the @cfg variable.
For a logged in user this a path like /dev/dri/renderD128 will have
default ownership root:video which won't work for the qemu:qemu user,
so we need to chown it.
We only do this when mount namespaces are enabled in the qemu driver,
so the chown'ing doesn't interfere with other users of the shared
render node path
https://bugzilla.redhat.com/show_bug.cgi?id=1460804
The VIR_SECURITY_MANAGER_MOUNT_NAMESPACE flag informs the DAC driver
if mount namespaces are in use for the VM. Will be used for future
changes.
Wire it up in the qemu driver
https://bugzilla.redhat.com/show_bug.cgi?id=1464313
If a Disk pool was defined/created using XML that either didn't
specify a specific format or specified format type='unknown', then
restarting a pool after an initial disk backend build with overwrite
would fail after a libvirtd restart for a non-autostarted pool.
This is because the persistent pool data is not updated during pool
build w/ overwrite processing to have the VIR_STORAGE_POOL_DISK_DOS
default format.
So in addition to the alteration done during disk build processing,
alter the default expectation for disk startup to be DOS if nothing
has been defined yet. That will either succeed if the pool had been
successfully built previously using the default DOS format or fail
with a message indicating the format is something else that does not
match the expect format 'dos'.
https://bugzilla.redhat.com/show_bug.cgi?id=1477880
If the "/#" is missing from the provided iSCSI path, then we need
to provide the default LUN of /0; otherwise, QEMU will fail to parse
the URL causing a failure to either create the guest or hotplug
attach the storage.
During post parse, for any iSCSI disk or hostdev, scan the source
path looking for the presence of '/', if found, then we can assume
the LUN is provided. If not found, alter the input XML to add the
"/0". This will cause the generated XML to have the generated
value when the domain config is saved after post parse.
Commit 703abf1d7 changed the logic so that we don't attempt to re-create
the image if it's a block device. This was done by modifying the
'reuse' variable. Unfortunately after modifying it one of the uses was
to infer whether we should probe the disk format. After changes in the
commit mentioned above we would attempt the probe if the target of the
copy is a block device and the format was not provided explicitly rather
than using the format of the disk.
Fix it by explicitly checking whether the user requested a reuse of the
disk rather than the modified boolean flag.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1490826
https://bugzilla.redhat.com/show_bug.cgi?id=1447169
With this patch users can cold plug a watchdog. Things are pretty
simple because a domain can have at most one watchdog device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If there was an error when constructing the buffer, NULL is
returned. The buffer is never freed though.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When destroying a domain libvirt marks it internally with a
beingDestroyed flag to make sure the qemuDomainDestroyFlags API itself
cleans up after the domain rather than letting an uninformed EOF handler
do it. However, when the domain is being started at the moment libvirt
was asked to destroy it, only the starting thread can properly clean up
after the domain and thus it ignores the beingDestroyed flag. Once
qemuDomainDestroyFlags finally gets a job, the domain may not be running
anymore, which should not be reported as an error if the domain has been
starting up.
https://bugzilla.redhat.com/show_bug.cgi?id=1445600
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This option requires:
<ioapic driver='qemu'/>
Report an error in case someone tries to combine
it with different ioapic setting.
Setting 'eim' on without enabling 'intremap' does not make sense.
https://bugzilla.redhat.com/show_bug.cgi?id=1457610
Add a 'cleanup' label and improve the readability of one of the
checks by making it conform to our formatting standard and moving
the corresponding comment.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
TPM 2 does not implement sysfs files for cancellation of commands.
We therefore use /dev/null for the cancel path passed to QEMU.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add a new CPU model called 'EPYC' to model processors from AMD EPYC
family (which includes EPYC 76xx,75xx,74xx, 73xx and 72xx).
The following features bits have been added/removed compare to Opteron_G5
Added: monitor, movbe, rdrand, mmxext, ffxsr, rdtscp, cr8legacy, osvw,
fsgsbase, bmi1, avx2, smep, bmi2, rdseed, adx, smap, clfshopt, sha
xsaveopt, xsavec, xgetbv1, arat
Removed: xop, fma4, tbm
The patch is depend on EPYC CPU model supported introduced in qemu [1]
[1] https://patchwork.kernel.org/patch/9902205/
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
In case of real migration (not migrating to file on save, dump etc)
migration info is not complete at time qemu finishes migration
in normal (non postcopy) mode. We need to update disks stats,
downtime info etc. Thus let's not expose this job status as
completed.
To archive this let's set status to 'qemu completed' after
qemu reports migration is finished. It is not visible as complete
job to clients. Cookie code on confirm phase will finally turn
job into completed. As we don't need more things to do when
migrating to file status is set to 'completed' as before
in this case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When getting job info in case mirror does not reach ready phase
fetch mirror stats from qemu. Otherwise mirror stats are already
saved in current job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Looks like it is more simple to drop this optimization as we are
going to add getting disks stats during migration via quering qemu
process and checking if we have to acquire job condition becomes
more complicate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking stat.status let's set status to migrating
as soon as migrate command is send (waiting for completion
is a good place too).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Setting status to none has little value - getting job status
will not return even elapsed time.
After this patch getting job stats stays correct in a sence
it will not fetch migration stats because it consults
stats.status before doing the fetch.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Querying destination migration statistics may result in getting
a failure or getting a elapsed time value depending on stats.status
value which is odd. Instead let's always fail. Clients should
be ready to handle this as currently getting failure period
can be considerable.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuMigrationFetchJobStatus is rather inconvinient. Some of its
callers don't need status to be updated, some don't need to update
elapsed time right away. So let's update status or elapsed time
in callers instead.
This patch drops updating job status on getting job stats by
client. This way we will not provide status 'completed' while
it is not yet updated by migration routine.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This way we get stats only in one place. The former code waits for
complete/postcopy status basically and don't need to mess with stats.
The patch drops raising an error on stats updates failure. This
does not make much sense anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status
instead of checking job.current->stats.status. The latter can be changed
when fetching migration statistics. Moving state function from the variable
and leave only store function seems more managable.
This patch removes all state checking usage of stats except for
qemuDomainGetJobStatsInternal. This place will be handled separately.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch simply switches code from using VIR_DOMAIN_JOB_* to
introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom
to introduce states for postcopy and mirroring phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1439991
Whenever a device is being updated via
virDomainUpdateDeviceFlags() API, we parse the device XML and
ideally run some generic checks to validate the configuration
(e.g. if device defines per-device boot order but the domain has
os/boot element already). Well, that's the theory - due to a
missing check we've jumped early from that check function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Neither @cfg nor (now) @driver is used in the API, so remove them
and mark @opaque as UNUSED.
NB: Commit id 'fa3c558596' dropped the unused @qemuCaps which was the
last consumer of @driver other than @cfg, but even @cfg was never used
even in the original implementation from commit id 'd987f63a'.
arm/aarch64 -M virt on KVM doesn't and will never work with standard
VGA card emulation. The recommended method is to use type=virtio, so
let's make it the default for video devices without an explicit type
set by the user.
https://bugzilla.redhat.com/show_bug.cgi?id=1404112
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This allows drivers to set their own default. But if a driver neglects
to fill one in, we still error like we previously would at parse time.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Will be needed for future patches to pull the default video type
setting out of XML parsing routines.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
There were a few places in our code where the following pattern in 'if'
condition occurred:
if ((foo = bar() < 0))
do something;
This patch adjusts the conditions to the expected format:
if ((foo = bar()) < 0)
do something;
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1488192
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Suggested-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Although not previously explicitly documented, the expectation for
the libvirt event loop is that an implementation is registered early
in application startup, before calling any libvirt APIs and then
run forever after. Replacing a previously registered event loop is
not safe & subject to races even if virConnectClose has been called
on open handles, due to delayed deregistration of callbacks during
conenction close.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Funny thing. So when initializing LXC driver's capabilities,
firstly the virLXCDriverGetCapabilities() is called. This creates
new capabilities, stores them under driver->caps, ref() them and
return them. However, the return value is ignored. Secondly, the
function is called yet again and since we have driver->caps set,
they are ref()-ed again an returned. So in the end, driver's
capabilities have refcount of three when in fact they should have
refcount of one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If you use the VDDK library to access virtual machines remotely, you
really need to know the Managed Object Reference ("moref") of the VM.
This must be passed each time you connect to the API.
For example nbdkit's VDDK plugin requires a moref to be passed to
mount up a VM's disk remotely:
nbdkit vddk user=root password=+/tmp/rootpw \
server=esxi.example.com thumbprint=xx:xx:xx:... \
vm=moref=2 \
file="[datastore1] Fedora/Fedora.vmdk"
Getting the moref is a huge pain. To get some idea of what it is, why
it is needed, and how much trouble it is to get it, see:
https://blogs.vmware.com/vsphere/2012/02/uniquely-identifying-virtual-machines-in-vsphere-and-vcloud-part-1-overview.htmlhttps://blogs.vmware.com/vsphere/2012/02/uniquely-identifying-virtual-machines-in-vsphere-and-vcloud-part-2-technical.html
However the moref is available conveniently in the internals of the
libvirt VMX driver. This patch exposes it as a custom XML element
using the same "vmware:" namespace which was previously used for the
datacenterpath (see libvirt commit 636a990587).
It appears in the XML like this:
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
<name>Fedora</name>
...
<vmware:datacenterpath>ha-datacenter</vmware:datacenterpath>
<vmware:moref>2</vmware:moref>
</domain>
Note that the moref can appear as either a simple ID (for esx://
connections) or as a "vm-<ID>" (for vpx:// connections). It should be
treated by users as an opaque string.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1487322
In ace45e67ab I tried to fix a problem that we get the reply to
a D-Bus call while we were sleeping. In that case the callback
was never set. So I changed the code that the callback is called
directly in this case. However, I hadn't realized that since the
callback is called out of order it locks the virNetDaemon.
Exactly the very same virNetDaemon object that we are dealing
with right now and that we have locked already (in
virNetDaemonAddShutdownInhibition())
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We call qemuDomainGetMachineName on domain start. On first
start (after daemon start) pid is 0 and virSystemdGetMachineNameByPID
don't get called. But after domain shutting down pid became -1 so
on next start virSystemdGetMachineNameByPID is called and returned an error.
Error is ignored so it is not critical. But at least on my system
(systemd-219 with extra patches) systemd-machined is crashed on
this request.
This behaviour is triggered by eaf2c9f89.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1484230
When updating a virtio enabled vNIC and trying to change either
of rx_queue_size or tx_queue_size success is reported although no
operation is actually performed. Moreover, there's no way how to
change these on the fly. This is due to way we check for changes:
explicitly for each struct member. Therefore it's easy to miss
one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>