The curent libvirt CPU driver for s390 does not return a host CPU model.
This patch returns 'host' according to the other platforms that would
not decode any CPU model.
This is an intermediate bugfix due to a discussion on OpenStack mailing
list. The final patch introducing the CPU model support for s390x will
exchange the hard-coded decode method.
Signed-off-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Found this one by inspection... The API claims to "own" the input
value even in the case of error. However, in the initial entry
to the API if the value exists, was STRING, but without a str value
it just returned without freeing the 'value' which it claims to now
own. So I added the virConfFreeValue() call in order to resolve.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id 'a4e86390' modified the command line to allow --ipadd multiple
times; however, it did not account for the condition where a NULL is
returned which will could lead to some interesting errors with multiple
--ipadd's without parameters.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id 'a4e86390' modified the command line to allow --ipadd multiple
times, which caused Coverity to notice a latent memory leak with the
'ipAddr' string not being VIR_FREE()'d
Signed-off-by: John Ferlan <jferlan@redhat.com>
The default value should be 16 MiB instead of 8 MiB. Only really old
version of upstream QEMU used the 8 MiB as default for vga framebuffer.
Without this change if you update your libvirt where we introduced the
"vgamem" attribute for QXL video device the value will be set to 8 MiB,
but previously your guest had 16 MiB because we didn't pass any value to
QEMU command line which means QEMU used its own 16 MiB as default.
This will affect all users with guest's display resolution higher than
1920x1080.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1179684
The way that we currently generate the <driver/> for <controller/> is
just madness:
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='12'/>
<driver cmd_per_lun='123'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
It's obvious that we should be aiming at the following:
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='12' cmd_per_lun='123'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Remove the resize flag and use the same code path for all callers.
This flag was added by commit 18f0316 to allow virStorageFileResize
use 'safezero' while preserving the behavior.
Explicitly return -2 when a fallback to a different method should
be done, to make the code path more obvious.
Fail immediately when ftruncate fails in the mmap method,
as we did before commit 18f0316.
I noticed this while working on a previous commit. Why should
we be calling out '../src/' when it is sufficient to refer to just
'./'? Blind copy-and-paste runs rampant in this file :)
* src/Makefile.am (INCLUDES, *_CFLAGS): Shorten to $(srcdir).
Signed-off-by: Eric Blake <eblake@redhat.com>
Well, the parallel build doesn't work as there are not dependencies
set correctly. When running 'make -j' I see this error:
make[2]: Entering directory '/home/zippy/work/libvirt/libvirt.git/src'
GEN util/virkeymaps.h
GEN locking/lock_protocol.h
make[2]: *** No rule to make target 'xenconfig/xen_xl_disk.h', needed by 'all'. Stop.
make[2]: *** Waiting for unfinished jobs....
GEN lxc/lxc_controller_dispatch.h
The fix is to correctly set dependencies by letting make know that .c
and .h are to be generated from .l. Moreover, the section is moved
closer to the other section which uses it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
VMware ESX does not always set the "serialX.fileType" tag in VMX files. The
default value for this tag is "device", and when adding a new serial port
of this type VMware will omit the fileType tag. This caused libvirt to
fail to parse the VMX file. Fixed by making this tag optional and using
"device" as a default value. Also updated vmx2xmltest to test for this
case.
Signed-off-by: Eric Blake <eblake@redhat.com>
Ever since commit 2c78051 split out a helper library for the sake of
changing CFLAGS, a VPATH build with xenconfig enabled has failed:
CC xenconfig/libvirt_xenxldiskparser_la-xen_xl_disk.lo
../../src/xenconfig/xen_xl_disk.l:37:21: fatal error: xen_xl.h: No such file or directory
# include "xen_xl.h"
^
compilation terminated.
Makefile:9462: recipe for target 'xenconfig/libvirt_xenxldiskparser_la-xen_xl_disk.lo' failed
The solution is to tell the build to look for xen_xl.h relative
to $(srcdir), since we keep that file under version control.
[Not fixed here - the raw use of -Wno-unused-parameter in CFLAGS
is NOT portable; ideally, we should be doing a configure test
and only supplying that argument when we know the compiler supports
-Wunused-parameter; but that's a patch for another day]
[Not fixed here - there are still issues with parallel builds hitting
a race between generating the files and trying to compile/distribute
them]
* src/Makefile.am (libvirt_xenxldiskparser_la_CFLAGS): Add another
include directory.
Signed-off-by: Eric Blake <eblake@redhat.com>
In one of my previous commits (311b4a67) I've tried to allow to
pass regular system pages to <hugepages>. However, there was a
little bug that wasn't caught. If domain has guest NUMA topology
defined, qemuBuildNumaArgStr() function takes care of generating
corresponding command line. The hugepages backing for guest NUMA
nodes is handled there too. And here comes the bug: the hugepages
setting from XML is stored in KiB internally, however, the system
pages size was queried and stored in Bytes. So the check whether
these two are equal was failing even if it shouldn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Make use of the ebtables functionality to be able to filter certain
parameters of icmpv6 packets. Extend the XML parser for icmpv6 types,
type ranges, codes, and code ranges. Extend the nwfilter documentation,
schema, and test cases.
Being able to filter icmpv6 types and codes helps extending the DHCP
snooper for IPv6 and filtering at least some parameters of IPv6's NDP
(Neighbor Discovery Protocol) packets. However, the filtering will not
be as good as the filtering of ARP packets since we cannot
check on IP addresses in the payload of the NDP packets.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
In commit 540c339a25 the whole domain
reference counting was refactored in the qemu driver. Domain jobs now
don't need to reference the domain object as they now expect the
reference from the calling function.
However, the patch forgot to remove the unref call in case we exit the
monitor when we were acquiring a nested job. This caused the daemon to
crash on a subsequent access to the domain object once we've done an
operation requiring a nested job for a monitor access.
An easy reproducer case:
1) Start a vm with qcow disks
2) virsh snapshot-create-as DOMNAME
3) virsh dumpxml DOMNAME
4) daemon crashes in a semi-random spot while accessing a now-removed VM
object.
Fortunately, the commit wasn't released yet, so there are no security
implications.
Reported-by: Shanzi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1177194
When migrate a vm, we will generate a xml via qemuDomainDefFormatLive and
pass this xml to target libvirtd. Libvirt will use the current network
state in def->data.network.actual to generate the xml, this will make
migrate failed when we set a network type guest interface use a macvtap
network as a source in a vm then migrate vm to another host(which has the
different macvtap network settings: different interface name, bridge name...)
Add a flag check in virDomainNetDefFormat, if we set a VIR_DOMAIN_XML_MIGRATABLE
flag when call virDomainNetDefFormat, we won't get the current vm interface
state.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add missing VNC setup via Parallels SDK.
Parallels Cloud Server starts one VNC server per domain,
so we could process only one VNC server definition.
Network-based listening currently is unimplemented.
Signed-off-by: Alexander Burluka <aburluka@parallels.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1177723
When setting new bandwidth limits via
virDomainSetInterfaceParameters, the old ones are cleared first.
However, if setting the new ones fails, the old are already gone
and interface is left in inconsistent state. Therefore, right
before failing we ought to try to restore the old bandwidth.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Lack of a lease (whether mac is given or not) is a normal expected
scenario, since we are already filling in rv with nleases (which is
okay as 0 if there is no lease). There is no need to raise an error.
This fixes:
> virsh # net-dhcp-leases --mac 00:50:56:c0:00:01 default
> error: Failed to get leases info for default
> error: internal error: no lease with matching MAC address: 00:50:56:c0:00:01
Signed-off-by: Nehal J Wani <nehaljw.kkd1@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1178652
We will get a warning when we have a guest in paused
status (caused by kernel panic) and restart libvirtd,
warning message like this:
Qemu reported unknown VM status: 'guest-panicked'
and this seems because we set a wrong status name in
qemu_monitor.c, and from qemu qapi-schema.json file
we know this status should named 'guest-panicked'.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 4dc04d3a added virNetlinkGetErrorCode, but forgot to provide
a fallback, which kills the build on mingw (among others):
CCLD libvirt.la
Cannot export virNetlinkGetErrorCode: symbol not defined
collect2: error: ld returned 1 exit status
* src/util/virnetlink.c (virNetlinkGetErrorCode): Provide fallback.
Signed-off-by: Eric Blake <eblake@redhat.com>
Vzctl man page says that --ipadd can be provided multiple times to add
several IP addresses. Looping over the configured ip addresses to add
one --ipadd for each. This would even handle the multiple IPs handled
by openvz_conf.c
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
Add a default implementation of virNetDevSetIPv4Address using netlink
and libnl. This avoids requiring /usr/sbin/ip or /usr/sbin/ifconfig
external binaries.
The typical case for the problem is starting a domain needing a network
that isn't started. Even after starting the network, we get an unknown error
when starting the container.
This is due to dynamic security label not being removed.
According to xm.config manual, HVM pae|apic|acpi feature default
is 1 (enabled). But in conversion from xm config to libvirt xml,
if xm config doesn't contain pae|apic|acpi, it sets default value
to 0, this causes some problems in HVM guest.
Update parser codes to set HVM pae|apic|acpi default value to 1
to match xm config convension.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Now that xenconfig supports parsing and formatting Xen's
XL config format, integrate it into the libxl driver's
connectDomainXML{From,To}Native functions.
Signed-off-by: Kiarie Kahurani <davidkiarie4@gmail.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Introduce a Xen xl parser
This parser allows for users to convert the new xl disk format and
spice graphics config to libvirt xml format and vice versa. Regarding
the spice graphics config, the code is pretty much straight forward.
For the disk {formating, parsing}, this parser takes care of the new
xl format which include positional parameters and key/value parameters.
In xl format disk config a <diskspec> consists of parameters separated by
commas. If the parameters do not contain an '=' they are automatically
assigned to certain options following the order below
target, format, vdev, access
The above are the only mandatory parameters in the <diskspec> but there
are many more disk config options. These options can be specified as
key=value pairs. This takes care of the rest of the options such as
devtype, backend, backendtype, script, direct-io-safe,
The positional paramters can also be specified in key/value form
for example
/dev/vg/guest-volume,,hda
/dev/vg/guest-volume,raw,hda,rw
format=raw, vdev=hda, access=rw, target=/dev/vg/guest-volume
are interpleted to one config.
In xm format, the above diskspec would be written as
phy:/dev/vg/guest-volume,hda,w
The disk parser is based on the same parser used successfully by
the Xen project for several years now. Ian Jackson authored the
scanner, which is used by this commit with mimimal changes. Only
the PREFIX option is changed, to produce function and file names
more consistent with libvirt's convention.
Signed-off-by: Kiarie Kahurani <davidkiarie4@gmail.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Export helper functions for reuse in getting values
from a virConfPtr object
Signed-off-by: Kiarie Kahurani <davidkiarie4@gmail.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The <domain/> element under /capabilities/guest/arch/ can have no
child elements. If that's the case we format:
<domain type='xen'>
</domain>
instead of simpler:
<domain type='xen'/>
This commit fixes that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
VIR_STORAGE_FILE_AUTO should be used only in xml provided to
libvirt by user, if I understood correctly. Driver should
set storage source format to specific disk format in
*DomainGetXMLDesc.
CDROMs in PCS use raw image format.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Some of the nwfilter tests are now failing since --concurrent shows
up in the ebtables command. To avoid this, implement a function
preventing the probing for lock support in the eb/iptables tools
and use it in the tests.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
There is one problem that causes various errors in the daemon. When
domain is waiting for a job, it is unlocked while waiting on the
condition. However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd. If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object. However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).
The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list. That reference can be removed when the API is waiting for
a job. And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.
This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked. This makes all reference counting
deterministic and makes the code a bit clearer.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 1a80b97d, which added the virCgroupHasEmptyTasks() function
forgot that the parameter @cgroup may be NULL and did not check that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Although QMP returns info about vCPU threads in TCG mode, the
data it returns is mostly lies. Only the first vCPU has a valid
thread_id returned. The thread_id given for the other vCPUs is
in fact the main emulator thread. All vCPUs actually run under
the same thread in TCG mode.
Our vCPU pinning code is not at all able to cope with this
so if you try to set CPU affinity per-vCPU you end up with
wierd errors
error: Failed to start domain instance-00000007
error: cannot set CPU affinity on process 24365: Invalid argument
Since few people will care about the performance of TCG with
strict CPU pinning, lets just disable that for now, so we get
a clear error message
error: Failed to start domain instance-00000007
error: Requested operation is not valid: cpu affinity is not supported
The code assumes that def->vcpus == nvcpupids, so when we setup
fake CPU pids for old QEMU with nvcpupids == 1, we cause the
later code to read off the end of the array. This has fun results
like sche_setaffinity(0, ...) which changes libvirtd's own CPU
affinity, or even better sched_setaffinity($RANDOM, ...) which
changes the affinity of a random OS process.
Libvirt BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1175397
QEMU BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1170093
In qemu there are two interesting arguments:
1) -numa to create a guest NUMA node
2) -object memory-backend-{ram,file} to tell qemu which memory
region on which host's NUMA node it should allocate the guest
memory from.
Combining these two together we can instruct qemu to create a
guest NUMA node that is tied to a host NUMA node. And it works
just fine. However, depending on machine type used, there might
be some issued during migration when OVMF is enabled (see QEMU
BZ). While this truly is a QEMU bug, we can help avoiding it. The
problem lies within the memory backend objects somewhere. Having
said that, fix on our side consists on putting those objects on
the command line if and only if needed. For instance, while
previously we would construct this (in all ways correct) command
line:
-object memory-backend-ram,size=256M,id=ram-node0 \
-numa node,nodeid=0,cpus=0,memdev=ram-node0
now we create just:
-numa node,nodeid=0,cpus=0,mem=256
because the backend object is obviously not tied to any specific
host NUMA node.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On a system with 160 CPUs the /proc/cpuinfo size grows beyond
the currently set limit of 10KB causing an internal error.
This patch increases the buffer size to 1MB.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Coverity flagged commit 0282ca45 as introducing a memory leak;
in all my refactoring to make capacity probing conditional on
whether the image is non-raw, I missed deleting the unconditional
probe.
* src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Drop
redundant assignment.
Signed-off-by: Eric Blake <eblake@redhat.com>
A recent lvm change has resulted in a change for the "default" type of
logical volume created when the "--virtualsize" or "--V" is supplied on
the command line (e.g. when the allocation and capacity values of a to
be created volume differ). It seems that at the very least the following
change adjusts the default type:
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=e0164f21
and the following may also have some impact.
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=87fc3b71
When using the virsh vol-create-as or vol-create xmlfile commands, the
result is that libvirt will now create a "thin logical volume" and a
"thin logical volume pool" rather than just a "thin snapshot logical
volume". For example the following sequence:
# lvcreate --name test -L 2M -V 5M lvm_test
Rounding up size to full physical extent 4.00 MiB
Rounding up size to full physical extent 8.00 MiB
Logical volume "test" created.
# lvs lvm_test
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lvol1 lvm_test twi-a-tz-- 4.00m 0.00 0.98
test lvm_test Vwi-a-tz-- 8.00m lvol1 0.00
compared to the former code which had the following:
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
test LVM_Test swi-a-s--- 4.00m [test_vorigin] 0.00
Since libvirt doesn't know how to parse the thin logical volume
and pool, it will fail to find the newly created volume and pool
even though it exists in the volume group.
It cannot find since the command used to find/parse returns a thin volume
'test' with no associated device, for example the output is:
lvol1##UgUwkp-fTFP-C0rc-ufue-xrYh-dkPr-FGPFPx#lvol1_tdata(0)#thin-pool#1#4194304#4194304#4194304#twi-a-tz--
test##NcaIoH-4YWJ-QKu3-sJc3-EOcS-goff-cThLIL##thin#0#8388608#4194304#8388608#Vwi-a-tz--
as compared to the former which had the following:
test#[test_vorigin]#Dt5Of3-4WE6-buvw-CWJ4-XOiz-ywOU-YULYw6#/dev/sda3(1300)#linear#1#4194304#4194304#4194304#swi-a-s---
While it's possible to generate code to handle the new thin lv and pool, this
patch will add a "--type snapshot" onto the lvcreate command libvirt uses
in order to "for now" be able to continue to utilize the thin snapshots
https://bugzilla.redhat.com/show_bug.cgi?id=1174569
There's nothing we need to do for shared iSCSI devices in
qemuAddSharedHostdev and qemuRemoveSharedHostdev. The iSCSI layer
takes care about that for us.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Wire up backing chain recursion. For the first time, it is now
possible to get libvirt to expose that qemu tracks read statistics
on backing files, as well as report maximum extent written on a
backing file during a block-commit operation.
For a running domain, where one of the two images has a backing
file, I see the traditional output:
$ virsh domstats --block testvm2
Domain: 'testvm2'
block.count=2
block.0.name=vda
block.0.path=/tmp/wrapper.qcow2
block.0.rd.reqs=1
block.0.rd.bytes=512
block.0.rd.times=28858
block.0.wr.reqs=0
block.0.wr.bytes=0
block.0.wr.times=0
block.0.fl.reqs=0
block.0.fl.times=0
block.0.allocation=0
block.0.capacity=1310720000
block.0.physical=200704
block.1.name=vdb
block.1.path=/dev/sda7
block.1.rd.reqs=0
block.1.rd.bytes=0
block.1.rd.times=0
block.1.wr.reqs=0
block.1.wr.bytes=0
block.1.wr.times=0
block.1.fl.reqs=0
block.1.fl.times=0
block.1.allocation=0
block.1.capacity=1310720000
vs. the new output:
$ virsh domstats --block --backing testvm2
Domain: 'testvm2'
block.count=3
block.0.name=vda
block.0.path=/tmp/wrapper.qcow2
block.0.rd.reqs=1
block.0.rd.bytes=512
block.0.rd.times=28858
block.0.wr.reqs=0
block.0.wr.bytes=0
block.0.wr.times=0
block.0.fl.reqs=0
block.0.fl.times=0
block.0.allocation=0
block.0.capacity=1310720000
block.0.physical=200704
block.1.name=vda
block.1.path=/dev/sda6
block.1.backingIndex=1
block.1.rd.reqs=0
block.1.rd.bytes=0
block.1.rd.times=0
block.1.wr.reqs=0
block.1.wr.bytes=0
block.1.wr.times=0
block.1.fl.reqs=0
block.1.fl.times=0
block.1.allocation=327680
block.1.capacity=786432000
block.2.name=vdb
block.2.path=/dev/sda7
block.2.rd.reqs=0
block.2.rd.bytes=0
block.2.rd.times=0
block.2.wr.reqs=0
block.2.wr.bytes=0
block.2.wr.times=0
block.2.fl.reqs=0
block.2.fl.times=0
block.2.allocation=0
block.2.capacity=1310720000
I may later do a patch that trims the output to avoid 0 stats,
particularly for backing files (which are more likely to have
0 stats, at least for write statistics when no block-commit
is performed). Also, I still plan to expose physical size
information (qemu doesn't expose it yet, so it requires a stat,
and for block devices, a further open/seek operation). But
this patch is good enough without worrying about that yet.
* src/qemu/qemu_driver.c (QEMU_DOMAIN_STATS_BACKING): New internal
enum bit.
(qemuConnectGetAllDomainStats): Recognize new user flag, and pass
details to...
(qemuDomainGetStatsBlock): ...here, where we can do longer recursion.
(qemuDomainGetStatsOneBlock): Output new field.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to report stats on backing chains, we need to separate
the output of stats for one block from how we traverse blocks.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Split...
(qemuDomainGetStatsOneBlock): ...into new helper.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch introduces access to allocation information about
a backing chain of a live domain. While querying storage
volumes for read-only disks could provide some of the details,
we do NOT want to read() a file while qemu is writing it.
Also, there is one case where we have to rely on qemu: when
doing a block commit into a backing file, where that file is
stored in qcow2 format on a host block device, we want to know
the current highest write offset into that image, in order to
know if the disk must be resized larger. qemu-img does not
(currently) show this information, and none of the earlier
block APIs were extensible enough to expose it. But
virDomainListGetStats is perfect for the job!
We don't need a new group of statistics, as the existing block
group is sufficient. On the other hand, as existing libvirt
releases already report 1:1 mapping of block.count to <disk>
devices, changing the array size could confuse older clients;
and even with newer clients, the time and memory taken to
report additional statistics is not always necessary (backing
files are generally read-only except for block-commit, so while
read statistics may change, sizing statistics will not). So
the choice here is to add a new flag that only newer callers
will pass, when they are prepared for the additional information.
This patch introduces the new API, but it will take more
patches to get it implemented for qemu.
* include/libvirt/libvirt-domain.h
(VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING): New flag.
* src/libvirt-domain.c (virConnectGetAllDomainStats): Document it,
and add a new field when it is in use.
* tools/virsh-domain-monitor.c (cmdDomstats): Use new flag.
* tools/virsh.pod (domstats): Document it.
Signed-off-by: Eric Blake <eblake@redhat.com>
A coming patch will make it optionally possible to list backing
chain block stats; in this mode of operation, block.counts is no
longer the number of <disks> in the domain, but the number of
blocks in the array being reported. We still want block.count
listed first, but rather than iterate the tree twice (once to
count, and once to list stats), it's easier to just touch things
up after the fact.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count
after the fact.
Signed-off-by: Eric Blake <eblake@redhat.com>
The prior refactoring can now be put to use. With the same domain
as the earlier commit 7b49926 (one qcow2 disk and an empty
cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
block.count=2
block.0.name=hda
block.0.path=/var/lib/libvirt/images/foo.qcow2
block.0.allocation=1309614080
block.0.capacity=42949672960
block.0.physical=1309671424
block.1.name=hdc
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Use
qemuStorageLimitsRefresh to report offline statistics.
Signed-off-by: Eric Blake <eblake@redhat.com>
Create a helper function that can be reused for gathering block
info from virDomainListGetStats.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Split guts...
(qemuStorageLimitsRefresh): ...into new helper function.
Signed-off-by: Eric Blake <eblake@redhat.com>
The documentation for virDomainBlockInfo was confusing: it stated
that 'physical' was the size of the container, then gave an example
of it being the amount of storage used by a sparse file (that is,
for a sparse raw image on a regular file, the wording implied
capacity==physical, while allocation was smaller; but the example
instead claimed physical==allocation). Since we use 'physical' for
the last offset of a block device, we should do likewise for
regular files.
Furthermore, the example claimed that for a qcow2 regular file,
allocation==physical. At the time the code was first written,
this was true (qcow2 files were allocated sequentially, and were
never sparse, so the last sector written happened to also match
the disk space occupied); but modern qemu does much better and
can punch holes for a qcow2 with allocation < physical.
Basically, after this patch, the three fields are now reliably
mapped as:
'capacity' - how much storage the guest can see (equal to
physical for raw images, determined by image metadata otherwise)
'allocation' - how much storage the image occupies (similar to
what 'du' would report)
'physical' - the last offset of the image (similar to what 'ls'
would report)
'capacity' can be larger than 'physical' (such as for a qcow2
image that does not vary much from a backing file) or smaller
(such as for a qcow2 file with lots of internal snapshots).
Likewise, 'allocation' can be (slightly) larger than 'physical'
(such as counting the tail of cluster allocations required to
round a file size up to filesystem granularity) or smaller
(for a sparse file). A block-resize operation changes capacity
(which, for raw images, also changes physical); many non-raw
images automatically grow physical and allocation as necessary
when starting with an allocation smaller than capacity; and even
when capacity and physical stay unchanged, allocation can change
when converting sectors from holes to data or back.
Note that this does not change semantics for qcow2 images stored
on block devices; there, we still rely on qemu to report the
highest written extent for allocation. So using this API to
track when to extend a block device because a qcow2 image is
about to exceed a threshold will not see any changes.
Also, note that virStorageVolInfo is unfortunately limited to
just 'capacity' and 'allocation' (we can't expand it to add
'physical', although we can expand the XML to add it there);
historically, that struct's 'allocation' value has reported
file size for qcow2 files (what this patch terms 'physical'
for a domain block device), but disk usage for raw files (what
this patch terms 'allocation'). So follow-up patches will be
needed to make storage volumes report the same allocation
values and get at physical values, where those differ.
* include/libvirt/libvirt-domain.h (_virDomainBlockInfo): Tweak
documentation to match saner definition.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): For regular
files, physical size is capacity, not allocation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Ultimately, we want to avoid read()ing a file while qemu is running.
We still have to open() block devices to determine their physical
size, but that is safer. This patch rearranges code to group
together all code that reads the image, to make it easier for later
patches to skip the metadata collection when possible.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Check for empty
disk up front. Place metadata reading next to use.
Signed-off-by: Eric Blake <eblake@redhat.com>
When requested in a later patch, the QMP command results are now
examined recursively. As qemu_driver will eventually have to
read items out of the hash table as stored by this patch, the
computation of backing alias string is done in a shared location.
* src/qemu/qemu_domain.h (qemuDomainStorageAlias): New prototype.
* src/qemu/qemu_domain.c (qemuDomainStorageAlias): Implement it.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): Perform recursion.
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Update callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch will allow recursion into backing chains when
collecting block stats. This patch should not change behavior,
but merely moves out the common code that will be reused once
recursion is enabled, and adds the parameter that will turn on
recursion.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Add recursion parameter,
although it is ignored for now.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Add parameter, and
split...
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): ...into helpers.
(qemuMonitorJSONGetBlockStatsInfo): Update caller.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Update caller.
* src/qemu/qemu_migration.c (qemuMigrationCookieAddNBD): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Right now, grabbing blockinfo always calls stat on the disk, then
opens the image to determine the capacity, using a throw-away
virStorageSourcePtr. This has a couple of drawbacks:
1. We are calling stat and opening a file on every invocation of
the API. However, there are cases where the stats should NOT be
changing between successive calls (if a domain is running, no
one should be changing the physical size of a block device or raw
image behind our backs; capacity of read-only files should not
be changing; and we are the gateway to the block-resize command
to know when the capacity of read-write files should be changing).
True, we still have to use stat in some cases (a sparse raw file
changes allocation if it is read-write and the amount of holes is
changing, and a read-write qcow2 image stored in a file changes
physical size if it was not fully pre-allocated). But for
read-only images, even this should be something we can remember
from the previous time, rather than repeating every call.
2. We want to enhance the power of virDomainListGetStats, by
sharing code. But we already have a virStorageSourcePtr for
each disk, and it would be easier to reuse the common structure
than to have to worry about the one-off virDomainBlockInfoPtr.
While this patch does not optimize reuse of information in point
1, it does get us closer to being able to do so; by updating a
structure that survives between consecutive calls.
* src/util/virstoragefile.h (_virStorageSource): Add physical, to
mirror virDomainBlockInfo; rearrange fields to match public struct.
(virStorageSourceCopy): Copy the new field.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Store into
storage source, then copy to block info.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order for a future patch to virDomainListGetStats to reuse
some code for determining disk usage of offline domains, we
need to make it easier to pull out part of the guts of grabbing
blockinfo. The current implementation grabs a job fairly late
in the game, while getstats will already own a job; reordering
things so that the job is always grabbed up front in both
functions will make it easier to pull out the common code.
This patch results in grabbing a job in cases where one was not
previously needed, but as it is a query job, it should not be
noticeably slower.
This patch touches the same code as the fix for CVE-2014-6458
(commit b799259); in that patch, we avoided hotplug changing
a disk reference during the time of obtaining a monitor lock
by copying all data we needed and no longer referencing disk;
this patch goes the other way and ensures that by holding the
job, the disk cannot be changed so we no longer need to worry
about the disk being invalidated across the monitor lock.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Rearrange job
control to be outside of disk information.
Signed-off-by: Eric Blake <eblake@redhat.com>
When any of the functions modified in commit 214c687b took false branch,
the function itself used none of its parameters resulting in "unused
parameter" error. Rewriting these functions to the stubs we use
elsewhere should fix the problem.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit e3435caf added cleanup code to qemuDomainSetVcpusFlags() that was
not supposed to reset the error. Usual procedure was done, saving the
error to temporary variable, but it was never free'd, but rather leaked.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit af2a1f05 tried clearly separating each condition in
qemuRestoreCgroupState() for the sake of readability, however somehow
one condition body was missing. That means that the body of the next
condition got executed only if both of there were true, which is
impossible, thus resulting in a dead code and a logic error.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In commit d2632d60 we agreed taht we want the parsed uid to properly
overflow but only to -1, however the value was read into long and then
wrapped into uid_t. That meaned it failed on 32-bit systems.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently virStorageFileResize() function uses build conditionals to
choose either the posix_fallocate() or syscall(SYS_fallocate) with no
fallback in order to preallocate the space in the newly resized file.
Since the safezero code has a similar set of conditionals modify the
resize and safezero code in order to allow the resize logic to make use
of safezero to unify the look/feel of the code paths.
Add a new boolean (resize) to safezero() to make the optional decision
whether to try syscall(SYS_fallocate) if the posix_fallocate fails because
HAVE_POSIX_FALLOCATE is not defined (eg, return -1 and errno == 0).
Create a local safezero_sys_fallocate in order to handle the resize
code paths that support that. If not present, the set errno = ENOSYS
in order to allow the caller to handle the failure scenarios.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Currently build conditionals decide which of two safezero() functions
should be built - either the posix_fallocate() or mmap() with a fallback
to a slower safewrite() algorithm in order to preallocate space in a raw file.
This patch will refactor safezero to utilize static functions for either
posix_fallocate or mmap/safewrite. The build conditional still exist, but
are only for shorter sections of code.
The posix_fallocate path will make use of the ret/errno setting to contain
the logic for safezero to decide whether it needs to fallback to other
algorithms. A return of -1 with errno not changed will indicate the conditional
is not present; otherwise, a return of -1 with errno change indicates the
call was made and it failed (no functional difference to current algorithm).
The mmap/safewrite option changes only slightly to handle the ftruncate
failure for mmap. That is, previously if the ftruncate failed, there was
no fallback to the slow safewrite option.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Currently, when there is an API that's blocking with locked domain and
second API that's waiting in virDomainObjListFindByUUID() for the domain
lock (with the domain list locked) no other API can be executed on any
domain on the whole hypervisor because all would wait for the domain
list to be locked. This patch adds new optional approach to this in
which the domain is only ref'd (reference counter is incremented)
instead of being locked and is locked *after* the list itself is
unlocked. We might consider only ref'ing the domain in the future and
leaving locking on particular APIs, but that's no tonight's fairy tale.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Volume and pool formatting functions took different approaches to
unspecified uids/gids. When unknown, it is always parsed as -1, but one
of the functions formatted it as unsigned int (wrong) and one as
int (better). Due to that, our two of our XML files from tests cannot
be parsed on 32-bit machines.
RNG schema needs to be modified as well, but because both
storagepool.rng and storagevol.rng need same schema for permission
element, save some space by moving it to storagecommon.rng.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When hot-plugging a VCPU into the guest, kvm needs to allocate some data
from the DMA zone, which might be in a memory node that's not allowed in
cpuset.mems. Basically the same problem as there was with starting the
domain and due to which commit 7e72ac7878
exists. This patch just extends it to hotplugging as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1161540
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Instead of setting the value of cpuset.mems once when the domain starts
and then re-calculating the value every time we need to change the child
cgroup values, leave the cgroup alone and rather set the child data
every time there is new cgroup created. We don't leave any task in the
parent group anyway. This will ease both current and future code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In systemd >= 218, the udev_set_log_fn method has been marked
deprecated and turned into a no-op. Nothing in the udev client
library will print to stderr by default anymore, so we can
just stop installing a logging hook for new enough udev.
For SCSI and SATA devices controller and unit are used
to specify drive address. For IDE devices - bus specifies
IDE bus, becase usually there are 2 IDE buses on IDE
controller.
Parallels SDK allows to set drive position by calling
PrlVmDev_SetStackIndex. Since PCS VMs have only one
controller of each type, for SATA and SCSI devices it
simple means position on bus, for IDE devices -
2 * bus_number + position_on_bus.
This patch fixes mapping from libvirt's disk->info.addr.drive
to parallels's 'StackIndex'.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
It seems file format is usually specified event for
real block devices. So report that file format is
raw in virDomainGetXMLDesc and add checks for proper
file format to prlsdkAddDisk.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
NULL value of virDomainVideoAccelDefPtr means default
values for video acceleration, so don't report error in
this case.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1174154
When we use attach-device add a hostdev or chr device which have a
iscsi address or others (just like guest agent, subsys iscsi disk...),
we will find there is no basic controller for our new attached device.
Somtimes this will make guest cannot start after we add them (although
they can start at the second time).
Signed-off-by: Luyao Huang <lhuang@redhat.com>
When libvirt is managing a bridge's forwarding database (FDB)
(macTableManager='libvirt'), if we add FDB entries for a new guest
interface even before the qemu process is created, then in the case of
a migration any other guest attached to the "destination" bridge will
have its traffic immediately sent to the destination of the migration
even while the source domain is still running (and the destination, of
course, isn't). To make sure that traffic from other guests on the new
host continues flowing to the old guest until the new one is ready, we
have to wait until the new guest CPUs are started to add the FDB
entries.
Conversely, we need to remove the FDB entries from the bridge any time
the guest CPUs are stopped; among other things, this will assure
proper operation during a post-copy migration (which is just the
opposite of the problem described in the previous paragraph).
We can change vnc password by using virDomainUpdateDeviceFlags API with
live flag. But it can't be changed with config flag. Error is reported as
below.
error: Operation not supported: persistent update of device 'graphics' is not supported
This patch supports the graphics arguments changed with config flag.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
It's not supported to change some graphics arguments with '--live'.
Replace some error code VIR_ERR_INTERNAL_ERROR and VIR_ERR_INVALID_ARG
with VIR_ERR_OPERATION_UNSUPPORTED.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1174096
When both parameter have lockspaces present, virDomainLeaseIndex
always returns -1 even there is a lease the same with the one we
check. This is due to broken logic in 'if-else' statement.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1173507
It occurred to me that OpenStack uses the following XML when not using
regular huge pages:
<memoryBacking>
<hugepages>
<page size='4' unit='KiB'/>
</hugepages>
</memoryBacking>
However, since we are expecting to see huge pages only, we fail to
startup the domain with following error:
libvirtError: internal error: Unable to find any usable hugetlbfs
mount for 4 KiB
While regular system pages are not huge pages technically, our code is
prepared for that and if it helps OpenStack (or other management
applications) we should cope with that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1174053
Introduced by commit id '17bddc46f' - fix a libvirtd crash when
matching a network iscsi hostdev with a host iscsi hostdev.
When we use attach-device to coldplug a network iscsi hostdev,
libvirt will check if there is already a device in XML. But if
the 'b' is a host iscsi hostdev and 'a' is a network iscsi hostdev,
then libvirtd will crash in virDomainHostdevMatchSubsysSCSIiSCSI
because 'b' doesn't have a hostname.
Add a check in virDomainHostdevMatchSubsys, if the a's protocol
and b's protocol is not the same.
Following is the backtrace:
0 0x00007f850d6bc307 in virDomainHostdevMatchSubsysSCSIiSCSI at conf/domain_conf.c:10889
1 virDomainHostdevMatchSubsys at conf/domain_conf.c:10911
2 virDomainHostdevMatch at conf/domain_conf.c:10973
3 virDomainHostdevFind at conf/domain_conf.c:10998
4 0x00007f84f6a10560 in qemuDomainAttachDeviceConfig at qemu/qemu_driver.c:7223
5 qemuDomainAttachDeviceFlags at qemu/qemu_driver.c:7554
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160995
In our config files users are expected to pass several integer values
for different configuration knobs. However, majority of them expect a
nonnegative number and only a few of them accept a negative number too
(notably keepalive_interval in libvirtd.conf).
Therefore, a new type to config value is introduced: VIR_CONF_ULONG
that is set whenever an integer is positive or zero. With this
approach knobs accepting VIR_CONF_LONG should accept VIR_CONF_ULONG
too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's no need for condition of the following form:
if (str && STREQ(str, dst))
since we have STREQ_NULLABLE macro that handles NULL cases.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For historical reasons, only the first <console> element might be of targetType
serial, but we checked for other consoles of targetType serial in our post-parse
callback if and only if we knew the first console was serial, otherwise
the check was skipped.
This patch moves the check one level up, so first
the check for secondary console of type serial is performed and then the
rest of operations continue unchanged.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1170092
We now have a qemuInterfaceStartDevices() which does the final
activation needed for the host-side tap/macvtap devices that are used
for qemu network connections. It will soon make sense to have the
converse qemuInterfaceStopDevices() which will undo whatever was done
during qemuInterfaceStartDevices().
A function to "stop" a single device has also been added, and is
called from the appropriate place in qemuDomainDetachNetDevice(),
although this is currently unnecessary - the device is going to
immediately be deleted anyway, so any extra "deactivation" will be for
naught. The call is included for completeness, though, in anticipation
that in the future there may be some required action that *isn't*
nullified by deleting the device.
This patch is a part of a more complete fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1081461
The patch that added qemuInterfaceStartDevices() (upstream commit
82977058f5) had an extra conditional to
prevent calling it if the reason for starting the CPUs was
VIR_DOMAIN_RUNNING_UNPAUSED or VIR_DOMAIN_RUNNING_SAVE_CANCELED. This
was put in by the author as the result of a reviewer asking if it was
necessary to ifup the interfaces in *all* occasions (because these
were the two cases where the CPU would have already been started (and
stopped) once, so the interface would already be ifup'ed).
It turns out that, as long as there is no corresponding
qemuInterfaceStopDevices() to ifdown the interfaces anytime the CPUs
are stopped, neglecting to ifup when reason is RUNNING_UNPAUSED or
RUNNING_SAVE_CANCELED doesn't cause any problems (because it just
happens that the interface will have already been ifup'ed by a prior
call when the CPU was previously started for some other reason).
However, it also doesn't *help*, and there will soon be a
qemuInterfaceStopDevices() function which *will* ifdown these
interfaces when the guest CPUs are stopped, and once that is done, the
interfaces will be left down in some cases when they should be up (for
example, if a domain is paused and then unpaused).
So, this patch is removing the condition in favor of always calling
qemuInterfaeStartDevices() when the guest CPUs are started.
This patch (and the aforementioned patch) resolve:
https://bugzilla.redhat.com/show_bug.cgi?id=1081461
When one domain is being undefined and at the same time started, for
example, there is a possibility of a rare problem occuring.
- Thread 1 does virDomainUndefine(), has the lock, checks that the
domain is active and because it's not, calls
virDomainObjListRemove().
- Thread 2 does virDomainCreate() and tries to lock the domain.
- Thread 1 needs to lock domain list in order to remove the domain from
it, but must unlock domain first (proper order is to lock domain list
first and the domain itself second).
- Thread 2 grabs the lock, starts the domain and releases the lock.
- Thread 1 grabs the lock and removes the domain from list.
With this patch:
- The undefining domain gets marked as "to undefine" before it is
unlocked.
- If domain is found in any of the search APIs, it's returned only if
it is not marked as "to undefine". The check is done while the
domain is locked.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When calling virCgroupAllowAllDevices we get these invalid entries
in the device cgroup config.
b -1:-1 rw
c -1:-1 rw
Check for positive values before outputting the major and minor to
avoid that.
For host-passthrough CPU we don't honor the CPU
features specified in the XML, but we allow
outputting them via the UPDATE_CPU flag for dumpxml,
this gives user a rough idea of what features the CPU
might have.
After restoring a managedsave'd domain, the features
might end up in the live status XML (in /var/run) without
the model. This XML cannot be parsed by the daemon after
restart and the domain might disappear.
This fix skips formatting the features for HOST_PASSTHROUGH
when UPDATE_CPU is not specified, so the newly restored domains
and newly created snapshots won't be affected.
Note: this doesn't fix existing snapshots or already restored
running domains.
https://bugzilla.redhat.com/show_bug.cgi?id=1030793https://bugzilla.redhat.com/show_bug.cgi?id=1151885
A logic bug in qemuConnectGetAllDomainStats makes the code mark the
monitor as available when qemuDomainObjBeginJob fails, instead of when
it succeeds, as the correct flow requires.
This patch fixes the check and updates the code documentation
accordingly.
Broken by commit 57023c0a3a.
Signed-off-by: Francesco Romani <fromani@redhat.com>
When using qemuProcessAttach to attach a qemu process,
the DAC label is not filled correctly.
Introduce a new function to get the uid:gid from the system
and fill the label.
This fixes the daemon crash when 'virsh screenshot' is called:
https://bugzilla.redhat.com/show_bug.cgi?id=1161831
It also fixes qemu-attach after the prerequisite of this patch
(commit f8c1fb3) was pushed out of order.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Currently, MAC registration occurs during device creation, which is
early enough that, during live migration, you end up with duplicate
MAC addresses on still-running source and target devices, even though
the target device isn't actually being used yet.
This patch proposes to defer MAC registration until right before
the guest can actually use the device -- In other words, right
before starting guest CPUs.
Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
Some programs want to change some values for the network interfaces
configuration in /proc/sys/net/ipv[46] folders. Giving RW access on them
allows wicked to work on openSUSE 13.2+.
Reusing the lxcNeedNetworkNamespace function to tell
lxcContainerMountBasicFS if the netns is disabled. When no netns is
set up, then we don't mount the /proc/sys/net/ipv[46] folder RW as
these would provide full access to the host NICs config.
https://bugzilla.redhat.com/show_bug.cgi?id=1172015
The refactoring done as part of commit id '59446096' caused a regression
for the multi initiator IQN commit '6aabcb5b' because the sendtargets was
not done on/for the initiator IQN prior to login (or trying to disable
autologin)
Prior to that commit, the paths were essentially
virStorageBackendISCSIStartPool
virStorageBackendISCSILogin
virStorageBackendISCSIConnection
if initiatoriqn
virStorageBackendCreateIfaceIQN
Issue sendtargets
Perform --login
else
Issue sendtargets
Perform --login
After that commit:
virStorageBackendISCSIStartPool
Issue sendtargets
Call virStorageBackendISCSIConnection
If initiatoriqn
virStorageBackendCreateIfaceIQN
Perform --login
else
Perform --login
So for non initiator IQN paths, nothing changed. For the initiator path,
the --login fails as does any attempts to change autologin via "--op update
--name node.startup --value manual".
In old version of parted like parted-2.1-25, error message is shown in
stdout when printing a disk info without disk label.
Error: /dev/sda: unrecognised disk label
This line has been moved to stderr in newer version of parted. So we
should check both stdout and stderr when locating this message.
This should fix bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1172468
Signed-off-by: Hao Liu <hliu@redhat.com>
When user doesn't have read access on one of the domains he requested,
the for loop could exit abruptly or continue and override pointer which
pointed to locked object.
This patch fixed two issues at once. One is that domflags might have
had QEMU_DOMAIN_STATS_HAVE_JOB even when there was no job started (this
is fixed by doing domflags |= QEMU_DOMAIN_STATS_HAVE_JOB only when the
job was acquired and cleaning domflags on every start of the loop.
Second one is that the domain is kept locked when
virConnectGetAllDomainStatsCheckACL() fails and continues the loop when
it didn't end. Adding a simple virObjectUnlock() and clearing the
pointer ought to do.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If we want to perform some operation and domain state is not suitable
for that operation, we should report error VIR_ERR_OPERATION_INVALID.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
When PrlJob_GetRetCode sets second argument to
error value it means sdk function failed and we
must return error from getJobResultHelper.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Return error code, returned by parallels SDK from
waitJob and getJobResult, so that caller can handle
different errors.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Get cdrom devices list from parallels server in
prlsdkLoadDomains and add ability to define a domain
with cdroms.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
First, we don't need to call prlsdkApplyConfig after
creating new VM or containers, because it's done in
functions prlsdkCreateVm and prlsdkCreateCt.
No need to check, if domain exists in the list after
prlsdkAddDomain.
Also organize code, so that we can call virObjectUnlock
in one place.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
This patch replaces code, which creates domains by
running prlctl command.
prlsdkCreateVm/Ct will do prlsdkApplyConfig, because
we send request to the server only once in this case.
But prlsdkApplyConfig will be called also from
parallelsDomainDefineXML function. There is no problem with
it, parallelsDomainDefineXML will be refactored later.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Rewrite code, which applies domain configuration given
to virDomainDefineXML function to the VM of container
registered in PCS.
This code first check if there are unsupported parameters
in domain XML and if yes - reports error. Some of such
parameters are not supported by PCS, for some - it's not
obvious, how to convert them into PCS's corresponding params,
so let's put off it, and implement only basic params in
this patch.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Change domain state using parallels SDK functions instead of
prlctl command.
We don't need to send events from these functions now, becase
events handler will send them. But we still need to update
virDomainObj in privconn->domains.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Subscribe to events from parallels server. It's
needed for 2 things: to update cached domains list
and to send corresponding libvirt events.
Parallels server sends a lot of different events, in
this patch we handle only some of them. In the future
we can handle for example, changes in a host network
configuration or devices states.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Move macro parallelsDomNotFoundError to file parallels_utils.h, because
it will be used in parallels_sdk.c.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Obtain information about domains using parallels sdk instead of prlctl.
prlsdkLoadDomains functions behaves as former parallelsLoadDomains with
NULL as second parameter (name) - it fills parallelsConn.domains list.
prlsdkLoadDomain is now able to update specified domain by given
virDomainObjPtr.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1082521
Support for shared hostdev's was added in a number of commits, initially
starting with 'f2c1d9a80' and most recently commit id 'fd243fc4' to fix
issues with the initial implementation. Missed in all those changes was
the need to mimic the virSELinux{Set|Restore}SecurityDiskLabel code to
handle the "shared" (or shareable) and readonly options when Setting
or Restoring the SELinux labels.
This patch will adjust the virSecuritySELinuxSetSecuritySCSILabel to not
use the virSecuritySELinuxSetSecurityHostdevLabelHelper in order to set
the label. Rather follow what the Disk code does by setting the label
differently based on whether shareable/readonly is set. This patch will
also modify the virSecuritySELinuxRestoreSecuritySCSILabel to follow
the same logic as virSecuritySELinuxRestoreSecurityImageLabelInt and not
restore the label if shared/readonly
https://bugzilla.redhat.com/show_bug.cgi?id=1171582
When we edit a negative controller address number to a device,
some of them will auto generate a controller with invalid index
number. This will make guest disappear after restart libvirtd.
Instead of allowing negative number for controller index, we
should forbid negative number in these place (we did this before,
but after f18c02ec, virStrToLong_ui changed to allow negative
number). Therefore switch to virStrToLong_uip in these places.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Avoid leaving the domain locked on a failed ACL check in
qemuDomainMigratePerform() and qemuDomainMigrateFinish2().
Introduced in commit abf75aea24 (Add ACL checks into the QEMU driver).
Commit c75425734 introduced a compilation failure:
../../src/access/viraccessdriverpolkit.c: In function 'virAccessDriverPolkitCheck':
../../src/access/viraccessdriverpolkit.c:137:5: error: format '%d' expects argument of type 'int', but argument 9 has type 'pid_t' [-Werror=format=]
VIR_DEBUG("Check action '%s' for process '%d' time %lld uid %d",
^
Since mingw pid_t is 64 bits, it's easier to just follow what we've
done elsewhere and cast to a large enough type when printing pids.
* src/access/viraccessdriverpolkit.c (virAccessDriverPolkitCheck):
Add cast.
Signed-off-by: Eric Blake <eblake@redhat.com>
lxcProcessSetupInterfaces() used to have a special case for
actualType='network' (a network with forward mode of route, nat, or
isolated) to call the libvirt public API to retrieve the bridge being
used by a network. That is no longer necessary - since all network
types that use a bridge and tap device now get the bridge name stored
in the ActualNetDef, we can just always use
virDomainNetGetActualBridgeName() instead.
qemuNetworkIfaceConnect() used to have a special case for
actualType='network' (a network with forward mode of route, nat, or
isolated) to call the libvirt public API to retrieve the bridge being
used by a network. That is no longer necessary - since all network
types that use a bridge and tap device now get the bridge name stored
in the ActualNetDef, we can just always use
virDomainNetGetActualBridgeName() instead.
(an audit of the two callers to qemuNetworkIfaceConnect() confirms
that it is never called for any other type of network, so the dead
code in the else statement (logging an internal error if it is called
for any other type of network) is eliminated in the process.)
When libvirt is managing the MAC table of a Linux host bridge, it must
turn off learning and unicast_flood for each tap device attached to
that bridge, then add a Forwarding Database (fdb) entry for the tap
device using the MAC address from the domain interface config.
Once we have disabled learning and flooding, any packet that has a
destination MAC address not present in the fdb will be dropped by the
bridge. This, along with the opportunistic disabling of promiscuous
mode[*], can result in enhanced network performance. and a potential
slight security improvement.
[*] If there is only one device on the bridge with learning/unicast_flood
enabled, then that device will automatically have promiscuous mode
disabled. If there are *no* devices with learning/unicast_flood
enabled (e.g. for a libvirt "route", "nat", or isolated network that
has no physical device attached), then all non-tap devices will have
promiscuous mode disabled (tap devices always have promiscuous mode
enabled, which may be a bug in the kernel, but in practice has 0
effect).
None of this has any effect for kernels prior to 3.15 (upstream kernel
commit 2796d0c648c940b4796f84384fbcfb0a2399db84 "bridge: Automatically
manage port promiscuous mode"). Even after that, until kernel 3.17
(upstream commit 5be5a2df40f005ea7fb7e280e87bbbcfcf1c2fc0 "bridge: Add
filtering support for default_pvid") traffic will not be properly
forwarded without manually adding vlan table entries. Unfortunately,
although the presence of the first patch is signalled by existence of
the "learning" and "unicast_flood" options in sysfs, there is no
reliable way to query whether or not the system's kernel has the
second of those patches installed, the only thing that can be done is
to try the setting and see if traffic continues to pass.
When the bridge device for a network has macTableManager='libvirt' the
intent is that all kernel management of the bridge's MAC table
(Forwarding Database, or fdb, in the case of a Linux Host Bridge) be
disabled, with libvirt handling updates to the table instead. The
setup required for the bridge itself is:
1) set the "vlan_filtering" property of the bridge device to 1.
2) If the bridge has a "Dummy" tap device used to set a fixed MAC
address on the bridge (which is always the case for a bridge created
by libvirt, and never the case for a bridge created by the host system
network config), turn off learning and unicast_flood on this tap (this
is needed even though this tap is never IFF_UP, because the kernel
ignores the IFF_UP flag of devices when using their settings to
automatically decide whether or not to turn off promiscuous mode for
any attached device).
(1) is done both for libvirt-created/managed bridges, and for bridges
that are created by the host system config, while (2) is done only for
bridges created by libvirt (i.e. for forward modes of nat, routed, and
isolated bridges)
There is no attempt to turn vlan_filtering off when destroying the
network because in the case of a libvirt-created bridge, the bridge is
about to be destroyed anyway, and in the case of a system bridge, if
the other devices attached to the bridge could operate properly before
destroying libvirt's network object, they will continue to operate
properly (this is similar to the way that libvirt will enable
ip_forwarding whenever a routed/natted network is started, but will
never attempt to disable it if they are stopped).
At the time that the network driver allocates a connection to a
network, the tap device that will be used hasn't yet been created -
that will be done later by qemu (or lxc or whoever) - but if the
network has macTableManager='libvirt', then when we do get around to
creating the tap device, we will need to add an entry for it to the
network bridge's fdb (forwarding database) *and* turn off learning and
unicast_flood for that tap device in the bridge's sysfs settings. This
means that qemu needs to know both the bridge name as well as the
setting of macTableManager, so we either need to create a new API to
retrieve that info, or just pass it back in the ActualNetDef that is
created during networkAllocateActualDevice. We choose the latter
method, since it's already done for the bridge device, and it has the
side effect of making the information available in domain status.
(NB: in the future, I think that the tap device should actually be
created by networkAllocateActualDevice(), as that will solve several
other problems, but that is a battle for another day, and this
information will still be useful outside the network driver)
When the actualType of a virDomainNetDef is "network", it means that
we are connecting to a libvirt-managed network (routed, natted, or
isolated) which does use a bridge device (created by libvirt). In the
past we have required drivers such as qemu to call the public API to
retrieve the bridge name in this case (even though it is available in
the NetDef's ActualNetDef if the actualType is "bridge" (i.e., an
externally-created bridge that isn't managed by libvirt). There is no
real reason for this difference, and as a matter of fact it
complicates things for qemu. Also, there is another bridge-related
attribute (macTableManager) that will need to be available in both
cases, so this makes things consistent.
In order to avoid problems when restarting libvirtd after an update
from an older version that *doesn't* store the network's bridgename in
the ActualNetDef, we also need to put it in place during
networkNotifyActualDevice() (this function is run for each interface
of each domain whenever libvirtd is restarted).
Along with making the bridge name available in the internal object, it
is also now reported in the <source> element of the <interface> state
XML (or the <actual> subelement in the internally-stored format).
The one oddity about this change is that usually there is a separate
union for every different "type" in a higher level object (e.g. in the
case of a virDomainNetDef there are separate "network" and "bridge"
members of the union that pivots on the type), but in this case
network and bridge types both have exactly the same attributes, so the
"bridge" member is used for both type==network and type==bridge.
The macTableManager attribute of a network's bridge subelement tells
libvirt how the bridge's MAC address table (used to determine the
egress port for packets) is managed. In the default mode, "kernel",
management is left to the kernel, which usually determines entries in
part by turning on promiscuous mode on all ports of the bridge,
flooding packets to all ports when the correct destination is unknown,
and adding/removing entries to the fdb as it sees incoming traffic
from particular MAC addresses. In "libvirt" mode, libvirt turns off
learning and flooding on all the bridge ports connected to guest
domain interfaces, and adds/removes entries according to the MAC
addresses in the domain interface configurations. A side effect of
turning off learning and unicast_flood on the ports of a bridge is
that (with Linux kernel 3.17 and newer), the kernel can automatically
turn off promiscuous mode on one or more of the bridge's ports
(usually only the one interface that is used to connect the bridge to
the physical network). The result is better performance (because
packets aren't being flooded to all ports, and can be dropped earlier
when they are of no interest) and slightly better security (a guest
can still send out packets with a spoofed source MAC address, but will
only receive traffic intended for the guest interface's configured MAC
address).
The attribute looks like this in the configuration:
<network>
<name>test</name>
<bridge name='br0' macTableManager='libvirt'/>
...
This patch only adds the config knob, documentation, and test
cases. The functionality behind this knob is added in later patches.
These two functions use netlink RTM_NEWNEIGH and RTM_DELNEIGH messages
to add and delete entries from a bridge's fdb. The bridge itself is
not referenced in the arguments to the functions, only the name of the
device that is attached to the bridge (since a device can only be
attached to one bridge at a time, and must be attached for this
function to make sense, the kernel easily infers which bridge's fdb is
being modified by looking at the device name/index).
I'm about to make block stats optionally more complex to cover
backing chains, where block.count will no longer equal the number
of <disks> for a domain. For these reasons, it is nicer if the
statistics output includes the source path (for local files).
This patch doesn't add anything for network disks, although we
may decide to add that later.
With this patch, I now see the following for the same domain as
in the previous patch (one qcow2 file, and an empty cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
block.count=2
block.0.name=hda
block.0.path=/var/lib/libvirt/images/foo.qcow2
block.1.name=hdc
* src/libvirt-domain.c (virConnectGetAllDomainStats): Document
new field.
* tools/virsh.pod (domstats): Document new field.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Return the new
stat for local files/block devices.
(QEMU_ADD_NAME_PARAM): Add parameter.
(qemuDomainGetStatsInterface): Update caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
I noticed that for an offline domain, 'virsh domstats --block $dom'
was producing just the domain name, with no stats. But the older
'virsh domblkinfo' works just fine on offline domains. This patch
starts to get us closer, by at least reporting the disk names for
an offline domain.
With this patch, I now see the following for an offline domain
with one qcow2 disk and an empty cdrom drive:
$ virsh domstats --block foo
Domain: 'foo'
block.count=2
block.0.name=hda
block.1.name=hdc
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Don't short-circuit
output of block name.
Signed-off-by: Eric Blake <eblake@redhat.com>
At least with 'virsh domstats --block' on an offline domain, we
currently output no stats even though we recognize the stat
category. Although a later patch will improve this situation,
it is better to document that this is expected behavior.
Also, while the current implementation rejects filtering flags
for virDomainListGetStats, this limitation may be lifted in the
future and we do not enforce it at the API level.
* src/libvirt-domain.c (virConnectGetAllDomainStats): Document
that recognized stats might not be reported.
(virDomainListGetStats): Likewise, and tweak filtering documentation.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemuDomainGetStatsBlock() could leak a stats hash table if it
encountered OOM while populating the virTypedParameters.
Oddly, the fix doesn't even touch qemuDomainGetStatsBlock :)
* src/qemu/qemu_driver.c (QEMU_ADD_COUNT_PARAM)
(QEMU_ADD_NAME_PARAM): Don't return early.
(qemuDomainGetStatsInterface): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
Whenever client socket was marked as closed for some reason, it could've
been changed when really closing the connection. With this patch the
proper reason is kept since the first time it's marked as closed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When user tries to insert element metadata providing a namespace
declaration as well, currently we insert the element without any validation
check for XML prefix (if provided). The next VM start would then
fail with parse error. This patch fixes this issue by adding a call to
xmlValidateNCName function to check for illegal characters in the
prefix.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143921
If probing capabilities via QMP fails, we now have a check
that prevents us falling back to -help parsing. Unfortunately
the error message
"Failed to probe capabilities for /usr/bin/qemu-kvm:
unsupported configuration: QEMU 2.1.2 is too new for help parsing"
is proving rather unhelpful to the user. We need to be telling
them why QMP failed (the root cause), rather than they can't
use -help (the side effect).
To do this we should capture stderr during QMP probing, and
if -help parsing then sees a new QEMU version, we know that
QMP should have worked, and so we can show the messages from
stderr. The message thus becomes
"Failed to probe capabilities for /usr/bin/qemu-kvm:
internal error: QEMU / QMP failed: Could not access
KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory"
When attempting to create internal system checkpoint with a passthrough
device qemu will report the following error:
error: operation failed: Error -22 while writing VM
This patch calls the function to check if migration is possible with
given VM and thus improves the error to:
error: Requested operation is not valid: domain has assigned non-USB host devices
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=874418#c19
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1115292
In one of the previous commits (eafb53fe) we disallowed
network-wide bandwidth to some network types. However, we
forgot about <portgroups/> which can have <bandwidth/> too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Move entering the job into the thread to simplify the program flow. Also
as the code holds a separate reference to the domain object some
conditions can be simplified.
After this patch qemuDomainObjTransferJob is no longer needed so this
patch removes it.
Reboot requires more sophistication and is left as a future work item --
but at least part of the plumbing is in place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If someone removes blockcopy storage file when still in mirroring phase
and then requesting blockjob abort using pivot, virsh cmd freezes. This
is not an issue with older qemu versions which did not support
asynchronous jobs (which we prefer by default).
As we have reached the mirroring phase successfully, polling monitor for
blockjob info always returns 1 and the loop never ends.
This fix introduces a check for qemuDomainBlockPivot return code, possibly
skipping the asynchronous waiting completely, if an error occurred and
asynchronous waiting was the preferred method.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1139567
Commit 86a15a25 introduced a new cpu driver API 'getModels'. Public API
allow you to pass NULL for models to get only number of existing models.
However the new code will crash with segfault so we have to count with
the possibility that the user wants only the number.
There is also difference in order of the models gathered by this new API
as the old approach was inserting the elements to the end of the array
so we should use 'VIR_APPEND_ELEMENT'.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reconnect to the VM is a possibly long-running job spawned in a separate
thread. We should reload the snapshot defs and managedsave state prior
to spawning the thread to avoid blocking of the daemon startup which
would serialize on the VM lock.
Also the reloading code would violate the domain job held while
reconnecting as the loader functions don't create jobs.
Coverity pointed out that in other places we always check the return
value from virJSONValueObjectGetNumberLong() but not in the new addition
in leaseshelper. To solve the issue and also be more robust in case
somebody would corrupt the file, skip outputting of the lease entry in
case the expiry time is missing.
https://bugzilla.redhat.com/show_bug.cgi?id=1087104#c5
When trying to use an invalid offset to virStorageVolUpload(), libvirt
fails in virFDStreamOpenFileInternal(), although it seems libvirt does
not check the return in storageVolUpload(), and calls
virFDStreamSetInternalCloseCb() right after. But stream doesn't have a
privateData (is NULL) yet, and the daemon crashes then.
0 0x00007f09429a9c10 in pthread_mutex_lock () from /lib64/libpthread.so.0
1 0x00007f094514dbf5 in virMutexLock (m=<optimized out>) at util/virthread.c:88
2 0x00007f09451cb211 in virFDStreamSetInternalCloseCb at fdstream.c:795
3 0x00007f092ff2c9eb in storageVolUpload at storage/storage_driver.c:2098
4 0x00007f09451f46e0 in virStorageVolUpload at libvirt.c:14000
5 0x00007f0945c78fa1 in remoteDispatchStorageVolUpload at remote_dispatch.h:14339
6 remoteDispatchStorageVolUploadHelper at remote_dispatch.h:14309
7 0x00007f094524a192 in virNetServerProgramDispatchCall at rpc/virnetserverprogram.c:437
Signed-off-by: Luyao Huang <lhuang@redhat.com>
This patch enables the helper program to detect event(s) triggered when
there is a change in lease length or expiry and client-id. This
transfers complete control of leases database to libvirt and obsoletes
use of the lease database file (<network-name>.leases). That file will
not be created, read, or written. This is achieved by adding the option
--leasefile-ro to dnsmasq and passing a custom env var to leaseshelper,
which helps us map events related to leases with their corresponding
network bridges, no matter what the event be.
Also, this requires the addition of a new non-lease entry in our custom
lease database: "server-duid". It is required to identify a DHCPv6
server.
Now that dnsmasq doesn't maintain its own leases database, it relies on
our helper program to tell it about previous leases and server duid.
Thus, this patch makes our leases program honor an extra action: "init",
in which it sends the known info in a particular format to dnsmasq
by printing it to stdout.
The drawback of this change is that upgrade to this new approach does
not transfer the existing leases for the network if the leaseshelper
wasn't already used.
For Intel and PowerPC the implementation is calling a cpu driver
function across driver layers (i.e. from qemu driver directly to cpu
driver).
The correct behavior is to use libvirt API functionality to perform such
a inter-driver call.
This patch introduces a new cpu driver API function getModels() to
retrieve the cpu models. The currect implementation to process the
cpu_map XML content is transferred to the INTEL and PowerPC cpu driver
specific API functions.
Additionally processing the cpu_map XML file is not safe due to the fact
that the cpu map does not exist for all architectures. Therefore it is
better to encapsulate the processing in the architecture specific cpu
drivers.
Signed-off-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Based on previous commit, we can now precreate missing volumes. While
digging out the functionality from storage driver would be nicer, if
you've seen the code it's nearly impossible. So I'm going from the
other end:
1) For given disk target, disk path is looked up.
2) For the disk path, storage pool is looked up, a volume XML is
constructed and then passed to virStorageVolCreateXML() which has all
the knowledge how to create raw images, (encrypted) qcow(2) images,
etc.
One of the advantages of this approach is, we don't have to care about
image conversion - qemu does that for us. So for instance, users can
transform qcow2 into raw on migration (if the correct XML is passed to
the migration API).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Up 'til now, users need to precreate non-shared storage on migration
themselves. This is not very friendly requirement and we should do
something about it. In this patch, the migration cookie is extended,
so that <nbd/> section does not only contain NBD port, but info on
disks being migrated. This patch sends a list of pairs of:
<disk target; disk size>
to the destination. The actual storage allocation is left for next
commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The function queries the block devices visible to qemu
('query-block') and parses the qemu's output. The info is
returned in a hash table which is expected to be pre-filled by
qemuMonitorJSONGetAllBlockStatsInfo(). However, in the next patch
we are not going to call the latter function at all, so we should
make the former function add devices into the hash table if not
found there.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While this could be exposed as a public API, it's not done yet as
there's no demand for that yet. Anyway, this is just preparing
the environment for easier volume creation on the destination.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since virDomainSnapshotFree will call virObjectUnref anyway, let's just use
that directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virInterfaceFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virNWFilterFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virSecretFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virStreamFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virStoragePoolFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virStorageVolFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virNodeDeviceFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virNetworkFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virDomainFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
virConnect.privateData is void *, so we can't access
fields of parallelsConn, pointer to which is stored in
virConnect.privateData. So replace all occurences of
conn->privateData->storageState with privconn->storageState.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Partially reverts commit 5754dbd.
The code in the specfile adds a MAC address to every <bridge>,
even for <forward mode='bridge'> for which we don't support
changing MAC addresses.
Remove it completely. For new networks, we have been adding
MAC addresses on definition/creation since the commit mentioned above.
For existing networks (pre-0.9.0), the MAC is added by this commit.
https://bugzilla.redhat.com/show_bug.cgi?id=1156367
Since our big split of libvirt.c there are only a few functions
living there. The majority was moved to corresponding subfile,
e.g. domain functions were moved to libvirt-domain.c. However,
the patches for virDomainGetFSInfo() and virDomainFSInfoFree()
introduction were posted prior the big split and merged after.
This resulted in two domain functions landing in wrong file.
Move them to the correct one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Adding non-existing nwfilter to a network interface device without any
nwfilter specified will crash libvirt daemon with segfault. The reason is
that the nwfilter is not found an libvirt will try to restore old
nwfilter configuration but there is no nwfilter specified.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The function virNetworkObjListExport() in network_conf.c had a call to
the public API virNetworkFree() which was causing a link error:
CCLD libvirt_driver_vbox_network_impl.la
./.libs/libvirt_conf.a(libvirt_conf_la-network_conf.o): In function `virNetworkObjListExport':
/home/laine/devel/libvirt/src/conf/network_conf.c:4496: undefined reference to `virNetworkFree'
This would happen when I added
#include "network_conf.h"
into domain_conf.h, then attempted to call a new function from that
file (and enum converter, similar to virNetworkForwardTypeToString())
In the end, virNetworkFree() ends up just calling virObjectUnref(obj)
anyway (after clearing all pending errors, which we probably *don't*
want to do in the cleanup of a utility function), so this is likely
more correct than the original code as well.
There is a race condition between the fopen and fscanf calls
in qemuGetProcessInfo. If fopen succeeds, there is a small
possibility that the file no longer exists before reading from it.
Now, if either fopen or fscanf calls fail, the function will behave
just as only fopen had failed.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1169055
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit id 'cb88d433' refactored the calling sequence to use a thread;
however, in doing so "lost" the check for if virNetSocketAccept returns
failure. Since other code makes that check, Coverity complains. Although
a false positive, adding back the failure check pacifies Coverity
Commit id '0d36a5d05' modified the code slightly, but removed the
return value check thus causing Coverity to complain that this call
was the only one where the return value wasn't checked. Since nothing
was done previously if there was a failure, just use ignore_value here
to pacify Coverity
Coverity complains that many other callers to return err from
virGetLastError() will check if err is not NULL before dereferencing
it. Just do the same here for safety.
Coverity complained that because the cfg->macFilter call checked
net->ifname != NULL before calling ebtablesRemoveForwardAllowIn, then
the virNetDevOpenvswitchRemovePort call should have the same check.
However, if I move the ebtables call prior to the check for TYPE_DIRECT
(where there is a VIR_FREE(net->ifname)), then it seems Coverity is
happy. Since firewall info is tacked on last during setup, removing
it in the opposite order of initialization seems to be natural anyway
https://bugzilla.redhat.com/show_bug.cgi?id=1159180
The virStoragePoolSourceFindDuplicate only checks the incoming definition
against the same type of pool as the def; however, for "scsi_host" and
"fc_host" adapter pools, it's possible that either some pool "scsi_host"
adapter definition is already using the scsi_hostN that the "fc_host"
adapter definition wants to use or some "fc_host" pool adapter definition
is using a vHBA scsi_hostN or parent scsi_hostN that an incoming "scsi_host"
definition is trying to use.
This patch adds the mismatched type checks and adds extraneous comments
to describe what each check is determining.
This patch also modifies the documentation to be describe what scsi_hostN
devices a "scsi_host" source adapter should use and which to avoid. It also
updates the parent definition to specifically call out that for mixed
environments it's better to define which parent to use so that the duplicate
pool checks can be done properly.
https://bugzilla.redhat.com/show_bug.cgi?id=1159180
Move the API from the backend to storage_conf and rename it to
virStoragePoolGetVhbaSCSIHostParent. A future patch will need to
use this functionality from storage_conf
There are some small issue in qemuProcessAttach:
1.Fix virSecurityManagerGetProcessLabel always get pid = 0,
move 'vm->pid = pid' before call virSecurityManagerGetProcessLabel.
2.Use virSecurityManagerGenLabel to get image label.
3.Fix always set selinux label for other security driver label.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
When a block{commit,copy} job was aborted on a domain, block job handler
did not process it correctly, leaving a phantom job in the background.
Any further calls to any blockjob causes "block <jobtype> still active"
error. This patch fixes the blockjob handler so that it checks not only
for VIR_DOMAIN_BLOCK_JOB_FAILED status, but VIR_DOMAIN_BLOCK_JOB_CANCELED
status as well, followed by our existing cleanup routine.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135169
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If job is failed in qemuMigrationRun, we expect the jobinfo type as
FAILED. But jobinfo type won't be updated until entering
qemuMigrationWaitForCompletion. We should make it updated in all
conditions. Moreover, we can't use qemuMigrationUpdateJobStatus
here because job may fail in libvirt, so we can't query job status
from QEMU.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
The migration job status is traced in qemuMigrationUpdateJobStatus
which is called in qemuMigrationRun. But if migration is cancelled
before the trace such as in qemuMigrationDriveMirror, the jobinfo
type won't be updated to CANCELLED. After this patch, we can get
jobinfo type CANCELLED if migration is cancelled during drive
mirror. Moreover, we can't use qemuMigrationUpdateJobStatus
because from qemu's point of view it's just the drive mirror being
cancelled and the migration hasn't even started yet.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160084
As of b6d4dad11b (1.2.5) we are trying to keep the status of FSFreeze
in the guest. Even though I've tried to fixed couple of corner cases
(6ea54769ba), it occurred to me just recently, that the approach is
broken by design. Firstly, there are many other ways to talk to
qemu-ga (even through libvirt) that filesystems can be thawed (e.g.
qemu-agent-command) without libvirt noticing. Moreover, there are
plenty of ways to thaw filesystems without even qemu-ga noticing (yes,
qemu-ga keeps internal track of FSFreeze status). So, instead of
keeping the track ourselves, or asking qemu-ga for stale state, it's
the best to let qemu-ga deal with that (and possibly let guest kernel
propagate an error).
Moreover, there's one bug with the following approach, if fsfreeze
command failed, we've executed fsthaw subsequently. So issuing
domfsfreeze in virsh gave the following result:
virsh # domfsfreeze gentoo
Froze 1 filesystem(s)
virsh # domfsfreeze gentoo
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance
virsh # domfsfreeze gentoo
Froze 1 filesystem(s)
virsh # domfsfreeze gentoo
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virReportSystemError is reserved for reporting system errors, calling it
with VIR_ERR_* error codes produces error messages that do not make any
sense, such as
internal error: guest failed to start: Kernel doesn't support user
namespace: Link has been severed
We should prohibit wrong usage with a syntax-check rule.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This reverts commit 433b427ff8.
The patch was added in order to overcome a bug in iproute2 and since it
was properly identified as a bug, particularly in openSUSE 13.2, and it
is being worked on [1], the best solution for libvirt seems to be to
keep the old behaviour.
[1] https://bugzilla.novell.com/show_bug.cgi?id=907093
Starting from libvirt-1.2.4, network state XML files moved to another
directory (see commit b9e95491) and libvirt automatically migrates the
network state files to a new location. However, the code used
dirent.d_type which is not supported by all filesystems. Thus, when
libvirt was upgraded on a host which used such filesystem, network state
XMLs were not properly moved and running networks disappeared from
libvirt.
This patch falls back to lstat() whenever dirent.d_type is DT_UNKNOWN to
fix this issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1167145
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit 2aa167ca tried to fix the DBus interaction code to allow
callers to use native types instead of 4-byte bools. But in
fixing the issue, I missed the case of an arrayref; Conrad Meyer
shows the following valid complaint issued by clang:
CC util/libvirt_util_la-virdbus.lo
util/virdbus.c:956:13: error: cast from 'bool *' to 'dbus_bool_t *' (aka 'unsigned int *') increases required alignment from 1 to 4 [-Werror,-Wcast-align]
GET_NEXT_VAL(dbus_bool_t, bool_val, bool, "%d");
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/virdbus.c:858:17: note: expanded from macro 'GET_NEXT_VAL'
x = (dbustype *)(*xptrptr + (*narrayptr - 1)); \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated.
But fixing that points out that we have NEVER supported arrayrefs
of sub-int types (byte, i16, u16, and now bool). Again, while raw
types promote, arrays do not; so the macros HAVE to deal with both
size possibilities rather than assuming that an arrayref uses the
same sizing as the promoted raw type.
Obviously, our testsuite wasn't covering as much as it should have.
* src/util/virdbus.c (GET_NEXT_VAL): Also fix array cases.
(SET_NEXT_VAL): Fix uses of sub-int arrays.
* tests/virdbustest.c (testMessageArray, testMessageArrayRef):
Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 6fcddfcd refactored job statistics but missed the jobinfo type updated
in qemuDomainGetJobInfo. After this patch, we can use virDomainGetJobInfo to
get jobinfo type again.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit 'c264eeaa' didn't do the prerequisite 'make syntax-check' before
pushing. There was a <tab> in the whitespace for the comment. Replaced
with spaces and aligned.
pushed as build breaker since Jenkins complained loudly
The typical case where we had a problem is with such a filesystem
definition as created by virt-sandbox-service:
<filesystem type='bind' accessmode='passthrough'>
<source dir='/var/lib/libvirt/filesystems/mysshd/var'/>
<target dir='/var'/>
</filesystem>
In this case, we don't want to unmount the /var subtree or we may
loose the access to the source folder.
Resolving symlinks can fail before mounting any file system if one file
system depends on another being mounted. Symlinks are now resolved in
two passes:
* Before any file system is mounted, but then we are more gentle if
the source path can't be accessed
* Right before mounting a file system, so that we are sure that we
have the resolved path... but then if it can't be accessed we raise
an error.
Due to a change (or bug?) in ip link implementation, the command
'ip link add vnet0...'
is forced into
'ip link add name vnet0...'
The changed command also works on older versions of iproute2, just the
'name' parameter has been made mandatory.
Add attribute to set vgamem_mb parameter of QXL device for QEMU. This
value sets the size of VGA framebuffer for QXL device. Default value in
QEMU is 8MB so reuse it also in libvirt to not break things.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
So far we didn't have any option to set video memory size for qemu video
devices. There was only the vram (ram for QXL) attribute but it was valid
only for the QXL video device.
To provide this feature to users QEMU has a dedicated device attribute
called 'vgamem_mb' to set the video memory size. We will use the 'vram'
attribute for setting video memory size for other QEMU video devices.
For the cirrus device we will ignore the vram value because it has
hardcoded video size in QEMU.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
QEMU has two different type of QXL display device. The first "qxl-vga"
is for primary video device and second "qxl" is for secondary video
device.
There are also two different ways how to specify those devices on qemu
command line, the first one and obsolete is using "-vga" option and the
current new one is using "-device" option. The "-vga" could be used only
to setup primary video device, so the "-vga qxl" equal to
"-device qxl-vga". Unfortunately the "-vga qxl" doesn't support setting
additional parameters for the device and "-global" option must be used
for this purpose. It's mandatory to use "-global qxl-vga...." to set the
parameters of primary video device previously defined with "-vga qxl".
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The vram attribute was introduced to set the video memory but it is
usable only for few hypervisors excluding QEMU/KVM and the old XEN
driver. Only in case of QEMU the vram was used for QXL.
This patch updates the documentation to reflect current code in libvirt
and also changes the cases when we will set the default vram attribute.
It also fixes existing strange default value for VGA devices 9MB to 16MB
because the video ram should be rounded to power of two.
The change of default value could affect migrations but I found out that
QEMU always round the video ram to power of two internally so it's safe
to change the default value to the next closest power of two and also
silently correct every domain XML definition. And it's also safe because
we don't pass the value to QEMU.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
There are two special cases, if the input number is 0 or the number is
larger then 2^31 (for 32bit unsigned int). For the special cases the
return value is 0 because they cannot be rounded.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Get mounted filesystems list, which contains hardware info of disks and its
controllers, from QEMU guest agent 2.2+. Then, convert the hardware info
to corresponding device aliases for the disks.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
virDomainGetFSInfo returns a list of filesystems information mounted in the
guest, which contains mountpoints, device names, filesystem types, and
device aliases named by libvirt. This will be useful, for example, to
specify mountpoints to fsfreeze when taking snapshot of a part of disks.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
As qemu is now able to notify us about change of the channel state used
for communication with the guest agent we now can more precisely track
the state of the guest agent.
To allow notifying management apps this patch implements a new event
that will be triggered on changes of the guest agent state.
Improve the monitor function to also retrieve the guest state of
character device (if provided) so that we can refresh the state of
virtio-serial channels and perhaps react to changes in the state in
future patches.
This patch changes the returned data from qemuMonitorGetChardevInfo to
return a structure containing the pty path and the state for all the
character devices.
The change to the testsuite makes sure that the data is parsed
correctly.
This patch contains three domain cleanup improvements in the migration
finish phase, ensuring a domain is properly disposed when a failure is
detected or the migration is cancelled.
The check for virDomainObjIsActive is moved to libxlDomainMigrationFinish,
where cleanup can occur if migration failed and the domain is inactive.
The 'cleanup' label was missplaced in libxlDomainMigrationFinish, causing
a migrated domain to remain in the event of an error or cancelled migration.
In cleanup, the domain was not removed from the driver's list of domains.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
During the perform phase of migration, the domain is started on
the dst host in a running state if VIR_MIGRATE_PAUSED flag is not
specified. In the finish phase, the domain is also unpaused if
VIR_MIGRATE_PAUSED flag is unset. I've noticed this second unpause
fails if the domain was already unpaused following the perform phase.
This patch changes the perform phase to always start the domain
paused, and defers unpausing, if requested, to the finish phase.
Unpausing should occur in the finish phase anyhow, where the domain
can be properly destroyed if the perform phase fails and migration
is cancelled.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Moving data reception of the perform phase of migration to a
thread introduces a race with the finish phase, where checking
if the domain is active races with the thread finishing the
perform phase. The race is easily solved by acquiring a job in
the finish phase, which must wait for the perform phase job to
complete.
While wrapping the finish phase in a job, noticed the virDomainObj
was being unlocked in a callee - libxlDomainMigrationFinish. Move
the unlocking to libxlDomainMigrateFinish3Params, where the lock
is acquired.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The libxl driver receives migration data within an IO callback invoked
by the event loop, effectively disabling the event loop while migration
occurs.
This patch moves receving of the migration data to a thread. The
incoming connection is still accepted in the IO callback, but control
is immediately returned to the event loop after spawning the thread.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Specifying an explicit path to pygrub (e.g. BINDIR "/pygrub") only works if
Xen and libvirt happen to be installed to the same prefix. A more flexible
approach is to simply specify "pygrub" which will cause libxl to use the
correct path which it knows (since it is built with the same prefix as pygrub).
This is particular problematic in the Debian packaging, since the Debian Xen
package relocates pygrub into a libexec dir, however I think this change makes
sense upstream.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
To be able to express some use cases of the RBD backing with libvirt, we
need to be able to specify a config file for the RBD client to qemu as
that is one of the commonly used options.
Some storage systems have internal support for snapshots. Libvirt should
be able to select a correct snapshot when starting a VM.
This patch adds a XML element to select a storage source snapshot for
the RBD protocol which supports this feature.
As we now have a common function to parse backing store string for RBD
backing store we can reuse it in the backing store walker so that we
don't fail on files backed by RBD storage.
This patch also adds a few tests to verify that the parsing works as
expected.
To allow reuse this non-trivial parser code in the backing store parser
this part of the command line parser needs to be split out into a
separate funciton.
Instead of splitting out various fields, pass the complete structure and
let the function pick various things of it.
As one of the callers isn't using virStorageSourcePtr to store the data,
this patch adds glue code that fills the data into a dummy
virStorageSourcePtr before calling the func.
This change will help when adding new fields that need output processing
in the future.
If there are no hosts for a storage source virStorageSourceCopy and
virStorageSourceNewFromBackingRelative would try to copy them anyways.
As the success of virStorageNetHostDefCopy is determined by returning
a pointer and malloc of 0 elements might return NULL according to the
implementation, the result of the copy function may vary.
Fix this by copying the hosts array only if there are hosts defined.
As we now have a deep copy function for struct virStorageSource add a
notice that extensions of the structure require also appropriate changes
to the virStorageSourceCopy func.
New qemu added a new event that is emitted when a virtio serial channel
is opened in the guest OS. This allows us to update the state of the
port in the output-only XML element.
This patch implements the monitor callbacks and necessary handlers to
update the state in the definition.
To track state of virtio channels this patch adds a new output-only
attribute called 'state' to the <target> element of virtio channels.
This will be later populated with the guest state of the channel.
To unify future additions that require information from "query-chardev"
rename qemuMonitorGetPtyPaths and friends to qemuMonitorGetChardevInfo
and move the allocation of the returned hash into the top level
function.
When creating a disk image snapshot the libvirt code would blindly copy
the parents label to the newly created image. This runs into problems
when you start a VM from an image hosted on NFS (or other storage system
that doesn't support selinux labels) and the snapshot destination is on
a storage system that does support selinux labels. Libvirt's code in
that case generates a different security label for the image hosted on
NFS. This label is valid only for NFS images and doesn't allow access in
case of a locally stored image.
To fix this issue libvirt needs to refrain from copying security
information in cases where the default domain seclabel is a better
choice.
This patch repurposes the now unused @force argument of
virStorageSourceInitChainElement to denote whether a copy of the
security labelling stuff should be attempted or not. This allows to
fine-control the copy operation for cases where we need to keep the
label of the old disk vs. the cases where we need to keep the label
unset to use the default domain imagelabel.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718
Commit c0e7022 breaks on a machine that lacks dbus headers:
In file included from util/virdbus.c:24:0:
util/virdbuspriv.h:31:3: error: unknown type name 'dbus_int16_t'
* src/util/virdbuspriv.h (DBusBasicValue): Only provide fallback
when dbus is compiled.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1152382
When libvirt create's the vport (VPORT_CREATE) for the vHBA, there isn't
enough "time" between the creation and the running of the following
backend->refreshPool after a backend->startPool in order to find the LU's.
Population of LU's happens asynchronously when udevEventHandleCallback
discovers the "new" vHBA port. Creation of the infrastructure by udev
is an iterative process creating and discovering actual storage devices and
adjusting the environment.
Because of the time it takes to discover and set things up, the backend
refreshPool call done after the startPool call will generally fail to
find any devices. This leaves the newly started pool appear empty when
querying via 'vol-list' after startup. The "workaround" has always been
to run pool-refresh after startup (or any time thereafter) in order to
find the LU's. Depending on how quickly run after startup, this too may
not find any LUs in the pool. Eventually though given enough time and
retries it will find something if LU's exist for the vHBA.
This patch adds a thread to be executed after the VPORT_CREATE which will
attempt to find the LU's without requiring the external run of refresh-pool.
It does this by waiting for 5 seconds and searching for the LU's. If any
are found, then the thread completes; otherwise, it will retry once more
in another 5 seconds. If none are found in that second pass, the thread
gives up.
Things learned while investigating this... No need to try and fill the
pool too quickly or too many times. Over the course of creation, the udev
code may 'add', 'change', and 'delete' the same device. So if the refresh
code runs and finds something, it may display it only to have a subsequent
refresh appear to "lose" the device. The udev processing doesn't seem to
have a way to indicate that it's all done with the creation processing of a
newly found vHBA. Only the Lone Ranger has silver bullets to fix everything.
Fix a problem in the getBlockDevice and processLU where retval initialized
to zero causing some failures to erroneously continue through to the
virStorageBackendSCSINewLun with an attempt to find a path for "/dev/(null)".
This would fail approximately 10 seconds later with debug message:
virStorageBackendSCSINewLun:203 :
No stable path found for '/dev/(null)' in '/dev/disk/by-path'
The root cause of the issue is for many /sys/bus/scsi/devices/<lun path>
there is no "block*" device found for the vHBA's, where "<lun path>" are
the various paths created for the vHBA, such as "17:0:0:0", "17:0:1:0",
"17:0:2:0", "17:0:3:0", etc. If the block device isn't found, then the
directory should just be ignored rather than attempting to process it.
The bug was that in getBlockDevice the assumption was "block" would exist
and either getNewStyleBlockDevice or getOldStyleBlockDevice would fill in
@block_device. However, if 'block*' doesn't exist, then the code returned
NULL for block_device *and* a good (zero) retval value. This caused the
processLU code to attempt the virStorageBackendSCSINewLun which failed
"at some point in time" in the future.
After this change - on test system the refresh-pool didn't have a noticable
pause of about 20 seconds - it completed within a second since no longer
was there an attempt/need to find "/dev/(null)".
Additionally, the virStorageBackendSCSIFindLU's shouldn't be declaring
found unless the processLU actually returns success. This will be
important in the followup patch which relies on whether a LU was found.
Compilation on a RHEL 5 host failed, due to the older dbus headers
present on that machine, and triggered by commit 2aa167ca:
util/virdbus.c: In function 'virDBusMessageIterDecode':
util/virdbus.c:952: error: 'DBusBasicValue' undeclared (first use in this function)
* m4/virt-dbus.m4 (LIBVIRT_CHECK_DBUS): Check for DBusBasicValue.
* src/util/virdbuspriv.h (DBusBasicValue): Provide fallback.
Signed-off-by: Eric Blake <eblake@redhat.com>
getsockopt(sock->fd, SOL_SOCKET, SO_PEERCRED, ...) sets the pid to 0
when the process that opens the connection is in another container.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Commit dc33e6e4 caused older platforms like Fedora 20 to emit
scary log messages at startup:
2014-11-19 23:12:58.800+0000: 28906: error : virCommandWait:2532 : internal error: Child process (/usr/sbin/iptables -w -L -n) unexpected exit status 2: iptables v1.4.19.1: unknown option "-w"
Try `iptables -h' or 'iptables --help' for more information.
Since we are probing and expect to handle the case where -w is not
supported, we should not let virCommand log it as an error.
* src/util/virfirewall.c (virFirewallCheckUpdateLock): Handle
non-zero status ourselves.
Signed-off-by: Eric Blake <eblake@redhat.com>
Oops, I forgot to squash one more instance of the same check in the
previous commit (v1.2.10-144-g52691f9).
https://bugzilla.redhat.com/show_bug.cgi?id=1147331
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Any attempt to start a tunnelled migration with libvirtd that supports
RDMA migration (specifically commit v1.2.8-226-ged22a47) crashes
libvirtd on the destination host.
The crash is inevitable because qemuMigrationPrepareAny is always called
with NULL protocol in case of tunnelled migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1147331
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
As discussed on the upstream list, it's better not to make this
kind of predictions in libvirt. It may happen that qemu learns
how to enable OVMF on other architectures too and we shouldn't
try to chase that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, we are whitelisting architectures, that we know how to run
OVMF on. So far, only x86_64 was enabled. However, looking at qemu
code, the same commandline can be used to enable OVMF for armv7l and
aarch64.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I noticed this while working on qemuDomainGetBlockInfo. Assigning
a bool value to an int variable compiles fine, but raises red flags
on the maintenance front as it becomes too easy to assign -1 or 2
or any other non-bool value to the same variable.
* cfg.mk (sc_prohibit_int_assign_bool): New rule.
* src/conf/snapshot_conf.c (virDomainSnapshotRedefinePrep): Fix
offenders.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo)
(qemuDomainSnapshotCreateXML): Likewise.
* src/test/test_driver.c (testDomainSnapshotAlignDisks):
Likewise.
* src/util/vircgroup.c (virCgroupSupportsCpuBW): Likewise.
* src/util/virpci.c (virPCIDeviceBindToStub): Likewise.
* src/util/virutil.c (virIsCapableVport): Likewise.
* tools/virsh-domain-monitor.c (cmdDomMemStat): Likewise.
* tools/virsh-domain.c (cmdBlockResize, cmdScreenshot)
(cmdInjectNMI, cmdSendKey, cmdSendProcessSignal)
(cmdDetachInterface): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Use of an 'int' to represent a 'bool' value is confusing. Just
because dbus made the mistake of cementing their 4-byte wire
format of dbus_bool_t into their API doesn't mean we have to
repeat the mistake. With a little bit of finesse, we can
guarantee that we provide a large-enough value to the DBus
code, while still copying only the relevant one-byte bool
to the client code, and isolate the rest of our code base from
the DBus stupidity.
* src/util/virdbus.c (GET_NEXT_VAL): Add parameter.
(virDBusMessageIterDecode): Adjust all clients.
* src/util/virpolkit.c (virPolkitCheckAuth): Use nicer type.
* tests/virdbustest.c (testMessageSimple, testMessageStruct):
Test new behavior.
Signed-off-by: Eric Blake <eblake@redhat.com>
This function returned non-inactive domains instead of active
domains. This broke virConnectNumOfDefinedDomains() and
virConnectListDefinedDomains() functions.
Ethernet interfaces in libvirt currently do not support bandwidth setting.
For example, following xml file for an interface will not apply these
settings to corresponding qdiscs.
<interface type="ethernet">
<mac address="02:36:1d:18:2a:e4"/>
<model type="virtio"/>
<script path=""/>
<target dev="tap361d182a-e4"/>
<bandwidth>
<inbound average="984" peak="1024" burst="64"/>
<outbound average="2000" peak="2048" burst="128"/>
</bandwidth>
</interface>
Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For some reason, commit id '72b4151f' triggered a Coverity uninitialized
'reply' variable check when referenced within the for loop.
It seems Coverity doesn't know that flags will have to be either AFFECT_LIVE
or AFFECT_CONFIG after the virDomainLiveConfigHelperMethod call.
By adding a "sa_assert()" to confirm that fact, Coverity is happy again.
https://bugzilla.redhat.com/show_bug.cgi?id=1164080
After a disk is hotunplugged a subsequent call to qemuDomainGetBlockIoTune
to get the --config settings of that disk will fail because the disk is no
longer found by qemuDiskPathToAlias causing an unexpected failure.
Since only the --live flag needs to have the disk device pointer, move the
fetch inside the (flags & VIR_DOMAIN_AFFECT_LIVE) condition. This will also
affect the results if no flags are provided or the --current flag is provided.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Seems the 'size_iops_sec' was a late add and the checks for whether
the field was defined, but unsupported and the maximum size of the
field were not being made.
Also, adjust blkdeviotune support error message for grammar, spelling
(paramater), and remove the "(need QEMU 1.7 or superior)". None of
our other similar error messages list which QEMU version is required.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rule sc_prohibit_newline_at_end_of_diagnostic for syntax-check does
check for passing strings ending with '\n' two lines after known
functions. This is, of course subject to false positives, so for the
sake of future changes, trick that syntax-check by adding one more line
with a comment.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
A previous commit introduced use of locking with invocation
of iptables in the viriptables.c module
commit ba95426d6f
Author: Serge Hallyn <serge.hallyn@ubuntu.com>
Date: Fri Nov 1 12:36:59 2013 -0500
util: use -w flag when calling iptables
This only ever had effect with the virtual network driver,
as it was not wired up into the nwfilter driver. Unfortunately
in the firewall refactoring the use of the -w flag was
accidentally lost.
This patch introduces it to the virfirewall.c module so that
both the virtual network and nwfilter drivers will be using
it. It also ensures that the equivalent --concurrent flag
to ebtables is used.
Since QEMU 1.2.0, we switched to QMP probing instead of parsing -help
(and other commands, such as -cpu ?) output. However, if QMP probing
failed, we still tried starting QEMU with various options and parsing
the output, which was guaranteed to fail because the output changed.
Let's just refuse parsing -help for QEMU >= 1.2.0.
https://bugzilla.redhat.com/show_bug.cgi?id=1160318
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We used to set migration capabilities only when a user asked for them in
flags. This is fine when migration succeeds since the QEMU process is
killed in the end but in case migration fails or if it's cancelled, some
capabilities may remain turned on with no way to turn them off. To fix
that, migration capabilities have to be turned on if requested but
explicitly turned off in case they were not requested but QEMU supports
them.
https://bugzilla.redhat.com/show_bug.cgi?id=1163953
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Rather than just picking the first CD (or failing that, HDD) we come
across, if the user has picked a boot device ordering with <boot
order=''>, respect that (and just try to boot the lowest-index device).
Adds two sets of tests to bhyve2xmlargv; 'grub-bootorder' shows that we
pick a user-specified device over the first device in the domain;
'grub-bootorder2' shows that we pick the first (lowest index) device.
When user calls setmem on a running LXC machine, we do update its cgroup
entry, however we neither update domain's runtime XML nor
we update our internal structures and this patch fixes it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1131919
Commit 6e5c79a1 tried to fix deadlock between nwfilter{Define,Undefine}
and starting of guest, but this same deadlock exists for
updating/attaching network device to domain.
The deadlock was introduced by removing global QEMU driver lock because
nwfilter was counting on this lock and ensure that all driver locks are
locked inside of nwfilter{Define,Undefine}.
This patch extends usage of virNWFilterReadLockFilterUpdates to prevent
the deadlock for all possible paths in QEMU driver. LXC and UML drivers
still have global lock.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143780
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In one of my previous patches (3a3c3780b) I've tried to fix the
problem of nvram path disappearing on a domain that's been
started and shut down again. I fixed this by explicitly saving
domain's config file. However, I did a bit of clumsy without
realizing we have a transient domains for which we don't save the
config file. Hence, any domain using UEFI became persistent.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160926
Introduce a 'managed' attribute to allow libvirt to decide whether to
delete a vHBA vport created via external means such as nodedev-create.
The code currently decides whether to delete the vHBA based solely on
whether the parent was provided at creation time. However, that may not
be the desired action, so rather than delete and force someone to create
another vHBA via an additional nodedev-create allow the configuration of
the storage pool to decide the desired action.
During createVport when libvirt does the VPORT_CREATE, set the managed
value to YES if not already set to indicate to the deleteVport code that
it should delete the vHBA when the pool is destroyed.
If libvirtd is restarted all the memory only state was lost, so for a
persistent storage pool, use the virStoragePoolSaveConfig in order to
write out the managed value.
Because we're now saving the current configuration, we need to be sure
to not save the parent in the output XML if it was undefined at start.
Saving the name would cause future starts to always use the same parent
which is not the expected result when not providing a parent. By not
providing a parent, libvirt is expected to find the best available
vHBA port for each subsequent (re)start.
At deleteVport, use the new managed value to decide whether to execute
the VPORT_DELETE. Since we no longer save the parent in memory or in
XML when provided, if it was not provided, then we have to look it up.
https://bugzilla.redhat.com/show_bug.cgi?id=1160926
Passing a copy of the storage pool adapter to a function just changes the
copy of the fields in the particular function and then when returning to
the caller those changes are discarded. While not yet biting us in the
storage clean-up case, it did cause an issue for the fchost storage pool
startup case, createVport. The issue was at startup, if no parent is found
in the XML, the code will search for the 'best available' parent and then
store that in the in memory copy of the adapter. Of course, in this case
it was a copy, so when returning to the virStorageBackendSCSIStartPool that
change was discarded (or lost) from the pool->def->source.adapter which
meant at shutdown (deleteVport), the code assumed no adapter was passed
and skipped the deletion, leaving the vHBA created by libvirt still defined
requiring an additional stop of a nodedev-destroy to remove.
Adjusted the createVport to take virStoragePoolDefPtr instead of the
adapter copy. Then use the virStoragePoolSourceAdapterPtr when processing.
A future patch will need the 'def' anyway, so this just sets up for that.
https://bugzilla.redhat.com/show_bug.cgi?id=1160565
The existing code assumed that the configuration of a 'parent' attribute
was correct for the createVport path. As it turns out, that may not be
the case which leads errors during the deleteVport path because the
wwnn/wwpn isn't associated with the parent.
With this change the following is reported:
error: Failed to start pool fc_pool_host3
error: XML error: Parent attribute 'scsi_host4' does not match parent 'scsi_host3' determined for the 'scsi_host16' wwnn/wwpn lookup.
for XML as follows:
<pool type='scsi'>
<name>fc_pool</name>
<source>
<adapter type='fc_host' parent='scsi_host4' wwnn='5001a4aaf3ca174b' wwpn='5001a4a77192b864'/>
</source>
Where 'nodedev-dumpxml scsi_host16' provides:
<device>
<name>scsi_host16</name>
<path>/sys/devices/pci0000:00/0000:00:04.0/0000:10:00.0/host3/vport-3:0-11/host16</path>
<parent>scsi_host3</parent>
<capability type='scsi_host'>
<host>16</host>
<unique_id>13</unique_id>
<capability type='fc_host'>
<wwnn>5001a4aaf3ca174b</wwnn>
<wwpn>5001a4a77192b864</wwpn>
...
The patch also adjusts the description of the storage pool to describe the
restrictions.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160565
If a 'parent' attribute is provided for the fchost, then at startup
time check to ensure it is a vport capable scsi_host. If the parent
is not vport capable, then disallow the startup. The following is the
expected results:
error: Failed to start pool fc_pool
error: XML error: parent 'scsi_host2' specified for vHBA is not vport capable
where the XML for the fc_pool is:
<pool type='scsi'>
<name>fc_pool</name>
<source>
<adapter type='fc_host' parent='scsi_host2' wwnn='5001a4aaf3ca174b' wwpn='5001a4a77192b864'/>
</source>
...
and 'scsi_host2' is not vport capable.
Providing an incorrect parent and a correct wwnn/wwpn could lead to
failures at shutdown (deleteVport) where the assumption is the parent
is for the fchost.
NOTE: If the provided wwnn/wwpn doesn't resolve to an existing scsi_host,
then we will be creating one with code (virManageVport) which
assumes the parent is vport capable.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This enables booting interactive GRUB menus (e.g. install CDs) with
libvirt-bhyve.
Caveat: A terminal other than the '--console' option to 'virsh start'
(e.g. 'cu -l /dev/nmdm0B -s 115200') must be used to connect to
grub-bhyve because the bhyve loader path is synchronous and must occur
before the VM actually starts.
Changing the bhyveProcessStart logic around to accommodate '--console'
for interactive loader use seems like a significant project and probably
not worth it, if UEFI/BIOS support for bhyve is "coming soon."
We still default to bhyveloader(1) if no explicit bootloader
configuration is supplied in the domain.
If the /domain/bootloader looks like grub-bhyve and the user doesn't
supply /domain/bootloader_args, we make an intelligent guess and try
chainloading the first partition on the disk (or a CD if one exists,
under the assumption that for a VM a CD is likely an install source).
Caveat: Assumes the HDD boots from the msdos1 partition. I think this is
a pretty reasonable assumption for a VM. (DrvBhyve with Bhyveload
already assumes that the first disk should be booted.)
I've tested both HDD and CD boot and they seem to work.
Use the device type name if we know it instead of its number,
even if we can't hotplug it:
qemuMonitorJSONAttachCharDevCommand:6094 : operation failed: Unsupported
char device type '10'
virDomainChrSourceDefIsEqual should return 'true' for
identical SPICEVMC chardevs, and those that have no source
specification.
After this change, a failed hotplug no longer leaves a stale
pointer in the domain definition.
https://bugzilla.redhat.com/show_bug.cgi?id=1162097
If the memory mode is specified as 'strict' and with one node, we
get the following error when starting domain.
error: Unable to write to '$cgroup_path/cpuset.mems': Device or resource busy
XML is configured with numatune as follows:
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
It's broken by Commit 411cea638f
which moved qemuSetupCgroupForEmulator() before setting cpuset.mems
in qemuSetupCgroupPostInit.
Directory '$cgroup_path/emulator/' is created in qemuSetupCgroupForEmulator.
But '$cgroup_path/emulator/cpuset.mems' it not set and has a default value
(all nodes, such as 0-1). Then we setup '$cgroup_path/cpuset.mems' to the
nodemask (in this case it's '0') in qemuSetupCgroupPostInit. It must fail.
This patch makes '$cgroup_path/emulator/cpuset.mems' is set before
'$cgroup_path/cpuset.mems'. The action is similar with that in
qemuDomainSetNumaParamsLive.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
If the memory mode in numatune is not 'strict', we should not setup
cpuset.mems. Before commit 1a7be8c600
we have checked the memory mode in virDomainNumatuneGetNodeset. This
patch adds the check as before.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
If the memory mode in numatune is specified as 'preferred' with one node
(such as nodeset='0'), domain's memory is not all in node 0 absolutely.
Assumption that node 0 doesn't have enough memory, memory can be allocated
on node 1 when qemu process startup. Then if we set cpuset.mems to '0',
it may invoke OOM.
Commit 1a7be8c600 changed the former logic of
checking memory mode in virDomainNumatuneGetNodeset. This patch adds the
check as before.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
This patch fixes the following issues.
1) When an invalid wwn is introduced, libvirt reports
"Malformed wwn: %s". The template won't be replaced.
2) "target" option for dompmsuspend and "xml" option for
save-image-define are required options and should use
VSH_OT_DATA instead of VSH_OT_STRING as an option type.
3) A typo.
Signed-off-by: Hao Liu <hliu@redhat.com>
Coverity found out that commit cd490086 caused a possible NULL pointer
dereference. This is due to the fact, that phyp_driver is NULL at the
time of closing the socket, instead of connection_data, which kept the
socket before the mentioned commit, could not be NULL.
However, internal_socket is still the local socket that can be
closed, even unconditionally, if we initialize it to -1.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Check the arability of the options with the current qemu binary,
add them in the varable opt if yes, print a message if not.
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Detect if the the qemu binary currently in use support the bps_max option,
If yes add it to the command, if not, just ignore the option.
We don't print error here, because the check for invalide arguments
has alerady been made in qemu_driver.c
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Add support for bps_max and friends in the driver part.
In the part checking if a qemu is running, check if the running binary
support bps_max, if not print an error message, if yes add it to
"info" variable
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add the capability to detect if the qemu binary have the capability
to use bps_max and friends
Add a value in the enum virQEMUCapsFlags for the qemu capability.
Set it with virQEMUCapsSet if the binary suport bps_max and they friends.
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Modify the structure _virDomainBlockIoTuneInfo to support these the new
options.
Change the initialization of the variable expectedInfo in qemumonitorjsontest.c
to avoid compiling problem.
Add documentation about the new xml options
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
nodeSetMemoryParameters() will call nodeSetMemoryParameterValue()
to set parameters. But it just filter the return code '-2' as
failure. Indeed we should report error when rc is negative.
https://bugzilla.redhat.com/show_bug.cgi?id=1161541
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
CPU numa topology implicitly allows memory specification in 'KiB'.
Enabling this to accept the 'unit' in which memory needs to be specified.
This now allows users to specify memory in units of choice, and
lists the same in 'KiB' -- just like other 'memory' elements in XML.
<numa>
<cell cpus='0-3' memory='1024' unit='MiB' />
<cell cpus='4-7' memory='1024' unit='MiB' />
</numa>
Also augment test cases to correctly model NUMA memory specification.
This adds the tag 'unit="KiB"' for memory attribute in NUMA cells.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 01b4de2b9f abstracts virDomainParseMemory()
for use by other functions in domain_conf.c
Extend the same for use, for functions outside of this file.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Store version numbers in this format
version = 1000000 * major + 1000 * minor + micro
produced by virParseVersionString instead of dedicated enums.
Split the complex esxVI_ProductVersion enum into a simpler
esxVI_ProductLine enum and a product version number.
Relax API and product version number checks to accept everything that
is equal or greater than the supported minimum version. VMware ESX
went through 3 major versions and the vSphere API always stayed
backward compatible. This commit assumes that this will also be true
for future VMware ESX versions.
Also reword error messages in esxConnectTo* to say what was expected
and what was found instead (suggested by Richard W.M. Jones).
As reviewing patches upstream it occurred to me, that we have two
functions doing nearly the same: virDomainParseMemory which
expects XML in the following format:
<memory unit='MiB'>1337</memory>
The other function being virDomainHugepagesParseXML expecting the
following format:
<someElement size='1337' unit='MiB'/>
It wouldn't matter to have two functions handle two different
scenarios like this if we could only not copy code that handles
32bit arches around. So this code merges the common parts into
one by inventing new @units_xpath argument to
virDomainParseMemory which allows overriding the default location
of @unit attribute in XML. With this change both scenarios above
can be parsed with virDomainParseMemory.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If detect_scsi_host_caps reports errors but keeps libvirtd going on
startup, the user is misled by the error messages. Transforming them
into warning still shows the problems, but indicates this is not fatal.
Introduced by commit c63ef0452b, when nodeset is NULL, validation will
pass in virNumaSetupMemoryPolicy, but virBitmapNextSetBit must ensure
bitmap is not NULL, otherwise that might cause a segmentation fault.
This patch fixes it.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
The shared netcf driver is stateful and inside the daemon so
there is no need to use the networkPrivateData field to get the
driver handle. Just access the global driver handle directly.
The shared network driver is stateful and inside the daemon so
there is no need to use the networkPrivateData field to get the
driver handle. Just access the global driver handle directly.
Many places already directly accessed the global driver handle
in any case, so the code could never work without relying on
this.
The shared storage driver is stateful and inside the daemon so
there is no need to use the storagePrivateData field to get the
driver handle. Just access the global driver handle directly.
Since the secondary drivers are only active when the primary
driver is also the Test driver, there is no need to use the
different type specific privateData fields.
Since the secondary drivers are only active when the primary
driver is also the Parallels driver, there is no need to use the
different type specific privateData fields. The object that was
being stored in the storagePrivateData can easily be kept in the
parallelsConn struct instead.
For inexplicable reasons the phyp driver defined two separate
structs for holding its private data. One it keeps in privateData
and the other it keeps in networkPrivateData. It uses them both
from all API driver methods. Merge the two separate structs
into one to remove this horrible abuse.
Since the secondary drivers are only active when the primary
driver is also the Hyper-V driver, there is no need to use the
different type specific privateData fields.
Since the secondary drivers are only active when the primary
driver is also the ESX driver, there is no need to use the
different type specific privateData fields.
Since the secondary drivers are only active when the primary
driver is also the remote driver, there is no need to use the
different type specific privateData fields.
The remote driver has had a long term hack to deal with the fact
that the old Xen driver worked outside libvirtd, but the rest
of the drivers worked inside. So you could have a local hypervisor
driver but everything else go via the remote driver. The Xen driver
long ago moved inside libvirtd, so this hack is no longer needed.
Thus we should open use the remote driver for secondary drivers
if the primary driver is already the remote driver.
IBM Power processors differ uniquely across generations (such as power6,
power7, power8). Each generation signifies a new PowerISA version
that exhibits features unique to that generation.
The higher 16 bits of PVR for IBM Power processors encode the CPU
generation, while the CPU chip (sub)version is encoded in lower 16 bits.
For all practical purposes of launching a VM, we care about the
generation which the vCPU will belong to, and not specifically the chip
version. This patch updates the libvirt PVR check to reflect this
relationship. It allows libvirt to select the right CPU generation
in case the exact match for a a specific CPU is not found.
Hence, there will no longer be a need to add each PowerPC CPU model to
cpu_map.xml; just adding entry for the matching ISA generation will
suffice.
It also contains changes to cpu_map.xml since processor generations
as understood by QEMU compat mode go as "power6", "power7" or "power8"
[Reference : QEMU commit 8dfa3a5e85 ]
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
PowerISA allows processors to run VMs in binary compatibility ("compat")
mode supporting an older version of ISA. QEMU has recently added support to
explicitly denote a VM running in compatibility mode through commit 6d9412ea
& 8dfa3a5e85. Now, a "compat" mode VM can be run by invoking this qemu
commandline on a POWER8 host: -cpu host,compat=power7.
This patch allows libvirt to exploit cpu mode 'host-model' to describe this
new mode for PowerKVM guests. For example, when a user wants to request a
power7 vm to run in compatibility mode on a Power8 host, this can be
described in XML as follows :
<cpu mode='host-model'>
<model>power7</model>
</cpu>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
This adds support for PowerPC Little Endian architecture.,
and allows libvirt to spawn VMs based on 'ppc64le' architecture.
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since libvirt.h was split into several files, it is impossible to
compile anything against a VPATH-built libvirt. In VPATH, only libvirt.h
is in build/include/libvirt while all other libvirt-*.h files are in
source/include/libvirt.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
One of the latest patches (9a8fc3efc2) introduced call of
geteuid(). However, not all systems have the function
implemented, e.g. mingw. Therefore, we fail to build on those
system. The fix consist of including virutil.h which defines
geteuid in needed. Sigh.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160084
As of b6d4dad1 (1.2.5) libvirt keeps track if domain disks have been
frozen. However, this falls into that set of information which don't
survive domain restart. Therefore, we need to clear the flag upon some
state transitions. Moreover, once we clear the flag we must update the
status file too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Extending the iothread disk support from pci to pci and ccw.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Finding the right type of disk should check for virtio as bus and
pci as device address type.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
When compiled without full numa support, the stub function for
virNumaNodeIsAvailable() just checks whether specified node is in range
<0, max); where max is maximum NUMA node available on the host. But
because the maximum node number is the highest usabe number (and not the
count of nodes), the check is incorrect as it should check whether the
specified node is in range <0, max> instead.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This is a reaction to Michal's fix [1] for non-NUMA systems that also
splits out conf/ out of util/ because libvirt_util shouldn't require
libvirt_conf if it is the other way around. This particular use case
worked, but we're trying to avoid it as mentioned [2], many times.
The only functions from virnuma.c that needed numatune_conf were
virDomainNumatuneNodesetIsAvailable() and virNumaSetupMemoryPolicy().
The first one should be in numatune_conf as it works with
virDomainNumatune, the second one just needs nodeset and mode, both of
which can be passed without the need of numatune_conf.
Apart from fixing that, this patch also fixes recently added
code (between commits d2460f85^..5c8515620) that doesn't support
non-contiguous nodesets. It uses new function
virNumaNodesetIsAvailable(), which doesn't need a stub as it doesn't use
any libnuma functions, to check if every specified nodeset is available.
[1] https://www.redhat.com/archives/libvir-list/2014-November/msg00118.html
[2] http://www.redhat.com/archives/libvir-list/2011-June/msg01040.html
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Patch 43b67f2e disallowed network tuning only with qemu driver, however
this patch moved the check for root privileges into
virNetDevBandwidthSet function, so the call should now
fail in all possible cases. A mock function was created so that the test
suite doesn't fail because of unsufficient privileges.
Since there was a valid note to patch 43b67f2e about the best spot to
check for bandwidth set call while having libvirt daemon run in session
mode, this patch reverts previous changes dealing with bandwith
(also reverts adding variable @cfg in qemuDomainGetNumaParameters which
does not have any use at the moment, but getting and unreferencing
driver's config) in qemu_driver.c and qemu_command.c. There will be
another patch in the series which introduces the fix itself.
==404== 232 bytes in 1 blocks are definitely lost in loss record 669 of 758
==404== at 0x4C2B934: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==404== by 0x52A2BF3: virAlloc (viralloc.c:144)
==404== by 0x1D49AD70: qemuMigrationCookieAddStatistics (qemu_migration.c:554)
==404== by 0x1D49AD70: qemuMigrationBakeCookie (qemu_migration.c:1228)
==404== by 0x1D4A43B8: qemuMigrationFinish (qemu_migration.c:5002)
==404== by 0x1D4C9339: qemuDomainMigrateFinish3Params (qemu_driver.c:11526)
Introduced by commit 5d6fb96
As of 90286418 the function is introduced. However, it's missing
an entry in the libvirt_private.syms so it can't be mocked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 28f8dfd (v1.0.0) introduced a security hole: in at least
the qemu implementation of virDomainGetXMLDesc, the use of the
flag VIR_DOMAIN_XML_MIGRATABLE (which is usable from a read-only
connection) triggers the implicit use of VIR_DOMAIN_XML_SECURE
prior to calling qemuDomainFormatXML. However, the use of
VIR_DOMAIN_XML_SECURE is supposed to be restricted to read-write
clients only. This patch treats the migratable flag as requiring
the same permissions, rather than analyzing what might break if
migratable xml no longer includes secret information.
Fortunately, the information leak is low-risk: all that is gated
by the VIR_DOMAIN_XML_SECURE flag is the VNC connection password;
but VNC passwords are already weak (FIPS forbids their use, and
on a non-FIPS machine, anyone stupid enough to trust a max-8-byte
password sent in plaintext over the network deserves what they
get). SPICE offers better security than VNC, and all other
secrets are properly protected by use of virSecret associations
rather than direct output in domain XML.
* src/remote/remote_protocol.x (REMOTE_PROC_DOMAIN_GET_XML_DESC):
Tighten rules on use of migratable flag.
* src/libvirt-domain.c (virDomainGetXMLDesc): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1159219
Users might want to update startupPolicy via the
virDomainUpdateDeviceFlags API too. This patch
implements the feature on config layer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The remote call actually doesn't free the arguments array so we leak
memory in case a domain list is specified. As the remote domain list
array consists only of stolen pointers from the actual domain objects
it's sufficient just to free the array.
Valgrind message:
==1081452== 64 bytes in 1 blocks are definitely lost in loss record 632 of 726
==1081452== at 0x4C296D0: calloc (vg_replace_malloc.c:618)
==1081452== by 0x4EA5CB4: virAllocN (viralloc.c:191)
==1081452== by 0x505D21E: remoteConnectGetAllDomainStats (remote_driver.c:7785)
==1081452== by 0x50081AA: virDomainListGetStats (libvirt-domain.c:11080)
==1081452== by 0x155249: cmdDomstats (virsh-domain-monitor.c:2147)
==1081452== by 0x12FB73: vshCommandRun (virsh.c:1935)
==1081452== by 0x133FEB: main (virsh.c:3719)
Domain memory elements such as max_balloon and cur_balloon are
implemented as 'unsigned long long', whereas the 'memory' element
in NUMA cells is implemented as 'unsigned int'.
Use the same data type (unsigned long long) for 'memory' element
in NUMA cells.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
A domain without a console quietly dies soon after start,
because we try to set /dev/null as a controlling TTY
2014-10-30 15:10:59.705+0000: 1: error : lxcContainerSetupFDs:283 :
ioctl(TIOCSCTTY) failed: Inappropriate ioctl for device
Report an error early instead of trying to start it.
https://bugzilla.redhat.com/show_bug.cgi?id=1155410
It fails after 30 seconds with this error:
error : virDBusCall:1429 : error from service: CanSuspend:
Did not receive a reply. Possible causes include: the remote
application did not send a reply, the message bus security
policy blocked the reply, the reply timeout expired, or the
network connection was broken.
Only probe for the power mgmt capabilities when driver is non-NULL.
This speeds up domain startup by 30 seconds.
https://bugzilla.redhat.com/show_bug.cgi?id=1159227
Coverity found out the very obvious problem in the code. That is that
virPidFileReleasePath() was called only if
virPidFileAcquirePath() returned 0. But virPidFileAcquirePath() doesn't
return only 0 on success, but the FD that needs to be closed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In qemuMigrationFinish mig->nbd can not be initialized by
qemuMigrationEatCookie without the QEMU_MIGRATION_COOKIE_NBD flag.
That causes qemuMigrationStopNBDServer to return early without
stopping the NBD server properly.
Signed-off-by: Weiwei Li <nuonuoli@tencent.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
There was no check for 'nodeset' attribute in numatune-related
elements. This patch adds validation that any nodeset specified does
not exceed maximum host node.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
When one domain is being undefined and at the same time started, for
example, there is a possibility of a rare problem occuring.
- Thread 1 does virDomainUndefine(), has the lock, checks that the
domain is active and because it's not, calls
virDomainObjListRemove().
- Thread 2 does virDomainCreate() and tries to lock the domain.
- Thread 1 needs to lock domain list in order to remove the domain from
it, but must unlock domain first (proper order is to lock domain list
first and the domain itself second).
- Thread 2 grabs the lock, starts the domain and releases the lock.
- Thread 1 grabs the lock and removes the domain from list.
With this patch:
- qemuDomainRemoveInactive() creates a QEMU_JOB_MODIFY if that's
possible, but since it must remove the domain from list either way,
it continues even when starting the job failed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When daemon is killed right in the middle of probing a qemu binary for
its capabilities, the qemu process is left running. Next time the
daemon is starting, it cannot start the probing qemu process because the
one that's already running does have the pidfile flock()'d.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This function is used to cleanup a pidfile doing whatever it takes, even
killing the owning process.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We were missing check for the fact that the storage driver was found and
in case there is no vbox storage driver available, daemon raised the
following error each start:
error : virRegisterStorageDriver:592 : driver in
virRegisterStorageDriver must not be NULL
Fixing this makes the condition unified with networkDriver registration
in vbox as well.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Particularly in qemuBuildNumaArgStr(), there was a need for the advice
due to memory backing, which needs to know the nodeset it will be pinned
to. With newer qemu this caused the following error when starting
domain:
error: internal error: Advice from numad is needed in case of
automatic numa placement
even when starting perfectly valid domain, e.g.:
...
<vcpu placement='auto'>4</vcpu>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
<cpu>
<numa>
<cell id='0' cpus='0' memory='524288'/>
<cell id='1' cpus='1' memory='524288'/>
</numa>
</cpu>
...
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138545
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Hotplugging and hotunplugging char devices is only supported through
'-device' and the check for device capability should be independently.
Coverity also complains about 'tmpChr->info.alias' could be NULL and we
are dereferencing it but it somehow only in this case don't recognize
that the value is set by 'qemuAssignDeviceChrAlias' so it's clearly
false positive. Add sa_assert to make coverity happy.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Coverity is complaining about overwriting value in 'rc' variable
without using the old value because it somehow doesn't recognize that
the value is used by MACRO. The 'rc' variable is there only for checking
return code so it's save to remove it and make coverity happy.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Since commit 3f99d64 no new scsi_host pools can be defined
if one of the already defined scsi_host pools does not refer
to an accessible scsi_host adapter.
Relax the check by skipping over these inaccessible pools
when checking for duplicates.
If both source adapters are specified by a parent address,
just comparing the address is faster and catches even addresses
that do not refer to valid adapters.
This macro seems to be defined only on linux/unix and it fails during
mingw build. Its value is '16' (taken from net/if.h) so define it if
it's not defined.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit 6c9a8a4 (Oct 2014) exposed a long-standing issue on 32-bit
machines: code related to virDomainSetMemoryParameters has always
been documented as using a 64-bit limit, but it was implemented by
calling virDomainParseMemory which enforced an 'unsigned long'
limit. Since VIR_DOMAIN_MEMORY_PARAM_UNLIMITED capped to a
long is -1, but virDomainParseScaledValue no longer accepts
negative values, an attempt to use 2^53-1 as a hard memory limit
started failing the testsuite. However, the problem with capping
things artificially low has existed for much longer - ever since
commits 4888f0fb and 2e22f23 (Mar 2012) switched internal tracking
from 'unsigned long' to 'unsigned long long' (prior to that time,
the cap was a side-effect of the choice of types). We _have_ to
cap the balloon memory values, (no thanks to baked in 'unsigned long'
of API such as virDomainSetMaxMemory or virDomainGetInfo with no
counterpart API that guarantees 64-bit access to those numbers)
but memory parameters have never needed the artificial limit.
At any rate, the solution is to make the parser function gain a
parameter, and only do the reduced 32-bit cap for the values that
are constrained due to API.
* src/conf/domain_conf.h (_virDomainMemtune): Add comments.
* src/conf/domain_conf.c (virDomainParseMemory): Add parameter.
(virDomainDefParseXML): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
commit 3e1e16aa8d (Use a port from the
migration range for NBD as well) changed ndb port allocation from
remotePorts to migrationPorts, but did not change the port releasing
process, which makes an error when migrating several times (above 64):
error: internal error: Unable to find an unused port in range
'migration' (49152-49215)
https://bugzilla.redhat.com/show_bug.cgi?id=1159245
Signed-off-by: Weiwei Li <nuonuoli@tencent.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
On error, libxlMakeDomBuildInfo() frees the caller-provided
libxl_domain_build_info struct embedded in libxl_domain_config,
causing a segfault
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f9c13020700 (LWP 40988)]
(gdb) bt
0 0x00007f9c162f95b4 in free () from /lib64/libc.so.6
1 0x00007f9c0d0965ad in libxl_bitmap_dispose () from
/usr/lib64/libxenlight.so.4.4
2 0x00007f9c0d0a73bf in libxl_domain_build_info_dispose ()
from /usr/lib64/libxenlight.so.4.4
3 0x00007f9c0d0a7974 in libxl_domain_config_dispose () from
/usr/lib64/libxenlight.so.4.4
4 0x00007f9c0d2e00c5 in libxlDomainStart (driver=0x7f9c0400e4e0,
vm=0x7f9c0412b0d0, start_paused=false, restore_fd=-1) at
libxl/libxl_domain.c:1323
5 0x00007f9c0d2e1d4b in libxlDomainCreateXML (conn=0x7f9c000009a0,...)
at libxl/libxl_driver.c:660
Remove the call to libxl_domain_build_info_dispose() from
libxlMakeDomBuildInfo(). On error, callers will dispose the
libxl_domain_config object, which in turn disposes the build info.
With the introduction of the libxlDomainGetEmulatorType function,
it is trivial to support a user-specfied <emulator> in the libxl
driver. This patch is based loosely on David Scott's old patch
to do the same
https://www.redhat.com/archives/libvir-list/2013-April/msg02119.html
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
It makes sense for none of the callers to have negative value as an
output and, fortunately, if anyone tried defining domain with negative
memory or any other value parsed by virDomainParseScaledValue(), the
resulting value was 0. That means we can error out during parsing as
it won't break anything.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1155843
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virGetSCSIHostNumber function return type is int, however
its stubbed version returns NULL. That results in a build fail
on systems that use the stubbed version. Fix by using a proper
return type.
Currently, build fails on FreeBSD because its struct ifreq does not
have ifr_hwaddr member. In order to fix that, check if this member
is present, otherwise fall back to the stub version of the
virNetDev{Add,Del}Multi functions.
The complaint is that if cleanup is called when virFileReadAll fails,
then mcast->entries is NULL and could be dereferenced in the clear
function. After following the code some - I saw that the caller to
the function (virNetDevGetMulticastTable) will also call
virNetDevMcastListClear if this function returns -1, so this
isn't necessary, so I removed the call.
Coverity complains that because the for loop is from 0 to 5 (max tokens)
and the impending switch/case statements used each of the #define values
that the 'default' wouldn't reachable. This patch will convert the #define's
into enum's and add the obligatory dead_error_begin marker for these type
situations.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1140981 reports that
the qemu-kvm shipped as part of RHEL 7.0 intentionally[1] cripples
block jobs by removing the 'block-stream' QMP command, while still
leaving 'block-job-cancel' as an unusable no-op. Meanwhile, we
already had existing code that checked whether block jobs were
completely missing (such as qemu 0.15), old style (cancel is
synchronous, and all commands spelled with '_'), or new style
(cancel is asynchronous, and all commands spelled with '-'), and
used that three-way probe to give decent error messages. At the
time that code was added, all existing qemu versions fell in one
of three buckets, and the code was using the presence of
'block-job-cancel' as the witness of which of the three buckets.
But now that RHEL qemu has shipped with intentionally crippled
'block-stream', we have a fourth bucket, which results in ugly
error messages when trying 'virsh blockpull':
error: Requested operation is not valid: Command 'block-stream' is not found
In reality, the fourth bucket should be treated the same as the
first bucket (no block job support); we can do that by realizing
that no existing build of qemu has working block-stream while
lacking block-job-cancel, so it is easiest to change our witness
to the command that starts a job rather than ends one. We still
act correctly regarding command spelling and whether cancel is
asynchronous. And on crippled RHEL builds, we now get the desired:
error: unsupported configuration: block jobs not supported with this qemu binary
[1] The intentional cripple is limited to qemu-kvm of RHEL; when using
qemu-kvm-rhev of RHEV, block job functionality is supported. Don't ask
me to explain the "why" behind it all - I'm just dealing with fallout
from someone else's decision.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC): Tweak comment.
* src/qemu/qemu_capabilities.c (virQEMUCapsCommands): Look for stream
rather than cancel when determining the flavor of block jobs supported.
Signed-off-by: Eric Blake <eblake@redhat.com>
The code that parses the schema from the URI touches the "hosts[0]"
member of the storage file source structure in case the URI contains a
schema. The hosts array was not yet allocated at the point in the code
where the transport protocol was parsed and set. This lead to a crash of
libvirtd.
Fix the code by allocating the "hosts" array upfront and add a test case
to verify this scenario. (Unfortunately this requires shuffling the test
case numbers too).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1156288
Now that all offenders have been cleaned, turn on a syntax-check
rule to prevent future offenders.
* cfg.mk (sc_prohibit_static_zero_init): New rule.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Avoid false
positive.
Signed-off-by: Eric Blake <eblake@redhat.com>
We weren't ever using the value for anything other than being non-zero.
* src/util/viraudit.h (virAuditLog): Change signature.
* src/util/viraudit.c (virAuditLog): Update user.
* daemon/libvirtd.c (main): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
As I've pushed 5892944f I haven't noticed one small nitpick.
There was this backslash missing on the line 1231 in the
enumeration of libraries to be added to vbox storage driver. This
resulted in nondeterministic build which sometimes succeeded and
sometimes failed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1146837
Resolve a crash in libvirtd resulting from commit id 'a4bd62ad' (1.0.6)
which added parentaddr and unique_id to allow unique identification of
a scsi_host, but assumed that all the pool entries and the incoming
definition would be similarly defined. If the existing pool uses the
'name' attribute and an incoming pool is using the parentaddr/unique_id,
then the code will attempt to compare the existing name string against
the incoming name string which doesn't exist (is NULL) and results in
a core (STREQ).
Conversely, if the existing pool used the parentaddr/unique_id and the
to be defined pool used the name, then the comparison would be against
the parentaddr, but since the incoming pool doesn't have one - that would
leave the comparison against a parentaddr of all 0's and a unique_id of 0,
which will always comparison to fail. This means someone could define the
same source adapter for two pools
In order to resolve this, adjust the code to get the 'host#' to be used
by the storage scsi backend in order to check/start the pool and make sure
the incoming definition doesn't match any of the existing pool defs.
https://bugzilla.redhat.com/show_bug.cgi?id=1141621
As part of attach processing, assign the device aliases by calling
qemuAssignDeviceAliases during qemuDomainQemuAttach once all the devices
are found after the qemuParseCommandLinePid processing.
This will alleviate a symptom that caused a libvirtd crash during an
attempted device detach.
In qemuDomainDetachControllerDevice if the info.alias already exists
a call to qemuAssignDeviceControllerAlias would overwrite the existing
so avoid this possibility.
Currently remote driver only initializes partial fields of
remote_connect_get_all_domain_stats_args. But xdr_array()
will check the uninitialised field 'doms_val'.
For safty reason, memset all fields of args is better.
Fix the following error from valgrind, like:
==30515== 1 errors in context 1 of 3:
==30515== Conditional jump or move depends on uninitialised value(s)
==30515== at 0x85E9402: xdr_array (xdr_array.c:88)
==30515== by 0x4FD8FC9: xdr_remote_connect_get_all_domain_stats_args (remote_protocol.c:6473)
==30515== by 0x4FE72F2: virNetMessageEncodePayload (virnetmessage.c:350)
==30515== by 0x4FDD21C: virNetClientProgramCall (virnetclientprogram.c:326)
==30515== by 0x4FB4D01: callFull.isra.2 (remote_driver.c:6667)
==30515== by 0x4FCBD45: call (remote_driver.c:6689)
==30515== by 0x4FCBD45: remoteConnectGetAllDomainStats (remote_driver.c:7793)
==30515== by 0x4FA0E75: virConnectGetAllDomainStats (libvirt.c:21678)
==30515== by 0x147FD1: cmdDomstats (virsh-domain-monitor.c:2148)
==30515== by 0x13006B: vshCommandRun (virsh.c:1915)
==30515== by 0x12A9E1: main (virsh.c:3699)
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
After rewriting the whole driver, Only version specified code is
remained in vbox_tmpl.c. So, this patch removes those unused macros
header files in vbox_tmpl.c.
The GetMedium will always return a IHardDisk object them.
In 2.2 and 3.0, it is what GetHardDisk exactly do. In 3.1 and later,
The IMedium is same as IHardDisk.
The CreateHardDiskMedium only support create HardDisk for medium
type, and it only works when vbox version >= 3.1. This patch make
the function workable with all vbox versions and rename it as
CreateHardDisk.
In vbox 2.2 and 3.0 this function will create a IHardDisk object.
In vbox later than 3.0, this function will create a IMedium object.
In old version, function FindMedium in UIVirtualBox doesn't work
for vbox2.2 and 3.0. We assume it will not be used when vbox in
these versions.
But when rewriting vboxStorageVolLookupByPath, we found it was
compatibe to use FindMedium to get a IHardDisk object, even in
vbox old versions. To achieve this, first make FindMedium call
FindHardDisk when VBOX_API_VERSION < 4000000.
Then change the argument type **IMedium to **IHardDisk. (As the
rules in heriachy, we can't transfer a IHardDisk to match
IMedium in output)
In vbox 2.2 and 3.0, the caller must be aware that they will get
a IHardDisk object in return.
We use typedef IMedium IHardDisk to make IHardDisk hierachy from
IMedium (Actually it did on vbox 2.2 and 3.0's C++ API).
So when calling
VBOX_MEDIUM_FUNC_ARG*(IHardDisk, func, args)
we can directly replace it to
gVBoxAPI.UIMedium.func(IHardDisk, args)
When dealing with this two types, we get some rules from it's
hierachy relationship.
When using IHardDisk and IMedium as input, we can't transfer a
IMedium to IHardDisk. Like:
gVBoxAPI.UIHardDisk.func(IHardDisk *hardDisk, args)
Here, we can't put a *IMedium as a argument.
When using IHardDisk and IMedium as output, we can't transfer a
IHardDisk to IMedium. Like:
gVBoxAPI.UIMachine.GetMedium(IMedium **out)
Here, we can't put a **IHardDisk as a argument. If this case
do happen, we either change the API to GetHardDisk or write a
new one.
This patch rewrites the following functions
*vboxStorageOpen
*vboxStorageClose
*vboxConnectNumOfStoragePools
*vboxConnectListStoragePools
*vboxStoragePoolLookupByName
These functions do not call any vbox API, so I directly move it
from vbox_tmpl.c to vbox_storage.c
A small improvement is made on vboxConnectListStoragePools.
The if condition nnames == 1 is modified to nnames > 0. So if the
caller put more than one slot to get active storage pools, the new
function will return exactly one, while the old one would only
return 0.
There are lots of macro declarations in vbox_common.c,
vbox_network.c, and the coming vbox_storage.c which simply the API
calling. Since they are totally the same. We shouldn't keep three
copies of that, so they are moved to vbox_common.h.
Note: The macros are quite different from those in vbox_tmpl.c,
because they are using different API.
We should follow the rules that CHECK macro only do checking works.
But this VBOX_OBJECT_CHECK and VBOX_OBJECT_HOST_CHECK declared some
varibles at the same time, which broke the rule. So the patch
removed this macros and dispatched it in source code.
The storage driver is still not rewriten at this point. So, I
remains the VBOX_OBJECT_CHECK macro in vbox_tmpl.c. But this will
finally be removed in patch 'vbox: Remove unused things in vbox_tmpl.c'
I made a mistake on copyright in patch 7f0f415b87.
If I copied codes from one file to another, I should copy the
copyright announcement as well. So this patch makes up the
copyright which I should have added in the previous patch.
Not every error message from qemu-ga has to have the 'class' field
filled out. For instance, I've seen this error message lately:
qemuAgentCheckError:1047 : unable to execute QEMU agent command \
{"execute":"guest-set-time"}: \
{"error":{"desc":"Invalid parameter type, expected: integer"}}
However, this got translated into rather generic error message:
internal error: unable to execute QEMU agent command
'guest-set-time': unknown QEMU command error
So we've dropped better error message in favor of a generic one.
This is due to our code which expects 'class' which is not
present here.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds functionality to processNicRxFilterChangedEvent().
The old and new multicast lists are compared and the filters in
the macvtap are programmed to match the guest's filters.
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
This patch provides the utility functions to needed to synchronize the
changes made to a guest domain network device's multicast filter
with the corresponding macvtap device's filter on the host:
* Get/add/remove multicast MAC addresses
* Get the macvtap device's RX filter list
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
https://bugzilla.redhat.com/show_bug.cgi?id=956506 documents that
given a domain where an internal snapshot parent has an external
snapshot child, we lacked a safety check when trying to use the
--children-only option to snapshot-delete:
$ virsh start dom
$ virsh snapshot-create-as dom internal
$ virsh snapshot-create-as dom external --disk-only
$ virsh snapshot-delete dom external
error: Failed to delete snapshot external
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
$ virsh snapshot-delete dom internal --children
error: Failed to delete snapshot internal
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
$ virsh snapshot-delete dom internal --children-only
Domain snapshot internal children deleted
While I'd still like to see patches that actually do proper external
snapshot deletion, we should at least fix the inconsistency in the
meantime. With this patch:
$ virsh snapshot-delete dom internal --children-only
error: Failed to delete snapshot internal
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
* src/qemu/qemu_driver.c (qemuDomainSnapshotDelete): Fix condition.
Signed-off-by: Eric Blake <eblake@redhat.com>
virNetDevLinkDump() gets a message from netlink into "resp", then
calls nlmsg_parse() to fill the table "tb" with pointers into resp. It
then returns tb to its caller, but not before freeing the buffer at
resp. That means that all the callers of virNetDevLinkDump() are
examining memory that has already been freed. This can be verified by
filling the buffer at resp with garbage prior to freeing it (or, I
suppose, just running libvirtd under valgrind) then performing some
operation that calls virNetDevLinkDump().
The code has been like this ever since virNetDevLinkDump() was written
- the original author didn't notice it, and neither did later
additional users of the function. It has only been pure luck (or maybe
a lack of heavy load, and/or maybe an allocation algorithm in malloc()
that delays re-use of just-freed memory) that has kept this from
causing errors, for example when configuring a PCI passthrough or
macvtap passthrough network interface.
The solution taken in this patch is the simplest - just return resp to
the caller along with tb, then have the caller free it after they are
finished using the data (pointers) in tb. I alternately could have
made a cleaner interface by creating a new struct that put tb and resp
together along with a vir*Free() function for it, but this function is
only used in a couple places, and I'm not sure there will be
additional new uses of virNetDevLinkDump(), so the value of adding a
new type, extra APIs, etc. is dubious.
libvirtd will report below error if it does not make sure driver was not NULL
in virRegisterNetworkDriver
$ libvirtd
2014-10-24 09:24:36.443+0000: 28876: info : libvirt version: 1.2.10
2014-10-24 09:24:36.443+0000: 28876: error : virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must not be NULL
2014-10-24 09:24:36.443+0000: 28876: error : virDriverLoadModule:99 : Failed module registration vboxNetworkRegister
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The recently added driver-*.h files were not listed in the
Makefile.am causing them to be missed when creating dists.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virTypedParameterValidateSet method will need to be used
from several libvirt-*.c files so must be non-static
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The function hypervEnumAndPull consumes query on success, but leaked
it on failure. Rather than having to change all callers (many of
them indirect callers through the generated
hypervGetMsvmComputerSystemList), it was easier to just guarantee
that the buffer is cleaned on return from the function.
* src/hyperv/hyperv_wmi.c (hypervEnumAndPull): Don't leak query on
failure.
Signed-off-by: Eric Blake <eblake@redhat.com>
With the large number of APIs in libvirt the driver.h file,
it is easy to get lost looking for things. Split each driver
into a separate header file based on the functional driver
groups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To prepare for introducing a single global driver, rename the
virDriver struct to virHypervisorDriver and the registration
API to virRegisterHypervisorDriver()
Tuning NUMA or network interface parameters requires root
privileges to manage cgroups. Thus an attempt to set some of these
parameters in session mode on a running domain should be invalid
followed by an error. An example might be memory tuning which raises
an error in such case.
The following behavior in session mode will be present after applying
this patch:
Tuning | SET | GET |
----------|---------------|--------|
NUMA | shut off only | always |
Memory | never | never |
Interface | never | always |
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1126762
The documentation for the restore hook states that returning an empty
XML is equivalent with copying the input. There was a bug in the code
checking the returned string by checking the string instead of the
contents. Use the new helper to check if the string is empty.
The helper checks whether a string contains only whitespace or is NULL.
This will be helpful to skip cases where a user string is optional, but
may be provided empty with the same meaning.
Newer versions of Debian use '/run/initctl' instead of '/dev/initctl'.
This patch updates the code to search for the FIFO from a list of
well-known locations.
Build with clang fails with:
CC util/libvirt_util_la-virsocketaddr.lo
util/virsocketaddr.c:904:17: error: cast from 'struct sockaddr *' to
'struct sockaddr_in *' increases required alignment from 1 to 4
[-Werror,-Wcast-align]
inet4 = (struct sockaddr_in*) res->ai_addr;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/virsocketaddr.c:909:17: error: cast from 'struct sockaddr *' to
'struct sockaddr_in6 *' increases required alignment from 1 to 4
[-Werror,-Wcast-align]
inet6 = (struct sockaddr_in6*) res->ai_addr;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.
Fix that by replacing virSocketAddrParseInternal() call with
virSocketAddrParse() in the virSocketAddrIsNumericLocalhost() function.
virSocketAddrParse stores an address in virSocketAddr.
virSocketAddr uses a union to store an address, so it doesn't
need casting.
virt-manager on Fedora sets up i686 hosts with "/usr/bin/qemu-kvm" emulator,
which in turn unconditionally execs qemu-system-x86_64 querying capabilities
then fails:
Error launching details: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/engine.py", line 748, in _show_vm_helper
details = self._get_details_dialog(uri, vm.get_connkey())
File "/usr/share/virt-manager/virtManager/engine.py", line 726, in _get_details_dialog
obj = vmmDetails(conn.get_vm(connkey))
File "/usr/share/virt-manager/virtManager/details.py", line 399, in __init__
self.init_details()
File "/usr/share/virt-manager/virtManager/details.py", line 784, in init_details
domcaps = self.vm.get_domain_capabilities()
File "/usr/share/virt-manager/virtManager/domain.py", line 518, in get_domain_capabilities
self.get_xmlobj().os.machine, self.get_xmlobj().type)
File "/usr/lib/python2.7/site-packages/libvirt.py", line 3492, in getDomainCapabilities
if ret is None: raise libvirtError ('virConnectGetDomainCapabilities() failed', conn=self)
libvirtError: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
Journal:
Oct 16 21:08:26 goatlord.localdomain libvirtd[1530]: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
If VM is configured with many devices(including passthrough devices)
and large memory, libvirtd will take seconds(in the worst case) to
wait for monitor. In this period the qemu process may run on any
PCPU though I intend to pin emulator to the specified PCPU in xml
configuration.
Actually qemu process takes high cpu usage during vm startup.
So this is not the strict CPU isolation in this case.
Signed-off-by: Zhou yimin <zhouyimin@huawei.com>
The mode attribute is required for the source element of vhost-user.
Thus virDomainNetDefFormat should always generate a xml with it and not
only when the mode is server.
The commit fixes the issue. And it adds a vhostuser interface in
'client' mode to qemuxml2argv-net-vhostuser.(args|xml) to test this
usecase.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
To allow live modification of device backends in qemu libvirt needs to
be able to hot-add/remove "objects". Add monitor backend functions to
allow this.
This function will be used for hot-add/remove of RNG backends,
IOThreads, memory backing objects, etc.
The JSON structure constructor has an option to add JSON arrays to the
constructed object. The description is inaccurate as it can add any json
object even a dict. Change the docs to cover this option and reject
adding NULL objects.
Our qemu monitor code has a converter from key-value pairs to a json
value object. I want to re-use the code later and having it part of the
monitor command generator is inflexible. Split it out into a separate
helper.
When enabling the migration_address option, by default it is
set to "127.0.0.1", but it's not a valid address for migration.
so we should add verification and set the default migration_address
to "0.0.0.0".
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
if specifying migration_host to an Ipv6 address without brackets,
it was resolved to an incorrect address, such as:
tcp:2001:0DB8::1428:4444,
but the correct address should be:
tcp:[2001:0DB8::1428]:4444
so we should add brackets when parsing it.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
The actual origin of this so called typo are two commits. The first one
was commit 72f8a7f that came up with the following condition:
if ((i == 8) & (flags & VIR_QEMU_PROCESS_KILL_FORCE))
Fortunately this succeeded thanks to bool being (int)1 and
VIR_QEMU_PROCESS_KILL_FORCE having the value of 1 << 0. The check was
then moved and altered in 8fd3823117 to
current state:
if ((i == 50) & force)
that will work again (both sides of '&' being booleans), but since this
was missed so many times, it may pose a problem in the future in case it
gets copy-pasted again.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This started as an investigation into an issue where libvirt (using the
libxl driver) and the Xen host, like an old couple, could not agree on
who is responsible for selecting the VNC port to use.
Things usually (and a bit surprisingly) did work because, just like that
old couple, they had the same idea on what to do by default. However it
was possible that this ended up in a big argument.
The problem is that display information exists in two different places:
in the vfbs list and in the build info. And for launching the device model,
only the latter is used. But that never gets initialized from libvirt. So
Xen allows the device model to select a default port while libvirt thinks
it has told Xen that this is done by libvirt (though the vfbs config).
While fixing that, I made a stab at actually evaluating the configuration
of the video device. So that it is now possible to at least decide between
a Cirrus or standard VGA emulation and to modify the VRAM within certain
limits using libvirt.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This patch introduces a function to detect whether the specified
emulator is QEMU_XEN or QEMU_XEN_TRADITIONAL. Detection is based on the
string "Options specific to the Xen version:" in '$qemu -help' output.
AFAIK, the only qemu containing that string in help output is the
old Xen fork (aka qemu-dm).
Note:
QEMU_XEN means a qemu that contains support for Xen.
QEMU_XEN_TRADITIONAL means Xen's old forked qemu 0.10.2
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Allow the Xen drivers to determine default vram values. Sane
default vaules depend on the device model being used, so the
drivers are in the best position to determine the defaults.
For the legacy xen driver, it is best to maintain the existing
logic for setting default vram values to ensure there are no
regressions. The libxl driver currently does not support
configuring a video device. Support will be added in a
subsequent patch, where the benefit of this change will be
reaped.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Commit 4dfc34c3 missed copying the user-specified keymap to
libxl_domain_build_info struct when creating a VFB device.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
After set domain's numa parameters for running domain, save the change,
save the change into live xml is needed to survive restarting the libvirtd,
same story with bug 1146511; meanwihle add call
qemuDomainObjBeginJob/qemuDomainObjEndJob in qemuDomainSetNumaParameters
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
After set the blkio parameters for running domain, save the change into
live xml is needed to survive restarting the libvirtd, same story with
bug 1146511, meanwhile add call qemuDomainObjBeginJob/qemuDomainObjEndJob
in qemuDomainSetBlkioParameters
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
The pkg-config files in src/ make it pretty easy to build language
bindings against an uninstalled libvirt, however, they don't work with
VPATH builds. The reason is that all *-api.xml files are generated in
source rather than build directory.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1147057
The code for relabelling the TAP FD is there due to a race. When
libvirt creates a /dev/tapN device it's labeled as
'system_u:object_r:device_t:s0' by default. Later, when
udev/systemd reacts to this device, it's relabelled to the
expected label 'system_u:object_r:tun_tap_device_t:s0'. Hence, we
have a code that relabels the device, to cut the race down. For
more info see ae368ebfcc.
But the problem is, the relabel function is called on all TUN/TAP
devices. Yes, on /dev/net/tun too. This is however a special kind
of device - other processes uses it too. We shouldn't touch it's
label then.
Ideally, there would an API in SELinux that would label just the
passed FD and not the underlying path. That way, we wouldn't need
to care as we would be not labeling /dev/net/tun but the FD
passed to the domain. Unfortunately, there's no such API so we
have to workaround until then.
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This implementation uses the https://esx-server/screen?id=<id> way to get
a screenshot of a running domain. Compared to the CreateScreenshot_Task
way this works since ESX 2.5 while CreateScreenshot_Task was added in
version 4.0.
The newly added libcurl stream driver is used to directly provide the
downloaded data without saving it to a temporary file first.
This allows to implement libvirt functions that use streams, such as
virDoaminScreenshot, without the need to store the downloaded data in
a temporary file first. The stream driver directly interacts with
libcurl to send and receive data.
The driver uses the libcurl multi interface that allows to do a transfer
in multiple curl_multi_perform() calls. The easy interface would do the
whole transfer in a single curl_easy_perform() call. This doesn't work
with the libvirt stream API that is driven by multiple calls to the
virStreamSend() and virStreamRecv() functions.
The curl_multi_wait() function is used to do blocking operations. But it
was added in libcurl 7.28.0. For older versions it is emulated using the
socket callback of the multi interface.
The current driver only supports blocking operations. There is already
some code in place for non-blocking mode but it is not complete.
This patch fills in the functionality of
processNicRxFilterChangedEvent(). It now checks if it is appropriate
to respond to the NIC_RX_FILTER_CHANGED event (based on device type
and configuration) and takes appropriate action. Currently it checks
if the guest interface has been configured with
trustGuestRxFilters='yes', and if the host side device is macvtap. If
so, and the MAC address on the guest has changed, the MAC address of
the macvtap device is changed to match.
The result of this is that networking from the guest will continue to
work if the mac address of a macvtap-connected network device is
changed from within the guest, as long as trustGuestRxFilters='yes'
(previously changing the MAC address in the guest would break
networking).
NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the
guest modified the NIC's RX Filter (for example, if the MAC address of
the NIC is changed by the guest).
This patch doesn't do anything useful with that event; it just sets up
all the plumbing to get news of the event into a worker thread with
all proper locking/reference counting, and provide an easy place to
add in desired functionality.
See src/qemu/EVENTHANDLERS.txt for information/instructions on adding
a libvirt-internal handler for a qemu event (using
NIC_RX_FILTER_CHANGED as an example).
This text was in the commit log for the patch that added the event
handler for NIC_RX_FILTER_CHANGED, and John Ferlan expressed a desire
that the information not be "lost", so I've put it into a file in the
qemu directory, hoping that it might catch the attention of future
writers of handlers for qemu events.
This function can be called at any time to get the current status of a
guest's network device rx-filter. In particular it is useful to call
after libvirt recieves a NIC_RX_FILTER_CHANGED event - this event only
tells you that something has changed in the rx-filter, the details are
retrieved with the query-rx-filter monitor command (only available in
the json monitor). The command sent to the qemu monitor looks like this:
{"execute":"query-rx-filter", "arguments": {"name":"net2"} }'
and the results will look something like this:
{
"return": [
{
"promiscuous": false,
"name": "net2",
"main-mac": "52:54:00:98:2d:e3",
"unicast": "normal",
"vlan": "normal",
"vlan-table": [
42,
0
],
"unicast-table": [
],
"multicast": "normal",
"multicast-overflow": false,
"unicast-overflow": false,
"multicast-table": [
"33:33:ff:98:2d:e3",
"01:80:c2:00:00:21",
"01:00:5e:00:00:fb",
"33:33:ff:98:2d:e2",
"01:00:5e:00:00:01",
"33:33:00:00:00:01"
],
"broadcast-allowed": false
}
],
"id": "libvirt-14"
}
This is all parsed from JSON into a virNetDevRxFilter object for
easier consumption. (unicast-table is usually empty, but is also an
array of mac addresses similar to multicast-table).
(NB: LIBNL_CFLAGS was added to tests/Makefile.am because virnetdev.h
now includes util/virnetlink.h, which includes netlink/msg.h when
appropriate. Without LIBNL_CFLAGS, gcc can't find that file (if
libnl/netlink isn't available, LIBNL_CFLAGS will be empty and
virnetlink.h won't try to include netlink/msg.h anyway).)
This same structure will be used to retrieve RX filter info for
interfaces on the host via netlink messages, and RX filter info for
interfaces on the guest via the qemu "query-rx-filter" command.
As is done with other items such as vlan, virtualport, and bandwidth,
set the actual trustGuestRxFilters value to be used by a domain
interface according to a merge of the same attribute in the interface,
portgroup, and network in use. the interface setting always takes
precedence (if specified), followed by portgroup, and finally the
setting in the network is used if it's not specified in the interface
or portgroup.
This new attribute will control whether or not libvirt will pay
attention to guest notifications about changes to network device mac
addresses and receive filters. The default for this is 'no' (for
security reasons). If it is set to 'yes' *and* the specified device
model and connection support it (currently only macvtap+virtio) then
libvirt will watch for NIC_RX_FILTER_CHANGED events, and when it
receives one, it will issue a query-rx-filter command, retrieve the
result, and modify the host-side macvtap interface's mac address and
unicast/multicast filters accordingly.
The functionality behind this attribute will be in a later patch. This
patch merely adds the attribute to the top-level of a domain's
<interface> as well as to <network> and <portgroup>, and adds
documentation and schema/xml2xml tests. Rather than adding even more
test files, I've just added the net attribute in various applicable
places of existing test files.
Prior patch removed the need for the virConnectPtr in the unplug
detach host path which caused ripple effect to remove in multiple
callers. The previous patch just left things as ATTRIBUTE_UNUSED -
this patch will remove the variable.
https://bugzilla.redhat.com/show_bug.cgi?id=1141732
Introduced by commit id '8f76ad99' the logic to detach a scsi_host
device (SCSI or iSCSI) fails when attempting to remove the 'drive'
because as I found in my investigation - the DelDevice takes care of
that for us.
The investigation turned up commits to adjust the logic for the
qemuMonitorDelDevice and qemuMonitorDriveDel processing for interfaces
(commit id '81f76598'), disk bus=VIRTIO,SCSI,USB (commit id '0635785b'),
and chr devices (commit id '55b21f9b'), but nothing with the host devices.
This commit uses the model for the previous set of changes and applies
it to the hostdev path. The call to qemuDomainDetachHostSCSIDevice will
return to qemuDomainDetachThisHostDevice handling either the audit of
the failure or the wait for the removal and then call into
qemuDomainRemoveHostDevice for the event, removal from the domain hostdev
list, and audit of the removal similar to other paths.
NOTE: For now the 'conn' param to +qemuDomainDetachHostSCSIDevice is left
as ATTRIBUTE_UNUSED. Removing requires a cascade of other changes to be
left for a future patch.
Since commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5 libxml2 removes
two slashes from the URI when there is no server part. This is fixed
with beb7281055dbf0ed4d041022a67c6c5cfd126f25, but only if the calling
application calls xmlSaveUri() on URI that xmlURIParse() parsed. And
that is not the case in virURIFormat(). virURIFormat() accepts
virURIPtr that can be created without parsing it and we do that when we
format network storage paths for gluster for example. Even though
virStorageSourceParseBackingURI() uses virURIParse(), it throws that data
structure right away.
Since we want to format URIs as URIs and not absolute URIs or opaque
URIs (see RFC 3986), we can specify that with a special hack thanks to
commit beb7281055dbf0ed4d041022a67c6c5cfd126f25, by setting port to -1.
This fixes qemuxml2argvtest test where the disk-drive-network-gluster
case was failing.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since 87dea4fcff vboxGetDrivers() is not
used for getting the vbox network driver. The only call the code does
is using NULL as the @networkDriver_ret param , but the code still used
vbox[0-9][0-9]NetworkDriver that didn't exist anymore.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch implements support for the ivshmem device in QEMU.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Ivshmem is supported by QEMU since 0.13 release.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch adds parsing/formatting code as well as documentation for
shared memory devices. This will currently be only accessible in QEMU
using it's ivshmem device, but is designed as generic as possible to
allow future expansion for other hypervisors.
In the devices section in the domain XML users may specify:
- For shmem device using a server:
<shmem name='shmem0'>
<server path='/tmp/socket-ivshmem0'/>
<size unit='M'>32</size>
<msi vectors='32' ioeventfd='on'/>
</shmem>
- For ivshmem device not using an ivshmem server:
<shmem name='shmem1'>
<size unit='M'>32</size>
</shmem>
Most of the configuration is made optional so it also allows
specifications like:
<shmem name='shmem1/>
<shmem name='shmem2'>
<server/>
</shmem>
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Aeons ago (commit 34dcbbb4, v0.8.2), we added a new libvirt event
(VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON) in order to tell the user WHY
the guest halted. This is because at least VDSM wants to react
differently to ENOSPC events (resize the lvm partition to be larger,
and resume the guest as if nothing had happened) from all other events
(I/O is hosed, throw up our hands and flag things as broken). At the
time this was done, downstream RHEL qemu added a vendor extension
'__com.redhat_reason', which would be exactly one of these strings:
"enospc", "eperm", "eio", and "eother". In our stupidity, we exposed
those exact strings to clients, rather than an enum, and we also
return "" if we did not have access to a reason (which was the case
for upstream qemu).
Fast forward to now: upstream qemu commit c7c2ff0c (will be qemu 2.2)
FINALLY adds a 'nospace' boolean, after discussion with multiple
projects determined that VDSM really doesn't care about distinction
between any other error types. So this patch converts 'nospace' into
the string "enospc" for compatibility with RHEL clients that were
already used to the downstream extension, while leaving the reason
blank for all other cases (no change from the status quo).
See also https://bugzilla.redhat.com/show_bug.cgi?id=1119784
* src/qemu/qemu_monitor_json.c (qewmuMonitorJSONHandleIOError):
Parse reason field from modern qemu.
* include/libvirt/libvirt.h.in
(virConnectDomainEventIOErrorReasonCallback): Document it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Right now when building the qemu command line, we try to do various
unconditional validations of the guest CPU against the host CPU. However
this checks are overly applied. The only time we should use the checks
are:
- The user requests host-model/host-passthrough, or
- When KVM is requsted. CPU features requested in TCG mode are always
emulated by qemu and are independent of the host CPU, so no host CPU
checks should be performed.
Right now if trying to specify a CPU for arm on an x86 host, it attempts
to do non-sensical validation and falls over.
Switch all the test cases that were intending to test CPU validation to
use KVM, so they continue to test the intended code.
Amend some aarch64 XML tests with a CPU model, to ensure things work
correctly.
check domain's status before call virQEMUCapsGet to report a accurate
error when domain is shut off
Resolve: https://bugzilla.redhat.com/show_bug.cgi?id=1147847
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
After 87dea4fcf one can observe a build failure:
./autogen.sh --system --without-driver-modules && make
CCLD libvirtd
../src/.libs/libvirt_driver_vbox.a(libvirt_driver_vbox_impl_la-vbox_driver.o):
In function `vboxNetworkRegister':
/home/jtomko/work/libvirt/libvirt.git/src/vbox/vbox_driver.c:168: undefined
reference to `vboxGetNetworkDriver'
collect2: error: ld returned 1 exit status
make[3]: *** [libvirtd] Error 1
The problem is that when building without driver modules the VBOX
network driver is not linked into the the VBOX driver.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch seperate the domain driver and the network driver.
libvirt_driver_vbox_impl.la has been linked in the network driver.
So that the version specified codes in vbox_V*.c would only be
compiled once.
The vboxGetNetworkDriver provides a simple interface to get vbox
network driver.
This patch rewrites two public APIs. They are vboxNetworkUndefine
and vboxNetworkDestroy. They use the same core function
vboxNetworkUndefineDestroy. I merged it in one patch.
This patch actually contains two public API, virNetworkDefineXML
and virNetworkCreateXML. They use the same core function
vboxNetworkDefineCreateXML. So I merged it together.
The patch dbb4cbf532 by Michal has splited the vbox driver into
three parties. This modification brings a more suitable interface
to the previous patch.
The new function vboxGetDriver is introduced to get the
corresponding vbox domain driver directly thought the vbox version.
Functions like vboxGetNetworkDriver and vboxGetStorageDriver
will be introduced after rewriting it's drivers.
This patch, by the way, fixed the align problem for vbox in
Makefile.am
Up until now, we set memballoon period in monitor successfully, however
we did not update domain definition structure, thus dumpxml was omitting
period attribute in memballoon element
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140960
When trying to update bandwidth limits on a running domain, limits get
updated in our internal structures, however XML parser reads
bandwidth limits from network 'actual' definition. Committing this patch
it is now available to update bandwidth 'actual' definition as well,
thus updating domain runtime XML.
A cygwin build of 1.2.9 fails with:
util/virprocess.c:87:27: fatal error: sys/syscall.h: No such file or directory
# include <sys/syscall.h>
But in reality, the ONLY user of setns() is lxc, which is Linux-only.
It's easiest to just limit the setns workarounds to Linux.
* src/util/virprocess.c (setns): Limit definition to Linux.
Signed-off-by: Eric Blake <eblake@redhat.com>
If we don't properly clean up all processes in the
machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm
starts fail with
'CreateMachine: File exists'
Additional processes can e.g. be added via
echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks
but there are other cases like
http://bugs.debian.org/761521
Invoke TerminateMachine to be on the safe side since systemd tracks the
cgroup anyway. This is a noop if all processes have terminated already.
Management software wants to be able to allocate disk space on demand.
To support this they need keep track of the space occupation of the
block device. This information is reported by qemu as part of block
stats.
This patch extend the block information in the bulk stats with the
allocation information.
To keep the same behaviour a helper is extracted from
qemuMonitorJSONGetBlockExtent in order to get per-device allocation
information.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
While our code gathers block stats via "query-blockstats" some
information need to be gathered via "query-block". Add a helper function
that will update the blockstats structure if requested.
If you use public api virConnectListAllDomains() with second parameter
set to NULL to get only the number of domains you will lock out all
other operations with domains.
Introduced by commit 2c680804.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This removes the artificial and unnecessary restriction that
virDomainSetMaxDowntime() only be called while a migration is in
progress.
https://bugzilla.redhat.com/show_bug.cgi?id=1146618
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The current block stats code matched up the disk name with the actual
stats by the order in the data returned from qemu. This unfortunately
isn't right as qemu may return the disks in any order. Fix this by
returning a hash of stats and index them by the disk alias.
Commit de0aeaf filtered them out from the host-model features,
to allow host-model to be migratable by default.
Even though they are not passed to QEMU for host-passthrough,
(and not enabled by default) filter them out too
so the user does not think the domain has them.
https://bugzilla.redhat.com/show_bug.cgi?id=1147584
Commit fba6bc4 introduced the non-migratable invtsc feature,
breaking save/migration with host-model and host-passthrough.
On hosts with this feature present it was automatically included
in the CPU definition, regardless of QEMU support.
Commit de0aeaf stopped including it by default for host-model,
but failed to fix host-passthrough.
This commit ignores checking of CPU features with host-passthrough,
since we don't pass them to QEMU (only -cpu host is passed),
allowing domains using host-passthrough that were saved with
the broken version of libvirtd to be restored.
https://bugzilla.redhat.com/show_bug.cgi?id=1147584
When virConnectDomainQemuMonitorEventRegister is called with the
VIR_CONNECT_DOMAIN_QEMU_MONITOR_EVENT_REGISTER_REGEX flag,
ignore the flag instead of crashing.
https://bugzilla.redhat.com/show_bug.cgi?id=1144920
For the new VIR_DOMAIN_EVENT_ID_TUNABLE event we have a bunch of
constants added
VIR_DOMAIN_EVENT_CPUTUNE_<blah>
VIR_DOMAIN_EVENT_BLKDEVIOTUNE_<blah>
This naming convention is bad for two reasons
- There is no common prefix unique for the events to both
relate them, and distinguish them from other event
constants
- The values associated with the constants were chosen
to match the names used with virConnectGetAllDomainStats
so having EVENT in the constant name is not applicable in
that respect
This patch proposes renaming the constants to
VIR_DOMAIN_TUNABLE_CPU_<blah>
VIR_DOMAIN_TUNABLE_BLKDEV_<blah>
ie, given them a common VIR_DOMAIN_TUNABLE prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=993411
On some systems (using libtirpc instead of glibc's
implementation), xdr_uint64_t exists rather under different name:
xdr_u_int64_t. This makes compilation fail then:
libvirt_lxc-lxc_monitor_protocol.o: In function `xdr_virLXCMonitorInitEventMsg':
/usr/local/src/libvirt/libvirt-1.1.1/src/./lxc/lxc_monitor_protocol.c:31: undefined reference to `xdr_uint64_t'
Therefore we rather mirror the d707c866 commit and redefine
xdr_uint64_t if needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On a domain startup, the variable store path is generated if needed.
The path is intended to be generated only once. However, the updated
domain definition is not saved into config dir rather than state XML
only. So later, whenever the domain is destroyed and the daemon is
restarted, the generated path is forgotten and the file may be left
behind on virDomainUndefine() call.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's no one to free() it anyway. Instead, we can just pass the
provided array pointer directly.
==20039== 48 bytes in 4 blocks are definitely lost in loss record 658 of 787
==20039== at 0x4C2A700: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20039== by 0x4EA661F: virAllocN (viralloc.c:191)
==20039== by 0x50386EF: remoteNodeGetFreePages (remote_driver.c:7625)
==20039== by 0x5003504: virNodeGetFreePages (libvirt.c:21379)
==20039== by 0x154625: cmdFreepages (virsh-host.c:374)
==20039== by 0x12F718: vshCommandRun (virsh.c:1935)
==20039== by 0x1339FB: main (virsh.c:3747)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since 363e9a68 we track backing chain metadata when creating snapshots
the right way even for the inactive configuration. As we did not yet
update other code paths that modify the backing chain (blockpull) the
newDef backing chain gets out of sync.
After stopping of a VM the new definition gets copied to the next start
one. The new VM then has incorrect backing chain info. This patch
switches the backing chain detector to always purge the existing backing
chain and forces re-detection to avoid this issue until we'll have full
backing chain tracking support.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1144922
Due to a missing check the API can be successfully called even if
the connection is ReadOnly. Fortunately, the API hasn't been
released yet, so there's no need for a CVE.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add files parallels_sdk.c and parallels_sdk.h for code
which works with SDK, so libvirt's code will not mix with
dealing with parallels SDK.
To use Parallels SDK you must first call PrlApi_InitEx function,
and then you will be able to connect to a server with
PrlSrv_LoginLocalEx function. When you've done you must call
PrlApi_Deinit. So let's call PrlApi_InitEx on first .connectOpen,
count number of connections and deinitialize, when this counter
becomes zero.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Executing prlctl command is not an optimal way to interact with
Parallels Cloud Server (PCS), it's better to use parallels SDK,
which is a remote API to paralles dispatcher service.
We prepared opensource version of this SDK and published it on
github, it's distributed under LGPL license. Here is a git repo:
https://github.com/Parallels/parallels-sdk.
To build with parallels SDK user should get compiler and linker
options from pkg-config 'parallels-sdk' file. So fix checks in
configure script and build with parallels SDK, if that pkg-config
file exists and add gcc options to makefile.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We have these configuration knobs, like max_clients and
max_anonymous_clients. They limit the number of clients
connected. Whenever the limit is reached, the daemon stops
accepting new ones and resumes if one of the connected clients
disconnects. If that's the case, a debug message is printed into
the logs. And when the daemon starts over to accept new clients
too. However, the problem is the messages have debug priority.
This may be unfortunate, because if the daemon stops accepting
new clients all of a sudden, and users don't have debug logs
enabled they have no idea what's going on. Raise the messages
level to INFO at least.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The changes in commit c7542573 introduced possible segfault. Looking
deeper into the code and the original code before the patch series were
applied I think that we should report error for each function failure
and also we shouldn't call some of the function twice.
Found by coverity.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Use the universal tunable event to report changes to user. All
blkdeviotune values are prefixed with "blkdeviotune".
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When you updated some blkdeviotune values for running domain the values
were stored only internally, but not saved into the live XML so they
won't survive restarting the libvirtd.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The commit 1b854c76 introduced a new function 'virPolkitCheckAuth' and
in the #else section when you don't have polkit all attributes should be
follwed by ATTRIBUTE_UNUSED.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
It would be nice to also print a params pointer and number of params in
the debug message and the previous limit for number of params in the rpc
message was too large. The 2048 params will be enough for future events.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
A long time ago in a galaxy far, far away it has been decided
that libvirt will manage not only domains but host as well. And
with my latest work on qemu driver supporting huge pages, we miss
the cherry on top: an API to allocate huge pages on the run.
Currently users are forced to log into the host and adjust the
huge pages pool themselves. However, with this API the problem
is gone - they can both size up and size down the pool.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In the previous patch I've changed the for loop bounds but forgot
to 'git add' changes that adapt the rest of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The check for ISCSI devices was missing a check of subsys type, which
meant we could skip labelling of other host devices as well. This fixes
USB hotplug on F21
https://bugzilla.redhat.com/show_bug.cgi?id=1145968
Spawning the pkcheck program every time a permission check is
required is hugely expensive on CPU. The pkcheck program is just
a dumb wrapper for the DBus API, so rewrite the code to use the
DBus API directly. This also simplifies error handling a bit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the remote daemon auth check and the access control
code to use the common polkit API for checking auth.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert virAccessDriverPolkitFormatProcess to use typesafe API
for getting process ID attribute.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update virNetServerClientCreateIdentity and virIdentityGetSystem
to use the new typesafe APIs for setting identity attributes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are now two places in libvirt which use polkit. Currently
they use pkexec, which is set to be replaced by direct DBus API
calls. Add a common API which they will both be able to use for
this purpose.
No tests are added at this time, since the impl will be gutted
in favour of a DBus API call shortly.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add options for tuning segment offloading:
<driver>
<host csum='off' gso='off' tso4='off' tso6='off'
ecn='off' ufo='off'/>
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
which control the respective host_ and guest_ properties
of the virtio-net device.
In nodeGetFreePages, if startCell is given by '0',
and the max node number is '0' too. The for-loop
wouldn't be executed.
So convert it to while-loop.
Before:
> virsh freepages --cellno 0 --pagesize 4
error: internal error: no suitable info found
After:
> virsh freepages --cellno 0 --pagesize 4
4KiB: 472637
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If you have a bridge network in running domain and libvirtd is restarted
the information about host bridge interface is lost from live xml.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140085
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Request erroring out from the backing chain traveller and drop qemu's
internal backing chain integrity tester.
The backing chain traveller reports errors by itself with possibly more
detail than qemuDiskChainCheckBroken ever could.
We also need to make sure that we reconnect to existing qemu instances
even at the cost of losing the backing chain info (this really should be
stored in the XML rather than reloaded from disk, but that needs some
work).
Add a new parameter to virStorageFileGetMetadata that will break the
backing chain detection process and report useful error message rather
than having to use virStorageFileChainGetBroken.
This patch just introduces the option, usage will be provided
separately.
Now we have universal tunable event so we can use it for reporting
changes to user. The cputune values will be prefixed with "cputune" to
distinguish it from other tunable events.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This new event will use typedParameters to expose what has been actually
updated and the reason is that we can in the future extend any tunable
values or add new tunable values. With typedParameters we don't have to
worry about creating some other events, we will just use this universal
event to inform user about updates.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In the function at one place we check if def->cpu is NULL prior
to accessing def->cpu->ncells. Then, later in the code,
def->cpu->ncells is accessed directly, without the check. This
makes coverity unhappy, because the first check makes it think
def->cpu can be NULL. However, the function is not called if
def->cpu is NULL. Therefore, remove the first check and hopefully
make coverity cheer again.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Cleanup virDomanDef structure from other nested structure and create
separate type definition for them.
Fix a typo in virDomainHugePage.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
RDMA Live migration requires registering memory with the hardware, and
thus QEMU offers a new 'capability' to pre-register / mlock() the guest
memory in advance for higher RDMA performance before the migration
begins. This capability is disabled by default, which means QEMU will
register the memory with the hardware in an on-demand basis.
This patch exposes this capability with the following example usage:
virsh migrate --live --rdma-pin-all --migrateuri rdma://hostname domain qemu+ssh://hostname/system
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch adds support for RDMA protocol in migration URIs.
USAGE: $ virsh migrate --live --migrateuri rdma://hostname domain qemu+ssh://hostname/system
Since libvirt runs QEMU in a pretty restricted environment, several
files needs to be added to cgroup_device_acl (in qemu.conf) for QEMU to
be able to access the host's infiniband hardware. Full documenation of
the feature can be found on QEMU wiki:
http://wiki.qemu.org/Features/RDMALiveMigration
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently we only support TCP protocol for native QEMU migration but
this is going to be changed. Let's make the code more general and remove
hardcoded TCP protocol from several places.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
For compatibility with old libvirt we need to support both tcp:host and
tcp://host migration URIs. Let's make the code that parses them a bit
cleaner.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
RDMA migration uses the 'setup' state in QEMU to optionally lock
all memory before the migration starts. The total time spent in
this state is exposed as VIR_DOMAIN_JOB_SETUP_TIME.
Additionally, QEMU also exports migration throughput (mbps) for both
memory and disk, so let's add them too: VIR_DOMAIN_JOB_MEMORY_BPS,
VIR_DOMAIN_JOB_DISK_BPS.
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
commit 72f919f558 introduced an user
friendly error message when trying to use IDE disks as readonly.
Do the same thing for the SATA bus.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1112939
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add a new parameter that will allow to return the XML stored in the save
image for further manipulation and adjust the callers. This option will
be used in later patches.
There is no need to acquire the driver-wide lock in
libxlDomainDefineXML. When switching to jobs in the libxl
driver, most driver-wide locks were removed. The locking here
was preserved since I mistakenly thought virDomainObjListAdd
needed protection. This is not the case, so remove the
unnecessary locking.
Commit id '9a2f36ec' added a build conditional of CAP_SYS_RAWIO
in order to determine whether or not a disk definition using rawio
should be allowed on platforms without CAP_SYS_RAWIO. If one was
found, virReportError was used but the code didn't goto cleanup.
This patch adds the goto.
We are not detecting the presence of FIPS from QEMU, but from procfs and
that means it's not QEMU capability. It was decided that we will pass
this flag to QEMU even if it's not supported by old QEMU binaries.
This patch also reverts changes done by commit a21cfb0f to
qemucapabilitestest and implements a new test case in qemuxml2argvtest.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135431
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1141879
A long time ago I've implemented support for so called multiqueue
net. The idea was to let guest network traffic be processed by
multiple host CPUs and thus increasing performance. However, this
behavior is enabled by QEMU via special ioctl() iterated over the
all tap FDs passed in by libvirt. Unfortunately, SELinux comes in
and disallows the ioctl() call because the /dev/net/tun has label
system_u:object_r:tun_tap_device_t:s0 and 'attach_queue' ioctl()
is not allowed on tun_tap_device_t type. So after discussion with
a SELinux developer we've decided that the FDs passed to the QEMU
should be labelled with svirt_t type and SELinux policy will
allow the ioctl(). Therefore I've made a patch
(cf976d9dcf) that does exactly this. The patch
was fixed then by a443193139 and
b635b7a1af. However, things are not
that easy - even though the API to label FD is called
(fsetfilecon_raw) the underlying file is labelled too! So
effectively we are mangling /dev/net/tun label. Yes, that broke
dozen of other application from openvpn, or boxes, to qemu
running other domains.
The best solution would be if SELinux provides a way to label an
FD only, which could be then labeled when passed to the qemu.
However that's a long path to go and we should fix this
regression AQAP. So I went to talk to the SELinux developer again
and we agreed on temporary solution that:
1) All the three patches are reverted
2) SELinux temporarily allows 'attach_queue' on the
tun_tap_device_t
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
- Provide an implementation for buildPool and deletePool operations
for the ZFS storage backend.
- Add VIR_STORAGE_POOL_SOURCE_DEVICE flag to ZFS pool poolOptions
as now we can specify devices to build pool from
- storagepool.rng: add an optional 'sourceinfodev' to 'sourcezfs' and
add an optional 'target' to 'poolzfs' entity
- Add a couple of tests to storagepoolxml2xmltest
If the qemu being used doesn't support JSON, then querying for IOThread
data would fail. In that case, ensure the *iothreads is NULL and return 0
as the count of iothreads available.
Currently, build with clang fails with:
CC qemu/libvirt_driver_qemu_impl_la-qemu_command.lo
qemu/qemu_command.c:6580:58: error: implicit conversion from enumeration type
'virMemAccess' to different enumeration type 'virTristateSwitch'
[-Werror,-Wenum-conversion]
virTristateSwitch memAccess = def->cpu->cells[i].memAccess;
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~^~~~~~~~~
1 error generated.
Fix that by using virMemAccess instead of virTristateSwitch.
Commit f05b6a91 added virQEMUDriverConfigPtr argument to the
virQEMUCapsFillDomainCaps function and it uses forward declaration
of virQEMUDriverConfig and virQEMUDriverConfigPtr that casues clang
build to fail:
gmake[3]: Entering directory `/usr/home/novel/code/libvirt/src'
CC qemu/libvirt_driver_qemu_impl_la-qemu_capabilities.lo
In file included from qemu/qemu_capabilities.c:43:
In file included from qemu/qemu_hostdev.h:27:
qemu/qemu_conf.h:63:37: error: redefinition of typedef 'virQEMUDriverConfig'
is a C11 feature [-Werror,-Wtypedef-redefinition]
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
qemu/qemu_capabilities.h:328:37: note: previous definition is here
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
Fix that by passing loader and nloader config attributes directly
instead of passing complete config.
Commit f36a94f introduced a double free on all success paths
in qemuSharedDeviceEntryInsert.
Only call qemuSharedDeviceEntryFree on the error path and
set entry to NULL before jumping there if the entry already
is in the hash table.
https://bugzilla.redhat.com/show_bug.cgi?id=1142722
Clean up all _virDomainMemoryStat.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Clean up all _virDomainBlockStats.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Clean up all _virDomainInterfaceStats.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.
Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.
Introduced in: eca96694a7f992be633d48d5ca03cedc9bbc3c9aa (v0.9.8)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140724
Reported-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1135396
There are two ways how to tell qemu to use huge pages. The first one
is suitable for domains with NUMA nodes: the path to hugetlbfs mount
is appended to NUMA node definition on the command line. The second
one is suitable for UMA domains: here there's this global '-mem-path'
argument that accepts path to the hugetlbfs mount point. However, the
latter case was not used for all the cases that it should be. For
instance:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
didn't trigger the '-mem-path' so the huge pages - despite being
configured - were not used at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As of 136ad4974 it is possible to specify different huge pages per
guest NUMA node. However, there's no check if nodeset specified in
./hugepages/page contains only those guest NUMA nodes that exist.
In other words with current code it is possible to define meaningless
combination:
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB' nodeset='0,2-3'/>
<page size='2048' unit='KiB' nodeset='1,4'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>4</vcpu>
<cpu>
<numa>
<cell id='0' cpus='0' memory='1048576'/>
<cell id='1' cpus='1' memory='1048576'/>
<cell id='2' cpus='2' memory='1048576'/>
<cell id='3' cpus='3' memory='1048576'/>
</numa>
</cpu>
Notice the node 4 in <hugepages/>?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch implements the VIR_DOMAIN_STATS_BLOCK group of statistics.
To do so, a helper function to get the block stats of all the disks of
a domain is added.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_INTERFACE group of
statistics.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_VCPU group of statistics. To
do so, this patch also extracts a helper to gather the vCPU information.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_CPU_TOTAL group of
statistics.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Future patches which will implement more bulk stats groups for QEMU will
need to access the connection object.
To accommodate that, a few changes are needed:
* enrich internal prototype to pass qemu driver object
* add per-group flag to mark if one collector needs monitor access or not
* If at least one collector of the requested stats needs monitor access
we must start a query job for each domain. The specific collectors
will run nested monitor jobs inside that.
* If the job can't be acquired we pass flags to the collector so
specific collectors that need monitor access can be skipped in order
to gather as much data as is possible.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Check to see if the UEFI binary mentioned in qemu.conf actually
exists, and if so expose it in domcapabilities like
<loader ...>
<value>/path/to/ovmf</value>
</loader>
We introduce some generic domcaps infrastructure for handling
a dynamic list of string values, it may be of use for future bits.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Up till now the virQEMUCapsFillDomainCaps() was type of void as
there was no way for it to fail. This is, however, going to
change in the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemu for IBM Power processor architecture is adding functionality for
supporting multiple 'pseries' machine type versions, each with different
capabilities. This patch is for supporting the same
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
As of 542899168c we learned libvirt to use UEFI for domains.
However, management applications may firstly query if libvirt
supports it. And this is where virConnectGetDomainCapabilities()
API comes handy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virStorageSourceInitChainElement initializes a new storage chain element
for use as a new disk source. If the new element doesn't contain the
driver name, copy it from the old source.
This fixes issue where a disk would forget the driver after a snapshot.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140984
Commit b606bbb4 broke reporting of errors when setting of guest time
fails via the guest agent as the return value is not checked and later
overwritten by the return value qemuMonitorRTCResetReinjection();
Fix this by checking the return value before resetting the RTC
reinjection.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1142294
For tuning the network, alternative devices
for creating tap and vhost devices can be specified via:
<backend tap='/dev/net/tun' vhost='/dev/net-vhost'/>
Instead of checking upfront if the <driver> element will be needed
in a big condition, just format all the attributes into a string
and output the <driver> element if the string is not empty.
We already are checking for negative value, reporting an error, but
using wrong function and the check only succeeds when a value that
cannot be converted to number successfully is encountered. This patch
provides just a minor change in call of the right version
of function virStrToLong.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138539
The first one occurs in openvzDomainMigratePrepare3Params() where in
case no remote uri is given, the distant hostname is used. The name is
obtained via virGetHostname() which require callers to free the
returned value.
The second leak lies in openvzDomainMigratePerform3Params(). There's a
virCommand used later. However, at the beginning of the function
virCheckFlags() is called which returns. So the command created was
leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, the setns() wrapper is supported only for x86_64 and i686
which leaves us failing to build on other platforms like arm, aarch64
and so on. This means, that the wrapper needs to be extended to those
platforms and make to fail on runtime not compile time.
The syscall numbers for other platforms was fetched using this
command:
kernel.git $ git grep "define.*__NR_setns" | grep -e arm -e powerpc -e s390
arch/arm/include/uapi/asm/unistd.h:#define __NR_setns (__NR_SYSCALL_BASE+375)
arch/arm64/include/asm/unistd32.h:#define __NR_setns 375
arch/powerpc/include/uapi/asm/unistd.h:#define __NR_setns 350
arch/s390/include/uapi/asm/unistd.h:#define __NR_setns 339
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The backing store string location offset 0 determines that the file
isn't present. The string size shouldn't be then checked:
from qemu.git/docs/specs/qcow2.txt
== Header ==
The first cluster of a qcow2 image contains the file header:
Byte 0 - 3: magic
QCOW magic string ("QFI\xfb")
4 - 7: version
Version number (valid values are 2 and 3)
8 - 15: backing_file_offset
Offset into the image file at which the backing file name
is stored (NB: The string is not null terminated). 0 if the
image doesn't have a backing file.
16 - 19: backing_file_size
Length of the backing file name in bytes. Must not be
longer than 1023 bytes. Undefined if the image doesn't have
a backing file. ^^^^^^^^^
This patch intentionally leaves the backing file string size check in
place in case a malformatted file would be presented to libvirt. Also
according to the docs the string size is maximum 1023 bytes, thus this
patch adds a check to verify that.
I was also able to verify that the check was done the same way in the
legacy qcow fromat (in qemu's code).
If there are no iothreads, then return from qemuProcessDetectIOThreadPIDs
without error; otherwise, the following occurs:
error: Failed to start domain $dom
error: An error occurred, but the cause is unknown
https://bugzilla.redhat.com/show_bug.cgi?id=1101574
Add an option 'iothreadpin' to the <cpuset> to allow for setting the
CPU affinity for each IOThread.
The iothreadspin will mimic the vcpupin with respect to being able to
assign each iothread to a specific CPU, although iothreads ids start
at 1 while vcpu ids start at 0. This matches the iothread naming scheme.
Modify qemuProcessStart() in order to allowing setting affinity to
specific CPU's for IOThreads. The process followed is similar to
that for the vCPU's.
This involves adding a function to fetch the IOThread id's via
qemuMonitorGetIOThreads() and adding them to iothreadpids[] list.
Then making sure all the cgroup data has been properly set up and
finally assigning affinity.
In order to support cpuset setting, introduce qemuSetupCgroupIOThreadsPin
and qemuSetupCgroupForIOThreads to mimic the existing Vcpu API's.
These will support having an 'iotrhreadpin' element in the 'cpuset' in
order to pin named IOThreads to specific CPU's. The IOThread pin names
will follow the IOThread naming scheme starting at 1 (eg "iothread1")
up through an including the def->iothreads value.
When spanning tree protocol is allowed in bridge settings, forward delay
value is set as well (default is 0 if omitted). Until now, there was no
check for delay value validity. Delay makes sense only as a positive
numerical value.
Note: However, even if you provide positive numerical value, brctl
utility only uses values from range <2,30>, so the number provided can
be modified (kernel most likely) to fall within this range.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1125764
Coverity complains about the calculation of the buf & len within
the PROBE macro. So to quiet things down, do the calculation prior
to usage in either write() or qemuMonitorIOWriteWithFD() calls and
then have the PROBE use the calculated values - which works.
Seems when commit id 'ea130e3b' added the checks to ensure each of
the hard_limit, soft_limit, and swap_hard_limit wasn't set at
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - a copy/paste error of using
the 'hard_limit' for each comparison was done. Adjust the code.
Coverity complains that because of how 'offset' is initialized to
0 (zero), the resulting math and comparison on rem is pointless.
According to the origin commit id '3ec128989', the code is a
replacement for gmtime(), but without the localtime() or GMT
calculations - so just remove this code and add a comment
indicating the removal
Since 98b9acf5aa
This was a false positive where Coverity was complaining that the
remoteDeserializeTypedParameters() could allocate 'params', but
none of the callers could return the allocated memory back to their
caller since on input the param was passed by value. Additionally,
the flow of the code was that if params was NULL on entry, then each
function would return 'nparams' as the number of params entries the
caller would need to allocate in order to call the function again
with 'nparams' and 'params' being set. By the time the deserialize
routine was called params would have something. For other callers
where the 'params' was passed by reference as NULL since it's expected
that the deserialize allocates the memory and then have that passed
back to the original caller to dispose there was no Coverity issue.
As it turns out Coverity didn't quite seem to understand the
relationship between 'nparams' and 'params'; however, if the
!userAllocated path of the deserialize code compared against
limit in any manner, then the Coverity error went away which
was quite strange, but useful.
As it turns out one code path remoteDomainGetJobStats had a
comparison against 'limit' while another remoteConnectGetAllDomainStats
did not assuming that limit would be checked. So I refactored the
code a bit to cause the limit check to occur in deserialize for
both conditions and then only made the check of current returned
size against the incoming *nparams fail the non allocation case.
This means the job code doesn't need to check the limit any more,
while the stats code now does check the limit.
Additionally, to help perhaps decipher which of the various
callers to the deserialize code caused the failure - I used
a #define to pass the __FUNCNAME__ of the caller along so that
error messages could have something like:
error: remoteConnectGetAllDomainStats: too many parameters '2' for nparams '0'
error: Reconnected to the hypervisor
(it's a contrived error just to show the funcname in the error)
The manufacurer and product from USB device itself are usually not particularly
useful -- they tend to be missing, or ugly (all-uppercase, padded with spaces,
etc.). Prefer what's in the usb id database and fall back to descriptors only
if the device is too new to be in database.
https://bugzilla.redhat.com/show_bug.cgi?id=1138887
This patch adds initial migration support to the OpenVZ driver,
using the VIR_DRV_FEATURE_MIGRATION_PARAMS family of migration
functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We stupidly modeled block job bandwidth after migration
bandwidth, which in turn was an 'unsigned long' and therefore
subject to 32-bit vs. 64-bit interpretations. To work around
the fact that 10-gigabit interfaces are possible but don't fit
within 32 bits, the original interface took the number scaled
as MiB/sec. But this scaling is rather coarse, and it might
be nice to tune bandwidth finer than in megabyte chunks.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust them all
at once. Note that there is intentionally no flag for the new
virDomainBlockCopy; there, since the API already uses a 64-bit
type always, instead of a possible 32-bit type, and is brand
new, it was easier to just avoid scaling issues. As with the
previous patch that adjusted the query side (commit db33cc24),
omitting the new flag preserves old behavior, and the
documentation now mentions limits of what happens when a 32-bit
machine is on either client or server side.
* include/libvirt/libvirt.h.in (virDomainBlockJobSetSpeedFlags)
(virDomainBlockPullFlags)
(VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES)
(VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES): New enums.
* src/libvirt.c (virDomainBlockJobSetSpeed, virDomainBlockPull)
(virDomainBlockRebase, virDomainBlockCommit): Document them.
* src/qemu/qemu_driver.c (qemuDomainBlockJobSetSpeed)
(qemuDomainBlockPull, qemuDomainBlockRebase)
(qemuDomainBlockCommit, qemuDomainBlockJobImpl): Support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
Upstream qemu 1.4 added some drive-mirror tunables not present
when it was first introduced in 1.3. Management apps may want
to set these in some cases (for example, without tuning
granularity down to sector size, a copy may end up occupying
more bytes than the original because an entire cluster is
copied even when only a sector within the cluster is dirty,
although tuning it down results in more CPU time to do the
copy). I haven't personally needed to use the parameters, but
since they exist, and since the new API supports virTypedParams,
we might as well expose them.
Since the tuning parameters aren't often used, and omitted from
the QMP command when unspecified, I think it is safe to rely on
qemu 1.3 to issue an error about them being unsupported, rather
than trying to create a new capability bit in libvirt.
Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where
a bad granularity (such as non-power-of-2) gives a poor message:
error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0'
because of abuse of QERR_INVALID_PARAMETER (which is supposed to
name the parameter that was given a bad value, rather than the
value passed to some other parameter). I don't see that a
capability check will help, so we'll just live with it (and it
has since been improved in upstream qemu).
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add
parameters.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise.
(qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise.
* tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The hard part of managing the disk copy is already coded; all
this had to do was convert the XML and virTypedParameters into
the internal representation.
With this patch, all blockcopy operations that used the old
API should also work via the new API. Additional extensions,
such as supporting the granularity tunable or a network rather
than file destination, will be added as later patches.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to implement the new virDomainBlockCopy, the existing
block copy internal implementation needs to be adjusted. The
new function will parse XML into a storage source, and parse
typed parameters into integers, then call into the same common
backend. For now, it's easier to keep the same implementation
limits that only local file destinations are suported, but now
the check needs to be explicit. Similar to qemuDomainBlockJobImpl
consuming 'vm', this code also consumes the caller's 'mirror'
description of the destination.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Rename...
(qemuDomainBlockCopyCommon): ...and adjust parameters.
(qemuDomainBlockRebase): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
When a domain is undefined, there are options to remove it's
managed save state or snapshots. However, there's another file
that libvirt creates per domain: the NVRAM variable store file.
Make sure that the file is not left behind if the domain is
undefined.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If we end up at the cleanup lable before we've VIR_EXPAND_N the list,
then calling virQEMUCapsFreeStringList() with a NULL proplist could
theoretically deref proplist if nproplist was set. Coverity doesn't
seem to acknowledge the relationship between proplist and nproplist
assuming in virQEMUCapsFreeStringList that nproplist could be at
least 1 and thus have a null deref. It only seems to follow the
NULL proplist.
Signed-off-by: John Ferlan <jferlan@redhat.com>
With the virGetGroupList() change in place - Coverity further complains
that if we fail to virFork(), the groups will be leaked - which aha seems
to be the case. Adjust the logic to save off the -errno, free the groups,
and then return the value we saved
Signed-off-by: John Ferlan <jferlan@redhat.com>
This ends up being a very bizarre false positive. With an assist from
eblake, the claim is that mgetgroups() could return a -1 value, but yet
still have a groups buffer allocated, yet the example shown doesn't
seem to prove that.
Rather than fret about it, by adding a well placed sa_assert() on the
returned *list value we can "assure" ourselves that the mgetgroups()
failure path won't signal this condition.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If a (floppy) drive isn't selected for snapshot explicitly and is empty
don't try to snapshot it. For external snapshots this would fail as we
can't generate a name for the snapshot from an empty drive.
Reported-by: Pavel Hrdina <phrdina@redhat.com>
To express empty drive we historically use storage source with empty
path. Unfortunately NBD disks may be declared without a path.
Add a helper to wrap this logic.
The libxl driver was blindly assigning libvirt's
virDomainLifecycleAction to libxl's libxl_action_on_shutdown, when
in fact the various actions take on different values in these enums.
Introduce helpers to properly map the enum values.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Test suites using the port allocator don't want to have different
behaviour depending on whether a port is in use on the host. Add
a VIR_PORT_ALLOCATOR_SKIP_BIND_CHECK which test suites can use
to skip the bind() test. The port allocator will thus only track
ports in use by the test suite process itself. This is fine when
using the port allocator to generate guest configs which won't
actually be launched
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
I've noticed two problem with the automatically created NVRAM varstore
file. The first, even though I run qemu as root:root for some reason I
get Permission denied when trying to open the _VARS.fd file. The
problem is, the upper directory misses execute permissions, which in
combination with us dropping some capabilities result in EPERM.
The next thing is, that if I switch SELinux to enforcing mode, I get
another EPERM because the vars file is not labeled correctly. It is
passed to qemu as disk and hence should be labelled as disk. QEMU may
write to it eventually, so this is different to kernel or initrd.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
With all the changes in my previous foray into this code, I forgot to
remove the libxlDomainEventQueue(driver, event); call inside the
dom == NULL condition.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity notes that if the virConnectListAllDomains returns a negative
value then the loop at the cleanup label that ends on numDomains will
have issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity notes that if qemuMonitorGetMachines() returns a negative
nmachines value, then the code at the cleanup label will have issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity notes that if the call to virBitmapParse() returns a negative
value, then when we jump to the error label, the call to
virCapabilitiesClearHostNUMACellCPUTopology() will have issues
with the negative nb_cpus
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the virNumaGetNodeCPUs() call fails with -1, then jumping to cleanup
with 'cpus == NULL' and calling virCapabilitiesClearHostNUMACellCPUTopology
will cause issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In qemuProcessInitPCIAddresses() if qemuMonitorGetAllPCIAddresses()
returns a negative (or zero) value, then no need to call the
qemuProcessDetectPCIAddresses().
Signed-off-by: John Ferlan <jferlan@redhat.com>
The code compares def->forwarders when deciding to return 0 at a
couple of points, then uses "def->nfwds" as a way to index into
the def->forwarders array. That reference results in Coverity
complaining that def->forwarders being NULL was checked as part
of an arithmetic OR operation where failure could be any one 5
conditions, but that is not checked when entering the loop to
dereference the array. Changing the comparisons to use nfwds
will clear the warnings
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the qemuMigrationEatCookie() fails to set mig, we jump to cleanup:
which will call qemuMigrationCancelDriveMirror() without first checking
if mig == NULL
Signed-off-by: John Ferlan <jferlan@redhat.com>
Perhaps a false positive, but since Coverity doesn't understand the
relationship between the 'count' and the 'strings', rather than leave
the chance the on input 'strings' is NULL and causes a deref - just
check for it and return
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the VIR_STRDUP(exptime,...) fails, then we will jump to cleanup,
no need to check if exptime is set which causes Coverity to issue
a complaint in the virStrToLong_ll call because there wasn't a check
for a NULL value while there was one for the reference right after
the VIR_STRDUP().
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we jump to cleanup before allocating the 'result', then the call
to virBlkioDeviceArrayClear will deref result causing a problem.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we jump to cleanup before allocating 'result', then the call to
virBlkioDeviceArrayClear() could dereference result
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the virJSONValueNewObject() fails, then rather than going to error
and getting a Coverity false positive since it doesn't seem to understand
the relationship between nkeywords, keywords, and values and seems to
believe calling qemuFreeKeywords will cause a NULL deref - just return NULL
Signed-off-by: John Ferlan <jferlan@redhat.com>
Adjust the parentheses in/for the waitpid loops; otherwise, Coverity
points out:
(1) Event assignment: Assigning: "waitret" = "waitpid(pid, &status, 0) == -1"
(2) Event between: At condition "waitret == -1", the value of "waitret"
must be between 0 and 1.
(3) Event dead_error_condition: The condition "waitret == -1" cannot
be true.
(4) Event dead_error_begin: Execution cannot reach this statement:
"ret = -*__errno_location();".
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity complains that when multiplying to 32 bit values that eventually
will be stored in a 64 bit value that it's possible the math could
overflow unless one of the values being multiplied is type cast to
the proper size.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity complains that checking for !domlist after setting doms = domlist
and making a deref of doms just above
It seems the call in question was intended to me made in the case that
'doms' was passed in and not when the virDomainObjListExport() call
allocated domlist and already called virConnectGetAllDomainStatsCheckACL().
Thus rather than check for !domlist - check that "doms != domlist" in
order to avoid the Coverity message.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Handle a few places where Coverity complains about the value being
unused. For two of them (Close cases) - the comments above the close
indicate there is no harm to ignore the error - so added an ignore_value.
For the other condition, added an rc check like other callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since cd4d547576
Coverity notes that setting 'ret = -3' prior to the unconditional
setting of 'ret = 0' will cause the value to be UNUSED.
Since the comment indicates that it is expect to allow the code
to continue, just remove the ret = -3 setting.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In qemuDomainSetBlkioParameters(), Coverity points out that the calls
to qemuDomainParseBlkioDeviceStr() are slightly different and points
out there may be a cut-n-paste error.
In the first call (AFFECT_LIVE), the second parameter is "param->field";
however, for the second call (AFFECT_CONFIG), the second parameter is
"params->field". It seems the "param->field" is correct especially since
each path as a setting of "param" to "¶ms[i]". Furthermore, there
were a few more instances of using "params[i]" instead of "param->"
which I cleaned up.
Signed-off-by: John Ferlan <jferlan@redhat.com>
After a4431931 the TAP FDs ale labeled with image label instead
of the process label. On the other hand, the commit was
incomplete as a few lines above, there's still old check for the
process label presence while it should be check for the image
label instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
From time to time weird bugreports occur on the list, e.g [1].
Even though the kernel supports setns syscall, there's an older
glibc in the system that misses a wrapper over the syscall.
Hence, after the configure phase we think there's no setns
support in the system, which is obviously wrong. On the other
hand, we can't rely on linux distributions to provide newer glibc
soon. Therefore we need to introduce the wrapper on or own.
1: https://www.redhat.com/archives/libvir-list/2014-September/msg00492.html
Signed-off-by: Stephan Sachse <ste.sachse@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virProcessTranslateStatus is used on error paths that should not spoil
the returned error. As the errors are ignored, use the quiet versions of
virAsprintf to create the message.
When using split UEFI image, it may come handy if libvirt manages per
domain _VARS file automatically. While the _CODE file is RO and can be
shared among multiple domains, you certainly don't want to do that on
the _VARS file. This latter one needs to be per domain. So at the
domain startup process, if it's determined that domain needs _VARS
file it's copied from this master _VARS file. The location of the
master file is configurable in qemu.conf.
Temporary, on per domain basis the location of master NVRAM file can
be overridden by this @template attribute I'm inventing to the
<nvram/> element. All it does is holding path to the master NVRAM file
from which local copy is created. If that's the case, the map in
qemu.conf is not consulted.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
QEMU now supports UEFI with the following command line:
-drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
-drive file=/usr/share/OVMF/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \
where the first line reflects <loader> and the second one <nvram>.
Moreover, these two lines obsolete the -bios argument.
Note that UEFI is unusable without ACPI. This is handled properly now.
Among with this extension, the variable file is expected to be
writable and hence we need security drivers to label it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Up to now, users can configure BIOS via the <loader/> element. With
the upcoming implementation of UEFI this is not enough as BIOS and
UEFI are conceptually different. For instance, while BIOS is ROM, UEFI
is programmable flash (although all writes to code section are
denied). Therefore we need new attribute @type which will
differentiate the two. Then, new attribute @readonly is introduced to
reflect the fact that some images are RO.
Moreover, the OVMF (which is going to be used mostly), works in two
modes:
1) Code and UEFI variable store is mixed in one file.
2) Code and UEFI variable store is separated in two files
The latter has advantage of updating the UEFI code without losing the
configuration. However, in order to represent the latter case we need
yet another XML element: <nvram/>. Currently, it has no additional
attributes, it's just a bare element containing path to the variable
store file.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After the previous commit, migration statistics on the source and
destination hosts are not equal because the destination updated time
statistics. Let's send the result back so that the same data can be
queried on both sides of the migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Total time of a migration and total downtime transfered from a source to
a destination host do not count with the transfer time to the
destination host and with the time elapsed before guest CPUs are
resumed. Thus, source libvirtd remembers when migration started and when
guest CPUs were paused. Both timestamps are transferred to destination
libvirtd which uses them to compute total migration time and total
downtime. Obviously, this requires the time to be synchronized between
the two hosts. The reported times are useless otherwise but they would
be equally useless if we didn't do this recomputation so don't lose
anything by doing it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When migrating a transient domain or with VIR_MIGRATE_UNDEFINE_SOURCE
flag, the domain may disappear from source host. And so will migration
statistics associated with the domain. We need to transfer the
statistics at the end of a migration so that they can be queried at the
destination host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virDomainGetJobStats gains new VIR_DOMAIN_JOB_STATS_COMPLETED flag that
can be used to fetch statistics of a completed job rather than a
currently running job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Job statistics data were tracked in several structures and variables.
Let's make a new qemuDomainJobInfo structure which can be used as a
single source of statistics data as a preparation for storing data about
completed a job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The new blockcopy API wants to reuse only a subset of the disk
hotplug parser - namely, we only care about the embedded
virStorageSourcePtr inside a <disk> XML. Strange as it may
seem, it was easier to just parse an entire disk definition,
then throw away everything but the embedded source, than it
was to disentangle the source parsing code from the rest of
the overall disk parsing function. All that I needed was a
couple of tweaks and a new internal flag that determines
whether the normally-mandatory target element can be
gracefully skipped, since everything else was already optional.
* src/conf/domain_conf.h (virDomainDiskSourceParse): New
prototype.
* src/conf/domain_conf.c (VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE):
New flag.
(virDomainDiskDefParseXML): Honor flag to make target optional.
(virDomainDiskSourceParse): New function.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu now checks for invalid address type for a panic device, which is
currently implemented only to use ISA address type, thus rejecting
any other options, except for leaving XML attributes blank, in that case,
defaults are used (this behaviour remains the same from earlier verions).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138125
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When QEMU fails during incoming migration after we successfully started
it (i.e., during Perform or Finish phase), we report a rather unhelpful
message
Unable to read from monitor: Connection reset by peer
We already have a code that takes error messages from QEMU's error
output but we disable it once QEMU successfully starts. This patch
postpones this until the end of Finish phase during incoming migration
so that we can report a much better error message:
internal error: early end of file from monitor: possible problem:
Unknown savevm section or instance '0000:00:05.0/virtio-balloon' 0
load of migration failed
https://bugzilla.redhat.com/show_bug.cgi?id=1090093
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Return failure right away when the domain object can't be looked up
instead of jumping to cleanup. This allows to remove the condition
before unlocking the domain object.
The code would lookup the snapshot object before acquiring the job. This
could lead to a crash as one thread could delete the snapshot object,
while a second thread already had the reference.
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Creating snapshots modifies the domain state. Currently we wouldn't
enter the job for certain operations although they would modify the
state. Refactor job handling so that everything is covered by an async
job.
For security type='none' libvirt according to the docs should not
generate seclabel be it for selinux or any model. So, skip the
reservation of labels when type is none.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Fairly straightforward - I got lucky that the generated functions
worked out of the box :)
* src/remote/remote_protocol.x (remote_domain_block_copy_args):
New struct.
(REMOTE_PROC_DOMAIN_BLOCK_COPY): New RPC.
* src/remote/remote_driver.c (remote_driver): Wire it up.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The usual portability fixes; and this includes a fix that adds
a new syntax check for double semicolons (commit 28de556 fixed
some, but gnulib found a better check).
* .gnulib: Update to latest.
* src/xenconfig/xen_common.c (xenFormatConfigCommon): Fix offender.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since 9f781da69d
Resolve a libvirtd crash in virStoragePoolSourceFindDuplicate()
when there is an existing SCSI pool defined with adapter type as
'scsi_host' and defining a new SCSI pool with adapter type as
'fc_host' and parent attribute missing or vice versa.
For example, if there is an existing SCSI pool with adapter type
as 'scsi_host' defined using the following XML
<pool type='scsi'>
<name>TEST_SCSI_POOL</name>
<source>
<adapter type='scsi_host' name='scsi_host1'/>
</source>
<target>
<path>/dev/disk/by-path</path>
</target>
</pool>
When defining another SCSI pool with adapter type as 'fc_host' using the
following XML will crash libvirtd
<pool type='scsi'>
<name>TEST_SCSI_FC_POOL</name>
<source>
<adapter type='fc_host' wwnn='1234567890abcdef' wwpn='abcdef1234567890'/>
</source>
<target>
<path>/dev/disk/by-path</path>
</target>
</pool>
Same is true for the reverse case as well where there exists a SCSI pool
with adapter type as 'fc_host' and another SCSI pool is defined with
adapter type as 'scsi_host'.
This happens because for fc_host 'name' is optional attribute whereas for
scsi_host its mandatory. However the check in libvirt for finding duplicate
storage pools didn't take that into account while comparing
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
To date, anyone performing a block copy and pivot ends up with
the destination being treated as <disk type='file'>. While this
works for data access for a block device, it has at least one
noticeable shortcoming: virDomainGetBlockInfo() reports allocation
differently for block devices visited as files (the size of the
device) than for block devices visited as <disk type='block'>
(the maximum sector used, as reported by qemu); and this difference
is significant when trying to manage qcow2 format on block devices
that can be grown as needed.
Of course, the more powerful virDomainBlockCopy() API can already
express the ability to set the <disk> type. But a new API can't
be backported, while a new flag to an existing API can; and it is
also rather inconvenient to have to resort to the full power of
generating XML when just adding a flag to the older call will do
the trick. So this patch enhances blockcopy to let the user flag
when the resulting XML after the copy must list the device as
type='block'.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_REBASE_COPY_DEV):
New flag.
* src/libvirt.c (virDomainBlockRebase): Document it.
* tools/virsh-domain.c (opts_block_copy, blockJobImpl): Add
--blockdev option.
* tools/virsh.pod (blockcopy): Document it.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow new flag.
(qemuDomainBlockCopy): Remember the flag, and make sure it is only
used on actual block devices.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
While reviewing the new virDomainBlockCopy API, Peter Krempa
pointed out that our existing design of using MiB/s for block
job bandwidth is rather coarse, especially since qemu tracks
it in bytes/s; so virDomainBlockCopy only accepts bytes/s.
But once the new API is implemented for qemu, we will be in
the situation where it is possible to set a value that cannot
be accurately reflected back to the user, because the existing
virDomainGetBlockJobInfo defaults to the coarser units.
Fortunately, we have an escape hatch; and one that has already
served us well in the past: we can use the flags argument to
specify which scale to use (see virDomainBlockResize for prior
art). This patch fixes the query side of the API; made easier
by previous patches that split the query side out from the
modification code. Later patches will address the virsh
interface, as well retrofitting all other blockjob APIs to
also accept a flag for toggling bandwidth units.
* include/libvirt/libvirt.h.in (_virDomainBlockJobInfo)
(VIR_DOMAIN_BLOCK_COPY_BANDWIDTH): Document sizing issues.
(virDomainBlockJobInfoFlags): New enum.
* src/libvirt.c (virDomainGetBlockJobInfo): Document new flag.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJobInfo): Add parameter.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJobInfo): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJobInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJobInfo)
(qemuMonitorJSONGetBlockJobInfoOne): Likewise. Don't scale here.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Update
callers.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
(qemuDomainGetBlockJobInfo): Likewise, and support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
The previous patch hoisted some bounds checks to the callers;
but someone that is not aware of the hoisted check could now
try passing an integer between LLONG_MAX and ULLONG_MAX. As a
safety measure, add new json conversion modes that let libvirt
error out early instead of pass bad numbers to qemu, if the
caller ever makes a mistake due to later refactoring.
Convert the various blockjob QMP calls to use the new modes,
and switch some of them to be optional (QMP has always supported
an omitted "speed" the same as "speed":0, for everything except
block-job-set-speed).
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw):
Add 'j'/'y' and 'J'/'Y' to error out on negative input.
(qemuMonitorJSONDriveMirror, qemuMonitorJSONBlockCommit)
(qemuMonitorJSONBlockJob): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu treats blockjob bandwidth as a 64-bit number, in the units
of bytes/second. But we stupidly modeled block job bandwidth
after migration bandwidth, which in turn was an 'unsigned long'
and therefore subject to 32-bit vs. 64-bit interpretations, and
with a scale of MiB/s. Our code already has to convert between
the two scales, and report overflow as appropriate; although
this conversion currently lives in the monitor code. In fact,
our conversion code limited things to 63 bits, because we
checked against LLONG_MAX and reject what would be negative
bandwidth if treated as signed.
On the bright side, our use of MiB/s means that even with a
32-bit unsigned long, we still have no problem representing a
bandwidth of 2GiB/s, which is starting to be more feasible as
10-gigabit or even faster interfaces are used. And once you
get past the physical speeds of existing interfaces, any larger
bandwidth number behaves the same - effectively unlimited.
But on the low side, the granularity of 1MiB/s tuning is rather
coarse. So the new virDomainBlockJob API decided to go with
a direct 64-bit bytes/sec number instead of the scaled number
that prior blockjob APIs had used. But there is no point in
rounding this number to MiB/s just to scale it back to bytes/s
for handing to qemu.
In order to make future code sharing possible between the old
virDomainBlockRebase and the new virDomainBlockCopy, this patch
moves the scaling and overflow detection into the driver code.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust all block
jobs at once, for consistency. This patch is just code motion;
there should be no user-visible change in behavior.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Change
parameter type and scale.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Move scaling
and overflow detection...
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl)
(qemuDomainBlockRebase, qemuDomainBlockCommit): ...here.
(qemuDomainBlockCopy): Use bytes/sec.
Signed-off-by: Eric Blake <eblake@redhat.com>
Another layer of overly-multiplexed code that deserves to be
split into obviously separate paths for query vs. modify.
This continues the cleanup started in commit cefe0ba.
In the process, make some tweaks to simplify the logic when
parsing the JSON reply. There should be no user-visible
semantic changes.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Drop parameter.
(qemuMonitorBlockJobInfo): New prototype.
(BLOCK_JOB_INFO): Drop enum.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob)
(qemuMonitorJSONBlockJobInfo): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Split...
(qemuMonitorBlockJobInfo): ...into second function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Move
block info portions...
(qemuMonitorJSONGetBlockJobInfo): ...here, and rename...
(qemuMonitorJSONBlockJobInfo): ...and export.
(qemuMonitorJSONGetBlockJobInfoOne): Alter return semantics.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainGetBlockJobInfo): Adjust
callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror)
(qemuMigrationCancelDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit fba6bc4 introduced support for the 'invtsc' feature,
which blocks migration. We should not include it in the
host-model CPU by default, because it's intended to be used
with migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1138221
https://bugzilla.redhat.com/show_bug.cgi?id=1027096#c8
There are two ways in which security model can make it way into
<seclabel/>. One is as the @model attribute, the second one is
via security_driver knob in qemu.conf. Then, while parsing
<seclabel/> several checks and fix ups of old, stale combinations
are performed. However, iff @model is specified. They are not
done in the latter case. So it's still possible to feed libvirt
with senseless combinations (if qemu.conf is adjusted correctly).
One example of a seclabel that needs some adjustment (in case
security_driver=none in qemu.conf) is:
<seclabel type='dynamic' relabel='yes'/>
The fixup code is copied from virSecurityLabelDefParseXML
(covering the former case) into virSecurityLabelDefsParseXML
(which handles the latter case).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The qemu implementation for virDomainGetBlockJobInfo() has a
minor bug: it grabs the qemu job with intent to QEMU_JOB_MODIFY,
which means it cannot be run in parallel with any other
domain-modifying command. Among others, virDomainBlockJobAbort()
is such a modifying command, and it defaults to being
synchronous, and can wait as long as several seconds to ensure
that the job has actually finished. Due to the job rules, this
means a user cannot obtain status about the job during that
timeframe, even though we know that some client management code
exists which is using a polling loop on status to see when a job
finishes.
This bug has been present ever since blockpull support was first
introduced (commit b976165, v0.9.4 in Jul 2011), all because we
stupidly tried to cram too much multiplexing through a single
helper routine, but was made worse in 97c59b9 (v1.2.7) when
BlockJobAbort was fixed to wait longer. It's time to disentangle
some of the mess in qemuDomainBlockJobImpl, and in the process
relax block job query to use QEMU_JOB_QUERY, since it can safely
be used in parallel with any long running modify command.
Technically, there is one case where getting block job info can
modify domain XML - we do snooping to see if a 2-phase job has
transitioned into the second phase, for an optimization in the
case of old qemu that lacked an event for the transition. I
claim this optimization is safe (the jobs are all about modifying
qemu state, not necessarily xml state); but if it proves to be
a problem, we could use the difference between the capabilities
QEMU_CAPS_BLOCKJOB_{ASYNC,SYNC} to determine whether we even
need snooping, and only request a modifying job in the case of
older qemu.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Move info
handling...
(qemuDomainGetBlockJobInfo): ...here, and relax job type.
(qemuDomainBlockJobAbort, qemuDomainBlockJobSetSpeed)
(qemuDomainBlockRebase, qemuDomainBlockPull): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
The existing virDomainBlockRebase code rejected the combination of
_RELATIVE and _COPY flags, but only by accident. It makes sense
to add support for the combination someday, at least for the case
of _SHALLOW and not _REUSE_EXT; but to implement it, libvirt would
have to pre-create the file with a relative backing name, and I'm
not ready to code that in yet.
Meanwhile, the code to forward on to the block copy code is getting
longer, and reorganizing the function to have the block pull done
early makes it easier to add even more block copy prep code.
This patch should have no semantic difference other than the quality
of the error message on the unsupported flag combination. Pre-patch:
error: unsupported flags (0x10) in function qemuDomainBlockCopy
Post-patch:
error: argument unsupported: Relative backing during copy not supported yet
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Reorder code,
and improve error message of relative copy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our style overwhelmingly uses hanging braces (the open brace
hangs at the end of the compound condition, rather than on
its own line), with the primary exception of the top level function
body. Fix the few remaining outliers, before adding a syntax
check in a later patch.
* src/interface/interface_backend_netcf.c (netcfStateReload)
(netcfInterfaceClose, netcf_to_vir_err): Correct use of { in
compound statement.
* src/conf/domain_conf.c (virDomainHostdevDefFormatSubsys)
(virDomainHostdevDefFormatCaps): Likewise.
* src/network/bridge_driver.c (networkAllocateActualDevice):
Likewise.
* src/util/virfile.c (virBuildPathInternal): Likewise.
* src/util/virnetdev.c (virNetDevGetVirtualFunctions): Likewise.
* src/util/virnetdevmacvlan.c
(virNetDevMacVLanVPortProfileCallback): Likewise.
* src/util/virtypedparam.c (virTypedParameterAssign): Likewise.
* src/util/virutil.c (virGetWin32DirectoryRoot)
(virFileWaitForDevices): Likewise.
* src/vbox/vbox_common.c (vboxDumpNetwork): Likewise.
* tests/seclabeltest.c (main): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
I'm about to add a syntax check that enforces our documented
HACKING style of always using matching {} on if-else statements.
This patch focuses on drivers that had several issues.
* src/lxc/lxc_fuse.c (lxcProcGetattr, lxcProcReadMeminfo): Correct
use of {}.
* src/lxc/lxc_driver.c (lxcDomainMergeBlkioDevice): Likewise.
* src/phyp/phyp_driver.c (phypConnectNumOfDomainsGeneric)
(phypUUIDTable_Init, openSSHSession, phypStoragePoolListVolumes)
(phypConnectListStoragePools, phypDomainSetVcpusFlags)
(phypStorageVolGetXMLDesc, phypStoragePoolGetXMLDesc)
(phypConnectListDefinedDomains): Likewise.
* src/vbox/vbox_common.c (vboxAttachSound, vboxDumpDisplay)
(vboxDomainRevertToSnapshot, vboxDomainSnapshotDelete): Likewise.
* src/vbox/vbox_tmpl.c (vboxStorageVolGetXMLDesc): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
We lacked of HOME environment variable,
set 'HOME=/' as default.
The kernel sets up $HOME for the init process.
Therefore any init can assume that $HOME is set.
libvirt currently violates that implicit rule.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
When FIPS mode is on, gnutls_dh_params_generate2 will fail if 1024 is
specified as the prime's number of bits, a bigger value works in both
cases.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Memory is allocated for 'mnt_src' by VIR_STRDUP in the loop. Next
loop it will be allocated again. So we need to free 'mnt_src'
before continue the loop.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Commit 0e1a1a8c introduced umask for virCommand, but the variables
used emit a warning on older compilers about shadowed global
declaration.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add umask to _virCommand, allow user to set umask to command.
Set umask(002) to qemu process to overwrite the default umask
of 022 set by many distros, so that unix sockets created for
virtio-serial has expected permissions.
Fix problem reported here:
https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11https://bugzilla.novell.com/show_bug.cgi?id=888166
To use virtio-serial device, unix socket created for chardev with
default umask(022) has insufficient permissions.
e.g.:
-device virtio-serial \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-device virtserialport,chardev=foo,name=org.fedoraproject.port.0
srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock
Other users in the same group (like real user, test engines, etc)
cannot write to this socket.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, there is one flag passed in during macvtap creation
(withTap) -- Let's convert this field to an unsigned int flag
field for future expansion.
Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The cleanup in commit cf976d9d used secdef->label to label the tap
FDs, but that is not possible since it's process-only label (svirt_t)
and not a object label (e.g. svirt_image_t). Starting a domain failed
with EPERM, but simply using secdef->imagelabel instead of
secdef->label fixes it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since 1b807f92, connecting with virsh to an already running session
libvirtd fails with:
$ virsh list --all
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to
'/run/user/1000/libvirt/libvirt-sock': Transport endpoint is already
connected
This is caused by a logic error in virNetSocketNewConnectUnix: even if
the connection to the daemon socket succeeded, we still try to spawn the
daemon and then connect to it.
This commit changes the logic to not try to spawn libvirtd if we
successfully connected to its socket.
Most of this commit is whitespace changes, use of -w is recommended to
look at it.
Currently, after calling commands to create a new volumes,
virStorageBackendZFSCreateVol calls virStorageBackendZFSFindVols that
calls virStorageBackendZFSParseVol.
virStorageBackendZFSParseVol checks if a volume already exists by
trying to get it using virStorageVolDefFindByName.
For a just created volume it returns NULL, so volume is reported as
new and appended to pool->volumes. This causes a volume to be listed
twice as storageVolCreateXML appends this new volume to the list as
well.
Fix that by passing a new volume definition to
virStorageBackendZFSParseVol so it could determine if it needs to add
this volume to the list.
In qemuDomainSnapshotCreateDiskActive() if we jumped to cleanup from a
failed actions = virJSONValueNewArray(), then 'cfg' would be NULL.
So just return -1, which in turn removes the need for cleanup:
Coverity complained about the following:
(3) Event ptr_arith:
Performing pointer arithmetic on "cur_fd" in expression "cur_fd++".
130 return virNetServerServiceNewFD(*cur_fd++,
The complaint is that pointer arithmetic taking place instead of the
expected auto increment of the variable... Adding some well placed
parentheses ensures our order of operation.
For virtio-blk-pci disks with the disk iothread attribute that are
running the correct emulator, add the "iothread=iothread#" to the
-device command line in order to enable iothreads for the disk as
long as the command is available, the disk iothread value provided is
valid, and is supported for the disk device being added
Add a new disk "driver" attribute "iothread" to be parsed as the thread
number for the disk to use. In order to more easily facilitate the usage
and configuration of the iothread, a "zero" for the attribute indicates
iothreads are not supported for the device and a positive value indicates
the specific thread to try and use.
Add a new capability to ensure the iothreads feature exists for the qemu
emulator being run - requires the "query-iothreads" QMP command. Using the
domain XML add correspoding command argument in order to generate the
threads. The iothreads will use a name space "iothread#" where, the
future patch to add support for using an iothread to a disk definition to
merely define which of the available threads to use.
Add tests to ensure the xml/argv processing is correct. Note that no
change was made to qemuargv2xmltest.c as processing the -object element
would require knowing more than just iothreads.