As pointed out by jtomko in his review of the IOThreads pinning code:
http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html
there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...
This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...
As there are two possible approaches to define a domain's memory size -
one used with legacy, non-NUMA VMs configured in the <memory> element
and per-node based approach on NUMA machines - the user needs to make
sure that both are specified correctly in the NUMA case.
To avoid this burden on the user I'd like to replace the NUMA case with
automatic totaling of the memory size. To achieve this I need to replace
direct access to the virDomainMemtune's 'max_balloon' field with
two separate getters depending on the desired size.
The two sizes are needed as:
1) Startup memory size doesn't include memory modules in some
hypervisors.
2) After startup these count as the usable memory size.
Note that the comments for the functions are future aware and document
state that will be present after a few later patches.
virDomainNetFindIdx no longer returns info whether device was not found,
or there was multiple matches. Additionally it already handle error
reporting. Introduce virDomainHasNet which does a simple task, without
implicit error reporting.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1135491
More or less a virtual copy of the existing virDomainVcpuPin{Add|Del} API's.
NB: The IOThreads implementation "reused" the virDomainVcpuPinDefPtr
since it provided everything necessary - an "id" and a "map" for each
thread id configured.
Since adding the support for scheduler policy settings in commit
8680ea97, there are two enums with the same information. That was
caused by rewriting the patch since first draft.
Find out thanks to clang, but there was no impact whatsoever.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1142631
This patch resolves a situation where the same "<target dev='$name'...>"
can be used for multiple disks in the domain.
While the $name is "mostly" advisory regarding the expected order that
the disk is added to the domain and not guaranteed to map to the device
name in the guest OS, it still should be unique enough such that other
domblk* type operations can be performed.
Without the patch, the domblklist will list the same Target twice:
$ virsh domblklist $dom
Target Source
------------------------------------------------
sda /var/lib/libvirt/images/file.qcow2
sda /var/lib/libvirt/images/file.img
Additionally, getting domblkstat, domblkerror, domblkinfo, and other block*
type calls will not be able to reference the second target.
Fortunately, hotplug disallows adding a "third" sda value:
$ qemu-img create -f raw /var/lib/libvirt/images/file2.img 10M
$ virsh attach-disk $dom /var/lib/libvirt/images/file2.img sda
error: Failed to attach disk
error: operation failed: target sda already exists
$
BUT, it since 'sdb' doesn't exist one would get the following on the same
hotplug attempt, but changing to use 'sdb' instead of 'sda'
$ virsh attach-disk $dom /var/lib/libvirt/images/file2.img sdb
error: Failed to attach disk
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-1' for device
$
Since we cannot fix this issue at parsing time, the best that can be done so
as to not "lose" a domain is to make the check prior to starting the guest
with the results as follows:
$ virsh start $dom
error: Failed to start domain $dom
error: XML error: target 'sda' duplicated for disk sources '/var/lib/libvirt/images/file.qcow2' and '/var/lib/libvirt/images/file.img'
$
Running 'make check' found a few more instances in the tests where this
duplicated target dev value was being used. These also exhibited some
duplicated 'id=' values (negating the uniqueness argument of aliases) in
the corresponding .args file and of course the *xmlout version of a few
input XML files.
At least Xen supports backend drivers in another domain (aka "driver
domain"). This patch introduces an XML config option for specifying the
backend domain name for <disk> and <interface> devices. E.g.
<disk>
<backenddomain name='diskvm'/>
...
</disk>
<interface type='bridge'>
<backenddomain name='netvm'/>
...
</interface>
In the future, same option will be needed for USB devices (hostdev
objects), but for now libxl doesn't have support for PVUSB.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Move the existing virDomainDefNew to virDomainDefNewFull as it's setting
a few things in the conf and re-introduce virDomainDefNew as a function
without parameters for common use.
For a while now there are two places that gather information about NUMA
related guest configuration. While the XML can't be changed we can at
least store the data in one place in the definition.
Rename the numatune_conf.[ch] files to numa_conf as later patches will
move the rest of the definitions from the cpu definition to this one.
Add an XML attribute to allow disabling merge of rx buffers
on the host:
<interface ...>
...
<model type='virtio'/>
<driver ...>
<host mrg_rxbuf='off'/>
</driver>
</interface>
https://bugzilla.redhat.com/show_bug.cgi?id=1186886
In order for QEMU vCPU (and other) threads to run with RT scheduler,
libvirt needs to take care of that so QEMU doesn't have to run privileged.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The helpers will be useful when implementing hotplug and coldplug of
random number generator devices.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
When adding devices to the definition it's useful to check whether the
devices don't reside on a conflicting address. This patch adds a helper
that iterates all device info and compares the addresses with the given
info.
Do the allocation first, then add the actual device.
The second part should never fail. This is good
for live hotplug where we don't want to remove the device
on OOM after the monitor command succeeded.
The only change in behavior is that on failure, the
vmdef->consoles array is freed, not just the first console.
Currently when launching the LXC controller we first write out
the plain, inactive XML configuration, then launch the controller,
then replace the file with the live status XML configuration.
By good fortune this hasn't caused any problems other than some
misleading error messages during failure scenarios.
This simplifies the code so it only writes out the XML once and
always writes the live status XML. To do this we need to handshake
with the child process, to make execution pause just before exec()
so we can write the XML status with the child PID present.
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.
It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.
This is seriously confusing & error prone for a number of
reasons:
- VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
formatting operation
- Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
to parse or to format, but not both.
This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.
The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
Currently, when there is an API that's blocking with locked domain and
second API that's waiting in virDomainObjListFindByUUID() for the domain
lock (with the domain list locked) no other API can be executed on any
domain on the whole hypervisor because all would wait for the domain
list to be locked. This patch adds new optional approach to this in
which the domain is only ref'd (reference counter is incremented)
instead of being locked and is locked *after* the list itself is
unlocked. We might consider only ref'ing the domain in the future and
leaving locking on particular APIs, but that's no tonight's fairy tale.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When one domain is being undefined and at the same time started, for
example, there is a possibility of a rare problem occuring.
- Thread 1 does virDomainUndefine(), has the lock, checks that the
domain is active and because it's not, calls
virDomainObjListRemove().
- Thread 2 does virDomainCreate() and tries to lock the domain.
- Thread 1 needs to lock domain list in order to remove the domain from
it, but must unlock domain first (proper order is to lock domain list
first and the domain itself second).
- Thread 2 grabs the lock, starts the domain and releases the lock.
- Thread 1 grabs the lock and removes the domain from list.
With this patch:
- The undefining domain gets marked as "to undefine" before it is
unlocked.
- If domain is found in any of the search APIs, it's returned only if
it is not marked as "to undefine". The check is done while the
domain is locked.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
At the time that the network driver allocates a connection to a
network, the tap device that will be used hasn't yet been created -
that will be done later by qemu (or lxc or whoever) - but if the
network has macTableManager='libvirt', then when we do get around to
creating the tap device, we will need to add an entry for it to the
network bridge's fdb (forwarding database) *and* turn off learning and
unicast_flood for that tap device in the bridge's sysfs settings. This
means that qemu needs to know both the bridge name as well as the
setting of macTableManager, so we either need to create a new API to
retrieve that info, or just pass it back in the ActualNetDef that is
created during networkAllocateActualDevice. We choose the latter
method, since it's already done for the bridge device, and it has the
side effect of making the information available in domain status.
(NB: in the future, I think that the tap device should actually be
created by networkAllocateActualDevice(), as that will solve several
other problems, but that is a battle for another day, and this
information will still be useful outside the network driver)
Resolving symlinks can fail before mounting any file system if one file
system depends on another being mounted. Symlinks are now resolved in
two passes:
* Before any file system is mounted, but then we are more gentle if
the source path can't be accessed
* Right before mounting a file system, so that we are sure that we
have the resolved path... but then if it can't be accessed we raise
an error.
Add attribute to set vgamem_mb parameter of QXL device for QEMU. This
value sets the size of VGA framebuffer for QXL device. Default value in
QEMU is 8MB so reuse it also in libvirt to not break things.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The vram attribute was introduced to set the video memory but it is
usable only for few hypervisors excluding QEMU/KVM and the old XEN
driver. Only in case of QEMU the vram was used for QXL.
This patch updates the documentation to reflect current code in libvirt
and also changes the cases when we will set the default vram attribute.
It also fixes existing strange default value for VGA devices 9MB to 16MB
because the video ram should be rounded to power of two.
The change of default value could affect migrations but I found out that
QEMU always round the video ram to power of two internally so it's safe
to change the default value to the next closest power of two and also
silently correct every domain XML definition. And it's also safe because
we don't pass the value to QEMU.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Get mounted filesystems list, which contains hardware info of disks and its
controllers, from QEMU guest agent 2.2+. Then, convert the hardware info
to corresponding device aliases for the disks.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Some storage systems have internal support for snapshots. Libvirt should
be able to select a correct snapshot when starting a VM.
This patch adds a XML element to select a storage source snapshot for
the RBD protocol which supports this feature.
To track state of virtio channels this patch adds a new output-only
attribute called 'state' to the <target> element of virtio channels.
This will be later populated with the guest state of the channel.
Modify the structure _virDomainBlockIoTuneInfo to support these the new
options.
Change the initialization of the variable expectedInfo in qemumonitorjsontest.c
to avoid compiling problem.
Add documentation about the new xml options
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Commit 01b4de2b9f abstracts virDomainParseMemory()
for use by other functions in domain_conf.c
Extend the same for use, for functions outside of this file.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 6c9a8a4 (Oct 2014) exposed a long-standing issue on 32-bit
machines: code related to virDomainSetMemoryParameters has always
been documented as using a 64-bit limit, but it was implemented by
calling virDomainParseMemory which enforced an 'unsigned long'
limit. Since VIR_DOMAIN_MEMORY_PARAM_UNLIMITED capped to a
long is -1, but virDomainParseScaledValue no longer accepts
negative values, an attempt to use 2^53-1 as a hard memory limit
started failing the testsuite. However, the problem with capping
things artificially low has existed for much longer - ever since
commits 4888f0fb and 2e22f23 (Mar 2012) switched internal tracking
from 'unsigned long' to 'unsigned long long' (prior to that time,
the cap was a side-effect of the choice of types). We _have_ to
cap the balloon memory values, (no thanks to baked in 'unsigned long'
of API such as virDomainSetMaxMemory or virDomainGetInfo with no
counterpart API that guarantees 64-bit access to those numbers)
but memory parameters have never needed the artificial limit.
At any rate, the solution is to make the parser function gain a
parameter, and only do the reduced 32-bit cap for the values that
are constrained due to API.
* src/conf/domain_conf.h (_virDomainMemtune): Add comments.
* src/conf/domain_conf.c (virDomainParseMemory): Add parameter.
(virDomainDefParseXML): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
This new attribute will control whether or not libvirt will pay
attention to guest notifications about changes to network device mac
addresses and receive filters. The default for this is 'no' (for
security reasons). If it is set to 'yes' *and* the specified device
model and connection support it (currently only macvtap+virtio) then
libvirt will watch for NIC_RX_FILTER_CHANGED events, and when it
receives one, it will issue a query-rx-filter command, retrieve the
result, and modify the host-side macvtap interface's mac address and
unicast/multicast filters accordingly.
The functionality behind this attribute will be in a later patch. This
patch merely adds the attribute to the top-level of a domain's
<interface> as well as to <network> and <portgroup>, and adds
documentation and schema/xml2xml tests. Rather than adding even more
test files, I've just added the net attribute in various applicable
places of existing test files.
This patch adds parsing/formatting code as well as documentation for
shared memory devices. This will currently be only accessible in QEMU
using it's ivshmem device, but is designed as generic as possible to
allow future expansion for other hypervisors.
In the devices section in the domain XML users may specify:
- For shmem device using a server:
<shmem name='shmem0'>
<server path='/tmp/socket-ivshmem0'/>
<size unit='M'>32</size>
<msi vectors='32' ioeventfd='on'/>
</shmem>
- For ivshmem device not using an ivshmem server:
<shmem name='shmem1'>
<size unit='M'>32</size>
</shmem>
Most of the configuration is made optional so it also allows
specifications like:
<shmem name='shmem1/>
<shmem name='shmem2'>
<server/>
</shmem>
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add options for tuning segment offloading:
<driver>
<host csum='off' gso='off' tso4='off' tso6='off'
ecn='off' ufo='off'/>
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
which control the respective host_ and guest_ properties
of the virtio-net device.
Cleanup virDomanDef structure from other nested structure and create
separate type definition for them.
Fix a typo in virDomainHugePage.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>