As pointed out by jtomko in his review of the IOThreads pinning code:
http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html
there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...
This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...
Now that the size of guest's memory can be inferred from the NUMA
configuration (if present) make it optional to specify <memory>
explicitly.
To make sure that memory is specified add a check that some form of
memory size was specified. One side effect of this change is that it is
no longer possible to specify 0KiB as memory size for the VM, but I
don't think it would be any useful to do so. (I can imagine embedded
systems without memory, just registers, but that's far from what libvirt
is usually doing).
Forbidding 0 memory for guests also fixes a few corner cases where 0 was
not interpreted correctly and caused failures. (Arguments for numad when
using automatic placement, size of the balloon). This fixes problems
described in https://bugzilla.redhat.com/show_bug.cgi?id=1161461
Test case changes are added to verify that the schema change and code
behave correctly.
Use the NUMA total instead of the configured size both in XML and for
uses in the code once NUMA is enabled for a domain.
One test case change is necessary as the rounding of the individual cell
sizes was not matching the rounding of the total size.
As there are two possible approaches to define a domain's memory size -
one used with legacy, non-NUMA VMs configured in the <memory> element
and per-node based approach on NUMA machines - the user needs to make
sure that both are specified correctly in the NUMA case.
To avoid this burden on the user I'd like to replace the NUMA case with
automatic totaling of the memory size. To achieve this I need to replace
direct access to the virDomainMemtune's 'max_balloon' field with
two separate getters depending on the desired size.
The two sizes are needed as:
1) Startup memory size doesn't include memory modules in some
hypervisors.
2) After startup these count as the usable memory size.
Note that the comments for the functions are future aware and document
state that will be present after a few later patches.
virDomainNetFindIdx no longer returns info whether device was not found,
or there was multiple matches. Additionally it already handle error
reporting. Introduce virDomainHasNet which does a simple task, without
implicit error reporting.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
A helper that never returns an error and treats bits out of bitmap range
as false.
Use it everywhere we use ignore_value on virBitmapGetBit, or loop over
the bitmap size.
https://bugzilla.redhat.com/show_bug.cgi?id=1135491
More or less a virtual copy of the existing virDomainVcpuPin{Add|Del} API's.
NB: The IOThreads implementation "reused" the virDomainVcpuPinDefPtr
since it provided everything necessary - an "id" and a "map" for each
thread id configured.
This patch turns both virNetworkObjFindByUUID() and
virNetworkObjFindByName() to return an referenced object so that
even if caller unlocks it, it's for sure that object won't
disappear meanwhile. Especially if the object (in general) is
locked and unlocked during the caller run.
Moreover, this commit is nicely small, since the object unrefing
can be done in virNetworkObjEndAPI().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Every API that touches internal structure of the object must lock
the object first. Not every API that has the object as an
argument needs to do that though. Some APIs just pass the object
to lower layers which, however, must lock the object then. Look
at the code, you'll get my meaning soon.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is going to be needed later, when some functions already
have the virNetworkObjList object already locked and need to
lookup a object to work on. As an example of such function is
virNetworkAssignDef(). The other use case might be in
virNetworkObjListForEach() callback.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Later we can turn APIs to lock the object if needed instead of
relying on caller to mutually exclude itself (probably done by
locking a big lock anyway).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is practically copy of qemuDomObjEndAPI. The reason why is
it so widely available is to avoid code duplication, since the
function is going to be called from our bridge driver, test
driver and parallels driver too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far it's just a structure which happens to have 'Obj' in its
name, but otherwise it not related to virObject at all. No
reference counting, not virObjectLock(), nothing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If a domain object is being removed and looked up concurrently we must
ensure we unlock the object before unreferencing it, since the latter
might free the object.
The flaw was introduced in commit feb1a4d792.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Undefining a running, autostarted domain removes the autostart link, but
dom->autostart is not cleared. If the domain is subsequently redefined,
libvirt thinks it is already autostarted and will not create the link
even if requested:
# virsh dominfo example | grep Autostart
Autostart: enable
# ls /etc/libvirt/qemu/autostart/example.xml
/etc/libvirt/qemu/autostart/example.xml
# virsh undefine example
Domain example has been undefined
# virsh define example.xml
Domain example defined from example.xml
# virsh dominfo example | grep Autostart
Autostart: enable
# virsh autostart example
Domain example marked as autostarted
# ls /etc/libvirt/qemu/autostart/example.xml
ls: cannot access /etc/libvirt/qemu/autostart/example.xml: No such file or directory
This commit ensures dom->autostart is cleared whenever the config and
autostart link (if present) are removed.
The bridge network driver cleared this flag itself in networkUndefine.
This commit moves this into virNetworkDeleteConfig for symmetry with
virDomainDeleteConfig, and to ensure it is not missed in future network
drivers.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Well, one day this will be self-locking object, but not today.
But lets prepare the code for that! Moreover,
virNetworkObjListFree() is no longer needed, so turn it into
virNetworkObjListDispose().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This API will be used in the future to call passed callback over
each network object in the list. It's slightly different to its
virDomainObjListForEach counterpart, because virDomainObjList
uses a hash table to store domain object, while virNetworkObjList
uses an array.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We have something like pvpanic device. However, in some cases it does
not have any address assigned, in which case we produce this ugly XML
(still valid though):
<devices>
<emulator>/usr/bin/qemu</emulator>
...
<panic>
</panic>
</devices>
Lets format "<panic/>" instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There was a mess in the way how we store unlimited value for memory
limits and how we handled values provided by user. Internally there
were two possible ways how to store unlimited value: as 0 value or as
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED. Because we chose to store memory
limits as unsigned long long, we cannot use -1 to represent unlimited.
It's much easier for us to say that everything greater than
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED means unlimited and leave 0 as valid
value despite that it makes no sense to set limit to 0.
Remove unnecessary function virCompareLimitUlong. The update of test
is to prevent the 0 to be miss-used as unlimited in future.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1146539
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Since the APIs support just one element per namespace and while
modifying an element all duplicates would be removed, let's do this
right away in the post parse callback.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1190590
Adding functionality to libvirt that will allow it
query the ethtool interface for the availability
of certain NIC HW offload features
Here is an example of the feature XML definition:
<device>
<name>net_eth4_90_e2_ba_5e_a5_45</name>
<path>/sys/devices/pci0000:00/0000:00:03.0/0000:08:00.1/net/eth4</path>
<parent>pci_0000_08_00_1</parent>
<capability type='net'>
<interface>eth4</interface>
<address>90:e2:ba:5e:a5:45</address>
<link speed='10000' state='up'/>
<feature name='rx'/>
<feature name='tx'/>
<feature name='sg'/>
<feature name='tso'/>
<feature name='gso'/>
<feature name='gro'/>
<feature name='rxvlan'/>
<feature name='txvlan'/>
<feature name='rxhash'/>
<capability type='80203'/>
</capability>
</device>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Instead of copying the whole object onto stack when calling the
function, just pass the pointer to the object and save up some
space on the stack. Moreover, this prepares the code to hide the
virNetworkObjList structure into network_conf.c and use accessors
only.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
All of our vir*Free() functions should accept NULL, even though
that there's no way of actually passing NULL with current code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is probably a copy-paste error from virDomainObj*
counterpart. But when speaking of virNetworkObj we should use
variable @nets for an array of networks, rather than @doms. It's
just confusing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since adding the support for scheduler policy settings in commit
8680ea97, there are two enums with the same information. That was
caused by rewriting the patch since first draft.
Find out thanks to clang, but there was no impact whatsoever.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1142631
This patch resolves a situation where the same "<target dev='$name'...>"
can be used for multiple disks in the domain.
While the $name is "mostly" advisory regarding the expected order that
the disk is added to the domain and not guaranteed to map to the device
name in the guest OS, it still should be unique enough such that other
domblk* type operations can be performed.
Without the patch, the domblklist will list the same Target twice:
$ virsh domblklist $dom
Target Source
------------------------------------------------
sda /var/lib/libvirt/images/file.qcow2
sda /var/lib/libvirt/images/file.img
Additionally, getting domblkstat, domblkerror, domblkinfo, and other block*
type calls will not be able to reference the second target.
Fortunately, hotplug disallows adding a "third" sda value:
$ qemu-img create -f raw /var/lib/libvirt/images/file2.img 10M
$ virsh attach-disk $dom /var/lib/libvirt/images/file2.img sda
error: Failed to attach disk
error: operation failed: target sda already exists
$
BUT, it since 'sdb' doesn't exist one would get the following on the same
hotplug attempt, but changing to use 'sdb' instead of 'sda'
$ virsh attach-disk $dom /var/lib/libvirt/images/file2.img sdb
error: Failed to attach disk
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-1' for device
$
Since we cannot fix this issue at parsing time, the best that can be done so
as to not "lose" a domain is to make the check prior to starting the guest
with the results as follows:
$ virsh start $dom
error: Failed to start domain $dom
error: XML error: target 'sda' duplicated for disk sources '/var/lib/libvirt/images/file.qcow2' and '/var/lib/libvirt/images/file.img'
$
Running 'make check' found a few more instances in the tests where this
duplicated target dev value was being used. These also exhibited some
duplicated 'id=' values (negating the uniqueness argument of aliases) in
the corresponding .args file and of course the *xmlout version of a few
input XML files.
Add VIR_VOL_XML_PARSE_OPT_CAPACITY flag to virStorageVolDefParseXML.
With this flag, no error is reported when the capacity is missing
if there is a backing store.
Commit 6992994 started filling the listen attribute
of the parent <graphics> elements from type='network' listens.
When this XML is passed to UpdateDevice, parsing fails:
XML error: graphics listen attribute 10.20.30.40 must match
address attribute of first listen element (found none)
Ignore the address in the parent <graphics> attribute
when no type='address' listens are found,
the same we ignore the address for the <listen> subelements
when parsing inactive XML.
In virNetworkDHCPHostDefParseXML an error is reported
when partialOkay == true, and none of ip, mac, name
were supplied.
Add the missing goto and error out in this case.
https://bugzilla.redhat.com/show_bug.cgi?id=1196503
We already check whether the host id is valid or not, add a jump
to forbid invalid host id.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
libvirt was unconditionally calling virNetDevBandwidthClear() for
every interface (and network bridge) of a type that supported
bandwidth, whether it actually had anything set or not. This doesn't
hurt anything (unless ifname == NULL!), but is wasteful.
This patch makes sure that all calls to virNetDevBandwidthClear() are
qualified by checking that the interface really had some bandwidth
setup done, and checks for a null ifname inside
virNetDevBandwidthClear(), silently returning success if it is null
(as well as removing the ATTRIBUTE_NONNULL from that function's
prototype, since we can't guarantee that it is never null,
e.g. sometimes a type='ethernet' interface has no ifname as it is
provided on the fly by qemu).
Well, not that we are not formatting invalid XML, rather than not as
beautiful as we can:
<cpu mode='host-passthrough'>
</cpu>
If there are no children, let's use the singleton element.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Well, so far there are no variables to free, no cleanup work needed on
an error, so bare 'return -1;' after each error is just okay. But this
will change in a while.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1151942
While the restriction doesn't have origin in any RFC, it matters
to us while constructing the dnsmasq config file (or command line
previously). For better picture, this is how the corresponding
part of network XML look like:
<dns>
<forwarder addr='8.8.4.4'/>
<txt name='example' value='example value'/>
</dns>
And this is how the config file looks like then:
server=8.8.4.4
txt-record=example,example value
Now we can see why there can't be any commas in the TXT name.
They are used by dnsmasq to separate @name and @value.
Funny, we have it in the documentation, but the code (which was
pushed back in 2011) didn't reflect that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
At least Xen supports backend drivers in another domain (aka "driver
domain"). This patch introduces an XML config option for specifying the
backend domain name for <disk> and <interface> devices. E.g.
<disk>
<backenddomain name='diskvm'/>
...
</disk>
<interface type='bridge'>
<backenddomain name='netvm'/>
...
</interface>
In the future, same option will be needed for USB devices (hostdev
objects), but for now libxl doesn't have support for PVUSB.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
The function that parses the <forward> subelement of a network used to
fail/log an error if the network definition contained both a <pf>
element as well as at least one <interface> or <address> element. That
check was present because the configuration of a network should have
either one <pf>, one or more <interface>, or one or more <address>,
but never combinations of multiple kinds.
This caused a problem when libvirtd was restarted with a network
already active - when a network with a <pf> element is started, the
referenced PF (Physical Function of an SRIOV-capable network card) is
checked for VFs (Virtual Functions), and the <forward> is filled in
with a list of all VFs for that PF either in the form of their PCI
addresses (a list of <address>) or their netdev names (a list of
<interface>); the <pf> element is not removed though. When libvirtd is
restarted, it parses the network status and finds both the original
<pf> from the config, as well as the list of either <address> or
<interface>, fails the parse, and the network is not added to the
active list. This failure is often obscured because the network is
marked as autostart so libvirt immediately restarts it.
It seems odd to me that <interface> and <address> are stored in the
same array rather than keeping two separate arrays, and having
separate arrays would have made the check much simpler. However,
changing to use two separate arrays would have required changes in
more places, potentially creating more conflicts and (more
importantly) more possible regressions in the event of a backport, so
I chose to keep the existing data structure in order to localize the
change.
It appears that this problem has been in the code ever since support
for <pf> was added (0.9.10), but until commit
34cc3b2f10 (first in libvirt 1.2.4)
networks with interface pools were not properly marked as active on
restart anyway, so there is no point in backporting this patch any
further than that.
Later patches will need to access the full definition to do check the
memory size and thus the checking needs to be done after the whole
definition including devices is known.
For historical reasons data regarding NUMA configuration were split
between the CPU definition and numatune. We cannot do anything about the
XML still being split, but we certainly can at least store the relevant
data in one place.
This patch moves the NUMA stuff to the right place.
As virDomainNumatuneSet now doesn't allocate the virDomainNuma object
any longer it's not necessary to pass the pointer to a pointer to store
the object as it will not change any longer.
While touching the parameter definitions I've also changed the name of
the parameter to "numa".
Since our formatter now handles well if the config is allocated and not
filled we can safely always-allocate the NUMA config and remove the
ad-hoc allocation code.
This will help in later patches as the parser will be refactored to just
fill the data.
Move the existing virDomainDefNew to virDomainDefNewFull as it's setting
a few things in the conf and re-introduce virDomainDefNew as a function
without parameters for common use.
Do a content-aware check if formatting of the <numatune> element is
necessary. Later on the def->numa structure will be always present so we
cannot decide only on the basis whether it's allocated.
Shuffling around the logic will allow to simplify the code quite a bit.
As an additional bonus the change in the logic now reports an error if
automatic placement is selected and individual placement is configured.
Currently the code would exit without reporting an error as
virBitmapParse reports one only if it fails to parse the bitmap, whereas
the code was jumping to the error label even in case 0 cpus were
correctly parsed in the map.
It's easier to recalculate the number in the one place it's used as
having a separate variable to track it. It will also help with moving
the NUMA code to the separate module.
Name it virNumaMemAccess and add it to conf/numa_conf.[ch]
Note that to avoid a circular dependency the type of the NUMA cell
memAccess variable was changed to int. It will be turned back later
after the circular dependency will not exist.
The mask was stored both as a bitmap and as a string. The string is used
for XML output only. Remove the string, as it can be reconstructed from
the bitmap.
The test change is necessary as the bitmap formatter doesn't "optimize"
using the '^' operator.
Rewrite the function to save a few local variables and reorder the code
to make more sense.
Additionally the ncells_max member of the virCPUDef structure is used
only for tracking allocation when parsing the numa definition, which can
be avoided by switching to VIR_ALLOC_N as the array is not resized
after initial allocation.
For weird historical reasons NUMA cells are added as a subelement of
<cpu> while the actual configuration is done in <numatune>.
This patch splits out the cell parser code from cpu config to NUMA
config. Note that the changes to the code are minimal just to make it
work and the function will be refactored in the next patch.
For a while now there are two places that gather information about NUMA
related guest configuration. While the XML can't be changed we can at
least store the data in one place in the definition.
Rename the numatune_conf.[ch] files to numa_conf as later patches will
move the rest of the definitions from the cpu definition to this one.
Not all files we want to find using virFileFindResource{,Full} are
generated when libvirt is built, some of them (such as RNG schemas) are
distributed with sources. The current API was not able to find source
files if libvirt was built in VPATH.
Both RNG schemas and cpu_map.xml are distributed in source tarball.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add an XML attribute to allow disabling merge of rx buffers
on the host:
<interface ...>
...
<model type='virtio'/>
<driver ...>
<host mrg_rxbuf='off'/>
</driver>
</interface>
https://bugzilla.redhat.com/show_bug.cgi?id=1186886
In order for QEMU vCPU (and other) threads to run with RT scheduler,
libvirt needs to take care of that so QEMU doesn't have to run privileged.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Prior to commit 7d5bf48474 (first appearing in libvirt 1.2.2), the
status XML of a domain's interface was missing a lot of important
information; mainly it just output the config of the interface, plus
the name of the tap device and qemu device alias. Commit 7d5bf48474
changed the status XML to include many important bits of information
that were required to make network "hook" scripts useful - bandwidth
information, vlan tag, the name of the bridge (or physical device in
the case of macvtap) that the tap/macvtap device was attached to - the
commit log for 7d5bf48474 has a very detailed explanation of the
change. For quick reference - in the example given there, prior to the
change, status XML looked like figure [C]:
<interface type='network'>
<source network='testnet' portgroup='admin'/>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
and after the change, it looked like figure [E]:
<interface type='direct'>
<source dev='p4p1_0' mode='bridge'/>
<bandwidth>
<inbound average='1000' peak='5000' burst='1024'/>
<outbound average='128' peak='256' burst='256'/>
</bandwidth>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
You'll notice that bandwidth info, physdev, and macvtap mode have been
added, but the network and portgroup names are now missing - I didn't
think that this information was of any use once the needed
bandwidth/vlan/etc config had been pulled from the network/portgroup.
I was wrong.
A few months after that change a user on IRC asked what happened to
portgroup in the status XML and described how he used it (more or less
as a tag to decide what external information to use in a hook script
that was run at startup/migration time - see
http://wiki.libvirt.org/page/OVS_and_PVLANS ). At that time I planned
to make a patch to re-add portgroup, but life intervened as that was
just prior to a transatlantic move involving several weeks of
"vacation". During this time I somehow forgot to make the patch, and
also mistakenly remembered that I *had* made it.
Subsequent to this, as a part of mprivozn's work to add support for
network-specific hooks, I did re-add the output of the network name in
status XML, but once again completely forgot about portgroup. This was
in commit a3609121 (first appearing in libvirt 1.2.11). This made the
status XML from the above example look like this:
<interface type='direct'>
<source network='testnet' dev='p4p1_0' mode='bridge'/>
<bandwidth>
<inbound average='1000' peak='5000' burst='1024'/>
<outbound average='128' peak='256' burst='256'/>
</bandwidth>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
*This* patch just adds the portgroup back to the status XML, so the
same example interface will look like this:
<interface type='direct'>
<source network='testnet' portgroup='admin'
dev='p4p1_0' mode='bridge'/>
<bandwidth>
<inbound average='1000' peak='5000' burst='1024'/>
<outbound average='128' peak='256' burst='256'/>
</bandwidth>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
The result is that the status XML now contains all information about
how the interface is setup (bandwidth, physical device, tap device,
etc), in addition to pointers to its origin (the network and
portgroup).
virDomainGraphicsListenSetAddress() and
virDomainGraphicsListenSetNetwork() both set their respective char* to
NULL directly when asked to set it to NULL, which is okay as long as
it's already set to NULL. If these functions are ever called to clear
a listen object that has a valid string in address or network, it will
end up leaking the old value. Currently that doesn't happen, so this
is just a preemptive strike.
Prior to 0.9.4, libvirt only supported a single listen, and it had to
be an IP address:
<graphics listen='1.2.3.4' ..../>
Starting with 0.9.4, a graphics element could have a <listen>
subelement (actually the grammar supports multiples, but all of the
drivers only support a single <listen> per <graphics>), and that
listen element can be of type='address' or type='network'. For
type='address', <listen> also has an attribute called 'address' which
contains the IP address for listening:
<graphics ....>
<listen type='address' address='1.2.3.4' .../>
</graphics>
type can also be "network", and in that case listen will have a
"network" attribute which will contain the name of a libvirt
network:
<graphics ....>
<listen type='network' network='testnet' .../>
</graphics>
At domain start (or migrate) time, libvirt will attempt to
find an IP address associated with that network (e.g. the IP address
of the bridge device used by the network, or the physical device
listed in <forward dev='physdev'/>) and fill in that address in the
status XML:
<graphics ....>
<listen type='network' network='testnet' address='1.2.3.4' .../>
</graphics>
In the case that a <graphics> element has a <listen> subelement of
type='address', that listen subelement's "address" attribute is
backfilled into the parent graphics element's "listen" *attribute* for
backward compatibility (so that a management application unaware of
the separate <listen> element can still learn the listen
address). This backfill should be done with the IP learned from
type='network' as well, and that's what this patch does:
<graphics listen='1.2.3.4' ....>
<listen type='network' network='testnet' address='1.2.3.4' .../>
</graphics>
This is a continuation of the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1191016
The function virDomainVcpuPinDel() used vcpupin_list to stand for
def->cputune.vcpupin, which made the codes more readable.
However, in this function, it will realloc vcpupin_list later.
As the definition of realloc(), it may free vcpupin_list and then
points it to a new-realloced address, but def->cputune.vcpupin doesn't
point to the new address(it's freed however).
Thus,
1) When we refer to the def->cputune.vcpupin afterwards, which was freed
by realloc(), an INVALID READ occurs, and libvirtd may crash.
2) As no one will use vcpupin_list any more, and no one frees it(it's just
alloced by realloc()), memory leak occurs.
Part of the valgrind logs are shown as below:
==1837== Thread 15:
==1837== Invalid read of size 8
==1837== at 0x5367337: virDomainDefFormatInternal (domain_conf.c:18392)
which is : virBufferAsprintf(buf, "<vcpupin vcpu='%u' ",
def->cputune.vcpupin[i]->vcpuid);
==1837== by 0x536966C: virDomainObjFormat (domain_conf.c:18970)
==1837== by 0x5369743: virDomainSaveStatus (domain_conf.c:19166)
==1837== by 0x117B26DC: qemuDomainPinVcpuFlags (qemu_driver.c:4586)
==1837== by 0x53EA313: virDomainPinVcpuFlags (libvirt.c:9803)
==1837== by 0x14CB7D: remoteDispatchDomainPinVcpuFlags (remote_dispatch.h:6762)
==1837== by 0x14CC81: remoteDispatchDomainPinVcpuFlagsHelper (remote_dispatch.h:6740)
==1837== by 0x5464C30: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==1837== by 0x546507A: virNetServerProgramDispatch (virnetserverprogram.c:307)
==1837== by 0x171B83: virNetServerProcessMsg (virnetserver.c:172)
==1837== by 0x171E6E: virNetServerHandleJob (virnetserver.c:193)
==1837== by 0x5318E78: virThreadPoolWorker (virthreadpool.c:145)
==1837== Address 0x12ea2870 is 0 bytes inside a block of size 16 free'd
==1837== at 0x4C291AC: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==1837== by 0x52A3D14: virReallocN (viralloc.c:245)
==1837== by 0x52A3DFB: virShrinkN (viralloc.c:372)
==1837== by 0x52A3F57: virDeleteElementsN (viralloc.c:503)
==1837== by 0x533939E: virDomainVcpuPinDel (domain_conf.c:15405) //doReset为true时才会进到。
==1837== by 0x117B2642: qemuDomainPinVcpuFlags (qemu_driver.c:4573)
==1837== by 0x53EA313: virDomainPinVcpuFlags (libvirt.c:9803)
==1837== by 0x14CB7D: remoteDispatchDomainPinVcpuFlags (remote_dispatch.h:6762)
==1837== by 0x14CC81: remoteDispatchDomainPinVcpuFlagsHelper (remote_dispatch.h:6740)
==1837== by 0x5464C30: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==1837== by 0x546507A: virNetServerProgramDispatch (virnetserverprogram.c:307)
==1837== by 0x171B83: virNetServerProcessMsg (virnetserver.c:172)
Steps to reproduce the problem:
1) use virDomainPinVcpuFlags() to pin a guest's vcpu to all the pcpus
of the host.
This patch uses def->cputune.vcpupin instead of vcpupin_list to do the
realloc() job, to avoid invalid read or memory leaking.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Yue Wenyuan <yuewenyuan@huawei.com@huawei.com>
The helpers will be useful when implementing hotplug and coldplug of
random number generator devices.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
When adding devices to the definition it's useful to check whether the
devices don't reside on a conflicting address. This patch adds a helper
that iterates all device info and compares the addresses with the given
info.
It is only supported for virtio adapters.
Silently drop it if it was specified for other models,
as is done for other virtio attributes.
Also mention this in the documentation.
https://bugzilla.redhat.com/show_bug.cgi?id=1147195
Commit id '652a2ec6' introduced two new node device capability flags
and the ability to use those flags as a way to search for a specific
subset of a 'scsi_host' device - namely a 'fc_host' and/or 'vports'.
The code modified the virNodeDeviceCapMatch whichs allows for searching
using the 'virsh nodedev-list [cap]' via virConnectListAllNodeDevices.
However, the original patches did not account for other searches for
the same capability key from virNodeListDevices using virNodeDeviceHasCap.
Since 'fc_host' and 'vports' are self defined bits of a 'scsi_host'
device mere string comparison against the basic/root type is not
sufficient.
This patch adds the check for the 'fc_host' and 'vports' bits within
a 'scsi_host' device and allows the following python code to find the
capabilities for the device:
import libvirt
conn = libvirt.openReadOnly('qemu:///system')
fc = conn.listDevices('fc_host', 0)
print(fc)
fc = conn.listDevices('vports', 0)
print(fc)
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Add the missing jump to thje error label. The error message shouldn't
ever be triggered though as it's called only on pre-selected nodes.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Extract the logic to determine which nodeset has to be used for a domain
from the formatting step so that it can be reused separately when the
nodeset is used in a different way.
https://bugzilla.redhat.com/show_bug.cgi?id=1170492
In one of our previous commits (dc8b7ce7) we've done a functional
change even though it was intended as pure refactor. The problem is,
that the following XML:
<vcpu placement='static' current='2'>6</vcpu>
<cputune>
<emulatorpin cpuset='1-3'/>
</cputune>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
gets translated into this one:
<vcpu placement='auto' current='2'>6</vcpu>
<cputune>
<emulatorpin cpuset='1-3'/>
</cputune>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
We should not change the vcpu placement mode. Moreover, we're doing
something similar in case of emulatorpin and iothreadpin. If they were
set, but vcpu placement was auto, we've mistakenly removed them from
the domain XML even though we are able to set them independently on
vcpus.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Do the allocation first, then add the actual device.
The second part should never fail. This is good
for live hotplug where we don't want to remove the device
on OOM after the monitor command succeeded.
The only change in behavior is that on failure, the
vmdef->consoles array is freed, not just the first console.
Currently when launching the LXC controller we first write out
the plain, inactive XML configuration, then launch the controller,
then replace the file with the live status XML configuration.
By good fortune this hasn't caused any problems other than some
misleading error messages during failure scenarios.
This simplifies the code so it only writes out the XML once and
always writes the live status XML. To do this we need to handshake
with the child process, to make execution pause just before exec()
so we can write the XML status with the child PID present.
Previously the function returned either -1 in case of an error or 0 on
success. However, we should also distinguish between a case we
successfully added a controller and a case there wasn't a need to add any
controller
This adds a new "localOnly" attribute on the domain element of the
network xml. With this set to "yes", DNS requests under that domain
will only be resolved by libvirt's dnsmasq, never forwarded upstream.
This was how it worked before commit f69a6b987d, and I found that
functionality useful. For example, I have my host's NetworkManager
dnsmasq configured to forward that domain to libvirt's dnsmasq, so I can
easily resolve guest names from outside. But if libvirt's dnsmasq
doesn't know a name and forwards it to the host, I'd get an endless
forwarding loop. Now I can set localOnly="yes" to prevent the loop.
Signed-off-by: Josh Stone <jistone@redhat.com>
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.
It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Commit id 'ca481a6f' added virNetworkRouteDefFree which may be called
in an error path from lxcAddNetworkRouteDefinition with 'route = NULL'.
So just add the (!def) at the top to resolve.
Commit id 'aa2cc721' added calls to virSocketAddrFormat but did not
check for a NULL (error) return which could lead to bad output
in the XML file. Need to check for NULL return and cause failure.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Moving code for parsing and formatting network routes to
networkcommon_conf helps reusing those routes for domains. The route
definition has been hidden to help reducing the number of unnecessary
checks in the format function.
https://bugzilla.redhat.com/show_bug.cgi?id=1182486
When updating a network and adding new ip-dhcp-host entry, the deamon
may crash. The problem is, we iterate over existing <host/> entries
trying to compare MAC addresses to see if there's already an existing
rule. However, not all entries are required to have MAC address. For
instance, the following is perfectly valid entry:
<host id='00:04:58:fd:e4:15:1b:09:4c:0e:09:af:e4:d3:8c:b8:ca:1e'
name='redhatipv6.redhat.com' ip='2001:db8:ca2:2::119'/>
When the checking loop iterates over this, the entry's MAC address is
accessed directly. Well, the fix is obvious - check if the address is
defined before trying to compare it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
There's this function virNetDevBandwidthParse which parses the
bandwidth XML snippet. But it's not clever much. For the
following XML it allocates the virNetDevBandwidth structure even
though it's completely empty:
<bandwidth>
</bandwidth>
Later in the code there are some places where we check if
bandwidth was set or not. And since we obtained pointer from the
parsing function we think that it is when in fact it isn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.
This is seriously confusing & error prone for a number of
reasons:
- VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
formatting operation
- Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
to parse or to format, but not both.
This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.
The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
The virCPUDefFormat* methods were relying on the VIR_DOMAIN_XML_*
flag definitions. It is not desirable for low level internal
functions to be coupled to flags for the public API, since they
may need to be called from several different contexts where the
flags would not be appropriate.
https://bugzilla.redhat.com/show_bug.cgi?id=1181408
When we try to hotplug a channel chr device with no target, we
will get success (which should fail) in virDomainChrDefParseXML,
because we use goto cleanup this place and return an incomplete
definition (with no target). In qemuDomainAttachChrDevice,
we add it to the domain definition, but fail to remove it from
there when chardev-add fails, because virDomainChrRemove
matches chardevices according to the target name.
The device definition is then freed in qemuDomainAttachDeviceFlags,
leaving a stale pointer in the domain definition.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
QEMU supports feature specification with -cpu host and we just skip
using that. Since QEMU developers themselves would like to use this
feature, this patch modifies the code to work.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178850
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1179684
The way that we currently generate the <driver/> for <controller/> is
just madness:
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='12'/>
<driver cmd_per_lun='123'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
It's obvious that we should be aiming at the following:
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='12' cmd_per_lun='123'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Make use of the ebtables functionality to be able to filter certain
parameters of icmpv6 packets. Extend the XML parser for icmpv6 types,
type ranges, codes, and code ranges. Extend the nwfilter documentation,
schema, and test cases.
Being able to filter icmpv6 types and codes helps extending the DHCP
snooper for IPv6 and filtering at least some parameters of IPv6's NDP
(Neighbor Discovery Protocol) packets. However, the filtering will not
be as good as the filtering of ARP packets since we cannot
check on IP addresses in the payload of the NDP packets.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1177194
When migrate a vm, we will generate a xml via qemuDomainDefFormatLive and
pass this xml to target libvirtd. Libvirt will use the current network
state in def->data.network.actual to generate the xml, this will make
migrate failed when we set a network type guest interface use a macvtap
network as a source in a vm then migrate vm to another host(which has the
different macvtap network settings: different interface name, bridge name...)
Add a flag check in virDomainNetDefFormat, if we set a VIR_DOMAIN_XML_MIGRATABLE
flag when call virDomainNetDefFormat, we won't get the current vm interface
state.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
The <domain/> element under /capabilities/guest/arch/ can have no
child elements. If that's the case we format:
<domain type='xen'>
</domain>
instead of simpler:
<domain type='xen'/>
This commit fixes that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In commit d2632d60 we agreed taht we want the parsed uid to properly
overflow but only to -1, however the value was read into long and then
wrapped into uid_t. That meaned it failed on 32-bit systems.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently, when there is an API that's blocking with locked domain and
second API that's waiting in virDomainObjListFindByUUID() for the domain
lock (with the domain list locked) no other API can be executed on any
domain on the whole hypervisor because all would wait for the domain
list to be locked. This patch adds new optional approach to this in
which the domain is only ref'd (reference counter is incremented)
instead of being locked and is locked *after* the list itself is
unlocked. We might consider only ref'ing the domain in the future and
leaving locking on particular APIs, but that's no tonight's fairy tale.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Volume and pool formatting functions took different approaches to
unspecified uids/gids. When unknown, it is always parsed as -1, but one
of the functions formatted it as unsigned int (wrong) and one as
int (better). Due to that, our two of our XML files from tests cannot
be parsed on 32-bit machines.
RNG schema needs to be modified as well, but because both
storagepool.rng and storagevol.rng need same schema for permission
element, save some space by moving it to storagecommon.rng.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We can change vnc password by using virDomainUpdateDeviceFlags API with
live flag. But it can't be changed with config flag. Error is reported as
below.
error: Operation not supported: persistent update of device 'graphics' is not supported
This patch supports the graphics arguments changed with config flag.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1174096
When both parameter have lockspaces present, virDomainLeaseIndex
always returns -1 even there is a lease the same with the one we
check. This is due to broken logic in 'if-else' statement.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1174053
Introduced by commit id '17bddc46f' - fix a libvirtd crash when
matching a network iscsi hostdev with a host iscsi hostdev.
When we use attach-device to coldplug a network iscsi hostdev,
libvirt will check if there is already a device in XML. But if
the 'b' is a host iscsi hostdev and 'a' is a network iscsi hostdev,
then libvirtd will crash in virDomainHostdevMatchSubsysSCSIiSCSI
because 'b' doesn't have a hostname.
Add a check in virDomainHostdevMatchSubsys, if the a's protocol
and b's protocol is not the same.
Following is the backtrace:
0 0x00007f850d6bc307 in virDomainHostdevMatchSubsysSCSIiSCSI at conf/domain_conf.c:10889
1 virDomainHostdevMatchSubsys at conf/domain_conf.c:10911
2 virDomainHostdevMatch at conf/domain_conf.c:10973
3 virDomainHostdevFind at conf/domain_conf.c:10998
4 0x00007f84f6a10560 in qemuDomainAttachDeviceConfig at qemu/qemu_driver.c:7223
5 qemuDomainAttachDeviceFlags at qemu/qemu_driver.c:7554
Signed-off-by: Luyao Huang <lhuang@redhat.com>
For historical reasons, only the first <console> element might be of targetType
serial, but we checked for other consoles of targetType serial in our post-parse
callback if and only if we knew the first console was serial, otherwise
the check was skipped.
This patch moves the check one level up, so first
the check for secondary console of type serial is performed and then the
rest of operations continue unchanged.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1170092
When one domain is being undefined and at the same time started, for
example, there is a possibility of a rare problem occuring.
- Thread 1 does virDomainUndefine(), has the lock, checks that the
domain is active and because it's not, calls
virDomainObjListRemove().
- Thread 2 does virDomainCreate() and tries to lock the domain.
- Thread 1 needs to lock domain list in order to remove the domain from
it, but must unlock domain first (proper order is to lock domain list
first and the domain itself second).
- Thread 2 grabs the lock, starts the domain and releases the lock.
- Thread 1 grabs the lock and removes the domain from list.
With this patch:
- The undefining domain gets marked as "to undefine" before it is
unlocked.
- If domain is found in any of the search APIs, it's returned only if
it is not marked as "to undefine". The check is done while the
domain is locked.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
For host-passthrough CPU we don't honor the CPU
features specified in the XML, but we allow
outputting them via the UPDATE_CPU flag for dumpxml,
this gives user a rough idea of what features the CPU
might have.
After restoring a managedsave'd domain, the features
might end up in the live status XML (in /var/run) without
the model. This XML cannot be parsed by the daemon after
restart and the domain might disappear.
This fix skips formatting the features for HOST_PASSTHROUGH
when UPDATE_CPU is not specified, so the newly restored domains
and newly created snapshots won't be affected.
Note: this doesn't fix existing snapshots or already restored
running domains.
https://bugzilla.redhat.com/show_bug.cgi?id=1030793https://bugzilla.redhat.com/show_bug.cgi?id=1151885
https://bugzilla.redhat.com/show_bug.cgi?id=1171582
When we edit a negative controller address number to a device,
some of them will auto generate a controller with invalid index
number. This will make guest disappear after restart libvirtd.
Instead of allowing negative number for controller index, we
should forbid negative number in these place (we did this before,
but after f18c02ec, virStrToLong_ui changed to allow negative
number). Therefore switch to virStrToLong_uip in these places.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
At the time that the network driver allocates a connection to a
network, the tap device that will be used hasn't yet been created -
that will be done later by qemu (or lxc or whoever) - but if the
network has macTableManager='libvirt', then when we do get around to
creating the tap device, we will need to add an entry for it to the
network bridge's fdb (forwarding database) *and* turn off learning and
unicast_flood for that tap device in the bridge's sysfs settings. This
means that qemu needs to know both the bridge name as well as the
setting of macTableManager, so we either need to create a new API to
retrieve that info, or just pass it back in the ActualNetDef that is
created during networkAllocateActualDevice. We choose the latter
method, since it's already done for the bridge device, and it has the
side effect of making the information available in domain status.
(NB: in the future, I think that the tap device should actually be
created by networkAllocateActualDevice(), as that will solve several
other problems, but that is a battle for another day, and this
information will still be useful outside the network driver)
When the actualType of a virDomainNetDef is "network", it means that
we are connecting to a libvirt-managed network (routed, natted, or
isolated) which does use a bridge device (created by libvirt). In the
past we have required drivers such as qemu to call the public API to
retrieve the bridge name in this case (even though it is available in
the NetDef's ActualNetDef if the actualType is "bridge" (i.e., an
externally-created bridge that isn't managed by libvirt). There is no
real reason for this difference, and as a matter of fact it
complicates things for qemu. Also, there is another bridge-related
attribute (macTableManager) that will need to be available in both
cases, so this makes things consistent.
In order to avoid problems when restarting libvirtd after an update
from an older version that *doesn't* store the network's bridgename in
the ActualNetDef, we also need to put it in place during
networkNotifyActualDevice() (this function is run for each interface
of each domain whenever libvirtd is restarted).
Along with making the bridge name available in the internal object, it
is also now reported in the <source> element of the <interface> state
XML (or the <actual> subelement in the internally-stored format).
The one oddity about this change is that usually there is a separate
union for every different "type" in a higher level object (e.g. in the
case of a virDomainNetDef there are separate "network" and "bridge"
members of the union that pivots on the type), but in this case
network and bridge types both have exactly the same attributes, so the
"bridge" member is used for both type==network and type==bridge.
The macTableManager attribute of a network's bridge subelement tells
libvirt how the bridge's MAC address table (used to determine the
egress port for packets) is managed. In the default mode, "kernel",
management is left to the kernel, which usually determines entries in
part by turning on promiscuous mode on all ports of the bridge,
flooding packets to all ports when the correct destination is unknown,
and adding/removing entries to the fdb as it sees incoming traffic
from particular MAC addresses. In "libvirt" mode, libvirt turns off
learning and flooding on all the bridge ports connected to guest
domain interfaces, and adds/removes entries according to the MAC
addresses in the domain interface configurations. A side effect of
turning off learning and unicast_flood on the ports of a bridge is
that (with Linux kernel 3.17 and newer), the kernel can automatically
turn off promiscuous mode on one or more of the bridge's ports
(usually only the one interface that is used to connect the bridge to
the physical network). The result is better performance (because
packets aren't being flooded to all ports, and can be dropped earlier
when they are of no interest) and slightly better security (a guest
can still send out packets with a spoofed source MAC address, but will
only receive traffic intended for the guest interface's configured MAC
address).
The attribute looks like this in the configuration:
<network>
<name>test</name>
<bridge name='br0' macTableManager='libvirt'/>
...
This patch only adds the config knob, documentation, and test
cases. The functionality behind this knob is added in later patches.
Since virStreamFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virStoragePoolFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virNodeDeviceFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virNetworkFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virDomainFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Partially reverts commit 5754dbd.
The code in the specfile adds a MAC address to every <bridge>,
even for <forward mode='bridge'> for which we don't support
changing MAC addresses.
Remove it completely. For new networks, we have been adding
MAC addresses on definition/creation since the commit mentioned above.
For existing networks (pre-0.9.0), the MAC is added by this commit.
https://bugzilla.redhat.com/show_bug.cgi?id=1156367
Adding non-existing nwfilter to a network interface device without any
nwfilter specified will crash libvirt daemon with segfault. The reason is
that the nwfilter is not found an libvirt will try to restore old
nwfilter configuration but there is no nwfilter specified.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The function virNetworkObjListExport() in network_conf.c had a call to
the public API virNetworkFree() which was causing a link error:
CCLD libvirt_driver_vbox_network_impl.la
./.libs/libvirt_conf.a(libvirt_conf_la-network_conf.o): In function `virNetworkObjListExport':
/home/laine/devel/libvirt/src/conf/network_conf.c:4496: undefined reference to `virNetworkFree'
This would happen when I added
#include "network_conf.h"
into domain_conf.h, then attempted to call a new function from that
file (and enum converter, similar to virNetworkForwardTypeToString())
In the end, virNetworkFree() ends up just calling virObjectUnref(obj)
anyway (after clearing all pending errors, which we probably *don't*
want to do in the cleanup of a utility function), so this is likely
more correct than the original code as well.
Commit id '0d36a5d05' modified the code slightly, but removed the
return value check thus causing Coverity to complain that this call
was the only one where the return value wasn't checked. Since nothing
was done previously if there was a failure, just use ignore_value here
to pacify Coverity
https://bugzilla.redhat.com/show_bug.cgi?id=1159180
The virStoragePoolSourceFindDuplicate only checks the incoming definition
against the same type of pool as the def; however, for "scsi_host" and
"fc_host" adapter pools, it's possible that either some pool "scsi_host"
adapter definition is already using the scsi_hostN that the "fc_host"
adapter definition wants to use or some "fc_host" pool adapter definition
is using a vHBA scsi_hostN or parent scsi_hostN that an incoming "scsi_host"
definition is trying to use.
This patch adds the mismatched type checks and adds extraneous comments
to describe what each check is determining.
This patch also modifies the documentation to be describe what scsi_hostN
devices a "scsi_host" source adapter should use and which to avoid. It also
updates the parent definition to specifically call out that for mixed
environments it's better to define which parent to use so that the duplicate
pool checks can be done properly.
https://bugzilla.redhat.com/show_bug.cgi?id=1159180
Move the API from the backend to storage_conf and rename it to
virStoragePoolGetVhbaSCSIHostParent. A future patch will need to
use this functionality from storage_conf
Resolving symlinks can fail before mounting any file system if one file
system depends on another being mounted. Symlinks are now resolved in
two passes:
* Before any file system is mounted, but then we are more gentle if
the source path can't be accessed
* Right before mounting a file system, so that we are sure that we
have the resolved path... but then if it can't be accessed we raise
an error.
Add attribute to set vgamem_mb parameter of QXL device for QEMU. This
value sets the size of VGA framebuffer for QXL device. Default value in
QEMU is 8MB so reuse it also in libvirt to not break things.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The vram attribute was introduced to set the video memory but it is
usable only for few hypervisors excluding QEMU/KVM and the old XEN
driver. Only in case of QEMU the vram was used for QXL.
This patch updates the documentation to reflect current code in libvirt
and also changes the cases when we will set the default vram attribute.
It also fixes existing strange default value for VGA devices 9MB to 16MB
because the video ram should be rounded to power of two.
The change of default value could affect migrations but I found out that
QEMU always round the video ram to power of two internally so it's safe
to change the default value to the next closest power of two and also
silently correct every domain XML definition. And it's also safe because
we don't pass the value to QEMU.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Get mounted filesystems list, which contains hardware info of disks and its
controllers, from QEMU guest agent 2.2+. Then, convert the hardware info
to corresponding device aliases for the disks.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
As qemu is now able to notify us about change of the channel state used
for communication with the guest agent we now can more precisely track
the state of the guest agent.
To allow notifying management apps this patch implements a new event
that will be triggered on changes of the guest agent state.
To be able to express some use cases of the RBD backing with libvirt, we
need to be able to specify a config file for the RBD client to qemu as
that is one of the commonly used options.
Some storage systems have internal support for snapshots. Libvirt should
be able to select a correct snapshot when starting a VM.
This patch adds a XML element to select a storage source snapshot for
the RBD protocol which supports this feature.
To track state of virtio channels this patch adds a new output-only
attribute called 'state' to the <target> element of virtio channels.
This will be later populated with the guest state of the channel.
I noticed this while working on qemuDomainGetBlockInfo. Assigning
a bool value to an int variable compiles fine, but raises red flags
on the maintenance front as it becomes too easy to assign -1 or 2
or any other non-bool value to the same variable.
* cfg.mk (sc_prohibit_int_assign_bool): New rule.
* src/conf/snapshot_conf.c (virDomainSnapshotRedefinePrep): Fix
offenders.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo)
(qemuDomainSnapshotCreateXML): Likewise.
* src/test/test_driver.c (testDomainSnapshotAlignDisks):
Likewise.
* src/util/vircgroup.c (virCgroupSupportsCpuBW): Likewise.
* src/util/virpci.c (virPCIDeviceBindToStub): Likewise.
* src/util/virutil.c (virIsCapableVport): Likewise.
* tools/virsh-domain-monitor.c (cmdDomMemStat): Likewise.
* tools/virsh-domain.c (cmdBlockResize, cmdScreenshot)
(cmdInjectNMI, cmdSendKey, cmdSendProcessSignal)
(cmdDetachInterface): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Ethernet interfaces in libvirt currently do not support bandwidth setting.
For example, following xml file for an interface will not apply these
settings to corresponding qdiscs.
<interface type="ethernet">
<mac address="02:36:1d:18:2a:e4"/>
<model type="virtio"/>
<script path=""/>
<target dev="tap361d182a-e4"/>
<bandwidth>
<inbound average="984" peak="1024" burst="64"/>
<outbound average="2000" peak="2048" burst="128"/>
</bandwidth>
</interface>
Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160926
Introduce a 'managed' attribute to allow libvirt to decide whether to
delete a vHBA vport created via external means such as nodedev-create.
The code currently decides whether to delete the vHBA based solely on
whether the parent was provided at creation time. However, that may not
be the desired action, so rather than delete and force someone to create
another vHBA via an additional nodedev-create allow the configuration of
the storage pool to decide the desired action.
During createVport when libvirt does the VPORT_CREATE, set the managed
value to YES if not already set to indicate to the deleteVport code that
it should delete the vHBA when the pool is destroyed.
If libvirtd is restarted all the memory only state was lost, so for a
persistent storage pool, use the virStoragePoolSaveConfig in order to
write out the managed value.
Because we're now saving the current configuration, we need to be sure
to not save the parent in the output XML if it was undefined at start.
Saving the name would cause future starts to always use the same parent
which is not the expected result when not providing a parent. By not
providing a parent, libvirt is expected to find the best available
vHBA port for each subsequent (re)start.
At deleteVport, use the new managed value to decide whether to execute
the VPORT_DELETE. Since we no longer save the parent in memory or in
XML when provided, if it was not provided, then we have to look it up.
https://bugzilla.redhat.com/show_bug.cgi?id=1160926
Passing a copy of the storage pool adapter to a function just changes the
copy of the fields in the particular function and then when returning to
the caller those changes are discarded. While not yet biting us in the
storage clean-up case, it did cause an issue for the fchost storage pool
startup case, createVport. The issue was at startup, if no parent is found
in the XML, the code will search for the 'best available' parent and then
store that in the in memory copy of the adapter. Of course, in this case
it was a copy, so when returning to the virStorageBackendSCSIStartPool that
change was discarded (or lost) from the pool->def->source.adapter which
meant at shutdown (deleteVport), the code assumed no adapter was passed
and skipped the deletion, leaving the vHBA created by libvirt still defined
requiring an additional stop of a nodedev-destroy to remove.
Adjusted the createVport to take virStoragePoolDefPtr instead of the
adapter copy. Then use the virStoragePoolSourceAdapterPtr when processing.
A future patch will need the 'def' anyway, so this just sets up for that.
virDomainChrSourceDefIsEqual should return 'true' for
identical SPICEVMC chardevs, and those that have no source
specification.
After this change, a failed hotplug no longer leaves a stale
pointer in the domain definition.
https://bugzilla.redhat.com/show_bug.cgi?id=1162097
Modify the structure _virDomainBlockIoTuneInfo to support these the new
options.
Change the initialization of the variable expectedInfo in qemumonitorjsontest.c
to avoid compiling problem.
Add documentation about the new xml options
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
CPU numa topology implicitly allows memory specification in 'KiB'.
Enabling this to accept the 'unit' in which memory needs to be specified.
This now allows users to specify memory in units of choice, and
lists the same in 'KiB' -- just like other 'memory' elements in XML.
<numa>
<cell cpus='0-3' memory='1024' unit='MiB' />
<cell cpus='4-7' memory='1024' unit='MiB' />
</numa>
Also augment test cases to correctly model NUMA memory specification.
This adds the tag 'unit="KiB"' for memory attribute in NUMA cells.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 01b4de2b9f abstracts virDomainParseMemory()
for use by other functions in domain_conf.c
Extend the same for use, for functions outside of this file.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As reviewing patches upstream it occurred to me, that we have two
functions doing nearly the same: virDomainParseMemory which
expects XML in the following format:
<memory unit='MiB'>1337</memory>
The other function being virDomainHugepagesParseXML expecting the
following format:
<someElement size='1337' unit='MiB'/>
It wouldn't matter to have two functions handle two different
scenarios like this if we could only not copy code that handles
32bit arches around. So this code merges the common parts into
one by inventing new @units_xpath argument to
virDomainParseMemory which allows overriding the default location
of @unit attribute in XML. With this change both scenarios above
can be parsed with virDomainParseMemory.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
PowerISA allows processors to run VMs in binary compatibility ("compat")
mode supporting an older version of ISA. QEMU has recently added support to
explicitly denote a VM running in compatibility mode through commit 6d9412ea
& 8dfa3a5e85. Now, a "compat" mode VM can be run by invoking this qemu
commandline on a POWER8 host: -cpu host,compat=power7.
This patch allows libvirt to exploit cpu mode 'host-model' to describe this
new mode for PowerKVM guests. For example, when a user wants to request a
power7 vm to run in compatibility mode on a Power8 host, this can be
described in XML as follows :
<cpu mode='host-model'>
<model>power7</model>
</cpu>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
This adds support for PowerPC Little Endian architecture.,
and allows libvirt to spawn VMs based on 'ppc64le' architecture.
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is a reaction to Michal's fix [1] for non-NUMA systems that also
splits out conf/ out of util/ because libvirt_util shouldn't require
libvirt_conf if it is the other way around. This particular use case
worked, but we're trying to avoid it as mentioned [2], many times.
The only functions from virnuma.c that needed numatune_conf were
virDomainNumatuneNodesetIsAvailable() and virNumaSetupMemoryPolicy().
The first one should be in numatune_conf as it works with
virDomainNumatune, the second one just needs nodeset and mode, both of
which can be passed without the need of numatune_conf.
Apart from fixing that, this patch also fixes recently added
code (between commits d2460f85^..5c8515620) that doesn't support
non-contiguous nodesets. It uses new function
virNumaNodesetIsAvailable(), which doesn't need a stub as it doesn't use
any libnuma functions, to check if every specified nodeset is available.
[1] https://www.redhat.com/archives/libvir-list/2014-November/msg00118.html
[2] http://www.redhat.com/archives/libvir-list/2011-June/msg01040.html
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Domain memory elements such as max_balloon and cur_balloon are
implemented as 'unsigned long long', whereas the 'memory' element
in NUMA cells is implemented as 'unsigned int'.
Use the same data type (unsigned long long) for 'memory' element
in NUMA cells.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
There was no check for 'nodeset' attribute in numatune-related
elements. This patch adds validation that any nodeset specified does
not exceed maximum host node.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Since commit 3f99d64 no new scsi_host pools can be defined
if one of the already defined scsi_host pools does not refer
to an accessible scsi_host adapter.
Relax the check by skipping over these inaccessible pools
when checking for duplicates.
If both source adapters are specified by a parent address,
just comparing the address is faster and catches even addresses
that do not refer to valid adapters.
Commit 6c9a8a4 (Oct 2014) exposed a long-standing issue on 32-bit
machines: code related to virDomainSetMemoryParameters has always
been documented as using a 64-bit limit, but it was implemented by
calling virDomainParseMemory which enforced an 'unsigned long'
limit. Since VIR_DOMAIN_MEMORY_PARAM_UNLIMITED capped to a
long is -1, but virDomainParseScaledValue no longer accepts
negative values, an attempt to use 2^53-1 as a hard memory limit
started failing the testsuite. However, the problem with capping
things artificially low has existed for much longer - ever since
commits 4888f0fb and 2e22f23 (Mar 2012) switched internal tracking
from 'unsigned long' to 'unsigned long long' (prior to that time,
the cap was a side-effect of the choice of types). We _have_ to
cap the balloon memory values, (no thanks to baked in 'unsigned long'
of API such as virDomainSetMaxMemory or virDomainGetInfo with no
counterpart API that guarantees 64-bit access to those numbers)
but memory parameters have never needed the artificial limit.
At any rate, the solution is to make the parser function gain a
parameter, and only do the reduced 32-bit cap for the values that
are constrained due to API.
* src/conf/domain_conf.h (_virDomainMemtune): Add comments.
* src/conf/domain_conf.c (virDomainParseMemory): Add parameter.
(virDomainDefParseXML): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
It makes sense for none of the callers to have negative value as an
output and, fortunately, if anyone tried defining domain with negative
memory or any other value parsed by virDomainParseScaledValue(), the
resulting value was 0. That means we can error out during parsing as
it won't break anything.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1155843
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1146837
Resolve a crash in libvirtd resulting from commit id 'a4bd62ad' (1.0.6)
which added parentaddr and unique_id to allow unique identification of
a scsi_host, but assumed that all the pool entries and the incoming
definition would be similarly defined. If the existing pool uses the
'name' attribute and an incoming pool is using the parentaddr/unique_id,
then the code will attempt to compare the existing name string against
the incoming name string which doesn't exist (is NULL) and results in
a core (STREQ).
Conversely, if the existing pool used the parentaddr/unique_id and the
to be defined pool used the name, then the comparison would be against
the parentaddr, but since the incoming pool doesn't have one - that would
leave the comparison against a parentaddr of all 0's and a unique_id of 0,
which will always comparison to fail. This means someone could define the
same source adapter for two pools
In order to resolve this, adjust the code to get the 'host#' to be used
by the storage scsi backend in order to check/start the pool and make sure
the incoming definition doesn't match any of the existing pool defs.
The mode attribute is required for the source element of vhost-user.
Thus virDomainNetDefFormat should always generate a xml with it and not
only when the mode is server.
The commit fixes the issue. And it adds a vhostuser interface in
'client' mode to qemuxml2argv-net-vhostuser.(args|xml) to test this
usecase.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Allow the Xen drivers to determine default vram values. Sane
default vaules depend on the device model being used, so the
drivers are in the best position to determine the defaults.
For the legacy xen driver, it is best to maintain the existing
logic for setting default vram values to ensure there are no
regressions. The libxl driver currently does not support
configuring a video device. Support will be added in a
subsequent patch, where the benefit of this change will be
reaped.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This new attribute will control whether or not libvirt will pay
attention to guest notifications about changes to network device mac
addresses and receive filters. The default for this is 'no' (for
security reasons). If it is set to 'yes' *and* the specified device
model and connection support it (currently only macvtap+virtio) then
libvirt will watch for NIC_RX_FILTER_CHANGED events, and when it
receives one, it will issue a query-rx-filter command, retrieve the
result, and modify the host-side macvtap interface's mac address and
unicast/multicast filters accordingly.
The functionality behind this attribute will be in a later patch. This
patch merely adds the attribute to the top-level of a domain's
<interface> as well as to <network> and <portgroup>, and adds
documentation and schema/xml2xml tests. Rather than adding even more
test files, I've just added the net attribute in various applicable
places of existing test files.