QEMU has the ability to mark machine types as deprecated. This should be
exposed to management applications in the capabilities.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Update the host CPU code to report the die_id in the NUMA topology
capabilities. On systems with multiple dies, this fixes the bug
where CPU cores can't be distinguished:
<cpus num='12'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
<cpu id='1' socket_id='0' core_id='1' siblings='1'/>
<cpu id='2' socket_id='0' core_id='0' siblings='2'/>
<cpu id='3' socket_id='0' core_id='1' siblings='3'/>
</cpus>
Notice how core_id is repeated within the scope of the same socket_id.
It now reports
<cpus num='12'>
<cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0'/>
<cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1'/>
<cpu id='2' socket_id='0' die_id='1' core_id='0' siblings='2'/>
<cpu id='3' socket_id='0' die_id='1' core_id='1' siblings='3'/>
</cpus>
So core_id is now unique within a (socket_id, die_id) pair.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The NUMA cells are stored directly in the virCapsHostPtr
struct. This moves them into their own struct allowing
them to be stored independantly of the rest of the host
capabilities. The change is used as an excuse to switch
the representation to use a GPtrArray too.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The XML parser currently calls virCapabilitiesDomainDataLookup during
parsing to find the domain capabilities matching the triple
(virt type, os type, arch)
This is, however, bogus with the QEMU driver as it assumes that there
is an emulator known to the default driver capabilities that matches
this triple. It is entirely possible for the driver to be parsing an
XML file with a custom emulator path specified pointing to a binary
that doesn't exist in the default driver capabilities. This will,
for example be the case on a RHEL host which only installs the host
native emulator to /usr/bin. The user can have built a custom QEMU
for non-native arches into $HOME and wish to use that.
Aside from validation, this call is also used to fill in a machine type
for the guest if not otherwise specified. Again, this data may be
incorrect for the QEMU driver because it is not taking account of
the emulator binary that is referenced.
To start fixing this, move the validation to the post-parse callbacks
where more intelligent driver specific logic can be applied.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Remove the need to pass around strings and switch to the enum values
instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The capabilities are declared in the XML schema so passing feature names
as strings from hypervisor drivers makes no sense.
Additionally some of the features expose so called 'toggles' while
others not. This knowledge was encoded by a bunch of 'STREQ's in the
formatter.
Change all of this by declaring the features as an enum and use it
instead of a dynamically allocated array.
Presence of 'toggles' is encoded together with the conversion strings
rather than in the formatter directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Libvirt currently uses the VIR_AUTOUNREF macro for auto cleanup of
virObject instances. GLib approaches things differently with GObject,
reusing their g_autoptr() concept.
This introduces support for g_autoptr() with virObject, to facilitate
the conversion to GObject.
Only virObject classes which are currently used with VIR_AUTOREF are
updated. Any others should be converted to GObject before introducing
use of autocleanup.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Now that virDomainXMLNamespace matches virXMLNamespace,
we no longer need to keep both around.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Neither the xmlDocPtr nor the root xmlNode (also passed
in the XPathContext) are interesting to the callees.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
As explained in the previous patch, collecting pointer typedefs into a
common header makes it easier to avoid circular inclusions. Continue
the efforts by pulling the appropriate typedefs from capabilities.h
into the new header.
This patch is just straight code motion (all typedefs are listed in
the same order before and after the patch); a later patch will sort
things for legibility.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce the bare bones functions to processing capability
data for the storage driver.
Since there will be no need for the <host> output, we need
to filter that data.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Require that all headers are guarded by a symbol named
LIBVIRT_$FILENAME
where $FILENAME is the uppercased filename, with all characters
outside a-z changed into '_'.
Note we do not use a leading __ because that is technically a
namespace reserved for the toolchain.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In many files there are header comments that contain an Author:
statement, supposedly reflecting who originally wrote the code.
In a large collaborative project like libvirt, any non-trivial
file will have been modified by a large number of different
contributors. IOW, the Author: comments are quickly out of date,
omitting people who have made significant contribitions.
In some places Author: lines have been added despite the person
merely being responsible for creating the file by moving existing
code out of another file. IOW, the Author: lines give an incorrect
record of authorship.
With this all in mind, the comments are useless as a means to identify
who to talk to about code in a particular file. Contributors will always
be better off using 'git log' and 'git blame' if they need to find the
author of a particular bit of code.
This commit thus deletes all Author: comments from the source and adds
a rule to prevent them reappearing.
The Copyright headers are similarly misleading and inaccurate, however,
we cannot delete these as they have legal meaning, despite being largely
inaccurate. In addition only the copyright holder is permitted to change
their respective copyright statement.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This patch is introducing cache monitor(CMT) to cache and
memory bandwidth monitor(MBM) for monitoring CPU memory
bandwidth.
The host capability of the two monitors is also introduced
in this patch.
For CMT, the host capability is shown like:
<host>
...
<cache>
<bank id='0' level='3' type='both' size='15' unit='MiB' cpus='0-5'>
<control granularity='768' min='1536' unit='KiB' type='both' maxAllocs='4'/>
</bank>
<monitor level='3' 'reuseThreshold'='270336' maxMonitors='176'>
<feature name='llc_occupancy'/>
</monitor>
</cache>
...
</host>
For MBM, the capability is shown like this:
<host>
...
<memory_bandwidth>
<node id='1' cpus='6-11'>
<control granularity='10' min ='10' maxAllocs='4'/>
</node>
<monitor maxMonitors='176'>
<feature name='mbm_total_bytes'/>
<feature name='mbm_local_bytes'/>
</monitor>
</memory_bandwidth>
...
</host>
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Move memory bandwidth capability nodes into one data structure,
this allows us to add a monitor for memory bandwidth.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Move all cache banks into one data structure, this allows
us to add other cache component, such as cache monitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Add new XML section to report host's memory bandwidth allocation
capability. The format as below example:
<host>
.....
<memory_bandwidth>
<node id='0' cpus='0-19'>
<control granularity='10' min ='10' maxAllocs='8'/>
</node>
</memory_bandwidth>
</host>
granularity ---- granularity of memory bandwidth, unit percentage.
min ---- minimum memory bandwidth allowed, unit percentage.
maxAllocs ---- maximum memory bandwidth allocation group supported.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1569678
On some large systems (with ~400GB of RAM) it is possible for
unsigned int to overflow in which case we report invalid number
of 4K pages pool size. Switch to unsigned long long.
We hit overflow in virNumaGetPages when doing:
huge_page_sum += 1024 * page_size * page_avail;
because although 'huge_page_sum' is an unsigned long long, the
page_size and page_avail are both unsigned int, so the promotion
to unsigned long long doesn't happen until the sum has been
calculated, by which time we've already overflowed.
Turning page_avail into a unsigned long long is not strictly
needed until we need ability to represent more than 2^32
4k pages, which equates to 16 TB of RAM. That's not
outside the realm of possibility, so makes sense that we
change it to unsigned long long to avoid future problems.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This way later patches can add another structures with virResctrl
prefix without the meaning being even more confusing than it needs to
be.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It doesn't access anything from conf/ and ti will be needed to use
from other util/ places. This split makes the separation clearer.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
On some platforms the number of bits in the cbm_mask might not be
divisible by 4 (and not even by 2), so we need to properly count the
bits. Similar file, min_cbm_bits, is properly parsed and used, but if
the number is greater than one, we lose the information about
granularity when reporting the data in capabilities. For that matter
always report granularity, but if it is not the same as the minimum,
add that information in there as well.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We're only adding only info about L3 caches, we can add more
later (just by changing one line), but for now that's more than enough
without overwhelming anyone.
XML snippet of how this should look like (also seen as part of the commit):
<cache>
<bank id='0' level='3' type='both' size='8192' unit='KiB' cpus='0-7'/>
</cache>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This way more drivers can utilize the functionality without copying
the code. And we can therefore test it in one place for all of them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no "node driver" as there was before, drivers have to do
their own ACL checking anyway, so they all specify their functions and
nodeinfo is basically just extending conf/capablities. Hence moving
the code to src/conf/ is the right way to go.
Also that way we can de-duplicate some code that is in virsysfs and/or
virhostcpu that got duplicated during the virhostcpu.c split. And
Some cleanup is done throughout the changes, like adding the vir*
prefix etc.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Guests are handled in callers, but if something goes wrong (when it
cannot be added to virCapabilities, for example), there's no way for
them to free it properly.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Usage of this keyword in front of function declaration that is exported via a
header file is unnecessary, since internally, this has been the default for most
compilers for quite some time.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In the reverted commit d2e5538b1, the libxl driver was changed to copy
interface names autogenerated by libxl to the corresponding network def
in the domain's virDomainDef object. The copied name is freed when the
domain transitions to the shutoff state. But when migrating a domain,
the autogenerated name is included in the XML sent to the destination
host. It is possible an interface with the same name already exists on
the destination host, causing migration to fail.
This patch defines a new capability for setting the network device
prefix that will be used in the driver. Valid prefixes are
VIR_NET_GENERATED_PREFIX or the one announced by the driver.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Introduce VIR_DOMAIN_VIRT_NONE to give domaintype the default value of zero.
This is specially helpful in constructing better error messages
when we don't want to look up the default emulator by virtType.
The test data in vircapstest.c is also modified to reflect this change.
This revealed that GuestDefaultEmulator was a bit buggy, capable
of returning an emulator that didn't match the passed domain type. Fix
up the test suite input to continue to pass.
This is a helper function to look up all capabilities data for all
the OS bits that are relevant to <domain>. This is
- os type
- arch
- domain type
- emulator
- machine type
This will be used to replace several functions in later commits.
But the internal API stays the same, and we just convert the value as
needed. Not useful yet, but this is the beginning step of using an enum
for ostype throughout the code.
There are two places where you'll find info on page sizes. The first
one is under <cpu/> element, where all supported pages sizes are
listed. Then the second one is under each <cell/> element which refers
to concrete NUMA node. At this place, the size of page's pool is
reported. So the capabilities XML looks something like this:
<capabilities>
<host>
<uuid>01281cda-f352-cb11-a9db-e905fe22010c</uuid>
<cpu>
<arch>x86_64</arch>
<model>Westmere</model>
<vendor>Intel</vendor>
<topology sockets='1' cores='1' threads='1'/>
...
<pages unit='KiB' size='4'/>
<pages unit='KiB' size='2048'/>
<pages unit='KiB' size='1048576'/>
</cpu>
...
<topology>
<cells num='4'>
<cell id='0'>
<memory unit='KiB'>4054408</memory>
<pages unit='KiB' size='4'>1013602</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>4071072</memory>
<pages unit='KiB' size='4'>1017768</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='1' socket_id='0' core_id='0' siblings='1'/>
</cpus>
</cell>
...
</cells>
</topology>
...
</host>
<guest/>
</capabilities>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If user or management application wants to create a guest,
it may be useful to know the cost of internode latencies
before the guest resources are pinned. For example:
<capabilities>
<host>
...
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>4004132</memory>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='20'/>
</distances>
<cpus num='2'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
<cpu id='2' socket_id='0' core_id='2' siblings='2'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>4030064</memory>
<distances>
<sibling id='0' value='20'/>
<sibling id='1' value='10'/>
</distances>
<cpus num='2'>
<cpu id='1' socket_id='0' core_id='0' siblings='1'/>
<cpu id='3' socket_id='0' core_id='2' siblings='3'/>
</cpus>
</cell>
</cells>
</topology>
...
</host>
...
</capabilities>
We can see the distance from node1 to node0 is 20 and within nodes 10.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To make it easier to forbid future attempts at a confusing typedef
name ending in Ptr that isn't actually a pointer, insist that we
follow our preferred style of 'typedef foo *fooPtr'.
* cfg.mk (sc_forbid_const_pointer_typedef): Enforce consistent
style, to prevent issue fixed in previous storage patch.
* src/conf/capabilities.h (virCapsPtr): Fix offender.
* src/security/security_stack.c (virSecurityStackItemPtr):
Likewise.
* tests/qemucapabilitiestest.c (testQemuDataPtr): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Expand the "secmodel" XML fragment of "host" with a sequence of
baselabel's which describe the default security context used by
libvirt with a specific security model and virtualization type:
<secmodel>
<model>selinux</model>
<doi>0</doi>
<baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel>
<baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel>
</secmodel>
<secmodel>
<model>dac</model>
<doi>0</doi>
<baselabel type='kvm'>107:107</baselabel>
<baselabel type='qemu'>107:107</baselabel>
</secmodel>
"baselabel" is driver-specific information, e.g. in the DAC security
model, it indicates USER_ID:GROUP_ID.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
These helpers use the remembered host capabilities to retrieve the cpu
map rather than query the host again. The intended usage for this
helpers is to fix automatic NUMA placement with strict memory alloc. The
code doing the prepare needs to pin the emulator process only to cpus
belonging to a subset of NUMA nodes of the host.
Implement check whether (maximum) vCPUs doesn't exceed machine
type's cpu-max settings.
On older versions of QEMU the check is disabled.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
This patch refactors various places to allow removing of the
defaultConsoleTargetType callback from the virCaps structure.
A new console character device target type is introduced -
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_NONE - to mark that no type was
specified in the XML. This type is at the end converted to the standard
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL. Other types that are
different from this default have to be processed separately in the
device post parse callback.