https://bugzilla.redhat.com/show_bug.cgi?id=1503284
The way we currently start qemu from CPU affinity POV is as
follows:
1) the child process is set affinity to all online CPUs (unless
some vcpu pinning was given in the domain XML)
2) Once qemu is running, cpuset cgroup is configured taking
memory pinning into account
Problem is that we let qemu allocate its memory just anywhere in
1) and then rely in 2) to be able to move the memory to
configured NUMA nodes. This might not be always possible (e.g.
qemu might lock some parts of its memory) and is very suboptimal
(copying large memory between NUMA nodes takes significant amount
of time).
The solution is to set affinity to one of (in priority order):
- The CPUs associated with NUMA memory affinity mask
- The CPUs associated with emulator pinning
- All online host CPUs
Later (once QEMU has allocated its memory) we then change this
again to (again in priority order):
- The CPUs associated with emulator pinning
- The CPUs returned by numad
- The CPUs associated with vCPU pinning
- All online host CPUs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This is mainly about /dev/sev and its default permissions 0600. Of
course, rule of 'tinfoil' would be that we can't trust anything, but the
probing code in QEMU is considered safe from security's perspective + we
can't create an udev rule for this at the moment, because ioctls and
file system permissions aren't cross-checked in kernel and therefore a
user with read permissions could issue a 'privileged' operation on SEV
which is currently only limited to root.
https://bugzilla.redhat.com/show_bug.cgi?id=1665400
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The default permissions (0600 root:root) are of no use to the qemu
process so we need to change the owner to qemu iff running with
namespaces.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Instead of exposing /dev/sev to every domain, do it selectively.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
SEV has a limit on number of concurrent guests. From security POV we
should only expose resources (any resources for that matter) to domains
that truly need them.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We should not give domains access to something they don't necessarily
need by default. Remove it from the qemu driver docs too.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Commit fb0d0d6c54 added capabilities data and updated
qemucapabilitiestest but forgot to update qemucaps2xmltest
at the same time.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This shows users can now use PCI for RISC-V guests, as long
as they opt into it by manually assigning addresses.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virtio-mmio is still used by default, so if PCI is desired
it's necessary to explicitly opt-in by adding an appropriate
<address type='pci' domain='0x0000' ... />
element to the corresponding device.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This adds an additional directive to the dnsmasq configuration file that
notifies clients via dhcp about the link's MTU. Guests can then choose
adjust their link accordingly.
Signed-off-by: Casey Callendrello <cdc@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This test relies on namespace support, which is only compiled in
if we have the 'fs' and 'netfs' backends.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Instead of repeating the same platform for every test,
set it once, since we do the same tests with the same
input for all platforms, it's just the output that differs.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Only run the pool-netfs-ns-mountopts if built WITH_STORAGE_FS and only
run pool-rbd-ns-configopts if built with WITH_STORAGE_RBD since the
namespace support is only enabled if the pool is enabled.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'qemu' binary used to provide the i386 emulator until it was renamed
to qemu-system-i386 in QEMU 1.0. Since we don't support such old
versions we don't need to check for 'qemu' when probing capabilities.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The custom namespaces were originally registered against the storage
pool source struct, but during review this was changed to the top level
storage pool struct. The namespace URIs were not updated to match, so
had a redundant '/source' component.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Some clients poll virDomainGetBlockJobInfo rather than wait for the
VIR_DOMAIN_BLOCK_JOB_READY event. In some cases qemu can get to 100% and
still not reach the synchronised phase. Initiating a pivot in that case
will fail.
Given that computers are interacting here, the error that the job
can't be finalized yet is not handled very well by those specific
implementations.
Our docs now correctly state to use the event. We already do a similar
output adjustment in case when the progress is not available from qemu
as in that case we'd report 0 out of 0, which some apps also incorrectly
considered as 100% complete.
In this case we subtract 1 from the progress if the ready state is not
signalled by qemu if the progress was at 100% otherwise.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Big number itself does not make much sense in some cases. Format the
bitshift format as well.
Changes our web page docs from:
VIR_MIGRATE_POSTCOPY = 32768 : Setting the VIR_MIGRATE_POSTCOPY...
VIR_MIGRATE_TLS = 65536 : Setting the VIR_MIGRATE_TLS flag...
to:
VIR_MIGRATE_POSTCOPY = 32768 (0x8000; 1 << 15) : Setting the VIR_MIGRATE_POSTCOPY...
VIR_MIGRATE_TLS = 65536 (0x10000; 1 << 16) : Setting the VIR_MIGRATE_TLS flag...
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
A copy+paste mistaken meant the wrong enum -> string convertor
function was used for the error when an incorrect feature capability was
used.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
PEP 8 says:
"Comparisons to singletons like None should always be done
with 'is' or 'is not', never the equality operators."
There are potentially semantics differences, though in the case of this
libvirt code its merely a style change:
http://jaredgrubb.blogspot.com/2009/04/python-is-none-vs-none.html
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virDomainDeviceInfo parameter is a large struct so it is preferrable
to pass it by reference instead of by value.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The struct _virStorageBackendQemuImgInfo is quite large so it is
preferrable to pass it by reference instead of by value. This requires
us to stop modifying the "compat" field.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The 'rv' variable is never changed after being declared, so can be
removed.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
'val' is initialized from virDomainCapsFeatureTypeFromString and a
few lines earlier there was already a check for 'val < 0'.
The 'val >= 0' is thus always true. The enum conversion similarly
ensures that the val will be less than VIR_DOMAIN_CAPS_FEATURE_LAST,
so "val < VIR_DOMAIN_CAPS_FEATURE_LAST' is thus always true too.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Be more sensible when setting labels of the target of a
virDomainBlockCopy operation. Previously we'd relabel everything in case
it's a copy job even if there's no unlabelled backing chain. Since we
are also not sure whether the backing chain is shared we don't relabel
the chain on completion of the blockjob. This certainly won't play nice
with the image permission relabelling feature.
While this does not fix the case where the image is reused and has
backing chain it certainly sanitizes all the other cases. Later on it
will also allow to do the correct thing in cases where only one layer
was introduced.
The change is necessary as in case when -blockdev will be used we will
need to hotplug the backing chain and thus labeling needs to be setup in
advance and not only at the time of pivot. To avoid multiple code paths
move the labeling now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Rather than passing in a virStorageSource which would override the
originally passed disk->src we can now drop passing in a disk completely
as all functions called inside here require a virStorageSource.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Use the functions designed to deal with single images as the *Disk
functions were just wrappers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Previously there weren't any suitable functions which would allow
setting up host side of a full disk chain so we've opted to replace the
'src' in a virDomainDiskDef by the new image source.
That is now no longer necessary so remove the munging.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Now that we have replacement in the form of the image labeling function
we can drop the unnecessary functions by replacing all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The same can be achieved by using qemuSecurity[Set|Restore]ImageLabel.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The flag will control the VIR_SECURITY_DOMAIN_IMAGE_LABEL_BACKING_CHAIN
flag of the security driver image labeling APIs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Security labeling of disks consists of labeling of the disk image
itself and it's backing chain. Modify
virSecurityManager[Set|Restore]ImageLabel to take a boolean flag that
will label the full chain rather than the top image itself.
This allows to delete/unify some parts of the code and will also
simplify callers in some cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since the disk is necessary only to get the source modify the functions
to take the source directly and rename them to
qemu[Setup|Teardown]ImageChainCgroup.
Additionally drop a pointless comment containing the old function name.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When we need to detect a chain for a image which will become the new
source for a disk (e.g. after a disk media change or a blockjob) we'd
need to replace disk->src temporarily to do so.
Move the 'disksrc' temporary variable to an argument and adjust callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The function at first validates the top image of the chain, then
traverses the chain as declared in the XML (if any) and then procedes to
detect the rest of the chain from images. All of the steps have their
own temporary iterator.
Clarify the use scope of the steps by introducing a new temp variable
holding the top level source and adding comments.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit f2f84b4d4 added storagepoolxml2argvtest processing; however,
it didn't follow alter the else to !WITH_STORAGE and add the source
itself to the EXTRA_DIST like the other WITH_STORAGE options for
virstorageutiltest and storagevolxml2argvtest.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 7a227688a caused a build failure on mingw. Following
other uses of including ../src/libvirt_driver_storage_impl.la
I moved to under the WITH_STORAGE conditional.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Allow for adjustment of RBD configuration options via Storage
Pool XML Namespace adjustments. When namespace arguments are
used to start the pool, add a VIR_WARN to indicate that the
startup was tainted by custom config_opts.
Based off original patch/concept:
https://www.redhat.com/archives/libvir-list/2014-May/msg00940.html
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
If the Storage Pool Namespace XML data exists, format the mount
options on the MOUNT command line and issue a VIR_WARN to indicate
that the storage pool was tainted by custom mount_opts.
When the pool is started, the options will be generated on the
command line along with the options already defined.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce the virStoragePoolFSMountOptionsDef to be used to
manage the Storage Pool XML Namespace for mount options.
Using a new virStorageBackendNamespaceInit function, set the
virStoragePoolXMLNamespace into the _virStoragePoolOptions when
the storage backend is loaded.
Modify the storagepool.rng to allow for the usage of a different
XML namespace to parse the fs_mount_opts to be included with
the fs and netfs storage pool definitions.
Modify the storagepoolxml2xmltest to utilize a properly modified
XML file to parse and format the namespace for a netfs storage pool.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce the infrastructure necessary to manage a Storage Pool XML
Namespace. The general concept is similar to virDomainXMLNamespace,
except that for Storage Pools the storage backend specific details
can be stored within the _virStoragePoolOptions unlike the domain
processing code which manages its xmlopt's via the virDomainXMLOption
which is allocated/passed around for each domain.
This patch defines the add the parse, format, free, and href methods
required to process the XML and callout from the Storage Pool Def
parse, format, and free API's to perform the action on the XML data
for/from the backend.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the addition of the <protocol ver='n'/> to the provided XML.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
If protocolVer present, add the -o nfsvers=# to the command
line for the NFS Storage Pool
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add an optional way to define which NFS Server version will be
used to content the target NFS server.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>