The default permissions (0600 root:root) are of no use to the qemu
process so we need to change the owner to qemu iff running with
namespaces.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Instead of exposing /dev/sev to every domain, do it selectively.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
SEV has a limit on number of concurrent guests. From security POV we
should only expose resources (any resources for that matter) to domains
that truly need them.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We should not give domains access to something they don't necessarily
need by default. Remove it from the qemu driver docs too.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
virtio-mmio is still used by default, so if PCI is desired
it's necessary to explicitly opt-in by adding an appropriate
<address type='pci' domain='0x0000' ... />
element to the corresponding device.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This adds an additional directive to the dnsmasq configuration file that
notifies clients via dhcp about the link's MTU. Guests can then choose
adjust their link accordingly.
Signed-off-by: Casey Callendrello <cdc@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The 'qemu' binary used to provide the i386 emulator until it was renamed
to qemu-system-i386 in QEMU 1.0. Since we don't support such old
versions we don't need to check for 'qemu' when probing capabilities.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The custom namespaces were originally registered against the storage
pool source struct, but during review this was changed to the top level
storage pool struct. The namespace URIs were not updated to match, so
had a redundant '/source' component.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Some clients poll virDomainGetBlockJobInfo rather than wait for the
VIR_DOMAIN_BLOCK_JOB_READY event. In some cases qemu can get to 100% and
still not reach the synchronised phase. Initiating a pivot in that case
will fail.
Given that computers are interacting here, the error that the job
can't be finalized yet is not handled very well by those specific
implementations.
Our docs now correctly state to use the event. We already do a similar
output adjustment in case when the progress is not available from qemu
as in that case we'd report 0 out of 0, which some apps also incorrectly
considered as 100% complete.
In this case we subtract 1 from the progress if the ready state is not
signalled by qemu if the progress was at 100% otherwise.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
A copy+paste mistaken meant the wrong enum -> string convertor
function was used for the error when an incorrect feature capability was
used.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
PEP 8 says:
"Comparisons to singletons like None should always be done
with 'is' or 'is not', never the equality operators."
There are potentially semantics differences, though in the case of this
libvirt code its merely a style change:
http://jaredgrubb.blogspot.com/2009/04/python-is-none-vs-none.html
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virDomainDeviceInfo parameter is a large struct so it is preferrable
to pass it by reference instead of by value.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The struct _virStorageBackendQemuImgInfo is quite large so it is
preferrable to pass it by reference instead of by value. This requires
us to stop modifying the "compat" field.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The 'rv' variable is never changed after being declared, so can be
removed.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
'val' is initialized from virDomainCapsFeatureTypeFromString and a
few lines earlier there was already a check for 'val < 0'.
The 'val >= 0' is thus always true. The enum conversion similarly
ensures that the val will be less than VIR_DOMAIN_CAPS_FEATURE_LAST,
so "val < VIR_DOMAIN_CAPS_FEATURE_LAST' is thus always true too.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Be more sensible when setting labels of the target of a
virDomainBlockCopy operation. Previously we'd relabel everything in case
it's a copy job even if there's no unlabelled backing chain. Since we
are also not sure whether the backing chain is shared we don't relabel
the chain on completion of the blockjob. This certainly won't play nice
with the image permission relabelling feature.
While this does not fix the case where the image is reused and has
backing chain it certainly sanitizes all the other cases. Later on it
will also allow to do the correct thing in cases where only one layer
was introduced.
The change is necessary as in case when -blockdev will be used we will
need to hotplug the backing chain and thus labeling needs to be setup in
advance and not only at the time of pivot. To avoid multiple code paths
move the labeling now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Rather than passing in a virStorageSource which would override the
originally passed disk->src we can now drop passing in a disk completely
as all functions called inside here require a virStorageSource.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Use the functions designed to deal with single images as the *Disk
functions were just wrappers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Previously there weren't any suitable functions which would allow
setting up host side of a full disk chain so we've opted to replace the
'src' in a virDomainDiskDef by the new image source.
That is now no longer necessary so remove the munging.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Now that we have replacement in the form of the image labeling function
we can drop the unnecessary functions by replacing all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The same can be achieved by using qemuSecurity[Set|Restore]ImageLabel.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The flag will control the VIR_SECURITY_DOMAIN_IMAGE_LABEL_BACKING_CHAIN
flag of the security driver image labeling APIs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Security labeling of disks consists of labeling of the disk image
itself and it's backing chain. Modify
virSecurityManager[Set|Restore]ImageLabel to take a boolean flag that
will label the full chain rather than the top image itself.
This allows to delete/unify some parts of the code and will also
simplify callers in some cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Since the disk is necessary only to get the source modify the functions
to take the source directly and rename them to
qemu[Setup|Teardown]ImageChainCgroup.
Additionally drop a pointless comment containing the old function name.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When we need to detect a chain for a image which will become the new
source for a disk (e.g. after a disk media change or a blockjob) we'd
need to replace disk->src temporarily to do so.
Move the 'disksrc' temporary variable to an argument and adjust callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The function at first validates the top image of the chain, then
traverses the chain as declared in the XML (if any) and then procedes to
detect the rest of the chain from images. All of the steps have their
own temporary iterator.
Clarify the use scope of the steps by introducing a new temp variable
holding the top level source and adding comments.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Allow for adjustment of RBD configuration options via Storage
Pool XML Namespace adjustments. When namespace arguments are
used to start the pool, add a VIR_WARN to indicate that the
startup was tainted by custom config_opts.
Based off original patch/concept:
https://www.redhat.com/archives/libvir-list/2014-May/msg00940.html
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
If the Storage Pool Namespace XML data exists, format the mount
options on the MOUNT command line and issue a VIR_WARN to indicate
that the storage pool was tainted by custom mount_opts.
When the pool is started, the options will be generated on the
command line along with the options already defined.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce the virStoragePoolFSMountOptionsDef to be used to
manage the Storage Pool XML Namespace for mount options.
Using a new virStorageBackendNamespaceInit function, set the
virStoragePoolXMLNamespace into the _virStoragePoolOptions when
the storage backend is loaded.
Modify the storagepool.rng to allow for the usage of a different
XML namespace to parse the fs_mount_opts to be included with
the fs and netfs storage pool definitions.
Modify the storagepoolxml2xmltest to utilize a properly modified
XML file to parse and format the namespace for a netfs storage pool.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce the infrastructure necessary to manage a Storage Pool XML
Namespace. The general concept is similar to virDomainXMLNamespace,
except that for Storage Pools the storage backend specific details
can be stored within the _virStoragePoolOptions unlike the domain
processing code which manages its xmlopt's via the virDomainXMLOption
which is allocated/passed around for each domain.
This patch defines the add the parse, format, free, and href methods
required to process the XML and callout from the Storage Pool Def
parse, format, and free API's to perform the action on the XML data
for/from the backend.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
If protocolVer present, add the -o nfsvers=# to the command
line for the NFS Storage Pool
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add an optional way to define which NFS Server version will be
used to content the target NFS server.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1584663
Modify the command generation to add some default options to the
fs/netfs storage pools based on the OS type. For Linux, it'll be
the "nodev, nosuid, noexec". For FreeBSD, it'll be "nosuid, noexec".
For others, just leave the options alone.
Modify the storagepoolxml2argvtest to handle the fact that the
same input XML could generate different output XML based on whether
Linux, FreeBSD, or other was being built.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Rather than deref off of "caps->guests", let's pass "caps->guests" and
caps->nguests to have the helper use "guests[i]->" instead.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Let's extract out the <guest> code into it's own method/helper.
NB: One minor change between the two is usage of "buf" instead
of "&buf" in the new code since we pass the address of &buf to
the helper.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Rather than deref off of "caps->host.", let's pass "&caps->host"
and make the helper use "host->" instead.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Let's extract out the <host> code into it's own method/helper.
NB: One minor change between the two is usage of "buf" instead
of "&buf" in the new code since we pass the address of &buf to
the helper.
Signed-off-by: John Ferlan <jferlan@redhat.com>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit 8b035c84d8.
The MTTCG impl in QEMU does allow pinning vCPUs.
When the guest is running we already check if pinning is
possible in the qemuDomainPinVcpuLive method, so this
check was adding no benefit.
When the guest is not running, we cannot know whether the
subsequent launch will use MTTCG or TCG, so we must allow
the pinning request. If the guest does use TCG on the next
launch it will fail, but this is no worse than if the user
had done a virDomainDefineXML with an XML doc specifying
vCPU pinning.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
MTTCG is the new multi-threaded impl of TCG which follows
KVM in having one host OS thread per vCPU. Historically
we have discarded all PIDs reported for TCG guests, but
we must now selectively honour this data.
We don't have anything in the domain XML that indicates
whether a guest is using TCG or MTTCG. While QEMU does
have an option (-accel tcg,thread=single|multi), it is
not desirable to expose this in libvirt. QEMU will
automatically use MTTCG when the host/guest architecture
pairing is known to be safe. Only developers of QEMU TCG
have a strong reason to override this logic.
Thus we use two sanity checks to decide if the vCPU
PID information is usable. First we see if the PID
duplicates the main emulator PID, and second we see
if the PID duplicates any other vCPUs.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The transition to the ready state is best observed by events as it's
ansynchronous and does not hint users to do polling. As currently only
the qemu driver supports block copy and block commit and the ready state
event was introduced by qemu 1.3 we can fully switch to the new
approach.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The documentation was only referring to a copy job, but in fact any
running blockjob will have the same results.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add documentation that the 'VIR_DOMAIN_BLOCK_COPY_TRANSIENT_JOB' flag
is auto-assumed if the block copy job is started while the VM is
transient and remove the restriction to define the domain when copy
is running.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Historically firewall rules for virtual networks were added straight
into the base chains. This works but has a number of bugs and design
limitations:
- It is inflexible for admins wanting to add extra rules ahead
of libvirt's rules, via hook scripts.
- It is not clear to the admin that the rules were created by
libvirt
- Each rule must be deleted by libvirt individually since they
are all directly in the builtin chains
- The ordering of rules in the forward chain is incorrect
when multiple networks are created, allowing traffic to
mistakenly flow between networks in one direction.
To address all of these problems, libvirt needs to move to creating
rules in its own private chains. In the top level builtin chains,
libvirt will add links to its own private top level chains.
Addressing the traffic ordering bug requires some extra steps. With
everything going into the FORWARD chain there was interleaving of rules
for outbound traffic and inbound traffic for each network:
-A FORWARD -d 192.168.3.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.3.0/24 -i virbr1 -j ACCEPT
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -d 192.168.2.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.2.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
The rule allowing outbound traffic from virbr1 would mistakenly
allow packets from virbr1 to virbr0, before the rule denying input
to virbr0 gets a chance to run.
What we really need todo is group the forwarding rules into three
distinct sets:
* Cross rules - LIBVIRT_FWX
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
* Incoming rules - LIBVIRT_FWI
-A FORWARD -d 192.168.3.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -d 192.168.2.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
* Outgoing rules - LIBVIRT_FWO
-A FORWARD -s 192.168.3.0/24 -i virbr1 -j ACCEPT
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -s 192.168.2.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
There is thus no risk of outgoing rules for one network mistakenly
allowing incoming traffic for another network, as all incoming rules
are evalated first.
With this in mind, we'll thus need three distinct chains linked from
the FORWARD chain, so we end up with:
INPUT --> LIBVIRT_INP (filter)
OUTPUT --> LIBVIRT_OUT (filter)
FORWARD +-> LIBVIRT_FWX (filter)
+-> LIBVIRT_FWO
\-> LIBVIRT_FWI
POSTROUTING --> LIBVIRT_PRT (nat & mangle)
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Some of the query callbacks want to know the firewall layer that was
being used for triggering the query to avoid duplicating that data.
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the platform driver impls to run logic before and after the
firewall reload process.
Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
disk->mirror would not be cleared while the local pointer was freed in
qemuDomainBlockCommit if qemuDomainObjExitMonitor or qemuBlockJobDiskNew
would return a failure.
Since block job handling is executed in the separate handler which needs
a qemu job, we don't need to pre-set the mirror state prior to starting
the job. Similarly the block copy job does not do that.
Move the setting of the data after starting the job so that we avoid
this problem.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
While this should not be necessary as we clear it in the event handler,
let's be sure and clear it prior to starting the job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Switching a block job to some states (e.g. QEMU_BLOCKJOB_STATE_READY)
might not require a job, thus if it will become ready asynchronously we
should not overwrite the state any more.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>