Commit Graph

36 Commits

Author SHA1 Message Date
Michal Privoznik
8abd1ffed1 qemu_namespace: Be tolerant to non-existent files when populating /dev
In 6.7.0 release I've changed how domain namespace is built and
populated. Previously it used to be done from a pre-exec hook
(ran in the forked off child, just before dropping all privileges
and exec()-ing QEMU), which not only meant we had to have two
different code paths for creating a node in domain's namespace
(one for this pre-exec hook, the other for hotplug ran from the
daemon), it also proved problematic because it was leaking FDs
into QEMU process.

To mitigate this problem, we've not only ditched libdevmapper
from the NS population process, I've also dropped the pre-exec
code and let the NS be populated from the daemon (using the
hotplug code). But, I was not careful when doing so, because the
pre-exec code was tolerant to files that doesn't exist, while
this new code isn't. For instance, the very first thing that is
done when the new NS is created is it's populated with
@defaultDeviceACL which contain files like /dev/null, /dev/zero,
/dev/random and /dev/kvm (and others).  While the rest will
probably exist every time, /dev/kvm might not and thus the new
code I wrote has to be tolerant to that.

Of course, users can override the @defaultDeviceACL (by setting
cgroup_device_acl in qemu.conf) and remove /dev/kvm (which is
acceptable workaround), but we definitely want libvirt to work
out of the box even on hosts without KVM.

Fixes: 9048dc4e62
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-04 08:18:21 +02:00
Michal Privoznik
95b9db4ee2 lib: Prefer WITH_* prefix for #if conditionals
Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH.
Things got way better with Pavel's work on meson, but apparently,
mixing these two lead to confusing and easy to miss bugs (see
31fb929eca for instance). While we were forced to use HAVE_
prefix with autotools, we are free to chose our own prefix with
meson and since WITH_ prefix appears to be more popular let's use
it everywhere.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-02 10:28:10 +02:00
Michal Privoznik
db37396e41 qemu_namespace: Don't build namespace if domain doesn't have it enabled
Even if namespaces are disabled, then due to a missing check at the
beginning of qemuDomainBuildNamespace(), the domain startup code
still tries to populate (nonexistent) domain's namespace.

Fixes: 8da362fe62
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-08-24 19:19:47 +02:00
Michal Privoznik
f4f3e6de4a qemuDomainNamespaceTeardownInput: Deduplicate code
We can use qemuDomainSetupInput() to obtain the path that we
need to unlink() from within domain's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 20:01:01 +02:00
Michal Privoznik
b9338334d5 qemuDomainNamespaceTeardownRNG: Deduplicate code
We can use qemuDomainSetupRNG() to obtain the path that we
need to unlink() from within domain's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 20:00:34 +02:00
Michal Privoznik
3d74d6e283 qemuDomainNamespaceTeardownChardev: Deduplicate code
We can use qemuDomainSetupChardev() to obtain the path that we
need to unlink() from within domain's namespace.  Note, while
previously we unlinked only VIR_DOMAIN_CHR_TYPE_DEV chardevs,
with this change we unlink some other types too - exactly those
types we created when plugging the device in.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 20:00:08 +02:00
Michal Privoznik
4e4dc63ca8 qemuDomainNamespaceTeardownMemory: Deduplicate code
We can use qemuDomainSetupMemory() to obtain the path that we
need to unlink() from within domain's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:59:42 +02:00
Michal Privoznik
0983833ed9 qemuDomainNamespaceTeardownHostdev: Unlink paths in one go
In my attempt to deduplicate the code, we can use
qemuDomainSetupHostdev() to obtain the list of paths to unlink
and then pass it to qemuDomainNamespaceUnlinkPaths() to unlink
them in a single fork.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:59:17 +02:00
Michal Privoznik
f7feac4ba8 qemuDomainNamespaceUnlinkPaths: Turn @paths into string list
So far, the only caller qemuDomainNamespaceUnlinkPath() will
always pass a single path to unlink, but similarly to
qemuDomainNamespaceMknodPaths() - there are a few callers that
would like to pass two or more files to unlink at once (held in a
string list). Make the @paths argument a string list then.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:58:55 +02:00
Michal Privoznik
52fa81ac52 qemu_namespace: Rename qemuDomainNamespaceUnlinkPath() to qemuNamespaceUnlinkPath()
To match how Mknod counterpart was renamed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-03 19:58:34 +02:00
Michal Privoznik
5c86fbb72d qemuDomainDetachDeviceUnlink: Unlink paths in one go
Simirarly to qemuDomainAttachDeviceMknodHelper() which was
modified just a couple of commits ago, modify the unlink helper
which is called on device detach so that it can unlink multiple
files in one go instead of forking off for every single one of
them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:58:29 +02:00
Michal Privoznik
a83a2041eb qemu_domain_namespace: Drop unused functions
After previous cleanup, creating /dev nodes from pre-exec hook is
no longer needed and thus can be removed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:38 +02:00
Michal Privoznik
40592f168f qemuDomainBuildNamespace: Populate SEV from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain SEV into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:38 +02:00
Michal Privoznik
6483b1e32b qemuDomainBuildNamespace: Populate loader from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain loader into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:38 +02:00
Michal Privoznik
408f64df9f qemuDomainBuildNamespace: Populate RNGs from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain RNGs into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:38 +02:00
Michal Privoznik
c872905242 qemuDomainBuildNamespace: Populate inputs from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain inputs into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:38 +02:00
Michal Privoznik
5f4f7c2094 qemuDomainBuildNamespace: Populate graphics from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain graphics (render node)
into daemon's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:37 +02:00
Michal Privoznik
87ae5262a0 qemuDomainBuildNamespace: Populate TPM from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain TPM into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:37 +02:00
Michal Privoznik
a10a229269 qemuDomainBuildNamespace: Populate chardevs from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain chardevs into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:37 +02:00
Michal Privoznik
7e80f98dbe qemuDomainBuildNamespace: Populate memory from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain memory (nvdimms) into
daemon's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:37 +02:00
Michal Privoznik
48b6eabf56 qemuDomainBuildNamespace: Populate hostdevs from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain hostdevs into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:37 +02:00
Michal Privoznik
afc6304ef8 qemuDomainBuildNamespace: Populate disks from daemon's namespace
As mentioned in one of previous commits, populating domain's
namespace from pre-exec() hook is dangerous. This commit moves
population of the namespace with domain disks into daemon's
namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
9048dc4e62 qemuDomainBuildNamespace: Populate basic /dev from daemon's namespace
As mentioned in previous commit, populating domain's namespace
from pre-exec() hook is dangerous. This commit moves population
of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm,
etc.) into daemon's namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
8da362fe62 qemu_domain_namespace: Repurpose qemuDomainBuildNamespace()
Okay, here is the deal. Currently, the way we build namespace is
very fragile. It is done from pre-exec hook when starting a
domain, after we mass closed all FDs and before we drop
privileges and exec() QEMU. This fact poses some limitations onto
the namespace build code, e.g. it has to make sure not to keep
any FD opened (not even through a library call), because it would
be leaked to QEMU. Also, it has to call only async signal safe
functions. These requirements are hard to meet - in fact as of my
commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling
libdevmapper functions.

To solve this issue and avoid similar problems in the future, we
should change our paradigm. We already have functions which can
populate domain's namespace with nodes from the daemon context.
If we use them to populate the namespace and keep only the bare
minimum in the pre-exec hook, we've mitigated the risk.

Therefore, the old qemuDomainBuildNamespace() is renamed to
qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace()
function is introduced. So far, the new function is basically a
NOP and domain's namespace is still populated from the pre-exec
hook - next patches will fix it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
f1ac53772d qemuDomainSetupDisk: Accept @src
The aim to make it look as close to
qemuDomainNamespaceSetupDisk() as possible. The latter will call
the former and this change makes that diff easier to read.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
277412df51 qemuNamespaceMknodPaths: Turn @paths into string list
Every caller does the same - counts the number of items in a
string list they have, only to pass the number to
qemuDomainNamespaceMknodPaths(). This is needless - the function
can accept the string list and count the items itself.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:36 +02:00
Michal Privoznik
f17088975d qemuDomainNamespaceMknodPaths: Create more files in one go
While the previous commit prepared the helper function run in a
forked off helper (with corresponding struct), this commit
modifies the caller, which now create all files requested in a
single process and does not fork off for every single path.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:35 +02:00
Michal Privoznik
86d2e323f4 qemuDomainAttachDeviceMknodHelper: Create more files in a single go
So far, when attaching a device needs two or more /dev nodes
created into a domain, we fork off and run the helper for every
node separately. For majority of devices this is okay, because
they need no or one node created anyway. But the idea is to use
this attach code to build the namespace when starting a domain,
in which case there will be way more nodes than one.

To achieve this, the recursive approach for handling symlinks has
to be turned into an iterative one.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:35 +02:00
Michal Privoznik
bf9aeab4f0 qemuDomainAttachDeviceMknodRecursive: Isolate bind mounted devices condition
When attaching a device into a domain, the corresponding /dev
node might need to be created in the domain's namespace. For some
types of files we call mknod(), for symlinks we call symlink(),
but for others - which exist in the host namespace - we need to
so called 'bind mount' them (which is a way of passing a
file/directory between mount namespaces). There is this condition
in qemuDomainAttachDeviceMknodRecursive() which decides whether a
bind mount will be used, move it into a separate function so that
it can be reused later.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:35 +02:00
Michal Privoznik
08277c2bc6 qemu_domain_namespace.c: Rename qemuDomainAttachDeviceMknodData
This structure is going to be used from not only device attach
code, but also when building the namespace. Moreover, the code
lives in a separate file so the chances of clashing with another
name are minimal.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:35 +02:00
Michal Privoznik
759921d47c qemuDomainAttachDeviceMknodHelper: Don't leak data->target
It's not really a problem since this is a helper process that
dies as soon as the helper function returns, but the cleanup code
will be replaced with a function soon and this change prepares
the code for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:35 +02:00
Michal Privoznik
9d8d42137a qemuDomainNamespaceSetupHostdev: Create paths in one go
While qemuDomainNamespaceMknodPaths() doesn't actually create
files in the namespace in one go (it forks for each path), it a
few commits time it will.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:34 +02:00
Michal Privoznik
c467b07e27 qemu_domain_namespace: Check for namespace enablement earlier
Functions that create a device node after domain startup (used
from hotplug) will get a list of paths they want to create and
eventually call qemuDomainNamespaceMknodPaths() which then checks
whether domain mount namespace is enabled in the first place.
Alternatively, on device hotunplug, we might want to delete a
path inside domain namespace in which case
qemuDomainNamespaceUnlinkPaths() checks whether the namespace is
enabled. While this is not dangerous, it certainly burns a couple
of CPU cycles needlessly.

Check whether mount namespace is enabled upfront.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:34 +02:00
Michal Privoznik
68a4320b95 qemu_domain_namespace: Drop unused @cfg argument
There is a lot of functions called from
qemuDomainBuildNamespace() that accept @cfg
(virQEMUDriverConfigPtr) as an argument and don't use it.
Historically, it was done so that all qemuDomainSetupAll*()
functions look the same.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:34 +02:00
Michal Privoznik
764eaf1aa4 qemu_domain_namespace: Rename qemuDomainCreateNamespace()
The name of this function is not very helpful, because it doesn't
create anything, it just flips a bit in a bitmask when domain is
starting up. Move the function internals into qemu_process.c and
forget the function ever existed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:40:33 +02:00
Michal Privoznik
90eee87569 qemu: Separate out namespace handling code
The qemu_domain.c file is big as is and we should split it into
separate semantic blocks. Start with code that handles domain
namespaces.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-03 19:32:27 +02:00