https://bugzilla.redhat.com/show_bug.cgi?id=924153
Commit 904e05a2 (v0.9.9) added a per-<disk> seclabel element with
an attribute relabel='no' in order to try and minimize the
impact of shutdown delays when an NFS server disappears. The idea
was that if a disk is on NFS and can't be labeled in the first
place, there is no need to attempt the (no-op) relabel on domain
shutdown. Unfortunately, the way this was implemented was by
modifying the domain XML so that the optimization would survive
libvirtd restart, but in a way that is indistinguishable from an
explicit user setting. Furthermore, once the setting is turned
on, libvirt avoids attempts at labeling, even for operations like
snapshot or blockcopy where the chain is being extended or pivoted
onto non-NFS, where SELinux labeling is once again possible. As
a result, it was impossible to do a blockcopy to pivot from an
NFS image file onto a local file.
The solution is to separate the semantics of a chain that must
not be labeled (which the user can set even on persistent domains)
vs. the optimization of not attempting a relabel on cleanup (a
live-only annotation), and using only the user's explicit notation
rather than the optimization as the decision on whether to skip
a label attempt in the first place. When upgrading an older
libvirtd to a newer, an NFS volume will still attempt the relabel;
but as the avoidance of a relabel was only an optimization, this
shouldn't cause any problems.
In the ideal future, libvirt will eventually have XML describing
EVERY file in the backing chain, with each file having a separate
<seclabel> element. At that point, libvirt will be able to track
more closely which files need a relabel attempt at shutdown. But
until we reach that point, the single <seclabel> for the entire
<disk> chain is treated as a hint - when a chain has only one
file, then we know it is accurate; but if the chain has more than
one file, we have to attempt relabel in spite of the attribute,
in case part of the chain is local and SELinux mattered for that
portion of the chain.
* src/conf/domain_conf.h (_virSecurityDeviceLabelDef): Add new
member.
* src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML):
Parse it, for live images only.
(virSecurityDeviceLabelDefFormat): Output it.
(virDomainDiskDefParseXML, virDomainChrSourceDefParseXML)
(virDomainDiskSourceDefFormat, virDomainChrDefFormat)
(virDomainDiskDefFormat): Pass flags on through.
* src/security/security_selinux.c
(virSecuritySELinuxRestoreSecurityImageLabelInt): Honor labelskip
when possible.
(virSecuritySELinuxSetSecurityFileLabel): Set labelskip, not
norelabel, if labeling fails.
(virSecuritySELinuxSetFileconHelper): Fix indentation.
* docs/formatdomain.html.in (seclabel): Document new xml.
* docs/schemas/domaincommon.rng (devSeclabel): Allow it in RNG.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.xml:
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.args:
* tests/qemuxml2xmloutdata/qemuxml2xmlout-seclabel-*-labelskip.xml:
New test files.
* tests/qemuxml2argvtest.c (mymain): Run the new tests.
* tests/qemuxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The imagelabel SELinux label was only generated when relabeling was
enabled. This prohibited labeling of files created by libvirt that need
to be labeled even if relabeling is turned off.
The only codepath this change has direct impact on is labeling of FDs
passed to qemu which is always safe in current state.
I realized after the fact that it's probably better in the long run to
give this function a name that matches the name of the link used in
sysfs to hold the group (iommu_group).
I'm changing it now because I'm about to add several more functions
that deal with iommu groups.
To not introduce more redundant code, helpers are added for
both "selinux", "dac", and "apparmor" backends.
Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
Signed-off-by: Osier Yang <jyang@redhat>
v2.5 - v3:
* Splitted from 8/10 of v2.5
* Don't forget the other backends (DAC, and apparmor)
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
If virPCIDeviceGetVFIOGroupDev() failed,
virSecurity*(Set|Restore)HostdevLabel() would fail to free a
virPCIDevice that had been allocated.
These leaks were all introduced (by me) very recently, in commit
f0bd70a.
This isn't strictly speaking a bugfix, but I realized I'd gotten a bit
too verbose when I chose the names for
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_*. This shortens them all a bit.
Legacy kvm style pci device assignment requires changes to the
labelling of several sysfs files for each device, but for vfio device
assignment, the only thing that needs to be relabelled/chowned is the
"group" device for the group that contains the device to be assigned.
There will soon be other items related to pci hostdevs that need to be
in the same part of the hostdevsubsys union as the pci address (which
is currently a single member called "pci". This patch replaces the
single member named pci with a struct named pci that contains a single
member named "addr".
Since cbe67ff9b0a5a94911afd4d12388bf182bcca86c
virSecuritySELinuxGenNewContext reports good error messages. Fix callers
that mask the errors by generic error message.
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
The virDomainDefGetSecurityLabelDef was modifying the domain XML.
It tried to find a seclabel corresponding to given sec driver. If the
label wasn't found, the function created one which is wrong. In fact
it's security manager which should modify this part of domain XML.
Normally libvirtd should run with a SELinux label
system_u:system_r:virtd_t:s0-s0:c0.c1023
If a user manually runs libvirtd though, it is sometimes
possible to get into a situation where it is running
system_u:system_r:init_t:s0
The SELinux security driver isn't expecting this and can't
parse the security label since it lacks the ':c0.c1023' part
causing it to complain
internal error Cannot parse sensitivity level in s0
This updates the parser to cope with this, so if no category
is present, libvirtd will hardcode the equivalent of c0.c1023.
Now this won't work if SELinux is in Enforcing mode, but that's
not an issue, because the user can only get into this problem
if in Permissive mode. This means they can now start VMs in
Permissive mode without hitting that parsing error
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Pull the code which parses the current process MCS range
out of virSecuritySELinuxMCSFind and into a new method
virSecuritySELinuxMCSGetProcessRange.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The body of the loop in virSecuritySELinuxMCSFind would
directly 'return NULL' on OOM, instead of jumping to the
cleanup label. This caused a leak of several local vars.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Coverity found the DACGenLabel was checking for mgr == NULL after a
possible dereference; however, in order to get into the function the
virSecurityManagerGenLabel would have already dereferenced sec_managers[i]
so the check was unnecessary. Same check is made in SELinuxGenSecurityLabel.
The existing virSecurityManagerSetProcessLabel() API is designed so
that it must be called after forking the child process, but before
exec'ing the child. Due to the way the virCommand API works, that
means it needs to be put in a "hook" function that virCommand is told
to call out to at that time.
Setting the child process label is a basic enough need when executing
any process that virCommand should have a method of doing that. But
virCommand must be told what label to set, and only the security
driver knows the answer to that question.
The new virSecurityManagerSet*Child*ProcessLabel() API is the way to
transfer the knowledge about what label to set from the security
driver to the virCommand object. It is given a virCommandPtr, and each
security driver calls the appropriate virCommand* API to tell
virCommand what to do between fork and exec.
1) in the case of the DAC security driver, it calls
virCommandSetUID/GID() to set a uid and gid that must be set for the
child process.
2) for the SELinux security driver, it calls
virCommandSetSELinuxLabel() to save a copy of the char* that will be
sent to setexeccon_raw() *after forking the child process*.
3) for the AppArmor security drivers, it calls
virCommandSetAppArmorProfile() to save a copy of the char* that will
be sent to aa_change_profile() *after forking the child process*.
With this new API in place, we will be able to remove
virSecurityManagerSetProcessLabel() from any virCommand pre-exec
hooks.
(Unfortunately, the LXC driver uses clone() rather than virCommand, so
it can't take advantage of this new security driver API, meaning that
we need to keep around the older virSecurityManagerSetProcessLabel(),
at least for now.)
On RHEL 5, I got:
security/security_selinux.c: In function 'getContext':
security/security_selinux.c:971: warning: unused parameter 'mgr' [-Wunused-parameter]
* src/security/security_selinux.c (getContext): Mark potentially
unused parameter.
The security manager drivers are not allowed to call back
out to top level security manager APIs, since that results
in recursive mutex acquisition and thus deadlock. Remove
calls to virSecurityManagerGetModel from SELinux / AppArmor
drivers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id a994ef2d1 changed the mechanism to store/update the default
security label from using disk->seclabels[0] to allocating one on the
fly. That change allocated the label, but never saved it. This patch
will save the label. The new virDomainDiskDefAddSecurityLabelDef() is
a copy of the virDomainDefAddSecurityLabelDef().
Prepare to support different types of hostdevs by refactoring
the current SELinux security driver code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When LXC labels USB devices during hotplug, it is running in
host context, so it needs to pass in a vroot path to the
container root.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virSecurityManager{Set,Restore}AllLabel methods are invoked
at domain startup/shutdown to relabel resources associated with
a domain. This works fine with QEMU, but with LXC they are in
fact both currently no-ops since LXC does not support disks,
hostdevs, or kernel/initrd files. Worse, when LXC gains support
for disks/hostdevs, they will do the wrong thing, since they
run in host context, not container context. Thus this patch
turns then into a formal no-op when used with LXC. The LXC
controller will call out to specific security manager labelling
APIs as required during startup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current SELinux policy only works for KVM guests, since
TCG requires the 'execmem' privilege. There is a 'virt_use_execmem'
boolean to turn this on globally, but that is unpleasant for users.
This changes libvirt to automatically use a new 'svirt_tcg_t'
context for TCG based guests. This obsoletes the previous
boolean tunable and makes things 'just work(tm)'
Since we can't assume we run with new enough policy, I also
make us log a warning message (once only) if we find the policy
lacks support. In this case we fallback to the normal label and
expect users to set the boolean tunable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>