Don't fail when there is nothing to do, as a tweak to the previous
patch regarding output of libvirt-UUID.files for LXC apparmor profiles
Signed-off-by: Eric Blake <eblake@redhat.com>
This negation in names of boolean variables is driving me insane. The
code is much more readable if we drop the 'no-' prefix. Well, at least
for me.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add security driver functions to label separate storage images using the
virStorageSource definition. This will help to avoid the need to do ugly
changes to the disk struct and use the source directly.
In the future we might need to track state of individual images. Move
the readonly and shared flags to the virStorageSource struct so that we
can keep them in a per-image basis.
The function headers contain type on the same line as the name. When
combined with usage of ATTRIBUTE_UNUSED, the function headers were very
long. Shorten them by breaking the line after the type.
virSecurityManagerSetDiskLabel and virSecurityManagerRestoreDiskLabel
don't have complementary semantics. Document the semantics to avoid
possible problems.
I'm going to add functions that will deal with individual image files
rather than whole disks. Rename the security function to make room for
the new one.
I'm going to add functions that will deal with individual image files
rather than whole disks. Rename the security function to make room for
the new one.
The image labels are stored in the virStorageSource struct. Convert the
virDomainDiskDefGetSecurityLabelDef helper not to use the full disk def
and move it appropriately.
A network disk might actually be backed by local storage. Also the path
iterator actually handles networked disks well now so remove the code
that skips the labelling in dac and selinux security driver.
As part of the work on backing chains, I'm finding that it would
be easier to directly manipulate chains of pointers (adding a
snapshot merely adjusts pointers to form the correct list) rather
than copy data from one struct to another. This patch converts
domain disk source to be a pointer.
In this patch, the pointer is ALWAYS allocated (thanks in part to
the previous patch forwarding all disk def allocation through a
common point), and all other changse are just mechanical fallout of
the new type; there should be no functional change. It is possible
that we may want to leave the pointer NULL for a cdrom with no
medium in a later patch, but as that requires a closer audit of the
source to ensure we don't fault on a null dereference, I didn't do
it here.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of src.
* src/conf/domain_conf.c: Adjust all clients.
* src/security/security_selinux.c: Likewise.
* src/qemu/qemu_domain.c: Likewise.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_process.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* tests/securityselinuxlabeltest.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
My future work will modify the metadata crawler function to use the
storage driver file APIs to access the files instead of accessing them
directly so that we will be able to request the metadata for remote
files too. To avoid linking the storage driver to every helper file
using the utils code, the backing chain traversal function needs to be
moved to the storage driver source.
Additionally the virt-aa-helper and virstoragetest programs need to be
linked with the storage driver as a result of this change.
In "src/conf/domain_conf.h" there are many enum declarations. The
cleanup in this header filer was started, but it wasn't enough and
there are many other files that has enum variables declared. So, the
commit was starting to be big. This commit finish the cleanup in this
header file and in other files that has enum variables, parameters,
or functions declared.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
In "src/conf/domain_conf.h" there are many enumerations (enum)
declarations to be converted as a typedef too. As mentioned before,
it's better to use a typedef for variable types, function types and
other usages. I think this file has most of those enum declarations
at "src/conf/". So, me and Eric Blake plan to keep the cleanups all
over the source code. This time, most of the files changed in this
commit are related to part of one file: "src/conf/domain_conf.h".
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
With dynamic_ownership = 1 but no seclabels, RestoreChardevLabel
dereferences the NULL seclabel when checking if norelabel is set.
Remove this check, since it is already done in RestoreSecurityAllLabel
and if norelabel is set, RestoreChardevLabel is never called.
The domain definition is clearly used a few lines
below so there's no need to mark @def as unused.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The DAC driver ignores the relabel='no' attribute in chardev config
<serial type='file'>
<source path='/tmp/jim/test.file'>
<seclabel model='dac' relabel='no'/>
</source>
<target port='0'/>
</serial>
This patch avoids labeling chardevs when relabel='no' is specified.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When relabel='no' at the domain level, there is no need to call
the hostdev relabeling functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=999301
The DAC driver ignores the relabel='no' attribute in disk config
<disk type='file' device='floppy'>
<driver name='qemu' type='raw'/>
<source file='/some/path/floppy.img'>
<seclabel model='dac' relabel='no'/>
</source>
<target dev='fda' bus='fdc'/>
<readonly/>
</disk>
This patch avoid labeling disks when relabel='no' is specified.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
If relabel='no' at the domain level, no need to attempt relabeling
in virSecurityDAC{Set,Restore}SecurityAllLabel().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Currently, the DAC security driver passes callback data as
void params[2];
params[0] = mgr;
params[1] = def;
Clean this up by defining a structure for passing the callback
data. Moreover, there's no need to pass the whole virDomainDef
in the callback as the only thing needed in the callbacks is
virSecurityLabelDefPtr.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
In switch statements, use enum types since it is safer when
adding new items to the enum.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Annotate some static function parameters with ATTRIBUTE_NONNULL
and remove checks for NULL inputs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
To avoid having the root of a backing chain present twice in the list we
need to invert the working of virStorageFileGetMetadataRecurse.
Until now the recursive worker created a new backing chain element from
the name and other information passed as arguments. This required us to
pass the data of the parent in a deconstructed way and the worker
created a new entry for the parent.
This patch converts this function so that it just fills in metadata
about the parent and creates a backing chain element from those. This
removes the duplication of the first element.
To avoid breaking the test suite, virstoragetest now calls a wrapper
that creates the parent structure explicitly and pre-fills it with the
test data with same function signature as previously used.
Switch over to storing of the backing chain as a recursive
virStorageSource structure.
This is a string based move. Currently the first element will be present
twice in the backing chain as currently the retrieval function stores
the parent in the newly detected chain. This will be fixed later.
Since it is an abbreviation, PCI should always be fully
capitalized or full lower case, never Pci.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
I noticed that the apparmor code could request metadata even
for a cdrom with no media, which would cause a memory leak of
the hash table used to look for loops in the backing chain.
But even before that, we blindly dereferenced the path for
printing a debug statement, so it is just better to enforce
that this is only used on non-NULL names.
* src/util/virstoragefile.c (virStorageFileGetMetadata): Assume
non-NULL path.
* src/util/virstoragefile.h: Annotate this.
* src/security/virt-aa-helper.c (get_files): Fix caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
Coverity complains about a possible leak of seclabel if
!sec_managers[i]->drv->domainGenSecurityLabel is true
and the seclabel might be overwritten by the next iteration
of the loop.
This leak should never happen, because every security driver
has domainGenSecurityLabel defined.
The code in virstoragefile.c is getting more complex as I
consolidate backing chain handling code. But for the setuid
virt-login-shell, we don't need to crawl backing chains. It's
easier to audit things for setuid security if there are fewer
files involved, so this patch moves the one function that
virFileOpen() was actually relying on to also live in virfile.c.
* src/util/virstoragefile.c (virStorageFileIsSharedFS)
(virStorageFileIsSharedFSType): Move...
* src/util/virfile.c (virFileIsSharedFS, virFileIsSharedFSType):
...to here, and rename.
(virFileOpenAs): Update caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetFileconHelper)
(virSecuritySELinuxSetSecurityAllLabel)
(virSecuritySELinuxRestoreSecurityImageLabelInt): Likewise.
* src/security/security_dac.c
(virSecurityDACRestoreSecurityImageLabelInt): Likewise.
* src/qemu/qemu_driver.c (qemuOpenFileAs): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsSafe): Likewise.
* src/util/virstoragefile.h: Adjust declarations.
* src/util/virfile.h: Likewise.
* src/libvirt_private.syms (virfile.h, virstoragefile.h): Move
symbols as appropriate.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to reuse the newly-created host-side disk struct in
the virstoragefile backing chain code, I first have to move
it to util/. This starts the process, by first moving the
security label structures.
* src/conf/domain_conf.h (virDomainDefGenSecurityLabelDef)
(virDomainDiskDefGenSecurityLabelDef, virSecurityLabelDefFree)
(virSecurityDeviceLabelDefFree, virSecurityLabelDef)
(virSecurityDeviceLabelDef): Move...
* src/util/virseclabel.h: ...to new file.
(virSecurityLabelDefNew, virSecurityDeviceLabelDefNew): Rename the
GenSecurity functions.
* src/qemu/qemu_process.c (qemuProcessAttach): Adjust callers.
* src/security/security_manager.c (virSecurityManagerGenLabel):
Likewise.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityFileLabel): Likewise.
* src/util/virseclabel.c: New file.
* src/conf/domain_conf.c: Move security code, and fix fallout.
* src/Makefile.am (UTIL_SOURCES): Build new file.
* src/libvirt_private.syms (domain_conf.h): Move symbols...
(virseclabel.h): ...to new section.
Signed-off-by: Eric Blake <eblake@redhat.com>
See lp#1276719 for the bug description. As virt-aa-helper doesn't know
the VFIO groups to use for the guest, allow access to all
/dev/vfio/[0-9]* and /dev/vfio/vfio files if there is a potential need
for vfio
Signed-off-by: Eric Blake <eblake@redhat.com>
It's finally time to start tracking disk backing chains in
<domain> XML. The first step is to start refactoring code
so that we have an object more convenient for representing
each host source resource in the context of a single guest
<disk>. Ultimately, I plan to move the new type into src/util
where it can be reused by virStorageFile, but to make the
transition easier to review, this patch just creates the
new type then fixes everything until it compiles again.
* src/conf/domain_conf.h (_virDomainDiskDef): Split...
(_virDomainDiskSourceDef): ...to new struct.
(virDomainDiskAuthClear): Use new type.
* src/conf/domain_conf.c (virDomainDiskDefFree): Split...
(virDomainDiskSourceDefClear): ...to new function.
(virDomainDiskGetType, virDomainDiskSetType)
(virDomainDiskGetSource, virDomainDiskSetSource)
(virDomainDiskGetDriver, virDomainDiskSetDriver)
(virDomainDiskGetFormat, virDomainDiskSetFormat)
(virDomainDiskAuthClear, virDomainDiskGetActualType)
(virDomainDiskDefParseXML, virDomainDiskSourceDefFormat)
(virDomainDiskDefFormat, virDomainDiskDefForeachPath)
(virDomainDiskDefGetSecurityLabelDef)
(virDomainDiskSourceIsBlockType): Adjust all users.
* src/lxc/lxc_controller.c (virLXCControllerSetupDisk):
Likewise.
* src/lxc/lxc_driver.c (lxcDomainAttachDeviceMknodHelper):
Likewise.
* src/qemu/qemu_command.c (qemuAddRBDHost, qemuParseRBDString)
(qemuParseDriveURIString, qemuParseGlusterString)
(qemuParseISCSIString, qemuParseNBDString)
(qemuDomainDiskGetSourceString, qemuBuildDriveStr)
(qemuBuildCommandLine, qemuParseCommandLineDisk)
(qemuParseCommandLine): Likewise.
* src/qemu/qemu_conf.c (qemuCheckSharedDevice)
(qemuAddISCSIPoolSourceHost, qemuTranslateDiskSourcePool):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainUpdateDeviceConfig)
(qemuDomainPrepareDiskChainElement)
(qemuDomainSnapshotCreateInactiveExternal)
(qemuDomainSnapshotPrepareDiskExternalBackingInactive)
(qemuDomainSnapshotPrepareDiskInternal)
(qemuDomainSnapshotPrepare)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive)
(qemuDomainBlockPivot, qemuDomainBlockJobImpl)
(qemuDomainBlockCopy, qemuDomainBlockCommit): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsSafe): Likewise.
* src/qemu/qemu_process.c (qemuProcessGetVolumeQcowPassphrase)
(qemuProcessInitPasswords): Likewise.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityFileLabel): Likewise.
* src/storage/storage_driver.c (virStorageFileInitFromDiskDef):
Likewise.
* tests/securityselinuxlabeltest.c (testSELinuxLoadDef):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Part of a series of cleanups to use new accessor methods.
* src/security/security_dac.c (virSecurityDACSetSecurityImageLabel)
(virSecurityDACRestoreSecurityImageLabelInt)
(virSecurityDACSetSecurityAllLabel): Use accessors.
* src/security/security_selinux.c
(virSecuritySELinuxRestoreSecurityImageLabelInt)
(virSecuritySELinuxSetSecurityImageLabel)
(virSecuritySELinuxSetSecurityAllLabel): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Per the documentation, is_selinux_enabled() returns -1 on error.
Account for this. Previously when -1 was being returned the condition
would still be true. I was noticing this because on my system that has
selinux disabled I was getting this in the libvirt.log every 5
seconds:
error : virIdentityGetSystem:173 : Unable to lookup SELinux process context: Invalid argument
With this patch applied, I no longer get these messages every 5
seconds. I am submitting this in case its deemed useful for inclusion.
Anyone have any comments on this change? This is a patch off current
master.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If there should be some sort of separator it is better to use comment
with the filename, copyright, description, license information and
authors.
Found by:
git grep -nH '^$' | grep '\.[ch]:1:'
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Make virt-aa-helper create rules to allow VMs access to filesystem
mounts from the host.
Signed-off-by: Felix Geyer <debfx@fobos.de>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Guido Günther <agx@sigxcpu.org>
use_apparmor() was first designed to be called from withing libvirtd,
but libvirt_lxc also uses it. in libvirt_lxc, there is no need to check
whether to use apparmor or not: just use it if possible.
Commit 2ce63c1 added imagelabel generation when relabeling is turned
off. But we weren't filling out the sensitivity for type 'none' labels,
resulting in an invalid label:
$ virsh managedsave domain
error: unable to set security context 'system_u:object_r:svirt_image_t'
on fd 28: Invalid argument
To support passing the path of the test data to the utils, one
more argument is added to virSCSIDeviceGetSgName,
virSCSIDeviceGetDevName, and virSCSIDeviceNew, and the related
code is changed accordingly.
Later tests for the scsi utils will be based on this patch.
Signed-off-by: Osier Yang <jyang@redhat.com>
Unlike the host devices of other types, SCSI host device XML supports
"shareable" tag. This patch introduces it for the virSCSIDevice struct
for a later patch use (to detect if the SCSI device is shareable when
preparing the SCSI host device in QEMU driver).
https://bugzilla.redhat.com/show_bug.cgi?id=996543
When starting up a domain, the SELinux labeling is done depending on
current configuration. If the labeling fails we check for possible
causes, as not all labeling failures are fatal. For example, if the
labeled file is on NFS which lacks SELinux support, the file can still
be readable to qemu process. These cases are distinguished by the errno
code: NFS without SELinux support returns EOPNOTSUPP. However, we were
missing one scenario. In case there's a read-only disk on a read-only
NFS (and possibly any FS) and the labeling is just optional (not
explicitly requested in the XML) there's no need to make the labeling
error fatal. In other words, read-only file on read-only NFS can fail to
be labeled, but be readable at the same time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We weren't very consistent in our use of VIR_ERR_NO_SUPPORT; many
users just passed __FUNCTION__ on, while others passed "%s" to
silence over-eager compilers that warn about __FUNCTION__ not
containing any %. It's nicer to route all these uses through
a single macro, so that if we ever need to change the reporting,
we can do it in one place.
I verified that 'virsh -c test:///default qemu-monitor-command test foo'
gives the same error message before and after this patch:
error: this function is not supported by the connection driver: virDomainQemuMonitorCommand
Note that in libvirt.c, we were inconsistent on whether virDomain*
API used virLibConnError() (with VIR_FROM_NONE) or virLibDomainError()
(with VIR_FROM_DOMAIN); this patch unifies these errors to all use
VIR_FROM_NONE, on the grounds that it is unlikely that a caller
learning that a call is unimplemented can do anything in particular
with extra knowledge of which error domain it belongs to.
One particular change to note is virDomainOpenGraphics which was
trying to fail with VIR_ERR_NO_SUPPORT after a failed
VIR_DRV_SUPPORTS_FEATURE check; all other places that fail a
feature check report VIR_ERR_ARGUMENT_UNSUPPORTED.
* src/util/virerror.h (virReportUnsupportedError): New macro.
* src/libvirt-qemu.c: Use new macro.
* src/libvirt-lxc.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/security/security_manager.c: Likewise.
* src/util/virinitctl.c: Likewise.
* src/libvirt.c: Likewise.
(virDomainOpenGraphics): Use correct error for unsupported feature.
Signed-off-by: Eric Blake <eblake@redhat.com>
For a while we're have random failures of 'securityselinuxtest'
which were not at all reproducible. Fortunately we finally
caught a failure with VIR_TEST_DEBUG=1 enabled. This revealed
TEST: securityselinuxtest
1) GenLabel "dynamic unconfined, s0, c0.c1023" ... OK
2) GenLabel "dynamic unconfined, s0, c0.c1023" ... OK
3) GenLabel "dynamic unconfined, s0, c0.c1023" ... OK
4) GenLabel "dynamic virtd, s0, c0.c1023" ... OK
5) GenLabel "dynamic virtd, s0, c0.c10" ... OK
6) GenLabel "dynamic virtd, s2-s3, c0.c1023" ... OK
7) GenLabel "dynamic virtd, missing range" ... Category two 1024 is out of range 0-1023
FAILED
FAIL: securityselinuxtest
And sure enough we had an off-by-1 in the MCS range code when
the current process has no range set. The test suite randomly
allocates 2 categories from 0->1024 so the chances of hitting
this in the test suite were slim indeed :-)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To make it easier to forbid future attempts at a confusing typedef
name ending in Ptr that isn't actually a pointer, insist that we
follow our preferred style of 'typedef foo *fooPtr'.
* cfg.mk (sc_forbid_const_pointer_typedef): Enforce consistent
style, to prevent issue fixed in previous storage patch.
* src/conf/capabilities.h (virCapsPtr): Fix offender.
* src/security/security_stack.c (virSecurityStackItemPtr):
Likewise.
* tests/qemucapabilitiestest.c (testQemuDataPtr): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
To ensure proper processing by virGetUserID() and virGetGroupID()
of a uid/gid add a "+" prior to the uid/gid to denote it's really
a uid/gid for the label.
Merge the functions 'virSecurityDACSetUser' and
'virSecurityDACSetGroup' into 'virSecurityDACSetUserAndGroup'.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
'const fooPtr' is the same as 'foo * const' (the pointer won't
change, but it's contents can). But in general, if an interface
is trying to be const-correct, it should be using 'const foo *'
(the pointer is to data that can't be changed).
Fix up offenders in src/security.
* src/security/security_apparmor.c (reload_profile)
(AppArmorSetSecurityHostdevLabelHelper)
(AppArmorReleaseSecurityLabel, AppArmorRestoreSecurityAllLabel)
(AppArmorSetSecurityProcessLabel)
(AppArmorSetSecurityChildProcessLabel)
(AppArmorSetSecurityImageLabel, AppArmorSecurityVerify)
(AppArmorSetSecurityHostdevLabel)
(AppArmorRestoreSecurityHostdevLabel, AppArmorSetFDLabel): Drop
needless const.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityFileLabel): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 29fe5d7 (released in 1.1.1) introduced a latent problem
for any caller of virSecurityManagerSetProcessLabel and where
the domain already had a uid:gid label to be parsed. Such a
setup would collect the list of supplementary groups during
virSecurityManagerPreFork, but then ignores that information,
and thus fails to call setgroups() to adjust the supplementary
groups of the process.
Upstream does not use virSecurityManagerSetProcessLabel for
qemu (it uses virSecurityManagerSetChildProcessLabel instead),
so this problem remained latent until backporting the initial
commit into v0.10.2-maint (commit c061ff5, released in 0.10.2.7),
where virSecurityManagerSetChildProcessLabel has not been
backported. As a result of using a different code path in the
backport, attempts to start a qemu domain that runs as qemu:qemu
will end up with supplementary groups unchanged from the libvirtd
parent process, rather than the desired supplementary groups of
the qemu user. This can lead to failure to start a domain
(typical Fedora setup assigns user 107 'qemu' to both group 107
'qemu' and group 36 'kvm', so a disk image that is only readable
under kvm group rights is locked out). Worse, it is a security
hole (the qemu process will inherit supplemental group rights
from the parent libvirtd process, which means it has access
rights to files owned by group 0 even when such files should
not normally be visible to user qemu).
LXC does not use the DAC security driver, so it is not vulnerable
at this time. Still, it is better to plug the latent hole on
the master branch first, before cherry-picking it to the only
vulnerable branch v0.10.2-maint.
* src/security/security_dac.c (virSecurityDACGetIds): Always populate
groups and ngroups, rather than only when no label is parsed.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=924153
Commit 904e05a2 (v0.9.9) added a per-<disk> seclabel element with
an attribute relabel='no' in order to try and minimize the
impact of shutdown delays when an NFS server disappears. The idea
was that if a disk is on NFS and can't be labeled in the first
place, there is no need to attempt the (no-op) relabel on domain
shutdown. Unfortunately, the way this was implemented was by
modifying the domain XML so that the optimization would survive
libvirtd restart, but in a way that is indistinguishable from an
explicit user setting. Furthermore, once the setting is turned
on, libvirt avoids attempts at labeling, even for operations like
snapshot or blockcopy where the chain is being extended or pivoted
onto non-NFS, where SELinux labeling is once again possible. As
a result, it was impossible to do a blockcopy to pivot from an
NFS image file onto a local file.
The solution is to separate the semantics of a chain that must
not be labeled (which the user can set even on persistent domains)
vs. the optimization of not attempting a relabel on cleanup (a
live-only annotation), and using only the user's explicit notation
rather than the optimization as the decision on whether to skip
a label attempt in the first place. When upgrading an older
libvirtd to a newer, an NFS volume will still attempt the relabel;
but as the avoidance of a relabel was only an optimization, this
shouldn't cause any problems.
In the ideal future, libvirt will eventually have XML describing
EVERY file in the backing chain, with each file having a separate
<seclabel> element. At that point, libvirt will be able to track
more closely which files need a relabel attempt at shutdown. But
until we reach that point, the single <seclabel> for the entire
<disk> chain is treated as a hint - when a chain has only one
file, then we know it is accurate; but if the chain has more than
one file, we have to attempt relabel in spite of the attribute,
in case part of the chain is local and SELinux mattered for that
portion of the chain.
* src/conf/domain_conf.h (_virSecurityDeviceLabelDef): Add new
member.
* src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML):
Parse it, for live images only.
(virSecurityDeviceLabelDefFormat): Output it.
(virDomainDiskDefParseXML, virDomainChrSourceDefParseXML)
(virDomainDiskSourceDefFormat, virDomainChrDefFormat)
(virDomainDiskDefFormat): Pass flags on through.
* src/security/security_selinux.c
(virSecuritySELinuxRestoreSecurityImageLabelInt): Honor labelskip
when possible.
(virSecuritySELinuxSetSecurityFileLabel): Set labelskip, not
norelabel, if labeling fails.
(virSecuritySELinuxSetFileconHelper): Fix indentation.
* docs/formatdomain.html.in (seclabel): Document new xml.
* docs/schemas/domaincommon.rng (devSeclabel): Allow it in RNG.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.xml:
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.args:
* tests/qemuxml2xmloutdata/qemuxml2xmlout-seclabel-*-labelskip.xml:
New test files.
* tests/qemuxml2argvtest.c (mymain): Run the new tests.
* tests/qemuxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Parsing 'user:group' is useful even outside the DAC security driver,
so expose the most abstract function which has no DAC security driver
bits in itself.
Attempts to start a domain with both SELinux and DAC security
modules loaded will deadlock; latent problem introduced in commit
fdb3bde and exposed in commit 29fe5d7. Basically, when recursing
into the security manager for other driver's prefork, we have to
undo the asymmetric lock taken at the manager level.
Reported by Jiri Denemark, with diagnosis help from Dan Berrange.
* src/security/security_stack.c (virSecurityStackPreFork): Undo
extra lock grabbed during recursion.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 75c1256 states that virGetGroupList must not be called
between fork and exec, then commit ee777e99 promptly violated
that for lxc's use of virSecurityManagerSetProcessLabel. Hoist
the supplemental group detection to the time that the security
manager needs to fork. Qemu is safe, as it uses
virSecurityManagerSetChildProcessLabel which in turn uses
virCommand to determine supplemental groups.
This does not fix the fact that virSecurityManagerSetProcessLabel
calls virSecurityDACParseIds calls parseIds which eventually
calls getpwnam_r, which also violates fork/exec async-signal-safe
safety rules, but so far no one has complained of hitting
deadlock in that case.
* src/security/security_dac.c (_virSecurityDACData): Track groups
in private data.
(virSecurityDACPreFork): New function, to set them.
(virSecurityDACClose): Clean up new fields.
(virSecurityDACGetIds): Alter signature.
(virSecurityDACSetSecurityHostdevLabelHelper)
(virSecurityDACSetChardevLabel, virSecurityDACSetProcessLabel)
(virSecurityDACSetChildProcessLabel): Update callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch wants the DAC security manager to be able to safely
get the supplemental group list for a given uid, but at the time
of a fork rather than during initialization so as to pick up on
live changes to the system's group database. This patch adds the
framework, including the possibility of a pre-fork callback
failing.
For now, any driver that implements a prefork callback must be
robust against the possibility of being part of a security stack
where a later element in the chain fails prefork. This means
that drivers cannot do any action that requires a call to postfork
for proper cleanup (no grabbing a mutex, for example). If this
is too prohibitive in the future, we would have to switch to a
transactioning sequence, where each driver has (up to) 3 callbacks:
PreForkPrepare, PreForkCommit, and PreForkAbort, to either clean
up or commit changes made during prepare.
* src/security/security_driver.h (virSecurityDriverPreFork): New
callback.
* src/security/security_manager.h (virSecurityManagerPreFork):
Change signature.
* src/security/security_manager.c (virSecurityManagerPreFork):
Optionally call into driver, and allow returning failure.
* src/security/security_stack.c (virSecurityDriverStack):
Wrap the handler for the stack driver.
* src/qemu/qemu_process.c (qemuProcessStart): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
While generating seclabels, we check the seclabel stack if required
driver is in the stack. If not, an error is returned. However, it is
possible for a seclabel to not have any model set (happens with LXC
domains that have just <seclabel type='none'>). If that's the case,
we should just skip the iteration instead of calling STREQ(NULL, ...)
and SIGSEGV-ing subsequently.
https://bugzilla.redhat.com/show_bug.cgi?id=964358
POSIX states that multi-threaded apps should not use functions
that are not async-signal-safe between fork and exec, yet we
were using getpwuid_r and initgroups. Although rare, it is
possible to hit deadlock in the child, when it tries to grab
a mutex that was already held by another thread in the parent.
I actually hit this deadlock when testing multiple domains
being started in parallel with a command hook, with the following
backtrace in the child:
Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)):
#0 __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360)
at pthread_mutex_lock.c:61
#3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70,
buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8)
at nss_files/files-pwd.c:40
#4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70,
buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0)
at ../nss/getXXbyYY_r.c:253
#5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031
#6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0,
clearExistingCaps=true) at util/virutil.c:1388
#7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654
#8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0)
at util/vircommand.c:2247
#9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0)
at util/vircommand.c:2100
#10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0,
driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1,
stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
flags=1) at qemu/qemu_process.c:3694
...
The solution is to split the work of getpwuid_r/initgroups into the
unsafe portions (getgrouplist, called pre-fork) and safe portions
(setgroups, called post-fork).
* src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust
signature.
* src/util/virutil.c (virSetUIDGID): Add parameters.
(virSetUIDGIDWithCaps): Adjust clients.
* src/util/vircommand.c (virExec): Likewise.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* src/security/security_dac.c (virSecurityDACSetProcessLabel):
Likewise.
* src/lxc/lxc_container.c (lxcContainerSetID): Likewise.
* configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not
initgroups.
Signed-off-by: Eric Blake <eblake@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The imagelabel SELinux label was only generated when relabeling was
enabled. This prohibited labeling of files created by libvirt that need
to be labeled even if relabeling is turned off.
The only codepath this change has direct impact on is labeling of FDs
passed to qemu which is always safe in current state.
I realized after the fact that it's probably better in the long run to
give this function a name that matches the name of the link used in
sysfs to hold the group (iommu_group).
I'm changing it now because I'm about to add several more functions
that deal with iommu groups.
I noticed several unusual spacings in for loops, and decided to
fix them up. See the next commit for the syntax check that found
all of these.
* examples/domsuspend/suspend.c (main): Fix spacing.
* python/libvirt-override.c: Likewise.
* src/conf/interface_conf.c: Likewise.
* src/security/virt-aa-helper.c: Likewise.
* src/util/virconf.c: Likewise.
* src/util/virhook.c: Likewise.
* src/util/virlog.c: Likewise.
* src/util/virsocketaddr.c: Likewise.
* src/util/virsysinfo.c: Likewise.
* src/util/viruuid.c: Likewise.
* src/vbox/vbox_tmpl.c: Likewise.
* src/xen/xen_hypervisor.c: Likewise.
* tools/virsh-domain-monitor.c (vshDomainStateToString): Drop
default case, to let compiler check us.
* tools/virsh-domain.c (vshDomainVcpuStateToString): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
To not introduce more redundant code, helpers are added for
both "selinux", "dac", and "apparmor" backends.
Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
Signed-off-by: Osier Yang <jyang@redhat>
v2.5 - v3:
* Splitted from 8/10 of v2.5
* Don't forget the other backends (DAC, and apparmor)
These all existed before virfile.c was created, and for some reason
weren't moved.
This is mostly straightfoward, although the syntax rule prohibiting
write() had to be changed to have an exception for virfile.c instead
of virutil.c.
This movement pointed out that there is a function called
virBuildPath(), and another almost identical function called
virFileBuildPath(). They really should be a single function, which
I'll take care of as soon as I figure out what the arglist should look
like.
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
If virPCIDeviceGetVFIOGroupDev() failed,
virSecurity*(Set|Restore)HostdevLabel() would fail to free a
virPCIDevice that had been allocated.
These leaks were all introduced (by me) very recently, in commit
f0bd70a.
This isn't strictly speaking a bugfix, but I realized I'd gotten a bit
too verbose when I chose the names for
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_*. This shortens them all a bit.
Legacy kvm style pci device assignment requires changes to the
labelling of several sysfs files for each device, but for vfio device
assignment, the only thing that needs to be relabelled/chowned is the
"group" device for the group that contains the device to be assigned.
There will soon be other items related to pci hostdevs that need to be
in the same part of the hostdevsubsys union as the pci address (which
is currently a single member called "pci". This patch replaces the
single member named pci with a struct named pci that contains a single
member named "addr".
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
This patch refactors various places to allow removing of the
defaultConsoleTargetType callback from the virCaps structure.
A new console character device target type is introduced -
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_NONE - to mark that no type was
specified in the XML. This type is at the end converted to the standard
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL. Other types that are
different from this default have to be processed separately in the
device post parse callback.
This patch adds instrumentation that will allow hypervisor drivers to
fill and validate domain and device definitions after parsed by the XML
parser.
With this patch, after the XML is parsed, a callback to the driver is
issued requesting to fill and validate driver specific details of the
configuration. This allows to use sensible defaults and checks on a per
driver basis at the time the XML is parsed.
Two callback pointers are stored in the new virDomainXMLConf object:
* virDomainDeviceDefPostParseCallback (devicesPostParseCallback)
- called for a single device parsed and for every single device in a
domain config. A virDomainDeviceDefPtr is passed along with the
domain definition and virCaps.
* virDomainDefPostParseCallback, (domainPostParseCallback)
- A callback that is meant to process the domain config after it's
parsed. A virDomainDefPtr is passed along with virCaps.
Both types of callbacks support arbitrary opaque data passed for the
callback functions.
Errors may be reported in those callbacks resulting in a XML parsing
failure.
This patch is the result of running:
for i in $(git ls-files | grep -v html | grep -v \.po$ ); do
sed -i -e "s/virDomainXMLConf/virDomainXMLOption/g" -e "s/xmlconf/xmlopt/g" $i
done
and a few manual tweaks.
otherwise we crash later on if we don't find a match like:
#0 0xb72c2b4f in virSecurityManagerGenLabel (mgr=0xb8e42d20, vm=0xb8ef40c0) at security/security_manager.c:424
#1 0xb18811f3 in qemuProcessStart (conn=conn@entry=0xb8eed880, driver=driver@entry=0xb8e3b1e0, vm=vm@entry=0xb8ef58f0,
migrateFrom=migrateFrom@entry=0xb18f6088 "stdio", stdin_fd=18,
stdin_path=stdin_path@entry=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img", snapshot=snapshot@entry=0x0,
vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_RESTORE, flags=flags@entry=2) at qemu/qemu_process.c:3364
#2 0xb18d6cb2 in qemuDomainSaveImageStartVM (conn=conn@entry=0xb8eed880, driver=driver@entry=0xb8e3b1e0, vm=0xb8ef58f0, fd=fd@entry=0xb6bf3f98,
header=header@entry=0xb6bf3fa0, path=path@entry=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img",
start_paused=start_paused@entry=false) at qemu/qemu_driver.c:4843
#3 0xb18d7eeb in qemuDomainRestoreFlags (conn=conn@entry=0xb8eed880,
path=path@entry=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img", dxml=dxml@entry=0x0, flags=flags@entry=0)
at qemu/qemu_driver.c:4962
#4 0xb18d8123 in qemuDomainRestore (conn=0xb8eed880, path=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img")
at qemu/qemu_driver.c:4987
#5 0xb718d186 in virDomainRestore (conn=0xb8eed880, from=0xb8ea87d8 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img") at libvirt.c:2768
#6 0xb7736363 in remoteDispatchDomainRestore (args=<optimized out>, rerr=0xb6bf41f0, client=0xb8eedaf0, server=<optimized out>, msg=<optimized out>)
at remote_dispatch.h:4679
#7 remoteDispatchDomainRestoreHelper (server=0xb8e1a3e0, client=0xb8eedaf0, msg=0xb8ee72c8, rerr=0xb6bf41f0, args=0xb8ea8968, ret=0xb8ef5330)
at remote_dispatch.h:4661
#8 0xb720db01 in virNetServerProgramDispatchCall (msg=0xb8ee72c8, client=0xb8eedaf0, server=0xb8e1a3e0, prog=0xb8e216b0)
at rpc/virnetserverprogram.c:439
#9 virNetServerProgramDispatch (prog=0xb8e216b0, server=server@entry=0xb8e1a3e0, client=0xb8eedaf0, msg=0xb8ee72c8) at rpc/virnetserverprogram.c:305
#10 0xb7206e97 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0xb8e1a3e0) at rpc/virnetserver.c:162
#11 virNetServerHandleJob (jobOpaque=0xb8ea7720, opaque=0xb8e1a3e0) at rpc/virnetserver.c:183
#12 0xb70f9f78 in virThreadPoolWorker (opaque=opaque@entry=0xb8e1a540) at util/virthreadpool.c:144
#13 0xb70f94a5 in virThreadHelper (data=0xb8e0e558) at util/virthreadpthread.c:161
#14 0xb705d954 in start_thread (arg=0xb6bf4b70) at pthread_create.c:304
#15 0xb6fd595e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
This unbreaks libvirt-tck's domain/100-transient-save-restore.t with
qemu:///session and selinux compiled in but disabled.
Introduced by 8d68cbeaa8
https://bugzilla.redhat.com/show_bug.cgi?id=947387
If a user configures a domain to use a seclabel of a specific type,
but the appropriate driver is not accessible, we should refuse to
start the domain. For instance, if user requires selinux, but it is
either non present in the system, or is just disabled, we should not
start the domain. Moreover, since we are touching only those labels we
have a security driver for, the other labels may confuse libvirt when
reconnecting to a domain on libvirtd restart. In our selinux example,
when starting up a domain, missing security label is okay, as we
auto-generate one. But later, when libvirt is re-connecting to a live
qemu instance, we parse a state XML, where security label is required
and it is an error if missing:
error : virSecurityLabelDefParseXML:3228 : XML error: security label
is missing
This results in a qemu process left behind without any libvirt control.
With my previous patches, we unconditionally appended a seclabel,
even if it wasn't generated but found in array of defined seclabels.
This resulted in double free later when doing virDomainDefFree
and iterating over the array of defined seclabels.
Moreover, there was another possibility of double free, if the
seclabel was generated in the last iteration of the process of
walking trough security managers array.
https://bugzilla.redhat.com/show_bug.cgi?id=923946
The <seclabel type='none'/> should be added iff there is no other
seclabel defined within a domain. This bug can be easily reproduced:
1) configure selinux seclabel for a domain
2) disable system's selinux and restart libvirtd
3) observe <seclabel type='none'/> being appended to a domain on its
startup
The virDomainDefGetSecurityLabelDef was modifying the domain XML.
It tried to find a seclabel corresponding to given sec driver. If the
label wasn't found, the function created one which is wrong. In fact
it's security manager which should modify this part of domain XML.
Normally libvirtd should run with a SELinux label
system_u:system_r:virtd_t:s0-s0:c0.c1023
If a user manually runs libvirtd though, it is sometimes
possible to get into a situation where it is running
system_u:system_r:init_t:s0
The SELinux security driver isn't expecting this and can't
parse the security label since it lacks the ':c0.c1023' part
causing it to complain
internal error Cannot parse sensitivity level in s0
This updates the parser to cope with this, so if no category
is present, libvirtd will hardcode the equivalent of c0.c1023.
Now this won't work if SELinux is in Enforcing mode, but that's
not an issue, because the user can only get into this problem
if in Permissive mode. This means they can now start VMs in
Permissive mode without hitting that parsing error
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Pull the code which parses the current process MCS range
out of virSecuritySELinuxMCSFind and into a new method
virSecuritySELinuxMCSGetProcessRange.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The body of the loop in virSecuritySELinuxMCSFind would
directly 'return NULL' on OOM, instead of jumping to the
cleanup label. This caused a leak of several local vars.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCaps structure gathered a ton of irrelevant data over time that.
The original reason is that it was propagated to the XML parser
functions.
This patch aims to create a new data structure virDomainXMLConf that
will contain immutable data that are used by the XML parser. This will
allow two things we need:
1) Get rid of the stuff from virCaps
2) Allow us to add callbacks to check and add driver specific stuff
after domain XML is parsed.
This first attempt removes pointers to private data allocation functions
to this new structure and update all callers and function that require
them.
Rename AppArmorSetImageFDLabel to AppArmorSetFDLabel which could
be used as a common function for *ALL* fd relabelling in Linux.
In apparmor profile for specific vm with uuid cdbebdfa-1d6d-65c3-be0f-fd74b978a773
Path: /etc/apparmor.d/libvirt/libvirt-cdbebdfa-1d6d-65c3-be0f-fd74b978a773.files
The last line is for the tapfd relabelling.
# DO NOT EDIT THIS FILE DIRECTLY. IT IS MANAGED BY LIBVIRT.
"/var/log/libvirt/**/rhel6qcow2.log" w,
"/var/lib/libvirt/**/rhel6qcow2.monitor" rw,
"/var/run/libvirt/**/rhel6qcow2.pid" rwk,
"/run/libvirt/**/rhel6qcow2.pid" rwk,
"/var/run/libvirt/**/*.tunnelmigrate.dest.rhel6qcow2" rw,
"/run/libvirt/**/*.tunnelmigrate.dest.rhel6qcow2" rw,
"/var/lib/libvirt/images/rhel6u3qcow2.img" rw,
"/dev/tap45" rw,
With the apparmor security driver enabled, qemu instances fail
to start
# grep ^security_driver /etc/libvirt/qemu.conf
security_driver = "apparmor"
# virsh start test-kvm
error: Failed to start domain test-kvm
error: internal error security label already defined for VM
The model field of virSecurityLabelDef object is always populated
by virDomainDefGetSecurityLabelDef(), so remove the check for a
NULL model when verifying if a label is already defined for the
instance.
Checking for a NULL model and populating it later in
AppArmorGenSecurityLabel() has been left in the code to be
consistent with virSecuritySELinuxGenSecurityLabel().
Coverity found the DACGenLabel was checking for mgr == NULL after a
possible dereference; however, in order to get into the function the
virSecurityManagerGenLabel would have already dereferenced sec_managers[i]
so the check was unnecessary. Same check is made in SELinuxGenSecurityLabel.
The existing virSecurityManagerSetProcessLabel() API is designed so
that it must be called after forking the child process, but before
exec'ing the child. Due to the way the virCommand API works, that
means it needs to be put in a "hook" function that virCommand is told
to call out to at that time.
Setting the child process label is a basic enough need when executing
any process that virCommand should have a method of doing that. But
virCommand must be told what label to set, and only the security
driver knows the answer to that question.
The new virSecurityManagerSet*Child*ProcessLabel() API is the way to
transfer the knowledge about what label to set from the security
driver to the virCommand object. It is given a virCommandPtr, and each
security driver calls the appropriate virCommand* API to tell
virCommand what to do between fork and exec.
1) in the case of the DAC security driver, it calls
virCommandSetUID/GID() to set a uid and gid that must be set for the
child process.
2) for the SELinux security driver, it calls
virCommandSetSELinuxLabel() to save a copy of the char* that will be
sent to setexeccon_raw() *after forking the child process*.
3) for the AppArmor security drivers, it calls
virCommandSetAppArmorProfile() to save a copy of the char* that will
be sent to aa_change_profile() *after forking the child process*.
With this new API in place, we will be able to remove
virSecurityManagerSetProcessLabel() from any virCommand pre-exec
hooks.
(Unfortunately, the LXC driver uses clone() rather than virCommand, so
it can't take advantage of this new security driver API, meaning that
we need to keep around the older virSecurityManagerSetProcessLabel(),
at least for now.)
The hook scripts used by virCommand must be careful wrt
accessing any mutexes that may have been held by other
threads in the parent process. With the recent refactoring
there are 2 potential flaws lurking, which will become real
deadlock bugs once the global QEMU driver lock is removed.
Remove use of the QEMU driver lock from the hook function
by passing in the 'virQEMUDriverConfigPtr' instance directly.
Add functions to the virSecurityManager to be invoked before
and after fork, to ensure the mutex is held by the current
thread. This allows it to be safely used in the hook script
in the child process.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
On RHEL 5, I got:
security/security_selinux.c: In function 'getContext':
security/security_selinux.c:971: warning: unused parameter 'mgr' [-Wunused-parameter]
* src/security/security_selinux.c (getContext): Mark potentially
unused parameter.
The security manager drivers are not allowed to call back
out to top level security manager APIs, since that results
in recursive mutex acquisition and thus deadlock. Remove
calls to virSecurityManagerGetModel from SELinux / AppArmor
drivers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add locking to virSecurityManagerXXX APIs, so that use of the
security drivers is internally serialized. This avoids the need
to rely on the global driver locks to achieve serialization
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To enable locking to be introduced to the security manager
objects later, turn virSecurityManager into a virObjectLockable
class
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To enable virCapabilities instances to be reference counted,
turn it into a virObject. All cases of virCapabilitiesFree
turn into virObjectUnref
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id a994ef2d1 changed the mechanism to store/update the default
security label from using disk->seclabels[0] to allocating one on the
fly. That change allocated the label, but never saved it. This patch
will save the label. The new virDomainDiskDefAddSecurityLabelDef() is
a copy of the virDomainDefAddSecurityLabelDef().
When changing to virArch, the virt-aa-helper.c file was not
completely changed. The vahControl struct was left with a
char *arch field, instead of virArch arch field.
Convert the host capabilities and domain config structs to
use the virArch datatype. Update the parsers and all drivers
to take account of datatype change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Prepare to support different types of hostdevs by refactoring
the current SELinux security driver code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When LXC labels USB devices during hotplug, it is running in
host context, so it needs to pass in a vroot path to the
container root.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virSecurityManager{Set,Restore}AllLabel methods are invoked
at domain startup/shutdown to relabel resources associated with
a domain. This works fine with QEMU, but with LXC they are in
fact both currently no-ops since LXC does not support disks,
hostdevs, or kernel/initrd files. Worse, when LXC gains support
for disks/hostdevs, they will do the wrong thing, since they
run in host context, not container context. Thus this patch
turns then into a formal no-op when used with LXC. The LXC
controller will call out to specific security manager labelling
APIs as required during startup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current SELinux policy only works for KVM guests, since
TCG requires the 'execmem' privilege. There is a 'virt_use_execmem'
boolean to turn this on globally, but that is unpleasant for users.
This changes libvirt to automatically use a new 'svirt_tcg_t'
context for TCG based guests. This obsoletes the previous
boolean tunable and makes things 'just work(tm)'
Since we can't assume we run with new enough policy, I also
make us log a warning message (once only) if we find the policy
lacks support. In this case we fallback to the normal label and
expect users to set the boolean tunable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When using vnc gaphics over a unix socket, virt-aa-helper needs to provide
access for the qemu domain to access the sockfile.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When a qemu domain is backed by huge pages, apparmor needs to grant the domain
rw access to files under the hugetlbfs mount point. Add a hook, called in
qemu_process.c, which ends up adding the read-write access through
virt-aa-helper. Qemu will be creating a randomly named file under the
mountpoint and unlinking it as soon as it has mmap()d it, therefore we
cannot predict the full pathname, but for the same reason it is generally
safe to provide access to $path/**.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For S390, the default console target type cannot be of type 'serial'.
It is necessary to at least interpret the 'arch' attribute
value of the os/type element to produce the correct default type.
Therefore we need to extend the signature of defaultConsoleTargetType
to account for architecture. As a consequence all the drivers
supporting this capability function must be updated.
Despite the amount of changed files, the only change in behavior is
that for S390 the default console target type will be 'virtio'.
N.B.: A more future-proof approach could be to to use hypervisor
specific capabilities to determine the best possible console type.
For instance one could add an opaque private data pointer to the
virCaps structure (in case of QEMU to hold capsCache) which could
then be passed to the defaultConsoleTargetType callback to determine
the console target type.
Seems to be however a bit overengineered for the use case...
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When restoring selinux labels after a VM is stopped, any non-standard
path that doesn't have a default selinux label causes the process
to stop and exit early. This isn't really an error condition IMO.
Of course the selinux API could be erroring for some other reason
but hopefully that's rare enough to not need explicit handling.
Common example here is storing disk images in a non-standard location
like under /mnt.
Fixes a build failure on cygwin:
cc1: warnings being treated as errors
security/security_dac.c: In function 'virSecurityDACSetProcessLabel':
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 7 has type 'uid_t' [-Wformat]
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 8 has type 'gid_t' [-Wformat]
* src/security/security_dac.c (virSecurityDACSetProcessLabel)
(virSecurityDACGenLabel): Use proper casts.
We used to walk the backing file chain at least twice per disk,
once to set up cgroup device whitelisting, and once to set up
security labeling. Rather than walk the chain every iteration,
which possibly includes calls to fork() in order to open root-squashed
NFS files, we can exploit the cache of the previous patch.
* src/conf/domain_conf.h (virDomainDiskDefForeachPath): Alter
signature.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Require caller
to supply backing chain via disk, if recursion is desired.
* src/security/security_dac.c
(virSecurityDACSetSecurityImageLabel): Adjust caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityImageLabel): Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Likewise.
(qemuSetupCgroup): Pre-populate chain.
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=851981
When using macvtap, a character device gets first created by
kernel with name /dev/tapN, its selinux context is:
system_u:object_r:device_t:s0
Shortly, when udev gets notification when new file is created
in /dev, it will then jump in and relabel this file back to the
expected default context:
system_u:object_r:tun_tap_device_t:s0
There is a time gap happened.
Sometimes, it will have migration failed, AVC error message:
type=AVC msg=audit(1349858424.233:42507): avc: denied { read write } for
pid=19926 comm="qemu-kvm" path="/dev/tap33" dev=devtmpfs ino=131524
scontext=unconfined_u:system_r:svirt_t:s0:c598,c908
tcontext=system_u:object_r:device_t:s0 tclass=chr_file
This patch will label the tapfd device before qemu process starts:
system_u:object_r:tun_tap_device_t:MCS(MCS from seclabel->label)
We are currently able to work only with non-translated SELinux
contexts, but we are using functions that work with translated
contexts throughout the code. This patch swaps all SELinux context
translation relative calls with their raw sisters to avoid parsing
problems.
The problems can be experienced with mcstrans for example. The
difference is that if you have translations enabled (yum install
mcstrans; service mcstrans start), fgetfilecon_raw() will get you
something like 'system_u:object_r:virt_image_t:s0', whereas
fgetfilecon() will return 'system_u:object_r:virt_image_t:SystemLow'
that we cannot parse.
I was trying to confirm that the _raw variants were here since the dawn of
time, but the only thing I see now is that it was imported together in
the upstream repo [1] from svn, so before 2008.
Thanks Laurent Bigonville for finding this out.
[1] http://oss.tresys.com/git/selinux.git
All USB device lookup functions emit an error when they cannot find the
requested device. With this patch, their caller can choose if a missing
device is an error or normal condition.
The functions virGetUserID and virGetGroupID are now able to parse
user/group names and IDs in a similar way to coreutils' chown. So, user
and group parsing in security_dac can be simplified.
The DAC driver is missing parsing of group and user names for DAC labels
and currently just parses uid and gid. This patch extends it to support
names, so the following security label definition is now valid:
<seclabel type='static' model='dac' relabel='yes'>
<label>qemu:qemu</label>
<imagelabel>qemu:qemu</imagelabel>
</seclabel>
When it tries to parse an owner or a group, it first tries to resolve it as
a name, if it fails or it's an invalid user/group name then it tries to
parse it as an UID or GID. A leading '+' can also be used for both owner and
group to force it to be parsed as IDs, so the following example is also
valid:
<seclabel type='static' model='dac' relabel='yes'>
<label>+101:+101</label>
<imagelabel>+101:+101</imagelabel>
</seclabel>
This ensures that UID 101 and GUI 101 will be used instead of an user or
group named "101".
This allows the user to control labelling of each character device
separately (the default is to inherit from the VM).
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
The DAC security driver silently ignored errors when parsing the DAC
label and used default values instead.
With a domain containing the following label definition:
<seclabel type='static' model='dac' relabel='yes'>
<label>sdfklsdjlfjklsdjkl</label>
</seclabel>
the domain would start normaly but the disk images would be still owned
by root and no error was displayed.
This patch changes the behavior if the parsing of the label fails (note
that a not present label is not a failure and in this case the default
label should be used) the error isn't masked but is raised that causes
the domain start to fail with a descriptive error message:
virsh # start tr
error: Failed to start domain tr
error: internal error invalid argument: failed to parse DAC seclabel
'sdfklsdjlfjklsdjkl' for domain 'tr'
I also changed the error code to "invalid argument" from "internal
error" and tweaked the various error messages to contain correct and
useful information.
If no 'security_driver' config option was set, then the code
just loaded the 'dac' security driver. This is a regression
on previous behaviour, where we would probe for a possible
security driver. ie default to SELinux if available.
This changes things so that it 'security_driver' is not set,
we once again do probing. For simplicity we also always
create the stack driver, even if there is only one driver
active.
The desired semantics are:
- security_driver not set
-> probe for selinux/apparmour/nop
-> auto-add DAC driver
- security_driver set to a string
-> add that one driver
-> auto-add DAC driver
- security_driver set to a list
-> add all drivers in list
-> auto-add DAC driver
It is not allowed, or possible to specify 'dac' in the
security_driver config param, since that is always
enabled.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This reverts commit 9f9b7b85c9.
The DAC security driver needs special handling and extra parameters and
can't just be added to regular security drivers.
When starting a machine the DAC security driver tries to set the UID and
GID of the newly spawned process. This worked as desired if the desired
label was set. When the label was missing a logical bug in
virSecurityDACGenLabel() caused that uninitialised values were used as
uid and gid for the new process.
With this patch, default values (from qemu driver configuration)
are used if the label is not found.
Currently, if users set 'security_driver="dac"' in qemu.conf libvirtd
fails to initialize as DAC driver is not found because it is missing
in our security drivers array.
The DAC security driver uses the virStrToLong_ui function to
parse the uid/gid out of the seclabel string. This works on
Linux where 'uid_t' is an unsigned int, but on Mingw32 it is
just an 'int'. This causes compiler warnings about signed/
unsigned int pointer mis-match.
To avoid this, use explicit 'unsigned int ouruid' local
vars to pass into virStrToLong_ui, and then simply assign
to the 'uid_t' type after parsing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the dynamic label generation code will create labels
with a sensitivity of s0, and a category pair in the range
0-1023. This is fine when running a standard MCS policy because
libvirtd will run with a label
system_u:system_r:virtd_t:s0-s0:c0.c1023
With custom policies though, it is possible for libvirtd to have
a different sensitivity, or category range. For example
system_u:system_r:virtd_t:s2-s3:c512.c1023
In this case we must assign the VM a sensitivity matching the
current lower sensitivity value, and categories in the range
512-1023
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code to refactor sec label handling accidentally changed the
SELinux driver to use the 'domain_context' when generating the
image label instead of the 'file_context'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
These changes make the security drivers able to find and handle the
correct security label information when more than one label is
available. They also update the DAC driver to be used as an usual
security driver.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This patch updates the structures that store information about each
domain and each hypervisor to support multiple security labels and
drivers. It also updates all the remaining code to use the new fields.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This is a fix for the object label generation. It uses a new flag for
virSecuritySELinuxGenNewContext that specifies whether the context is
for an object. If so the context role remains unchanged.
Without this fix it is not possible to start domains with image file or
block device backed storage when selinux is enabled.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>