From e7facdca25ddcc0fdabc8d86fdc1f1da39285fdf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= Date: Fri, 4 Mar 2022 11:59:23 +0000 Subject: [PATCH] logging: lockdown the systemd service configuration MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 'systemd-analyze security' command looks at the unit file configuration and reports on any settings which increase the attack surface for the daemon. Since most systemd units are fairly minimalist, this is generally informing us about settings that we never put any thought into using before. In its current configuration it reports # systemd-analyze security virtlogd.service ...snip... β†’ Overall exposure level for virtlogd.service: 9.6 UNSAFE 😨 which is pretty terrible as a score. If we apply all of the recommendations that appear possible without (knowingly) breaking functionality it reports: # systemd-analyze security virtlogd.service ...snip... β†’ Overall exposure level for virtlogd.service: 2.2 OK πŸ™‚ which is a pretty decent improvement. Some of the settings we would like to enable require a systemd version that is newer than that available in our oldest distro target - RHEL-8 at v239. NB, RestrictSUIDSGID is technically newer than 239, but RHEL-8 backported it, and other distros we target have it by default. Remaining recommendations are βœ— CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER) We block FOWNER/IPC_OWNER, but can't block the two DAC capabilities. Historically apps/users might point QEMU to log files in $HOME, pre-created with their own user ID. βœ— IPAddressDeny= Not required since RestrictAddressFamilies blocks IP usage. Ignoring this avoids the overhead of creating a traffic filter than will never be used. βœ— NoNewPrivileges= Highly desirable, but cannot enable it yet, because it will block the ability to transition to the virtlogd_t SELinux domain during execve. The SELinux policy needs fixing to permit this transition under NNP first. βœ— PrivateTmp= There is a decent chance people have VMs configured with a serial port logfile pointing at /tmp. We would cause a regression to use private /tmp for logging βœ— PrivateUsers= This would put virtlogd inside a user namespace where its root is in fact unprivileged. Same problem as the User= setting below βœ— ProcSubset= Libraries we link to might read certain non-PID related files from /proc βœ— ProtectClock= Requires v245 βœ— ProtectHome= Same problem as PrivateTmp=. There's a decent chance that someone has a VM configured to write a logfile to /home βœ— ProtectHostname= Requires v241 βœ— ProtectKernelLogs Requires v244 βœ— ProtectProc Requires v247 βœ— ProtectSystem= We only set it to 'full', as 'strict' is not viable for our required usage βœ— RootDirectory=/RootImage= We are not capable of running inside a custom chroot given needs to write log files to arbitrary places βœ— RestrictAddressFamilies=~AF_UNIX We need AF_UNIX to communicate with other libvirt daemons βœ— SystemCallFilter=~@resources We link to libvirt.so which links to libnuma.so which has a constructor that calls set_mempolicy. This is highly undesirable todo during a constructor. βœ— User=/DynamicUser= This is highly desirable, but we currently read/write logs as root, and directories we're told to write into could be anywhere. So using a non-root user would have a major risk of regressions for applications and also have upgrade implications Reviewed-by: Michal Privoznik Signed-off-by: Daniel P. BerrangΓ© --- src/logging/virtlogd.service.in | 94 +++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/src/logging/virtlogd.service.in b/src/logging/virtlogd.service.in index 569c9f88ad..bcc356f9d1 100644 --- a/src/logging/virtlogd.service.in +++ b/src/logging/virtlogd.service.in @@ -14,6 +14,100 @@ EnvironmentFile=-@initconfdir@/virtlogd ExecStart=@sbindir@/virtlogd $VIRTLOGD_ARGS ExecReload=/bin/kill -USR1 $MAINPID +CapabilityBoundingSet=~CAP_AUDIT_CONTROL +CapabilityBoundingSet=~CAP_AUDIT_READ +CapabilityBoundingSet=~CAP_AUDIT_WRITE +CapabilityBoundingSet=~CAP_BLOCK_SUSPEND +CapabilityBoundingSet=~CAP_CHOWN +# Mgmt app/user might have pre-created log files that we're +# told to open and write to, or be storing them in otherwise +# inaccessible locations like $HOME. So we need to ignore +# DAC permission checks. +#CapabilityBoundingSet=~CAP_DAC_OVERRIDE +#CapabilityBoundingSet=~CAP_DAC_READ_SEARCH +CapabilityBoundingSet=~CAP_FOWNER +CapabilityBoundingSet=~CAP_FSETID +CapabilityBoundingSet=~CAP_IPC_LOCK +CapabilityBoundingSet=~CAP_IPC_OWNER +CapabilityBoundingSet=~CAP_KILL +CapabilityBoundingSet=~CAP_LEASE +CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE +CapabilityBoundingSet=~CAP_MAC_ADMIN +CapabilityBoundingSet=~CAP_MAC_OVERRIDE +CapabilityBoundingSet=~CAP_MKNOD +CapabilityBoundingSet=~CAP_NET_ADMIN +CapabilityBoundingSet=~CAP_NET_BIND_SERVICE +CapabilityBoundingSet=~CAP_NET_BROADCAST +CapabilityBoundingSet=~CAP_NET_RAW +CapabilityBoundingSet=~CAP_SETFCAP +CapabilityBoundingSet=~CAP_SETPCAP +CapabilityBoundingSet=~CAP_SETGID +CapabilityBoundingSet=~CAP_SETUID +CapabilityBoundingSet=~CAP_SYSLOG +CapabilityBoundingSet=~CAP_SYS_ADMIN +CapabilityBoundingSet=~CAP_SYS_BOOT +CapabilityBoundingSet=~CAP_SYS_CHROOT +CapabilityBoundingSet=~CAP_SYS_MODULE +CapabilityBoundingSet=~CAP_SYS_NICE +CapabilityBoundingSet=~CAP_SYS_PACCT +CapabilityBoundingSet=~CAP_SYS_PTRACE +CapabilityBoundingSet=~CAP_SYS_RAWIO +CapabilityBoundingSet=~CAP_SYS_RESOURCE +CapabilityBoundingSet=~CAP_SYS_TIME +CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG +CapabilityBoundingSet=~CAP_WAKE_ALARM + +LockPersonality=true +MemoryDenyWriteExecute=true +# Cannot enable this as it prevents transitioning to +# the confined SELinux virtlogd_t domain on execve +# unless we modify the policy to allow this. +#NoNewPrivileges=true +PrivateDevices=true +PrivateMounts=true +PrivateNetwork=true +# XXX someone could configure QEMU to log a serial port to an +# arbitrary directory, including /tmp, even if this is ill-advised +#PrivateTmp=true +# Not until oldest build target has systemd >= v245 +#ProtectClock=true +ProtectControlGroups=true +# Not until oldest build target has systemd >= v241 +#ProtectHostname=true +# Not until oldest build target has systemd >= v244 +#ProtectKernelLogs=true +ProtectKernelModules=true +ProtectKernelTunables=true +# Not until oldest build target has systemd >= v247 +#ProtectProc=invisible +ProtectSystem=full +RestrictAddressFamilies=AF_UNIX +RestrictNamespaces=~cgroup +RestrictNamespaces=~ipc +RestrictNamespaces=~mnt +RestrictNamespaces=~net +RestrictNamespaces=~pid +RestrictNamespaces=~user +RestrictNamespaces=~uts +RestrictRealtime=true +RestrictSUIDSGID=true +SystemCallArchitectures=native +SystemCallFilter=~@clock +SystemCallFilter=~@debug +SystemCallFilter=~@module +SystemCallFilter=~@mount +SystemCallFilter=~@raw-io +SystemCallFilter=~@reboot +SystemCallFilter=~@swap +SystemCallFilter=~@privileged +# Unfortunately we link to libnuma via libvirt.so which +# has a constructor that runs unconditionally that invokes +# set_mempolicy() +#SystemCallFilter=~@resources +SystemCallFilter=~@cpu-emulation +SystemCallFilter=~@obsolete +UMask=077 + [Install] WantedBy=multi-user.target Also=virtlogd.socket