lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231213143813.6818-1-michael.weiss@aisec.fraunhofer.de>
Date:   Wed, 13 Dec 2023 15:38:10 +0100
From:   Michael Weiß <michael.weiss@...ec.fraunhofer.de>
To:     Christian Brauner <brauner@...nel.org>,
        Alexander Mikhalitsyn <alexander@...alicyn.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Paul Moore <paul@...l-moore.com>
CC:     Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        Quentin Monnet <quentin@...valent.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Miklos Szeredi <miklos@...redi.hu>,
        Amir Goldstein <amir73il@...il.com>,
        "Serge E. Hallyn" <serge@...lyn.com>, <bpf@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-fsdevel@...r.kernel.org>,
        <linux-security-module@...r.kernel.org>,
        <gyroidos@...ec.fraunhofer.de>,
        Michael Weiß <michael.weiss@...ec.fraunhofer.de>
Subject: [RFC PATCH v3 0/3] devguard: guard mknod for non-initial user namespace

If a container manager restricts its unprivileged (user namespaced)
children by a device cgroup, it is not necessary to deny mknod()
anymore. Thus, user space applications may map devices on different
locations in the file system by using mknod() inside the container.

A use case for this, we also use in GyroidOS, is to run virsh for
VMs inside an unprivileged container. virsh creates device nodes,
e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails
in a non-initial userns, even if a cgroup device white list with the
corresponding major, minor of /dev/null exists. Thus, in this case
the usual bind mounts or pre populated device nodes under /dev are
not sufficient.

Due to the discussion with Christian on v2, I agree that the previous
approach was to complex. Actually, we just want working device
nodes in user namespace if we have a device cgroup in place which
handles access decisions.

Patch 1 provides a helper functions to check if the current task
is guarded by a bpf-device cgroup program.
Thanks Alexander Mikhalitsyn for reviewing.

Patch 2 implements the ns_capable check including sysctl as proposed
by Christian. I provide a short overview about device node creation
and access decisions in the commit message there.

Patch 3 provides devgard, a small lsm which actually strips out
SB_I_NODEV.

---
Changes in v3:
- Small LSM to just implement security_inode_mknod() hook
- Leave devcgroup as is
- Strip SB_I_NO_DEV in security_inode_mknod hook as suggested by
  Christian
- Do not change bpf or cgroup access decision at all
- ns_capable(sb->s_iflags, CAP_MKNOD) in vfs_mknod()
- Link to v2: https://lore.kernel.org/lkml/1d481e11-6601-4b82-a317-f8506f3ccf9b@aisec.fraunhofer.de/

Changes in v2:
- Integrate this as LSM (Christian, Paul)
- Switched to a device cgroup specific flag instead of a generic
  bpf program flag (Christian)
- Do not ignore SB_I_NODEV in fs/namei.c but use LSM hook in
  sb_alloc_super in fs/super.c
- Link to v1: https://lore.kernel.org/lkml/20230814-devcg_guard-v1-0-654971ab88b1@aisec.fraunhofer.de

Michael Weiß (3):
  bpf: cgroup: Introduce helper cgroup_bpf_current_enabled()
  fs: Make vfs_mknod() to check CAP_MKNOD in user namespace of sb
  devguard: added device guard for mknod in non-initial userns

 fs/namei.c                   | 30 +++++++++++++++++++++++-
 include/linux/bpf-cgroup.h   |  2 ++
 kernel/bpf/cgroup.c          | 14 ++++++++++++
 security/Kconfig             | 11 +++++----
 security/Makefile            |  1 +
 security/devguard/Kconfig    | 12 ++++++++++
 security/devguard/Makefile   |  2 ++
 security/devguard/devguard.c | 44 ++++++++++++++++++++++++++++++++++++
 8 files changed, 110 insertions(+), 6 deletions(-)
 create mode 100644 security/devguard/Kconfig
 create mode 100644 security/devguard/Makefile
 create mode 100644 security/devguard/devguard.c


base-commit: a39b6ac3781d46ba18193c9dbb2110f31e9bffe9
-- 
2.30.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ