linux-hardening - Re: [PATCH] capabilities: new kernel.ns_modules

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202208121146.9E4A98B@keescook>
Date:   Fri, 12 Aug 2022 11:48:36 -0700
From:   Kees Cook <keescook@...omium.org>
To:     Vegard Nossum <vegard.nossum@...cle.com>
Cc:     linux-kernel@...r.kernel.org,
        Thadeu Lima de Souza Cascardo <cascardo@...onical.com>,
        Serge Hallyn <serge@...lyn.com>,
        Eric Biederman <ebiederm@...ssion.com>,
        linux-hardening@...r.kernel.org, John Haxby <john.haxby@...cle.com>
Subject: Re: [PATCH] capabilities: new kernel.ns_modules_allowed sysctl

On Wed, Aug 10, 2022 at 10:25:17AM +0200, Vegard Nossum wrote:
> 
> On 8/10/22 00:56, Kees Cook wrote:
> > On Tue, Aug 09, 2022 at 08:52:29PM +0200, Vegard Nossum wrote:
> >> Creating a new user namespace grants you the ability to reach a lot of code
> >> (including loading certain kernel modules) that would otherwise be out of
> >> reach of an attacker. We can reduce the attack surface and block exploits
> >> by ensuring that user namespaces cannot trigger module (auto-)loading.
> >>
> >> A cursory search of exploits found online yields the following extremely
> >> non-exhaustive list of vulnerabilities, and shows that the technique is
> >> both old and still in use:
> >>
> >> - CVE-2016-8655
> >> - CVE-2017-1000112
> >> - CVE-2021-32606
> >> - CVE-2022-2588
> >> - CVE-2022-27666
> >> - CVE-2022-34918
> >>
> >> This patch adds a new sysctl, kernel.ns_modules_allowed, which when set to
> >> 0 will block requests to load modules when the request originates in a
> >> process running in a user namespace.
> >>
> >> For backwards compatibility, the default value of the sysctl is set to
> >> CONFIG_NS_MODULES_ALLOWED_DEFAULT_ON, which in turn defaults to 1, meaning
> >> there should be absolutely no change in behaviour unless you opt in either
> >> at compile time or at runtime.
> >>
> >> This mitigation obviously offers no protection if the vulnerable module is
> >> already loaded, but for many of these exploits the vast majority of users
> >> will never actually load or use these modules on purpose; in other words,
> >> for the vast majority of users, this would block exploits for the above
> >> list of vulnerabilities.
> > 
> > We've needed better module autoloading protections for a long time[1].
> > This patch is a big hammer ("all user namespaces"), so I worry it
> > wouldn't actually get used much.
> > 
> > Here's a pointer into a prior thread, where Linus chimed in[2].
> > I replied back then, but I'm not sure I agree with my 2017 self any
> > more. :P
> > 
> > It really does feel like the loading decisions need to be made by the
> > userspace helper, which currently doesn't have enough information to
> > make those choices.
> > 
> > -Kees
> > 
> > [1] https://github.com/KSPP/linux/issues/24
> > [2] https://lore.kernel.org/kernel-hardening/CA+55aFxiDKfe6VCM+aV2OgnkzMpP+iz+rn2k25_Qa_QLex=pPQ@mail.gmail.com/
> 
> Thanks for the pointers, I didn't have any of this context.
> 
> I would still argue for my patch with the following points:
> 
> 1) As you said, it's been almost 7 years since the discussion you linked
> and apparently it's still a problem (including those 5 privilege
> escalation CVEs from my changelog); this relatively simple patch
> provides a mitigation _today_
> 
> 2) it can be layered with any other future mitigations if they do show up
> 
> 3) it's not as big a hammer as completely disabling unprivileged user
> namespaces, which seems to be the next best thing currently in terms of
> protecting your users (as a distro)
> 
> 4) both the implementation and the user interface are fairly simple in
> my patch, which means it's not a huge long term maintenance burden like
> block-/allowlists or capabilities based on whether modules are
> maintained or not (I would also argue that "maintained or not" is not a
> great proxy for whether there are security issues in the code)
> 
> 5) it resembles other sysctls like unprivileged_bpf_disabled or
> perf_event_paranoid, or even modules_disabled
> 
> 6) it's opt-in by default, and even then, if you run into problems with
> containers that don't work or whatever, the solution is extremely
> simple: just load the modules you need before starting your container
> (the module names are printed in the kernel log so it shouldn't be
> difficult to track down issues)
> 
> What's the downside..?

I agree, it'd be nice to have. I'm just trying to predict what kind of
push-back there may be.

Can you address the build failures noted on the thread, and send a v2? I
note that after this patch it looks like all module loading from a userns
gets logged, regardless of the setting. Is that intended?

-Kees

-- 
Kees Cook