[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aUAiIkNPgied0Tyf@example.org>
Date: Mon, 15 Dec 2025 15:58:42 +0100
From: Alexey Gladkov <legion@...nel.org>
To: Dan Klishch <danilklishch@...il.com>
Cc: brauner@...nel.org, containers@...ts.linux-foundation.org,
ebiederm@...ssion.com, keescook@...omium.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
viro@...iv.linux.org.uk
Subject: Re: [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount
visibility
On Mon, Dec 15, 2025 at 09:46:00AM -0500, Dan Klishch wrote:
> On 12/15/25 5:10 AM, Alexey Gladkov wrote:
> > On Sun, Dec 14, 2025 at 01:02:54PM -0500, Dan Klishch wrote:
> >> On 12/14/25 11:40 AM, Alexey Gladkov wrote:
> >>> But then, if I understand you correctly, this patch will not be enough
> >>> for you. procfs with subset=pid will not allow you to have /proc/meminfo,
> >>> /proc/cpuinfo, etc.
> >>
> >> Hmm, I didn't think of this. sunwalker-box only exposes cpuinfo and PID
> >> tree to the sandboxed programs (empirically, this is enough for most of
> >> programs you want sandboxing for). With that in mind, this patch and a
> >> FUSE providing an overlay with cpuinfo / seccomp intercepting opens of
> >> /proc/cpuinfo / a small kernel patch with a new mount option for procfs
> >> to expose more static files still look like a clean solution to me.
> >
> > I don't think you'll be able to do that. procfs doesn't allow itself to
> > be overlayed [1]. What should block mounting overlayfs and fuse on top
> > of procfs.
> >
> > [1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/proc/root.c#n274
>
> This is why I have been careful not to say overlayfs. With [2] (warning:
> zero-shot ChatGPT output), I can do:
>
> $ ./fuse-overlay target --source=/proc
> $ ls target
> 1 88 194 1374 889840 908552
> 2 90 195 1375 889987 908619
> 3 91 196 1379 890031 908658
> 4 92 203 1412 890063 908756
> 5 93 205 1590 890085 908804
> 6 94 233 1644 890139 908951
> 7 96 237 1802 890246 909848
> 8 97 239 1850 890271 909914
> 10 98 240 1852 894665 909924
> 13 99 243 1865 895854 909926
> 15 100 244 1888 895864 910005
> 16 102 246 1889 896030 acpi
> 17 103 262 1891 896205 asound
> 18 104 263 1895 896508 bus
> 19 105 264 1896 896544 driver
> 20 106 265 1899 896706 dynamic_debug
> <...>
>
> [2] https://gist.github.com/DanShaders/547eeb74a90315356b98472feae47474
>
> This requires a much more careful thought wrt magic symlinks
> and permission checks. The fact that I am highly unlikely to 100%
> correctly reimplement the checks and special behavior of procfs makes me
> not want to proceed with the FUSE route.
>
> On 12/15/25 6:30 AM, Christian Brauner wrote:
> > The standard way of making it possible to mount procfs inside of a
> > container with a separate mount namespace that has a procfs inside it
> > with overmounted entries is to ensure that a fully-visible procfs
> > instance is present.
>
> Yes, this is a solution. However, this is only marginally better than
> passing --privileged to the outer container (in a sense that we require
> outer sandbox to remove some protections for the inner sandbox to work).
>
> > The container needs to inherit a fully-visible instance somehow if you
> > want nesting. Using an unprivileged LSM such as landlock to prevent any
> > access to the fully visible procfs instance is usually the better way.
> >
> > My hope is that once signed bpf is more widely adopted that distros will
> > just start enabling blessed bpf programs that will just take on the
> > access protecting instead of the clumsy bind-mount protection mechanism.
>
> These are big changes to container runtimes that are unlikely to happen
> soon. In contrast, the patch we are discussing will be available in 2
> months after the merge for me to use on ArchLinux, and in a couple more
> months on Ubuntu.
>
> So, is there any way forward with the patch or should I continue trying
> to find a userspace solution?
I still consider these patches useful. I made them precisely to remove
some of the restrictions we have for procfs because of global files in
the root of this filesystem.
I can update and prepare a new version of patchset if Christian thinks
it's useful too.
--
Rgrds, legion
Powered by blists - more mailing lists