[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250213-fehlgriff-filmt-1dfdd558ab78@brauner>
Date: Thu, 13 Feb 2025 15:56:35 +0100
From: Christian Brauner <brauner@...nel.org>
To: Jeff Layton <jlayton@...nel.org>
Cc: Zicheng Qu <quzicheng@...wei.com>,
Linus Torvalds <torvalds@...ux-foundation.org>, axboe@...nel.dk, joel.granados@...nel.org, tglx@...utronix.de,
viro@...iv.linux.org.uk, hch@....de, len.brown@...el.com, pavel@....cz,
pengfei.xu@...el.com, rafael@...nel.org, tanghui20@...wei.com, zhangqiao22@...wei.com,
judy.chenhui@...wei.com, linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
syzkaller-bugs@...glegroups.com, linux-pm@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH 0/2] acct: don't allow access to internal filesystems
On Wed, Feb 12, 2025 at 12:16:44PM +0100, Christian Brauner wrote:
> On Tue, Feb 11, 2025 at 01:56:41PM -0500, Jeff Layton wrote:
> > On Tue, 2025-02-11 at 18:15 +0100, Christian Brauner wrote:
> > > In [1] it was reported that the acct(2) system call can be used to
> > > trigger a NULL deref in cases where it is set to write to a file that
> > > triggers an internal lookup.
> > >
> > > This can e.g., happen when pointing acct(2) to /sys/power/resume. At the
> > > point the where the write to this file happens the calling task has
> > > already exited and called exit_fs() but an internal lookup might be
> > > triggered through lookup_bdev(). This may trigger a NULL-deref
> > > when accessing current->fs.
> > >
> > > This series does two things:
> > >
> > > - Reorganize the code so that the the final write happens from the
> > > workqueue but with the caller's credentials. This preserves the
> > > (strange) permission model and has almost no regression risk.
> > >
> > > - Block access to kernel internal filesystems as well as procfs and
> > > sysfs in the first place.
> > >
> > > This api should stop to exist imho.
> > >
> >
> > I wonder who uses it these days, and what would we suggest they replace
> > it with? Maybe syscall auditing?
>
> Someone pointed me to atop but that also works without it. Since this is
> a privileged api I think the natural candidate to replace all of this is
> bpf. I'm pretty sure that it's relatively straightforward to get a lot
> more information out of it than with acct(2) and it will probably be
> more performant too.
>
> Without any limitations as it is right now, acct(2) can easily lockup
> the system quite easily by pointing it to various things in sysfs and
> I'm sure it can be abused in other ways. So I wouldn't enable it.
And I totally forgot about taskstats via Netlink:
https://www.kernel.org/doc/Documentation/accounting/taskstats.txt
include/uapi/linux/taskstats.h
Powered by blists - more mailing lists