[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240806-moosbedeckt-laufbahn-b11f1488a0d6@brauner>
Date: Tue, 6 Aug 2024 13:36:37 +0200
From: Christian Brauner <brauner@...nel.org>
To: Jeff Layton <jlayton@...nel.org>
Cc: Mateusz Guzik <mjguzik@...il.com>,
Alexander Viro <viro@...iv.linux.org.uk>, Jan Kara <jack@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>, Josef Bacik <josef@...icpanda.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 3/4] lockref: rework CMPXCHG_LOOP to handle
contention better
On Mon, Aug 05, 2024 at 08:52:28AM GMT, Jeff Layton wrote:
> On Mon, 2024-08-05 at 13:44 +0200, Christian Brauner wrote:
> > > Audit not my favorite area of the kernel to work in either. I don't see
> > > a good way to make it rcu-friendly, but I haven't looked too hard yet
> > > either. It would be nice to be able to do some of the auditing under
> > > rcu or spinlock.
> >
> > For audit your main option is to dodge the problem and check whether
> > audit is active and only drop out of rcu if it is. That sidesteps the
> > problem. I'm somewhat certain that a lot of systems don't really have
> > audit active.
> >
>
> I did have an earlier version of 4/4 that checked audit_context() and
> stayed in RCU mode if it comes back NULL. I can resurrect that if you
> think it's worthwhile.
Let's at least see what it looks like. Maybe just use a helper local to
fs/namei.c that returns ECHILD if audit is available and 0 otherwise?
> > From a brief look at audit it would be quite involved to make it work
> > just under rcu. Not just because it does various allocation but it also
> > reads fscaps from disk and so on. That's not going to work unless we add
> > a vfs based fscaps cache similar to what we do for acls. I find that
> > very unlikely.
>
> Yeah. It wants to record a lot of (variable-length) information at very
> inconvenient times. I think we're sort of stuck with it though until
> someone has a vision on how to do this in a non-blocking way.
>
> Handwavy thought: there is some similarity to tracepoints in what
> audit_inode does, and tracepoints are able to be called in all sorts of
> contexts. I wonder if we could leverage the same infrastructure
> somehow? The catch here is that we can't just drop audit records if
> things go wrong.
I can't say much about the tracepoint idea as I lack the necessary
details around their implementation.
I think the better way forward is a model with a fastpath and a
slowpath. Under RCU audit_inode() returns -ECHILD if it sees that it
neeeds to end up doing anything it couldn't do in a non-blocking way and
then path lookup can drop out of RCU and call audit_inode() again.
I think this wouldn't be extremly terrible. It would amount to adding a
flag to audit_inode() AUDIT_MAY_NOT_BLOCK and then on ECHILD
audit_inode() gets called again without that flag.
Over time if people are interested they could then make more and more
stuff available under rcu for audit.
Powered by blists - more mailing lists