[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YsM5XHy4RZUDF8cR@ZenIV>
Date: Mon, 4 Jul 2022 20:02:52 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Alexander Potapenko <glider@...gle.com>,
Alexei Starovoitov <ast@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrey Konovalov <andreyknvl@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
Christoph Hellwig <hch@....de>,
Christoph Lameter <cl@...ux.com>,
David Rientjes <rientjes@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Herbert Xu <herbert@...dor.apana.org.au>,
Ilya Leoshkevich <iii@...ux.ibm.com>,
Ingo Molnar <mingo@...hat.com>, Jens Axboe <axboe@...nel.dk>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Kees Cook <keescook@...omium.org>,
Marco Elver <elver@...gle.com>,
Mark Rutland <mark.rutland@....com>,
Matthew Wilcox <willy@...radead.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
Pekka Enberg <penberg@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Petr Mladek <pmladek@...e.com>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vasily Gorbik <gor@...ux.ibm.com>,
Vegard Nossum <vegard.nossum@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>,
kasan-dev <kasan-dev@...glegroups.com>,
Linux-MM <linux-mm@...ck.org>,
linux-arch <linux-arch@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Evgenii Stepanov <eugenis@...gle.com>,
Nathan Chancellor <nathan@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Segher Boessenkool <segher@...nel.crashing.org>,
Vitaly Buka <vitalybuka@...gle.com>,
linux-toolchains <linux-toolchains@...r.kernel.org>
Subject: Re: [PATCH v4 43/45] namei: initialize parameters passed to
step_into()
On Mon, Jul 04, 2022 at 10:36:05AM -0700, Linus Torvalds wrote:
> For example, in __follow_mount_rcu(), when we jump to a new mount
> point, and that sequence has
>
> *seqp = read_seqcount_begin(&dentry->d_seq);
>
> to reset the sequence number to the new path we jumped into.
>
> But I don't actually see what checks the previous sequence number in
> that path. We just reset it to the new one.
Theoretically it could be a problem. We have /mnt/foo/bar and
/mnt/baz/bar. Something's mounted on /mnt/foo, hiding /mnt/foo/bar.
We start a pathwalk for /mnt/baz/bar,
someone umounts /mnt/foo and swaps /mnt/foo to /mnt/baz before
we get there. We are doomed to get -ECHILD from an attempt to
legitimize in the end, no matter what. However, we might get
a hard error (-ENOENT, for example) before that, if we pick up
the old mount that used to be on top of /mnt/foo (now /mnt/baz)
and had been detached before the damn thing had become /mnt/baz
and notice that there's no "bar" in its root.
It used to be impossible (rename would've failed if the target had
been non-empty and had we managed to empty it first, well, there's
your point when -ENOENT would've been accurate). With exchange...
Yes, it's a possible race.
Might need to add
if (read_seqretry(&mount_lock, nd->m_seq))
return false;
in there. And yes, it's a nice demonstration of how subtle and
brittle RCU pathwalk is - nobody noticed this bit of fun back when
RENAME_EXCHANGE had been added... It got a lot more readable these
days, but...
> For __follow_mount_rcu it looks like validating the previous sequence
> number is left to the caller, which then does try_to_unlazy_next().
Not really - the caller goes there only if we have __follow_mount_rcu()
say "it's too tricky for me, get out of RCU mode and deal with it
there".
Anyway, I've thrown a mount_lock check in there, running xfstests to
see how it goes...
Powered by blists - more mailing lists