[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMn1gO53G6-sZE8RiLAD2uERbW1XtvyZbRonNGbHonzD058yAA@mail.gmail.com>
Date: Fri, 21 Jan 2022 14:35:36 -0800
From: Peter Collingbourne <pcc@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Andrey Konovalov <andreyknvl@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Memory Management List <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Mel Gorman <mgorman@...e.de>,
"# 3.4.x" <stable@...r.kernel.org>
Subject: Re: [PATCH] mm/mmzone.c: fix page_cpupid_xchg_last() to READ_ONCE()
the page flags
On Thu, Jan 20, 2022 at 2:43 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Wed, Jan 19, 2022 at 03:28:32PM -0800, Peter Collingbourne wrote:
> > On Wed, Jan 19, 2022 at 2:03 AM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Tue, Jan 18, 2022 at 03:05:39PM -0800, Peter Collingbourne wrote:
> > > > After submitting a patch with a compare-exchange loop similar to this
> > > > one to set the KASAN tag in the page flags, Andrey Konovalov pointed
> > > > out that we should be using READ_ONCE() to read the page flags. Fix
> > > > it here.
> > >
> > > What does it actually fix? If it manages to split the read and read
> > > garbage the cmpxchg will fail and we go another round, no harm done.
> >
> > What I wasn't sure about was whether the compiler would be allowed to
> > break this code by hoisting the read of page->flags out of the loop
> > (because nothing in the loop actually writes to page->flags aside from
> > the compare-exchange, and if that succeeds we're *leaving* the loop).
>
> The cmpxchg is a barrier() and as such I don't think it's allowed to
> hoist anything out of the loop.
Yes it looks like it's at least as powerful as a barrier() because the
implementations I looked at (arm64 and x86) use inline asm with memory
operand (i.e. same as barrier()).
> The bigger problem is I think that page_cpuid_last() usage which does a
> second load of page->flags, and given sufficient races that could
> actually load a different value and then things would be screwy. But
> that's not actually fixed.
Right, the patch you provided (which I posted as v2) inlines the
page_cpuid_last() and should resolve this.
> > > > Signed-off-by: Peter Collingbourne <pcc@...gle.com>
> > > > Link: https://linux-review.googlesource.com/id/I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
> > >
> > > That's that doing here?
> >
> > I upload my changes to Gerrit and link to them here so that I (and
> > others) can see the progression of the patch via the web UI.
>
> What's the life-time guarantee for that URL existing? Because if it
> becomes part of the git commit, it had better stay around 'forever'
> etc..
I'd be surprised if it went away any time soon, the same hosting is
used for projects like Android and Chromium which have been using it
for years and have a long track record of stability.
Also note that the link is useful independent of the host continuing
to be up, it means that you can also do a search of the mailing list
archive like so:
https://lore.kernel.org/all/?q=I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
(or the equivalent link on a different mailing list archive if
lore.kernel.org ever gets shut down) and find the progression of the
patch that way. This is particularly useful if (as in this case) the
subject line of the patch changes.
Peter
Powered by blists - more mailing lists