lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 24 Oct 2022 21:19:57 -0700 (PDT)
From:   Hugh Dickins <hughd@...gle.com>
To:     Vlastimil Babka <vbabka@...e.cz>
cc:     Matthew Wilcox <willy@...radead.org>,
        Hyeonggon Yoo <42.hyeyoo@...il.com>,
        Hugh Dickins <hughd@...gle.com>,
        David Laight <David.Laight@...lab.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        rcu@...r.kernel.org
Subject: Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE

On Mon, 24 Oct 2022, Vlastimil Babka wrote:
> On 10/3/22 19:00, Matthew Wilcox wrote:
> > On Sun, Oct 02, 2022 at 02:48:02PM +0900, Hyeonggon Yoo wrote:
> >> Just one more thing, rcu_leak_callback too. RCU seem to use it
> >> internally to catch double call_rcu().
> >> 
> >> And some suggestions:
> >> - what about adding runtime WARN() on slab init code to catch
> >>   unexpected arch/toolchain issues?
> >> - instead of 4, we may use macro definition? like (PAGE_MAPPING_FLAGS + 1)?
> > 
> > I think the real problem here is that isolate_movable_page() is
> > insufficiently paranoid.  Looking at the gyrations that GUP and the
> > page cache do to convince themselves that the page they got really is
> > the page they wanted, there are a few missing pieces (eg checking that
> > you actually got a refcount on _this_ page and not some random other
> > page you were temporarily part of a compound page with).
> > 
> > This patch does three things:
> > 
> >  - Turns one of the comments into English.  There are some others
> >    which I'm still scratching my head over.
> >  - Uses a folio to help distinguish which operations are being done
> >    to the head vs the specific page (this is somewhat an abuse of the
> >    folio concept, but it's acceptable)
> >  - Add the aforementioned check that we're actually operating on the
> >    page that we think we want to be.
> >  - Add a check that the folio isn't secretly a slab.
> > 
> > We could put the slab check in PageMapping and call it after taking
> > the folio lock, but that seems pointless.  It's the acquisition of
> > the refcount which stabilises the slab flag, not holding the lock.
> > 
> 
> I would like to have a working safe version in -next, even if we are able
> simplify it later thanks to frozen refcounts. I've made a formal patch of
> yours, but I'm still convinced the slab check needs to be more paranoid so
> it can't observe a false positive __folio_test_movable() while missing the
> folio_test_slab(), hence I added the barriers as in my previous attempt [1].
> Does that work for you and can I add your S-o-b?
> 
> [1] https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@suse.cz/

Ignore me, don't let me distract if you're happy with Matthew's patch
(I know little of PageMovable, and I haven't tried to understand it);
but it did look to me more like 6.2 material, and I was surprised that
you dropped the simple align(4) approach for 6.1.

Because of Hyeonggon's rcu_leak_callback() observation?  That was a
good catch, but turned out to be irrelevant, because it was only for
an RCU debugging option, which would never be set up on a struct page
(well, maybe it would in a dynamically-allocated-struct-page future).

Hugh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ