lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 28 Sep 2022 18:20:10 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Joel Fernandes <joel@...lfernandes.org>,
        Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc:     Hugh Dickins <hughd@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE

On 9/28/22 15:48, Joel Fernandes wrote:
> On Wed, Sep 28, 2022 at 02:49:02PM +0900, Hyeonggon Yoo wrote:
>> On Tue, Sep 27, 2022 at 10:16:35PM -0700, Hugh Dickins wrote:
>>> It's a bug in linux-next, but taking me too long to identify which
>>> commit is "to blame", so let me throw it over to you without more
>>> delay: I think __PageMovable() now needs to check !PageSlab().

When I tried that, the result wasn't really nice:

https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@suse.cz/

And what if there's another conflicting page "type" later. Or the 
debugging variant of rcu_head in struct page itself. The __PageMovable() 
is just too fragile.

>>> I had made a small experimental change somewhere, rebuilt and rebooted,
>>> was not surprised to crash once swapping and compaction came in,
>>> but was surprised to find the crash in isolate_movable_page(),
>>> called by compaction's isolate_migratepages_block().
>>>
>>> page->mapping was ffffffff811303aa, which qualifies as __PageMovable(),
>>> which expects struct movable_operations at page->mapping minus low bits.
>>> But ffffffff811303aa was the address of SLUB's rcu_free_slab(): I have
>>> CONFIG_CC_OPTIMIZE_FOR_SIZE=y, so function addresses may have low bits set.
>>>
>>> Over to you! Thanks,
>>> Hugh
>>
>> Wow, didn't expect this.
>> Thank you for report!
>>
>> That should be due to commit 65505d1f2338e7
>> ("mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head")
>> as now rcu_head can use some bits that shares with mapping.
>>
>> Hmm IMO we have two choices...
>>
>> 1. simply drop the commit as it's only for debugging (RCU folks may not like [1])
> 
> Yeah definitely don't like this option as patches are out that depend on
> this (not yet merged though). :-)

But we'll have to do that for now and postpone to 6.2 I'm afraid as 
merge window for 6.1 is too close to have confidence in any solution 
that we came up this moment.

>> 2. make __PageMovable() to use true page flag, with approach [2])
> 
> What are the drawbacks of making it a true flag?

Even if we free PageSlab, I'm sure there will be better uses of a free 
page flag than __PageMovable.

3. With frozen page allocation
https://lore.kernel.org/all/20220809171854.3725722-1-willy@infradead.org/

slab pages will have refcount 0 and compaction will skip them for that 
reason. But it had other unresolved issues with page isolation code IIRC.

> thanks,
> 
>   - Joel
> 
> 
> 
> 
>> [1] https://lore.kernel.org/all/85afd876-d8bb-0804-b2c5-48ed3055e702@joelfernandes.org/
>> [2] https://lore.kernel.org/linux-mm/20220919125708.276864-1-42.hyeyoo@gmail.com/
>>
>> Thanks!
>>
>> -- 
>> Thanks,
>> Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ