lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8f0miUc//BQXN3A@feng-clx>
Date:   Wed, 18 Jan 2023 21:31:06 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
CC:     Vlastimil Babka <vbabka@...e.cz>,
        "Sang, Oliver" <oliver.sang@...el.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        "oe-lkp@...ts.linux.dev" <oe-lkp@...ts.linux.dev>,
        lkp <lkp@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Jann Horn <jannh@...gle.com>,
        "Song, Youquan" <youquan.song@...el.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Jan Kara <jack@...e.cz>, John Hubbard <jhubbard@...dia.com>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...nel.org>,
        Muchun Song <songmuchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Hyeonggon Yoo <42.hyeyoo@...il.com>,
        "Yin, Fengwei" <fengwei.yin@...el.com>, <hongjiu.lu@...el.com>
Subject: Re: [linus:master] [hugetlb] 7118fc2906:
 kernel_BUG_at_lib/list_debug.c

On Tue, Jan 17, 2023 at 10:25:17AM -0800, Linus Torvalds wrote:
> On Tue, Jan 17, 2023 at 4:22 AM Feng Tang <feng.tang@...el.com> wrote:
> >
> > With the following patch to use 'O1' instead 'O2' gcc optoin for
> > page_alloc.c, the list corruption issue can't be reproduced for
> > commit 7118fc2906 in 1000 runs.
> 
> Ugh.
> 
> It would be lovely if you could just narrow it down with
> 
>   #pragma GCC optimize ("O1")
>  ...
>   #pragma GCC optimize ("O2")
> 
> around just that prep_compound_page(), but when I tried it myself I
> get some function attribute mismatch errors.

Yes, this works with both my local build and 0Day's build system,
and it also makes the issue gone.
 
> > As is can't be reproduced with X86_64 build, it could be i386
> > compiling related.
> 
> Your particular config causes a huge amount of nasty 64-bit arithmetic
> according to the objdump code, with sequences like
> 
>   c13b3cbb:       83 05 d0 28 6c c5 01    addl   $0x1,0xc56c28d0
>   c13b3cc2:       83 15 d4 28 6c c5 00    adcl   $0x0,0xc56c28d4
> 
> which seems to be just from some coverage profiling being on
> (CONFIG_GCOV?), or something. It makes it very hard to read the code.
> 
> You also have UBSAN enabled, which - again - makes for some really
> grotty asm that hides any actual logic.
> 
> Finally, your objdump version also does some horrendous decoding, like
> 
>   c13b3e29:       8d b4 26 00 00 00 00    lea    0x0(%esi,%eiz,1),%esi

I know little about these tools, and I tried objdump tool from
Cent OS 9 (objdump version 2.35.2) and Ubuntu 22.04 (objdump version
2.38), they both dumped similar assembly. Please let me know if you
want us to try other version of objdump.

> which is just a 7-byte 'nop' instruction, but again, it makes it
> rather hard to actually read the code.
> 
> With the i386 defconfig, gcc generates a function that is just ~30
> instructions for me, so this makes a huge difference in the legibility
> of the code.
> 
> I wonder if you can recreate the issue with a much more
> straightforward config. By all means, leave DEBUG_PAGEALLOC and SLUB
> debugging on, but without the things like UBSAN and GCOV.

We modify the kconfig to disable GCOV and UBSAN, and the issue can't
be reproudced in 1000 runs.

The objdump of prep_compound_page() for the 2 commits are also
attached for check.

Anothe info is, as you pointed out GCOV and UBSAN, we tried to
disable GCOV/UBSAN for page_alloc.c in mm/Makefile separtely
(thanks to Fengwei Yin's tip), by adding lines:

a) UBSAN_SANITIZE_page_alloc.o := n
b) GCOV_PROFILE_page_alloc.o := n 

The issue also cannot be reproduced with either one (with
original kconfig)  

> > I also objdumped 'prep_compound_page' for vmlinux of 7118fc2906 and
> > its parent commit 48b8d744ea84, which have big difference than the
> > simple 'set_page_count()' change, but I can't tell which part is
> > abnormal, so attach them for further check.
> 
> Yeah, I can't make heads or tails of them either, see above on how
> illegible the objdump files are. And that's despite not even having
> all of prep_compound_page() in them (it's missing
> prep_compound_page.cold, which is probably just UBSAN fixup code, but
> who knows..)

Yes, from what I can tell that old 'prep_compound_page.cold' is mostly
calling UBSAN functions. And after disbling GCOV+UBSAN, there is no
more prep_compound_page.cold. 

Thanks,
Feng

> 
> That said, with the i386 defconfig, the only change from adding
> set_page_count() to the loop seems to be exactly that:
> 
>  .L589:
> -       movl    $1024, 12(%eax)
> +       movl    $0, 28(%eax)
>         addl    $32, %eax
> +       movl    $1024, -20(%eax)
>         movl    %esi, -28(%eax)
>         movl    $0, -12(%eax)
>         cmpl    %edx, %eax
> 
> (don't ask me why gcc does *one* access using the pre-incremented
> pointer, and then the rest to the post-incremented ones, but whatever
> - it means that it's not just "add a mov $0", it's also changing how
> the
> 
>         p->mapping = TAIL_MAPPING;
> 
> instruction is done, which is that
> 
> -       movl    $1024, 12(%eax)
> +       movl    $1024, -20(%eax)
> 
> part of the change)
> 
>              Linus

View attachment "7118fc29.log" of type "text/plain" (3713 bytes)

View attachment "48b8d744.log" of type "text/plain" (4707 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ