[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8f0miUc//BQXN3A@feng-clx>
Date: Wed, 18 Jan 2023 21:31:06 +0800
From: Feng Tang <feng.tang@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Vlastimil Babka <vbabka@...e.cz>,
"Sang, Oliver" <oliver.sang@...el.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
"oe-lkp@...ts.linux.dev" <oe-lkp@...ts.linux.dev>,
lkp <lkp@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jann Horn <jannh@...gle.com>,
"Song, Youquan" <youquan.song@...el.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Jan Kara <jack@...e.cz>, John Hubbard <jhubbard@...dia.com>,
"Kirill A . Shutemov" <kirill@...temov.name>,
Matthew Wilcox <willy@...radead.org>,
Michal Hocko <mhocko@...nel.org>,
Muchun Song <songmuchun@...edance.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Hyeonggon Yoo <42.hyeyoo@...il.com>,
"Yin, Fengwei" <fengwei.yin@...el.com>, <hongjiu.lu@...el.com>
Subject: Re: [linus:master] [hugetlb] 7118fc2906:
kernel_BUG_at_lib/list_debug.c
On Tue, Jan 17, 2023 at 10:25:17AM -0800, Linus Torvalds wrote:
> On Tue, Jan 17, 2023 at 4:22 AM Feng Tang <feng.tang@...el.com> wrote:
> >
> > With the following patch to use 'O1' instead 'O2' gcc optoin for
> > page_alloc.c, the list corruption issue can't be reproduced for
> > commit 7118fc2906 in 1000 runs.
>
> Ugh.
>
> It would be lovely if you could just narrow it down with
>
> #pragma GCC optimize ("O1")
> ...
> #pragma GCC optimize ("O2")
>
> around just that prep_compound_page(), but when I tried it myself I
> get some function attribute mismatch errors.
Yes, this works with both my local build and 0Day's build system,
and it also makes the issue gone.
> > As is can't be reproduced with X86_64 build, it could be i386
> > compiling related.
>
> Your particular config causes a huge amount of nasty 64-bit arithmetic
> according to the objdump code, with sequences like
>
> c13b3cbb: 83 05 d0 28 6c c5 01 addl $0x1,0xc56c28d0
> c13b3cc2: 83 15 d4 28 6c c5 00 adcl $0x0,0xc56c28d4
>
> which seems to be just from some coverage profiling being on
> (CONFIG_GCOV?), or something. It makes it very hard to read the code.
>
> You also have UBSAN enabled, which - again - makes for some really
> grotty asm that hides any actual logic.
>
> Finally, your objdump version also does some horrendous decoding, like
>
> c13b3e29: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
I know little about these tools, and I tried objdump tool from
Cent OS 9 (objdump version 2.35.2) and Ubuntu 22.04 (objdump version
2.38), they both dumped similar assembly. Please let me know if you
want us to try other version of objdump.
> which is just a 7-byte 'nop' instruction, but again, it makes it
> rather hard to actually read the code.
>
> With the i386 defconfig, gcc generates a function that is just ~30
> instructions for me, so this makes a huge difference in the legibility
> of the code.
>
> I wonder if you can recreate the issue with a much more
> straightforward config. By all means, leave DEBUG_PAGEALLOC and SLUB
> debugging on, but without the things like UBSAN and GCOV.
We modify the kconfig to disable GCOV and UBSAN, and the issue can't
be reproudced in 1000 runs.
The objdump of prep_compound_page() for the 2 commits are also
attached for check.
Anothe info is, as you pointed out GCOV and UBSAN, we tried to
disable GCOV/UBSAN for page_alloc.c in mm/Makefile separtely
(thanks to Fengwei Yin's tip), by adding lines:
a) UBSAN_SANITIZE_page_alloc.o := n
b) GCOV_PROFILE_page_alloc.o := n
The issue also cannot be reproduced with either one (with
original kconfig)
> > I also objdumped 'prep_compound_page' for vmlinux of 7118fc2906 and
> > its parent commit 48b8d744ea84, which have big difference than the
> > simple 'set_page_count()' change, but I can't tell which part is
> > abnormal, so attach them for further check.
>
> Yeah, I can't make heads or tails of them either, see above on how
> illegible the objdump files are. And that's despite not even having
> all of prep_compound_page() in them (it's missing
> prep_compound_page.cold, which is probably just UBSAN fixup code, but
> who knows..)
Yes, from what I can tell that old 'prep_compound_page.cold' is mostly
calling UBSAN functions. And after disbling GCOV+UBSAN, there is no
more prep_compound_page.cold.
Thanks,
Feng
>
> That said, with the i386 defconfig, the only change from adding
> set_page_count() to the loop seems to be exactly that:
>
> .L589:
> - movl $1024, 12(%eax)
> + movl $0, 28(%eax)
> addl $32, %eax
> + movl $1024, -20(%eax)
> movl %esi, -28(%eax)
> movl $0, -12(%eax)
> cmpl %edx, %eax
>
> (don't ask me why gcc does *one* access using the pre-incremented
> pointer, and then the rest to the post-incremented ones, but whatever
> - it means that it's not just "add a mov $0", it's also changing how
> the
>
> p->mapping = TAIL_MAPPING;
>
> instruction is done, which is that
>
> - movl $1024, 12(%eax)
> + movl $1024, -20(%eax)
>
> part of the change)
>
> Linus
View attachment "7118fc29.log" of type "text/plain" (3713 bytes)
View attachment "48b8d744.log" of type "text/plain" (4707 bytes)
Powered by blists - more mailing lists