[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiss_E41A1uH0-1MXF-GjxzW_Rbz+Xbs+fbr-vyQFpo4g@mail.gmail.com>
Date: Tue, 6 Aug 2024 10:49:58 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Guenter Roeck <linux@...ck-us.net>, Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, linux-kernel@...r.kernel.org,
Linux-MM <linux-mm@...ck.org>, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 6.10 000/809] 6.10.3-rc3 review
[ Adding s390 people, this is strange ]
New people, see
https://lore.kernel.org/all/CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@mail.gmail.com/
for context. There's a heisenbug that depends on random code layout
issues on s390.
On Tue, 6 Aug 2024 at 10:34, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Hmm. Do we have some alignment confusion?
>
> The alignment rules for 192 are to align it to 64-byte boundaries
> (because that's the largest power of two that divides it), and that
> means it stays at 192, and that would give 21 objects per 4kB page.
>
> But if we use the "align up to next power of two", you get 256 bytes,
> and 16 objects per page.
>
> And that 21-vs-16 confusion would seem to match this pretty well:
>
> [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
>
> which makes me wonder...
I'd suspect commit ad59baa31695 ("slab, rust: extend kmalloc()
alignment guarantees to remove Rust padding"), perhaps with some odd
s390 code generation issue for 'ffs()'.
IOW, this new code in mm/slab_common.c
if (flags & SLAB_KMALLOC)
align = max(align, 1U << (ffs(size) - 1));
might not match some other alignment code.
Or maybe it's the s390 ffs().
It looks like
static inline int ffs(int word)
{
unsigned long mask = 2 * BITS_PER_LONG - 1;
unsigned int val = (unsigned int)word;
return (1 + (__flogr(-val & val) ^ (BITS_PER_LONG - 1))) & mask;
}
where s390 has this very odd "flogr" instruction ("find last one G
register"?) for the non-constant case.
That uses a "union register_pair" but only ever uses the "even"
register without ever using the full 128-bit part or the odd register.
So the other register in the register pair is uninitialized.
Does that cause random compiler issues based on register allocation?
Just for fun, does something like this make any difference?
--- a/arch/s390/include/asm/bitops.h
+++ b/arch/s390/include/asm/bitops.h
@@ -305,6 +305,7 @@ static inline unsigned char __flogr(unsigned long word)
union register_pair rp;
rp.even = word;
+ rp.odd = 0;
asm volatile(
" flogr %[rp],%[rp]\n"
: [rp] "+d" (rp.pair) : : "cc");
Thomas notices that the special "div by constant" routines moved
around, and I'm not seeing how *that* would matter, but it's all
obviously very strange.
Linus
Powered by blists - more mailing lists