[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3a706169-9fce-48a0-b808-37f347a65a25@roeck-us.net>
Date: Tue, 6 Aug 2024 11:13:16 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, linux-kernel@...r.kernel.org,
Linux-MM <linux-mm@...ck.org>, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 6.10 000/809] 6.10.3-rc3 review
On 8/6/24 10:49, Linus Torvalds wrote:
> [ Adding s390 people, this is strange ]
>
Did I get lost somewhere ? I am seeing this with parisc (64 bit), not s390.
Thanks,
Guenter
> New people, see
>
> https://lore.kernel.org/all/CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@mail.gmail.com/
>
> for context. There's a heisenbug that depends on random code layout
> issues on s390.
>
> On Tue, 6 Aug 2024 at 10:34, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> Hmm. Do we have some alignment confusion?
>>
>> The alignment rules for 192 are to align it to 64-byte boundaries
>> (because that's the largest power of two that divides it), and that
>> means it stays at 192, and that would give 21 objects per 4kB page.
>>
>> But if we use the "align up to next power of two", you get 256 bytes,
>> and 16 objects per page.
>>
>> And that 21-vs-16 confusion would seem to match this pretty well:
>>
>> [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
>>
>> which makes me wonder...
>
> I'd suspect commit ad59baa31695 ("slab, rust: extend kmalloc()
> alignment guarantees to remove Rust padding"), perhaps with some odd
> s390 code generation issue for 'ffs()'.
>
> IOW, this new code in mm/slab_common.c
>
> if (flags & SLAB_KMALLOC)
> align = max(align, 1U << (ffs(size) - 1));
>
> might not match some other alignment code.
>
> Or maybe it's the s390 ffs().
>
> It looks like
>
> static inline int ffs(int word)
> {
> unsigned long mask = 2 * BITS_PER_LONG - 1;
> unsigned int val = (unsigned int)word;
>
> return (1 + (__flogr(-val & val) ^ (BITS_PER_LONG - 1))) & mask;
> }
>
> where s390 has this very odd "flogr" instruction ("find last one G
> register"?) for the non-constant case.
>
> That uses a "union register_pair" but only ever uses the "even"
> register without ever using the full 128-bit part or the odd register.
> So the other register in the register pair is uninitialized.
>
> Does that cause random compiler issues based on register allocation?
>
> Just for fun, does something like this make any difference?
>
> --- a/arch/s390/include/asm/bitops.h
> +++ b/arch/s390/include/asm/bitops.h
> @@ -305,6 +305,7 @@ static inline unsigned char __flogr(unsigned long word)
> union register_pair rp;
>
> rp.even = word;
> + rp.odd = 0;
> asm volatile(
> " flogr %[rp],%[rp]\n"
> : [rp] "+d" (rp.pair) : : "cc");
>
>
> Thomas notices that the special "div by constant" routines moved
> around, and I'm not seeing how *that* would matter, but it's all
> obviously very strange.
>
> Linus
Powered by blists - more mailing lists