lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <25208400-a203-fb5c-d0c3-2934a4d227e3@linux-m68k.org>
Date: Tue, 25 Nov 2025 14:52:43 +1100 (AEDT)
From: Finn Thain <fthain@...ux-m68k.org>
To: Daniel Palmer <daniel@...f.com>
cc: Peter Zijlstra <peterz@...radead.org>, Will Deacon <will@...nel.org>, 
    Andrew Morton <akpm@...ux-foundation.org>, Arnd Bergmann <arnd@...db.de>, 
    Boqun Feng <boqun.feng@...il.com>, 
    Geert Uytterhoeven <geert@...ux-m68k.org>, linux-arch@...r.kernel.org, 
    linux-kernel@...r.kernel.org, linux-m68k@...r.kernel.org, 
    Mark Rutland <mark.rutland@....com>
Subject: Re: [RFC v4 3/5] atomic: Specify alignment for atomic_t and
 atomic64_t


On Mon, 24 Nov 2025, Daniel Palmer wrote:

> On Tue, 21 Oct 2025 at 07:39, Finn Thain <fthain@...ux-m68k.org> wrote:
> >
> > Some recent commits incorrectly assumed 4-byte alignment of locks.
> > That assumption fails on Linux/m68k (and, interestingly, would have
> > failed on Linux/cris also). Specify the minimum alignment of atomic
> > variables for fewer surprises and (hopefully) better performance.
> 
> FWIW I implemented jump labels for m68k and I think there is a problem
> with this in there too.
> jump_label_init() calls static_key_set_entries() and setting
> key->entries in there is corrupting 'atomic_t enabled' at the start of
> key.
> 
> With this patch the problem goes away.
> 

That's interesting. I wonder whether the alignment requirements of machine 
instructions permitted the "appropriation" of the low bits from those 
pointers...

In anycase, a modified jump label algorithm that did not use/abuse pointer 
bits would need to execute as fast as the existing implementation. And 
that might be quite difficult (especially a portable algorithm).

Recently I had an opportunity to do some performance measurements on m68k 
for this atomic_t alignment patch. I tested some kernel stressors on an 
AWS 95 (33 MHz 68040, 128 MB RAM, 512 KiB L2$) and also on a Mac IIfx (40 
MHz 68030, 80 MB RAM, 32 KiB L2$).

The patch makes the kernel faster or slower, depending the workload. For 
example, the fifo, futex and shm stressors were consistently faster 
whereas the splice, signal and msg stressors were consistently slower.

There are no hardware counters for cache misses that might account for 
part of the slowdown. OTOH, alignment also reduces instances of locks 
split across page boundaries, which might account for the speed-up. (I 
didn't look at VM performance counters.)

Finally, I should note that the stress-ng man page says "do NOT use" as a 
benchmark. OK, well, if anyone wishes to reproduce my results, I can send 
you the statically linked binary I used. The job file is attached.

I wonder whether others have done any throughput measurement for this 
patch, using their favourite workloads?
View attachment "aligned-atomics.job" of type "text/plain" (1242 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ