[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org>
Date: Mon, 8 Sep 2025 11:30:29 +1000 (AEST)
From: Finn Thain <fthain@...ux-m68k.org>
To: Peter Zijlstra <peterz@...radead.org>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Lance Yang <lance.yang@...ux.dev>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Eero Tamminen <oak@...sinkinet.fi>,
Will Deacon <will@...nel.org>, stable@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] atomic: Specify natural alignment for atomic_t
On Mon, 1 Sep 2025, Peter Zijlstra wrote:
> On Mon, Sep 01, 2025 at 11:36:00AM +0200, Peter Zijlstra wrote:
>
> > Something like the completely untested below should do I suppose.
> >
> > ---
> > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h
> > index 711a1f0d1a73..e39cdfe5a59e 100644
> > --- a/include/linux/instrumented.h
> > +++ b/include/linux/instrumented.h
> > @@ -67,6 +67,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_
> > {
> > kasan_check_read(v, size);
> > kcsan_check_atomic_read(v, size);
> > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));
> > }
> >
> > /**
> > @@ -81,6 +82,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size
> > {
> > kasan_check_write(v, size);
> > kcsan_check_atomic_write(v, size);
> > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));
> > }
> >
> > /**
> > @@ -95,6 +97,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v,
> > {
> > kasan_check_write(v, size);
> > kcsan_check_atomic_read_write(v, size);
> > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));
> > }
> >
> > /**
>
> Arguably, that should've been something like:
>
> ((unsigned long)v & (size-1))
>
> I suppose.
>
I've been testing this patch and these new WARN_ON splats have revealed
some issues.
To start with, I overlooked atomic64_t in my patch. But aligning that
isn't sufficient in some cases: we have kmem caches of structs containing
atomic64_t members, where kmem_cache_create() is used with a default
alignment of 4, for example, the struct inode cache in fs/proc/inode.c. So
I changed the above CONFIG_DEBUG_ATOMIC test to
(unsigned long)v & (sizeof(long) - 1).
Another issue I encountered was atomic operations used on non-atomic
types. The try_cmpxchg() in task_work_add() triggers the
CONFIG_DEBUG_ATOMIC misalignment WARN because of the 850 byte offset of
task_works into struct task_struct. I got around this by adding
__aligned(sizeof(long)) to task_works, but maybe
__aligned(sizeof(atomic_t)) would be better (?)
Another example of this problem (i.e. atomic operation used with
non-atomic type) appears in wait_task_zombie() in kernel/exit.c, where
cmpxchg() is used on exit_state, found at offset 418 in struct
task_struct. I prevented this by adding __aligned(sizeof(long)) to
exit_state. I'm not sure what the right patch is.
A different problem shows up in spi_setup_transport_attrs() where
spi_dv_mutex() is used to coerce starget_data into struct
spi_transport_attrs, which happens to contain some atomic storage.
starget_data is found at the end of the dynamically allocated struct
scsi_target (not an uncommon pattern). So starget_data ends up at offset
290, and struct spi_transport_attrs is also misaligned, as is dv_mutex. To
get around this, I changed the definition of struct scsi_target:
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -370,7 +370,7 @@ struct scsi_target {
char scsi_level;
enum scsi_target_state state;
void *hostdata; /* available to low-level driver */
- unsigned long starget_data[]; /* for the transport */
+ atomic64_t starget_data[]; /* for the transport */
/* starget_data must be the last element!!!! */
} __attribute__((aligned(sizeof(unsigned long))));
This patch works because starget_data is never accessed except where it is
being cast as some other type. But there's probably a reason for the
'long' here that I don't know about... maybe I should use atomic_long_t
or atomic_t.
I found a few structures in the VFS layer with a misaligned lock, which I
patched my way around. There's still a WARN splat from down_write() for
semaphores somewhere in the VFS and TTY layers, but I was unable to track
down the relevant memory allocations:
[ 59.500000] ------------[ cut here ]------------
[ 59.500000] WARNING: CPU: 0 PID: 329 at ./include/linux/instrumented.h:101 rwsem_down_write_slowpath+0x26a/0x4ca
[ 59.500000] Modules linked in:
[ 59.500000] CPU: 0 UID: 0 PID: 329 Comm: (udev-worker) Not tainted 6.16.0-multi-00009-g2b4218de16c4 #1 NONE
[ 59.500000] Stack from 01e4be64:
[ 59.500000] 01e4be64 005b83e3 005b83e3 005aa650 00000065 00000009 01e4be88 00007398
[ 59.500000] 005b83e3 01e4be98 000353a0 005aa41f 01e3024c 01e4bed0 00002594 005aa650
[ 59.500000] 00000065 00507f2a 00000009 00000000 00000000 00000000 00000002 00000002
[ 59.500000] 00000000 00000000 01e3024c 01e4bf40 00507f2a 005aa650 00000065 00000009
[ 59.500000] 00000000 00000000 00000000 00141ef8 0013e360 fffffffe 0aba9500 01e3024c
[ 59.500000] 01e4bfa0 00f981b0 00000001 01e4bf2a 01e30254 01e4bf2a 00070000 00020000
[ 59.500000] Call Trace: [<00007398>] dump_stack+0x10/0x16
[ 59.500000] [<000353a0>] __warn+0xd0/0xf8
[ 59.500000] [<00002594>] warn_slowpath_fmt+0x54/0x62
[ 59.500000] [<00507f2a>] rwsem_down_write_slowpath+0x26a/0x4ca
[ 59.500000] [<00507f2a>] rwsem_down_write_slowpath+0x26a/0x4ca
[ 59.500000] [<00141ef8>] iput+0x0/0x1fe
[ 59.500000] [<0013e360>] dput+0x0/0x160
[ 59.500000] [<00070000>] defer_console_output+0x28/0x34
[ 59.500000] [<00020000>] _FP_CALL_TOP+0x25c2/0xd512
[ 59.500000] [<000101e4>] ikbd_mouse_y0_bot+0x4/0x1c
[ 59.500000] [<0000ffff>] atari_keyb_init+0xf7/0x17a
[ 59.500000] [<005081c2>] down_write+0x38/0xf6
[ 59.500000] [<0013733a>] do_unlinkat+0xc8/0x264
[ 59.500000] [<0013755e>] sys_unlink+0x1c/0x20
[ 59.500000] [<0000876a>] syscall+0x8/0xc
[ 59.500000] [<0000c048>] die_if_kernel.part.0+0x2c/0x40
[ 59.500000]
[ 59.500000] ---[ end trace 0000000000000000 ]---
All of my testing was done on m68k. It would be interesting to try this on
some other 32-bit architecture that tolerates misaligned accesses. More
misaligned 64-bit atomics would be unsurprising.
The patches for these experiments may be found at,
https://github.com/fthain/linux/tree/atomic_t
Powered by blists - more mailing lists