lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f79735e1-1625-4746-98ce-a3c40123c5af@linux.dev>
Date: Sat, 23 Aug 2025 12:47:49 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Finn Thain <fthain@...ux-m68k.org>,
 Geert Uytterhoeven <geert@...ux-m68k.org>, mhiramat@...nel.org
Cc: akpm@...ux-foundation.org, will@...nel.org, peterz@...radead.org,
 mingo@...hat.com, longman@...hat.com, anna.schumaker@...cle.com,
 boqun.feng@...il.com, joel.granados@...nel.org, kent.overstreet@...ux.dev,
 leonylgao@...cent.com, linux-kernel@...r.kernel.org, rostedt@...dmis.org,
 tfiga@...omium.org, amaindex@...look.com, jstultz@...gle.com,
 Mingzhe Yang <mingzhe.yang@...com>, Eero Tamminen <oak@...sinkinet.fi>,
 linux-m68k <linux-m68k@...ts.linux-m68k.org>,
 Lance Yang <ioworker0@...il.com>, senozhatsky@...omium.org
Subject: Re: [PATCH v5 2/3] hung_task: show the blocker task if the task is
 hung on semaphore

Hi Finn,

On 2025/8/23 08:27, Finn Thain wrote:
> 
> On Sat, 23 Aug 2025, Lance Yang wrote:
> 
>>>
>>> include/linux/hung_task.h-/*
>>> include/linux/hung_task.h- * @blocker: Combines lock address and blocking type.
>>> include/linux/hung_task.h- *
>>> include/linux/hung_task.h- * Since lock pointers are at least 4-byte aligned(32-bit) or 8-byte
>>> include/linux/hung_task.h- * aligned(64-bit). This leaves the 2 least bits (LSBs) of the pointer
>>> include/linux/hung_task.h- * always zero. So we can use these bits to encode the specific blocking
>>> include/linux/hung_task.h- * type.
>>> include/linux/hung_task.h- *
> 
> That comment was introduced in commit e711faaafbe5 ("hung_task: replace
> blocker_mutex with encoded blocker"). It's wrong and should be fixed.

Right, the problematic assumption was introduced in that commit ;)

> 
>>> include/linux/hung_task.h- * Type encoding:
>>> include/linux/hung_task.h- * 00 - Blocked on mutex
>>>    (BLOCKER_TYPE_MUTEX)
>>> include/linux/hung_task.h- * 01 - Blocked on semaphore
>>>    (BLOCKER_TYPE_SEM)
>>> include/linux/hung_task.h- * 10 - Blocked on rw-semaphore as READER
>>>    (BLOCKER_TYPE_RWSEM_READER)
>>> include/linux/hung_task.h- * 11 - Blocked on rw-semaphore as WRITER
>>>    (BLOCKER_TYPE_RWSEM_WRITER)
>>> include/linux/hung_task.h- */
>>> include/linux/hung_task.h-#define BLOCKER_TYPE_MUTEX            0x00UL
>>> include/linux/hung_task.h-#define BLOCKER_TYPE_SEM              0x01UL
>>> include/linux/hung_task.h-#define BLOCKER_TYPE_RWSEM_READER     0x02UL
>>> include/linux/hung_task.h-#define BLOCKER_TYPE_RWSEM_WRITER     0x03UL
>>> include/linux/hung_task.h-
>>> include/linux/hung_task.h:#define BLOCKER_TYPE_MASK             0x03UL
>>>
>>> On m68k, the minimum alignment of int and larger is 2 bytes.
>>
>> Ah, thanks, that's good to know! It clearly explains why the
>> WARN_ON_ONCE() is triggering.
>>
>>> If you want to use the lowest 2 bits of a pointer for your own use,
>>> you must make sure data is sufficiently aligned.
>>
>> You're right. Apparently I missed that :(
>>
>> I'm wondering if there's a way to check an architecture's minimum
>> alignment at compile-time. If so, we could disable this feature on
>> architectures that don't guarantee 4-byte alignment.
>>
> 
> As Geert says, the compiler can give you all the bits you need, so you
> won't have to contort your algorithm to fit whatever free bits happen to
> be available. Please see for example, commit 258a980d1ec2 ("net: dst:
> Force 4-byte alignment of dst_metrics").

Yes, thanks, it's a helpful example!

I see your point that explicitly enforcing alignment is a very clean
solution for the lock structures supported by the blocker tracking
mechanism.

However, I'm thinking about the "principle of minimal impact" here.
Forcing alignment on the core lock types themselves — like struct
semaphore — feels like a broad change to fix an issue that's local to the
hung task detector :)

> 
>> If not, the fallback is to adjust the runtime checks.
>>
> 
> That would be a solution to a different problem.

For that reason, I would prefer to simply adjust the runtime checks within
the hung task detector. It feels like a more generic and self-contained
solution. It works out-of-the-box for the majority of architectures and
provides a safe fallback for those that aren't.

Happy to hear what you and others think about this trade-off. Perhaps
there's a perspective I'm missing ;)

Thanks,
Lance

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ