[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tencent_347CD8D5F7FC23EFE58BBACBA98894DB6E05@qq.com>
Date: Wed, 28 Jan 2026 11:29:51 +0800
From: Yuwen Chen <ywen.chen@...mail.com>
To: tglx@...nel.org
Cc: akpm@...ux-foundation.org,
andrealmeid@...lia.com,
bigeasy@...utronix.de,
colin.i.king@...il.com,
dave@...olabs.net,
dvhart@...radead.org,
edliaw@...gle.com,
justinstitt@...gle.com,
kernel-team@...roid.com,
licayy@...mail.com,
linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org,
luto@....edu,
mingo@...hat.com,
morbo@...gle.com,
nathan@...nel.org,
ndesaulniers@...gle.com,
peterz@...radead.org,
shuah@...nel.org,
usama.anjum@...labora.com,
wakel@...gle.com,
ywen.chen@...mail.com
Subject: Re: [PATCH v2] selftests/futex: fix the failed futex_requeue test issue
On Tue, 27 Jan 2026 19:30:31 +0100, Thomas Gleixner wrote:
> Extremely high?
>
> The main thread waits for 10000us aka. 10 seconds to allow the waiter
> thread to reach futex_wait().
>
> If anything is extreme then it's the 10 seconds wait, not the
> requirements. Please write factual changelogs and not fairy tales.
10,000 us is equal to 10 ms. On a specific ARM64 platform, it's quite
common for this test case to fail when there is a 10-millisecond waiting
time.
> That's a known issue for all futex selftests when the test system is
> under extreme load. That's why there is a gratious 10 seconds timeout,
> which is annoyingly long already.
>
> Also why is this special for the requeue_single test case?
>
> It's exactly the same issue for all futex selftests including the multi
> waiter one in the very same file, no?
Yes, this is a common phenomenon. However, for the sake of convenient
illustration, only the case of requeue_single is listed here.
> Why do you need an atomic store here?
>
> pthread_barrier_wait() is a full memory barrier already, no?
Yes, there's no need to use atomic here. However, in the kernel, WRITE_ONCE
and READ_ONCE are more likely to be used. Since it's particularly difficult
to use them here, atomic is adopted.
> What's wrong with reading /proc/$PID/wchan ?
>
> It's equally unreliable as /proc/$PID/stat because both can return the
> desired state _before_ the thread reaches the inner workings of the test
> related sys_futex(... WAIT).
Is it possible for the waiterfn to enter the sleep state between the
pthread_barrier_wait function and the futex_wait function? If so, would
checking the call stack be a solution?
Maybe using /proc/$PID/wchan is a better approach. Currently, I haven't found
any problems when using /proc/$PID/stat on our platform.
Thanks
Yuwen
Powered by blists - more mailing lists