[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1808501-559e-4762-b0ea-f1fffd2e7f19@kernel.dk>
Date: Wed, 3 Sep 2025 12:51:09 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>
Cc: syzbot <syzbot+034246a838a10d181e78@...kaller.appspotmail.com>,
andrealmeid@...lia.com, dave@...olabs.net, dvhart@...radead.org,
linux-kernel@...r.kernel.org, mingo@...hat.com,
syzkaller-bugs@...glegroups.com, tglx@...utronix.de
Subject: Re: [syzbot] [kernel?] general protection fault in try_to_wake_up (3)
On 9/3/25 7:07 AM, Sebastian Andrzej Siewior wrote:
> +Jens
>
> On 2025-09-02 23:46:28 [+0200], Peter Zijlstra wrote:
>> When I build the provided .config with clang-20, that a58 offset is
>> exactly task_struct::pi_lock::lockdep_map, which nicely corresponds with
>> the below stacktrace, and seems to suggest someone did:
>> try_to_wake_up(NULL).
>
> correct.
>
>>> try_to_wake_up+0x67/0x12b0 kernel/sched/core.c:4216
>>> requeue_pi_wake_futex+0x24b/0x2f0 kernel/futex/requeue.c:249
>>
>> Trouble is, we've not changed the requeue bits in a fair while... So I'm
>> somewhat confused on how this happens now ?!
>
> This means syzkaller managed to invoke futex_wait_setup(?, NULL) in
> order to get futex_q::task assigned to NULL. All users use current
> except for io_futex_wait().
>
> The syz-reproducer lists only:
> | timer_create(0x0, &(0x7f0000000080)={0x0, 0x11, 0x0, @thr={0x0, 0x0}}, &(0x7f0000000000))
> | timer_settime(0x0, 0x0, &(0x7f0000000240)={{0x0, 0x8}, {0x0, 0x9}}, 0x0)
> | futex(&(0x7f000000cffc), 0x80000000000b, 0x0, 0x0, &(0x7f0000048000), 0x0)
> | futex(&(0x7f000000cffc), 0xc, 0x1, 0x0, &(0x7f0000048000), 0x0)
>
> and that is probably why it can't come up with C-reproducer.
> The whole log has (filtered) the following lines:
>
> | io_uring_setup(0x85a, &(0x7f0000000180)={0x0, 0x58b9, 0x1, 0x2, 0x383})
> | syz_io_uring_setup(0x88f, &(0x7f0000000300)={0x0, 0xaedf, 0x0, 0x0, 0x25d}, &(0x7f0000000140)=<r0=>0x0, &(0x7f0000000280)=<r1=>0x0)
> | syz_memcpy_off$IO_URING_METADATA_GENERIC(r0, 0x4, &(0x7f0000000080)=0xfffffffc, 0x0, 0x4)
> | syz_io_uring_submit(r0, r1, &(0x7f00000001c0)=@...ING_OP_RECVMSG={0xa, 0x8, 0x1, r2, 0x0, &(0x7f0000000440)={0x0, 0x0, 0x0}, 0x0, 0x40000020, 0x1, {0x2}})
>
> This should explain the how the waiter got NULL. There is no private
> flag so that is how they interact with each other.
> Do we want this:
>
> diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c
> index c716a66f86929..0c98256ebdcb7 100644
> --- a/kernel/futex/requeue.c
> +++ b/kernel/futex/requeue.c
> @@ -312,6 +312,8 @@ futex_proxy_trylock_atomic(u32 __user *pifutex, struct futex_hash_bucket *hb1,
> if (!top_waiter->rt_waiter || top_waiter->pi_state)
> return -EINVAL;
>
> + if (!top_waiter->task)
> + -EINVAL;
> /* Ensure we requeue to the expected futex. */
> if (!futex_match(top_waiter->requeue_pi_key, key2))
> return -EINVAL;
>
> ?
>
> Sebastian
Yep that looks reasonable to me. And agree that this futex must've been
setup on the io_uring side, which is why you end up with ->task == NULL.
--
Jens Axboe
Powered by blists - more mailing lists