[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CANp29Y4wyREoKO60XjOfh618Udf5h21boF3R_=qYY8tJc0otfg@mail.gmail.com>
Date: Tue, 8 Feb 2022 11:32:07 +0100
From: Aleksandr Nogikh <nogikh@...gle.com>
To: Hillf Danton <hdanton@...a.com>
Cc: "Theodore Ts'o" <tytso@....edu>, Waiman Long <longman@...hat.com>,
syzbot <syzbot+03464269af631f4a4bdf@...kaller.appspotmail.com>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4)
Closing the bug. Syzkaller now is much more careful with sched_setattr
and perf_event_open, so, hopefully, we'll see fewer such false
positive reports in the future.
#syz invalid
On Thu, Dec 30, 2021 at 1:50 PM Hillf Danton <hdanton@...a.com> wrote:
>
> On Wed, 29 Dec 2021 16:29:33 -0500 Theodore Ts'o wrote:
> > On Mon, Dec 27, 2021 at 10:14:23PM -0500, Waiman Long wrote:
> > >
> > > The test was running on a CONFIG_PREEMPT kernel. So if the syzkaller kthread
> > > is running at a higher priority than the rcu_preempt kthread, it is possible
> > > for the rcu_preempt kthread to be starved of cpu time. The rwsem optimistic
> > > spinning code will relinquish the cpu if there is a higher priority thread
> > > running. Since rcu_preempt kthread did not seem to be able to get the cpu, I
> > > suspect that it is probably caused by the syzkaller thread having a higher
> > > priority.
> >
> > It's even worse than that. The Syzkaller reproducer is calling
> > sched_setattr():
> >
> > *(uint32_t*)0x20000080 = 0x38; // sched_attr.sched_size
> > *(uint32_t*)0x20000084 = 1; // sched_attr.sched_policy == SCHED_FIFO
> > *(uint64_t*)0x20000088 = 0; // sched_attr.sched_flags
> > *(uint32_t*)0x20000090 = 0; // sched_attr.sched_nice
> > *(uint32_t*)0x20000094 = 1; // sched_attr.sched_priority
> > *(uint64_t*)0x20000098 = 0; // ...
> > *(uint64_t*)0x200000a0 = 0;
> > *(uint64_t*)0x200000a8 = 0;
> > *(uint32_t*)0x200000b0 = 0;
> > *(uint32_t*)0x200000b4 = 0;
> > syscall(__NR_sched_setattr, 0, 0x20000080ul, 0ul);
> >
> > So one or more of the syzkaller threads is SCHED_FIFO, and SCHED_FIFO
> > threads will *never* relinquish the CPU in favor of SCHED_OTHER
> > threads (which in practice will include all kernel threads unless
> > special measures are taken by someone who knows what they are doing)
> > so long as it they are runable.
> >
> > See the discussion at:
> >
> > https://lore.kernel.org/all/Yb5RMWRsJl5TMk8H@casper.infradead.org/
> >
> > I recommend that kernel developers simply ignore any syzkaller report
> > that relates to tasks being hung or rcu detected and where the
> > reproducer is trying to set a real-time priority (e.g., sched_policy
> > of SCHED_FIFO or SCHED_RR), since the number of ways that
> > sched_setattr can be used as a foot-gun are near infinite....
> >
> > Syzkaller reports that include sched_setattr are great for inflating
> > the OMG! There are tons of unhandled syzkaller reports, "companies
> > need to fund more engineering headcount to fix syzkaller bugs" slide
> > decks. But IMHO, they are not good for much else.
> >
> > - Ted
> >
>
> On the other hand, this report suggests IMHO the need for setting the
> deadline, a couple of ticks by default, for spinners, to cut the chance
> for FIFO tasks to make trouble in scenarios like the report.
>
> Mutex needs the same mechanism if it makes sense.
>
> Thanks
> Hillf
>
>
> +++ x/kernel/locking/rwsem.c
> @@ -716,6 +716,7 @@ rwsem_spin_on_owner(struct rw_semaphore
> struct task_struct *new, *owner;
> unsigned long flags, new_flags;
> enum owner_state state;
> + unsigned long deadline;
>
> lockdep_assert_preemption_disabled();
>
> @@ -724,6 +725,10 @@ rwsem_spin_on_owner(struct rw_semaphore
> if (state != OWNER_WRITER)
> return state;
>
> + /* avoid spinning long enough to make rcu stall
> + * particularly in case of FIFO spinner
> + */
> + deadline = jiffies + 2;
> for (;;) {
> /*
> * When a waiting writer set the handoff flag, it may spin
> @@ -747,7 +752,8 @@ rwsem_spin_on_owner(struct rw_semaphore
> */
> barrier();
>
> - if (need_resched() || !owner_on_cpu(owner)) {
> + if (need_resched() || !owner_on_cpu(owner) ||
> + time_after(jiffies, deadline)) {
> state = OWNER_NONSPINNABLE;
> break;
> }
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@...glegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20211230125018.2272-1-hdanton%40sina.com.
Powered by blists - more mailing lists