lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YczTPYx0L7y8TgIE@mit.edu>
Date:   Wed, 29 Dec 2021 16:29:33 -0500
From:   "Theodore Ts'o" <tytso@....edu>
To:     Waiman Long <longman@...hat.com>
Cc:     Hillf Danton <hdanton@...a.com>,
        syzbot <syzbot+03464269af631f4a4bdf@...kaller.appspotmail.com>,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] INFO: rcu detected stall in ext4_file_write_iter (4)

On Mon, Dec 27, 2021 at 10:14:23PM -0500, Waiman Long wrote:
> 
> The test was running on a CONFIG_PREEMPT kernel. So if the syzkaller kthread
> is running at a higher priority than the rcu_preempt kthread, it is possible
> for the rcu_preempt kthread to be starved of cpu time. The rwsem optimistic
> spinning code will relinquish the cpu if there is a higher priority thread
> running. Since rcu_preempt kthread did not seem to be able to get the cpu, I
> suspect that it is probably caused by the syzkaller thread having a higher
> priority.

It's even worse than that.  The Syzkaller reproducer is calling
sched_setattr():

  *(uint32_t*)0x20000080 = 0x38;    // sched_attr.sched_size
  *(uint32_t*)0x20000084 = 1;       // sched_attr.sched_policy == SCHED_FIFO
  *(uint64_t*)0x20000088 = 0;       // sched_attr.sched_flags
  *(uint32_t*)0x20000090 = 0;       // sched_attr.sched_nice
  *(uint32_t*)0x20000094 = 1;       // sched_attr.sched_priority
  *(uint64_t*)0x20000098 = 0;       // ...
  *(uint64_t*)0x200000a0 = 0;
  *(uint64_t*)0x200000a8 = 0;
  *(uint32_t*)0x200000b0 = 0;
  *(uint32_t*)0x200000b4 = 0;
  syscall(__NR_sched_setattr, 0, 0x20000080ul, 0ul);

So one or more of the syzkaller threads is SCHED_FIFO, and SCHED_FIFO
threads will *never* relinquish the CPU in favor of SCHED_OTHER
threads (which in practice will include all kernel threads unless
special measures are taken by someone who knows what they are doing)
so long as it they are runable.

See the discussion at:

    https://lore.kernel.org/all/Yb5RMWRsJl5TMk8H@casper.infradead.org/

I recommend that kernel developers simply ignore any syzkaller report
that relates to tasks being hung or rcu detected and where the
reproducer is trying to set a real-time priority (e.g., sched_policy
of SCHED_FIFO or SCHED_RR), since the number of ways that
sched_setattr can be used as a foot-gun are near infinite....

Syzkaller reports that include sched_setattr are great for inflating
the OMG!  There are tons of unhandled syzkaller reports, "companies
need to fund more engineering headcount to fix syzkaller bugs" slide
decks.  But IMHO, they are not good for much else.

	      	     	       		    - Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ