lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200420114719.GA2659@kadam>
Date:   Mon, 20 Apr 2020 14:47:19 +0300
From:   Dan Carpenter <dan.carpenter@...cle.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     syzbot <syzbot+66243bb7126c410cefe6@...kaller.appspotmail.com>,
        axboe@...nel.dk, io-uring@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk
Subject: Re: INFO: rcu detected stall in io_uring_release

On Sun, Apr 19, 2020 at 12:06:26PM +0800, Hillf Danton wrote:
> 
> Sat, 18 Apr 2020 11:59:13 -0700
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    8f3d9f35 Linux 5.7-rc1
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=115720c3e00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > 
> > Unfortunately, I don't have any reproducer for this crash yet.
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+66243bb7126c410cefe6@...kaller.appspotmail.com
> > 
> > rcu: INFO: rcu_preempt self-detected stall on CPU
> > rcu: 	0-....: (10500 ticks this GP) idle=57e/1/0x4000000000000002 softirq=44329/44329 fqs=5245 
> > 	(t=10502 jiffies g=79401 q=2096)
> > NMI backtrace for cpu 0
> > CPU: 0 PID: 23184 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  <IRQ>
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x188/0x20d lib/dump_stack.c:118
> >  nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
> >  nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
> >  trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
> >  rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254
> >  print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
> >  check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
> >  rcu_pending kernel/rcu/tree.c:3225 [inline]
> >  rcu_sched_clock_irq.cold+0x55d/0xcfa kernel/rcu/tree.c:2296
> >  update_process_times+0x25/0x60 kernel/time/timer.c:1727
> >  tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
> >  tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320
> >  __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
> >  __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584
> >  hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646
> >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline]
> >  smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138
> >  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
> >  </IRQ>
> > RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301
> > Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 3a ea 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
> > RSP: 0018:ffffc9000897fdf0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > RAX: ffff888024082080 RBX: ffff88808df8e000 RCX: 1ffff9200112ffab
> > RDX: 0000000000000000 RSI: ffffffff81d549c6 RDI: ffff88808df8e300
> > RBP: ffffed1011bf1c2c R08: 0000000000000001 R09: ffffed1011bf1c61
> > R10: ffff88808df8e307 R11: ffffed1011bf1c60 R12: ffffed1011bf1c22
> > R13: ffff88808df8e160 R14: ffff88808df8e110 R15: ffffffff81d54ed0
> >  io_uring_release+0x3e/0x50 fs/io_uring.c:7324
> >  __fput+0x33e/0x880 fs/file_table.c:280
> >  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
> >  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
> >  exit_to_usermode_loop+0x2fa/0x360 arch/x86/entry/common.c:165
> >  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
> >  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
> >  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
> >  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> 
> Make io ring ctx's percpu_ref balanced.
> 
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5904,6 +5904,7 @@ static int io_submit_sqes(struct io_ring
>  fail_req:
>  			io_cqring_add_event(req, err);
>  			io_double_put_req(req);
> +			--submitted;
>  			break;
>  		}


fs/io_uring.c
  5880          for (i = 0; i < nr; i++) {
  5881                  const struct io_uring_sqe *sqe;
  5882                  struct io_kiocb *req;
  5883                  int err;
  5884  
  5885                  sqe = io_get_sqe(ctx);
  5886                  if (unlikely(!sqe)) {
  5887                          io_consume_sqe(ctx);
  5888                          break;
  5889                  }
  5890                  req = io_alloc_req(ctx, statep);
  5891                  if (unlikely(!req)) {
  5892                          if (!submitted)
  5893                                  submitted = -EAGAIN;
  5894                          break;
  5895                  }
  5896  
  5897                  err = io_init_req(ctx, req, sqe, statep, async);
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
On the success path io_init_req() takes some references like:

	get_cred(req->work.creds);

That one is probably buggy and should be put if the call to:

	return io_req_set_file(state, req, fd, sqe_flags);

fails...  But io_req_set_file() takes some other references if it
succeeds like percpu_ref_get(req->fixed_file_refs); and it's not clear
that those are released if io_submit_sqe() fails.

  5898                  io_consume_sqe(ctx);
  5899                  /* will complete beyond this point, count as submitted */
  5900                  submitted++;
  5901  
  5902                  if (unlikely(err)) {
  5903  fail_req:
  5904                          io_cqring_add_event(req, err);
  5905                          io_double_put_req(req);
  5906                          break;
  5907                  }
  5908  
  5909                  trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data,
  5910                                                  true, async);
  5911                  err = io_submit_sqe(req, sqe, statep, &link);
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
here

  5912                  if (err)
  5913                          goto fail_req;
  5914          }

regards,
dan carpenter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ