lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00772002-8df8-3a41-6e6c-20e3854ad3f0@kernel.dk>
Date:   Sat, 21 May 2022 20:42:35 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Guo Xuenan <guoxuenan@...wei.com>, asml.silence@...il.com,
        io-uring@...r.kernel.org
Cc:     houtao1@...wei.com, yi.zhang@...wei.com, chengzhihao1@...wei.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] io_uring: add a schedule condition in io_submit_sqes

On 5/21/22 8:33 AM, Guo Xuenan wrote:
> when set up sq ring size with IORING_MAX_ENTRIES, io_submit_sqes may
> looping ~32768 times which may trigger soft lockups. add need_resched
> condition to avoid this bad situation.
> 
> set sq ring size 32768 and using io_sq_thread to perform stress test
> as follows:
> watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [iou-sqp-600:601]
> Kernel panic - not syncing: softlockup: hung tasks
> CPU: 2 PID: 601 Comm: iou-sqp-600 Tainted: G L 5.18.0-rc7+ #3
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  dump_backtrace+0x218/0x228
>  show_stack+0x20/0x68
>  dump_stack_lvl+0x68/0x84
>  dump_stack+0x1c/0x38
>  panic+0x1ec/0x3ec
>  watchdog_timer_fn+0x28c/0x300
>  __hrtimer_run_queues+0x1d8/0x498
>  hrtimer_interrupt+0x238/0x558
>  arch_timer_handler_virt+0x48/0x60
>  handle_percpu_devid_irq+0xdc/0x270
>  generic_handle_domain_irq+0x50/0x70
>  gic_handle_irq+0x8c/0x4bc
>  call_on_irq_stack+0x2c/0x38
>  do_interrupt_handler+0xc4/0xc8
>  el1_interrupt+0x48/0xb0
>  el1h_64_irq_handler+0x18/0x28
>  el1h_64_irq+0x74/0x78
>  console_unlock+0x5d0/0x908
>  vprintk_emit+0x21c/0x470
>  vprintk_default+0x40/0x50
>  vprintk+0xd0/0x128
>  _printk+0xb4/0xe8
>  io_issue_sqe+0x1784/0x2908
>  io_submit_sqes+0x538/0x2880
>  io_sq_thread+0x328/0x7b0
>  ret_from_fork+0x10/0x20
> SMP: stopping secondary CPUs
> Kernel Offset: 0x40f1e8600000 from 0xffff800008000000
> PHYS_OFFSET: 0xfffffa8c80000000
> CPU features: 0x110,0000cf09,00001006
> Memory Limit: none
> ---[ end Kernel panic - not syncing: softlockup: hung tasks ]---
> 
> Signed-off-by: Guo Xuenan <guoxuenan@...wei.com>
> ---
>  fs/io_uring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 92ac50f139cd..d897c6798f00 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -7864,7 +7864,7 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
>  			if (!(ctx->flags & IORING_SETUP_SUBMIT_ALL))
>  				break;
>  		}
> -	} while (submitted < nr);
> +	} while (submitted < nr && !need_resched());
>  
>  	if (unlikely(submitted != nr)) {
>  		int ref_used = (submitted == -EAGAIN) ? 0 : submitted;

This is wrong, you'll potentially end up doing random short submits for
non-sqpoll as well.

sqpoll already supports capping how many it submits in one go, it just
doesn't do it if it's only running one ring. As simple as the below,
with 1024 pulled out of thin air. Would be great if you could experiment
and submit a v2 based on this principle instead. Might still need a
cond_resched() carefully placed in io_sq_thread().

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e0823f58f795..3830d7b493b9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7916,7 +7916,8 @@ static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
 	unsigned int to_submit;
 	int ret = 0;
 
-	to_submit = io_sqring_entries(ctx);
+	/* cap at 1024 to avoid doing too much in one submit round */
+	to_submit = min(io_sqring_entries(ctx), 1024U);
 	/* if we're handling multiple rings, cap submit size for fairness */
 	if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
 		to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ