[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <50d4bea8-4fbc-5c78-7b14-8bad32e25a05@sapience.com>
Date: Mon, 24 Jul 2023 14:04:41 -0400
From: Genes Lists <lists@...ience.com>
To: Jens Axboe <axboe@...nel.dk>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
stable@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Pavel Begunkov <asml.silence@...il.com>,
io-uring@...r.kernel.org, linux-kernel@...r.kernel.org,
Andres Freund <andres@...razel.de>
Subject: Re: [PATCH 6.4 800/800] io_uring: Use io_schedule* in cqring wait
On 7/24/23 11:47, Jens Axboe wrote:
> It's not a behavioural change, it's a reporting change. There's no
> functionality changing here. That said, I do think we should narrow it a
> bit so we're only marked as in iowait if the task waiting has pending
> IO. That should still satisfy the initial problem, and it won't flag
> iowait on mariadb like cases where they have someone else just
> perpetually waiting on requests.
>
> As a side effect, this also removes the flush that wasn't at all
> necessary on io_uring.
>
> If folks are able to test the below, that would be appreciated.
>
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 89a611541bc4..f4591b912ea8 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -2493,11 +2493,20 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
> return 0;
> }
>
> +static bool current_pending_io(void)
> +{
> + struct io_uring_task *tctx = current->io_uring;
> +
> + if (!tctx)
> + return false;
> + return percpu_counter_read_positive(&tctx->inflight);
> +}
> +
> /* when returns >0, the caller should retry */
> static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> struct io_wait_queue *iowq)
> {
> - int token, ret;
> + int io_wait, ret;
>
> if (unlikely(READ_ONCE(ctx->check_cq)))
> return 1;
> @@ -2511,17 +2520,19 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
> return 0;
>
> /*
> - * Use io_schedule_prepare/finish, so cpufreq can take into account
> - * that the task is waiting for IO - turns out to be important for low
> - * QD IO.
> + * Mark us as being in io_wait if we have pending requests, so cpufreq
> + * can take into account that the task is waiting for IO - turns out
> + * to be important for low QD IO.
> */
> - token = io_schedule_prepare();
> + io_wait = current->in_iowait;
> + if (current_pending_io())
> + current->in_iowait = 1;
> ret = 0;
> if (iowq->timeout == KTIME_MAX)
> schedule();
> else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
> ret = -ETIME;
> - io_schedule_finish(token);
> + current->in_iowait = io_wait;
> return ret;
> }
>
>
Tested on top of 6.4.6 stable - and working great - iowait stats now
look like they always did.
thank you
gene
Powered by blists - more mailing lists