[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87367tvh6g.fsf@nanos.tec.linutronix.de>
Date: Thu, 21 May 2020 10:13:59 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
John Garry <john.garry@...wei.com>,
Bart Van Assche <bvanassche@....org>,
Hannes Reinecke <hare@...e.com>, io-uring@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: io_uring vs CPU hotplug, was Re: [PATCH 5/9] blk-mq: don't set data->ctx and data->hctx in blk_mq_alloc_request_hctx
Ming Lei <ming.lei@...hat.com> writes:
> On Thu, May 21, 2020 at 12:14:18AM +0200, Thomas Gleixner wrote:
>> When the CPU is finally offlined, i.e. the CPU cleared the online bit in
>> the online mask is definitely too late simply because it still runs on
>> that outgoing CPU _after_ the hardware queue is shut down and drained.
>
> IMO, the patch in Christoph's blk-mq-hotplug.2 still works for percpu
> kthread.
>
> It is just not optimal in the retrying, but it should be fine. When the
> percpu kthread is scheduled on the CPU to be offlined:
>
> - if the kthread doesn't observe the INACTIVE flag, the allocated request
> will be drained.
>
> - otherwise, the kthread just retries and retries to allocate & release,
> and sooner or later, its time slice is consumed, and migrated out, and the
> cpu hotplug handler will get chance to run and move on, then the cpu is
> shutdown.
1) This is based on the assumption that the kthread is in the SCHED_OTHER
scheduling class. Is that really a valid assumption?
2) What happens in the following scenario:
unplug
mq_offline
set_ctx_inactive()
drain_io()
io_kthread()
try_queue()
wait_on_ctx()
Can this happen and if so what will wake up that thread?
I'm not familiar enough with that code to answer #2, but this really
wants to be properly described and documented.
Thanks,
tglx
Powered by blists - more mailing lists