[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <448d3660-0d83-889b-001f-a09ea53fa117@kernel.dk>
Date: Wed, 20 May 2020 09:20:50 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Christoph Hellwig <hch@....de>, Ming Lei <ming.lei@...hat.com>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
linux-block@...r.kernel.org, John Garry <john.garry@...wei.com>,
Bart Van Assche <bvanassche@....org>,
Hannes Reinecke <hare@...e.com>, io-uring@...r.kernel.org
Subject: Re: io_uring vs CPU hotplug, was Re: [PATCH 5/9] blk-mq: don't set
data->ctx and data->hctx in blk_mq_alloc_request_hctx
On 5/20/20 8:45 AM, Jens Axboe wrote:
> On 5/20/20 2:03 AM, Christoph Hellwig wrote:
>> On Wed, May 20, 2020 at 11:04:24AM +0800, Ming Lei wrote:
>>> On Wed, May 20, 2020 at 09:18:23AM +0800, Ming Lei wrote:
>>>> On Tue, May 19, 2020 at 05:30:00PM +0200, Christoph Hellwig wrote:
>>>>> On Tue, May 19, 2020 at 09:54:20AM +0800, Ming Lei wrote:
>>>>>> As Thomas clarified, workqueue hasn't such issue any more, and only other
>>>>>> per CPU kthreads can run until the CPU clears the online bit.
>>>>>>
>>>>>> So the question is if IO can be submitted from such kernel context?
>>>>>
>>>>> What other per-CPU kthreads even exist?
>>>>
>>>> I don't know, so expose to wider audiences.
>>>
>>> One user is io uring with IORING_SETUP_SQPOLL & IORING_SETUP_SQ_AFF, see
>>> io_sq_offload_start(), and it is a IO submission kthread.
>>
>> As far as I can tell that code is buggy, as it still needs to migrate
>> the thread away when the cpu is offlined. This isn't a per-cpu kthread
>> in the sene of having one for each CPU.
>>
>> Jens?
>
> It just uses kthread_create_on_cpu(), nothing home grown. Pretty sure
> they just break affinity if that CPU goes offline.
Just checked, and it works fine for me. If I create an SQPOLL ring with
SQ_AFF set and bound to CPU 3, if CPU 3 goes offline, then the kthread
just appears unbound but runs just fine. When CPU 3 comes online again,
the mask appears correct.
So don't think there's anything wrong on that side. The affinity is a
performance optimization, not a correctness issue. Really not much we
can do if the chosen CPU is offlined, apart from continue to chug along.
--
Jens Axboe
Powered by blists - more mailing lists