lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87eereuudh.fsf@nanos.tec.linutronix.de>
Date:   Thu, 21 May 2020 00:14:18 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        Ming Lei <ming.lei@...hat.com>
Cc:     linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
        John Garry <john.garry@...wei.com>,
        Bart Van Assche <bvanassche@....org>,
        Hannes Reinecke <hare@...e.com>, io-uring@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: io_uring vs CPU hotplug, was Re: [PATCH 5/9] blk-mq: don't set data->ctx and data->hctx in blk_mq_alloc_request_hctx

Jens Axboe <axboe@...nel.dk> writes:

> On 5/20/20 1:41 PM, Thomas Gleixner wrote:
>> Jens Axboe <axboe@...nel.dk> writes:
>>> On 5/20/20 8:45 AM, Jens Axboe wrote:
>>>> It just uses kthread_create_on_cpu(), nothing home grown. Pretty sure
>>>> they just break affinity if that CPU goes offline.
>>>
>>> Just checked, and it works fine for me. If I create an SQPOLL ring with
>>> SQ_AFF set and bound to CPU 3, if CPU 3 goes offline, then the kthread
>>> just appears unbound but runs just fine. When CPU 3 comes online again,
>>> the mask appears correct.
>> 
>> When exactly during the unplug operation is it unbound?
>
> When the CPU has been fully offlined. I check the affinity mask, it
> reports 0. But it's still being scheduled, and it's processing work.
> Here's an example, PID 420 is the thread in question:
>
> [root@...hlinux cpu3]# taskset -p 420
> pid 420's current affinity mask: 8
> [root@...hlinux cpu3]# echo 0 > online 
> [root@...hlinux cpu3]# taskset -p 420
> pid 420's current affinity mask: 0
> [root@...hlinux cpu3]# echo 1 > online 
> [root@...hlinux cpu3]# taskset -p 420
> pid 420's current affinity mask: 8
>
> So as far as I can tell, it's working fine for me with the goals
> I have for that kthread.

Works for me is not really useful information and does not answer my
question:

>> When exactly during the unplug operation is it unbound?

The problem Ming and Christoph are trying to solve requires that the
thread is migrated _before_ the hardware queue is shut down and
drained. That's why I asked for the exact point where this happens.

When the CPU is finally offlined, i.e. the CPU cleared the online bit in
the online mask is definitely too late simply because it still runs on
that outgoing CPU _after_ the hardware queue is shut down and drained.

This needs more thought and changes to sched and kthread so that the
kthread breaks affinity once the CPU goes offline. Too tired to figure
that out right now.

Thanks,

        tglx




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ