linux-kernel - Re: Question on handling managed IRQs when hotplugging CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <42d149c5-0380-c357-8811-81015159ac04@huawei.com>
Date:   Tue, 5 Feb 2019 13:24:11 +0000
From:   John Garry <john.garry@...wei.com>
To:     Hannes Reinecke <hare@...e.de>,
        Thomas Gleixner <tglx@...utronix.de>
CC:     Keith Busch <keith.busch@...el.com>,
        Christoph Hellwig <hch@....de>,
        "Marc Zyngier" <marc.zyngier@....com>,
        "axboe@...nel.dk" <axboe@...nel.dk>,
        "Peter Zijlstra" <peterz@...radead.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Linuxarm <linuxarm@...wei.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Hannes Reinecke <hare@...e.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>
Subject: Re: Question on handling managed IRQs when hotplugging CPUs

On 04/02/2019 07:12, Hannes Reinecke wrote:
> On 2/1/19 10:57 PM, Thomas Gleixner wrote:
>> On Fri, 1 Feb 2019, Hannes Reinecke wrote:
>>> Thing is, if we have _managed_ CPU hotplug (ie if the hardware
>>> provides some
>>> means of quiescing the CPU before hotplug) then the whole thing is
>>> trivial;
>>> disable SQ and wait for all outstanding commands to complete.
>>> Then trivially all requests are completed and the issue is resolved.
>>> Even with todays infrastructure.
>>>
>>> And I'm not sure if we can handle surprise CPU hotplug at all, given
>>> all the
>>> possible race conditions.
>>> But then I might be wrong.
>>
>> The kernel would completely fall apart when a CPU would vanish by
>> surprise,
>> i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be
>> the least of our problems.
>>
> Hehe. As I thought.

Hi Hannes,

>
> So, as the user then has to wait for the system to declars 'ready for
> CPU remove', why can't we just disable the SQ and wait for all I/O to
> complete?
> We can make it more fine-grained by just waiting on all outstanding I/O
> on that SQ to complete, but waiting for all I/O should be good as an
> initial try.
> With that we wouldn't need to fiddle with driver internals, and could
> make it pretty generic.

I don't fully understand this idea - specifically, at which layer would 
we be waiting for all the IO to complete?

> And we could always add more detailed logic if the driver has the means
> for doing so.
>

Thanks,
John

> Cheers,
>
> Hannes