linux-kernel - Re: Question on handling managed IRQs when hotplugging CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d93ff049-e96d-36ce-7e58-ec97cbb27ed0@suse.de>
Date:   Mon, 4 Feb 2019 08:12:30 +0100
From:   Hannes Reinecke <hare@...e.de>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     John Garry <john.garry@...wei.com>,
        Keith Busch <keith.busch@...el.com>,
        Christoph Hellwig <hch@....de>,
        Marc Zyngier <marc.zyngier@....com>,
        "axboe@...nel.dk" <axboe@...nel.dk>,
        Peter Zijlstra <peterz@...radead.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Linuxarm <linuxarm@...wei.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Hannes Reinecke <hare@...e.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>
Subject: Re: Question on handling managed IRQs when hotplugging CPUs

On 2/1/19 10:57 PM, Thomas Gleixner wrote:
> On Fri, 1 Feb 2019, Hannes Reinecke wrote:
>> Thing is, if we have _managed_ CPU hotplug (ie if the hardware provides some
>> means of quiescing the CPU before hotplug) then the whole thing is trivial;
>> disable SQ and wait for all outstanding commands to complete.
>> Then trivially all requests are completed and the issue is resolved.
>> Even with todays infrastructure.
>>
>> And I'm not sure if we can handle surprise CPU hotplug at all, given all the
>> possible race conditions.
>> But then I might be wrong.
> 
> The kernel would completely fall apart when a CPU would vanish by surprise,
> i.e. uncontrolled by the kernel. Then the SCSI driver exploding would be
> the least of our problems.
> 
Hehe. As I thought.

So, as the user then has to wait for the system to declars 'ready for 
CPU remove', why can't we just disable the SQ and wait for all I/O to 
complete?
We can make it more fine-grained by just waiting on all outstanding I/O 
on that SQ to complete, but waiting for all I/O should be good as an 
initial try.
With that we wouldn't need to fiddle with driver internals, and could 
make it pretty generic.
And we could always add more detailed logic if the driver has the means 
for doing so.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@...e.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)