linux-kernel - Re: [PATCH 1/2] genirq: Extract irq_set_affinity_masks() from devm_platform_get_irqs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eee8d4b8-6b47-d675-aa6c-b0376b693e87@huawei.com>
Date:   Fri, 18 Feb 2022 08:41:13 +0000
From:   John Garry <john.garry@...wei.com>
To:     Marc Zyngier <maz@...nel.org>
CC:     <linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
        "Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,
        Marcin Wojtas <mw@...ihalf.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>, <kernel-team@...roid.com>
Subject: Re: [PATCH 1/2] genirq: Extract irq_set_affinity_masks() from
 devm_platform_get_irqs_affinity()

On 17/02/2022 17:17, Marc Zyngier wrote:

Hi Marc,

>> I know you mentioned it in 2/2, but it would be interesting to see how
>> network controller drivers can handle the problem of missing in-flight
>> IO completions for managed irq shutdown. For storage controllers this
>> is all now safely handled in the block layer.
> 
> Do you have a pointer to this? It'd be interesting to see if there is
> a common pattern.

Check blk_mq_hctx_notify_offline() and other hotplug handler friends in 
block/blk-mq.c and also blk_mq_get_ctx()/blk_mq_map_queue()

So the key steps in CPU offlining are:
- when the last CPU in HW queue context cpumask is going offline we mark 
the HW queue as inactive and no longer queue requests there
- drain all in-flight requests before we allow that last CPU to go 
offline, meaning that we always have a CPU online to service any 
completion interrupts

This scheme relies on symmetrical HW submission and completion queues 
and also that the blk-mq HW queue context cpumask is same as the HW 
queue's IRQ affinity mask (see blk_mq_pci_map_queues()).

I am not sure how much this would fit with the networking stack or that 
marvell driver.

Thanks,
John