linux-kernel - Re: A kernel warning when entering suspend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 5 Apr 2019 06:50:15 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Keith Busch <kbusch@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Dongli Zhang <dongli.zhang@...cle.com>,
        fin4478 fin4478 <fin4478@...mail.com>,
        "keith.busch@...el.com" <keith.busch@...el.com>,
        "axboe@...com" <axboe@...com>,
        "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        linux-kernel@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: A kernel warning when entering suspend

On Thu, Apr 04, 2019 at 04:29:56PM -0600, Keith Busch wrote:
> On Fri, Apr 05, 2019 at 06:19:50AM +0800, Ming Lei wrote:
> > Also in current blk-mq implementation, one irq may become shutdown
> > because of CPU hotplug even though when there is in-flight request
> > on the queue served by the irq. Then we depend on timeout handler to
> > cover this case, and this irq may be enabled in the timeout handler too,
> > please see nvme_poll_irqdisable().
> 
> Right, but when the last CPU mapped to an hctx is taken offline, we really
> ought to have blk-mq wait for that hctx to reap all outstanding requests
> before letting the notifier continue with offlining that CPU. We just
> don't have the infrastructure to freeze an individual hctx yet.

Looks this issue isn't unique for storage device, anyone knows how other
device drivers deal with this situation? For example, one network packet is
submitted to NIC controller and not got completed, then the interrupt
may become down because of CPU hotplug.

Thanks,
Ming