linux-kernel - Re: [PATCH v12] NVMe: Convert to blk-mq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53F5E0F1.20808@bjorling.me>
Date:	Thu, 21 Aug 2014 14:07:13 +0200
From:	Matias Bjørling <m@...rling.me>
To:	Keith Busch <keith.busch@...el.com>
CC:	willy@...ux.intel.com, sbradshaw@...ron.com, axboe@...com,
	tom.leiming@...il.com, hch@...radead.org, rlnelson@...gle.com,
	linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org
Subject: Re: [PATCH v12] NVMe: Convert to blk-mq

On 08/19/2014 12:49 AM, Keith Busch wrote:
> On Fri, 15 Aug 2014, Matias Bjørling wrote:
>>
>> * NVMe queues are merged with the tags structure of blk-mq.
>>
>
> I see the driver's queue suspend logic is removed, but I didn't mean to
> imply it was safe to do so without replacing it with something else. I
> thought maybe we could use the blk_stop/start_queue() functions if I'm
> correctly understanding what they're for.

They're usually only used for the previous request model.

Please correct me if I'm wrong. The flow of suspend is as following 
(roughly):

1. Freeze user threads
2. Perform sys_sync
3. Freeze freezable kernel threads
4. Freeze devices
5. ...

On nvme suspend, we process all outstanding request and cancels any 
outstanding IOs, before going suspending.

 From what I found, is it still possible for IOs to be submitted and 
lost in the process?

>
> With what's in version 12, we could free an irq multiple times that
> doesn't even belong to the nvme queue anymore in certain error conditions.
>
> A couple other things I just noticed:
>
>   * We lose the irq affinity hint after a suspend/resume or device reset
>   because the driver's init_hctx() isn't called in these scenarios.

Ok, you're right.

>
>   * After a reset, we are not guaranteed that we even have the same number
>   of h/w queues. The driver frees ones beyond the device's capabilities,
>   so blk-mq may have references to freed memory. The driver may also
>   allocate more queues if it is capable, but blk-mq won't be able to take
>   advantage of that.

Ok. Out of curiosity, why can the number of exposed nvme queues change 
from the hw perspective on suspend/resume?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/