linux-kernel - Re: [PATCH 3/3] NVMe: Convert to blk-mq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Tue, 22 Oct 2013 13:52:25 -0600 (MDT)
From:	Keith Busch <keith.busch@...el.com>
To:	Matias Bjorling <m@...rling.me>
cc:	Keith Busch <keith.busch@...el.com>, axboe@...nel.dk,
	willy@...ux.intel.com, linux-kernel@...r.kernel.org,
	linux-nvme@...ts.infradead.org
Subject: Re: [PATCH 3/3] NVMe: Convert to blk-mq

On Tue, 22 Oct 2013, Matias Bjorling wrote:
> Den 22-10-2013 18:55, Keith Busch skrev:
>> On Fri, 18 Oct 2013, Matias Bjørling wrote:
>>> On 10/18/2013 05:13 PM, Keith Busch wrote:
>>>> On Fri, 18 Oct 2013, Matias Bjorling wrote:
>>>>> The nvme driver implements itself as a bio-based driver. This primarily
>>>>> because of high lock congestion for high-performance nvm devices. To
>>>>> remove the congestion within the traditional block layer, a multi-queue
>>>>> block layer is being implemented.
>> 
>>>>> -    result = nvme_map_bio(nvmeq, iod, bio, dma_dir, psegs);
>>>>> -    if (result <= 0)
>>>>> +    if (nvme_map_rq(nvmeq, iod, rq, dma_dir))
>>>>>         goto free_cmdid;
>>>>> -    length = result;
>>>>> 
>>>>> -    cmnd->rw.command_id = cmdid;
>>>>> +    length = blk_rq_bytes(rq);
>>>>> +
>>>>> +    cmnd->rw.command_id = rq->tag;
>>>> 
>>>> The command ids have to be unique on a submission queue. Since each
>>>> namespace's blk-mq has its own 'tags' used as command ids here but share
>>>> submission queues, what's stopping the tags for commands sent to 
>>>> namespace
>>>> 1 from clashing with tags for namespace 2?
>>>> 
>>>> I think this would work better if one blk-mq was created per device
>>>> rather than namespace. It would fix the tag problem above and save a
>>>> lot of memory potentially wasted on millions of requests allocated that
>>>> can't be used.
>>> 
>>> You're right. I didn't see the connection. In v3 I'll push struct 
>>> request_queue to nvme_dev and map the queues appropriately. It will also 
>>> fix the command id issues.
>> 
>> Just anticipating a possible issue with the suggestion. Will this separate
>> the logical block size from the request_queue? Each namespace can have
>> a different format, so the block size and request_queue can't be tied
>> together like it currently is for this to work.
>
> If only a couple of different logical sizes are to be expected (1-4), we can 
> keep a list of already initialized request queues, and use the one that match 
> an already initialized?

The spec allows a namespace to have up to 16 different block formats and
they need not be the same 16 as another namespace on the same device.

>From a practical standpoint, I don't think devices will support more
than a few formats, but even if you kept it to that many request queues,
you just get back to conflicting command id tags and some wasted memory.

>
> Axboe, do you know of a better solution?