linux-kernel - Re: [RFC PATCH v2 0/1] lightnvm: move bad block and chunk state logic to core

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d17ce375-6ddb-8fd2-a8c1-f1e7d3572b37@lightnvm.io>
Date:   Fri, 17 Aug 2018 11:42:20 +0200
From:   Matias Bjørling <mb@...htnvm.io>
To:     javier@...xlabs.com
Cc:     igor.j.konopko@...el.com, marcin.dziegielewski@...el.com,
        hans.holmberg@...xlabs.com, hlitz@...c.edu,
        youngtack.jin@...cuitblvd.com, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 0/1] lightnvm: move bad block and chunk state logic
 to core

On 08/17/2018 11:34 AM, Javier Gonzalez wrote:
>> On 17 Aug 2018, at 11.29, Matias Bjørling <mb@...htnvm.io> wrote:
>>
>> On 08/17/2018 10:44 AM, Javier Gonzalez wrote:
>>>> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@...htnvm.io> wrote:
>>>>
>>>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote:
>>>>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@...htnvm.io> wrote:
>>>>>>
>>>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to
>>>>>> core.
>>>>>>
>>>>>> Hi Javier, I did not end up using your patch. I had misunderstood what
>>>>>> was implemented. Instead I implemented the detection of the each chunk by
>>>>>> first sensing the first page, then the last page, and if the chunk
>>>>>> is sensed as open, a per page scan will be executed to update the write
>>>>>> pointer appropriately.
>>>>> I see why you want to do it this way for maintaining the chunk
>>>>> abstraction, but this is potentially very inefficient as blocks not used
>>>>> by any target will be recovered unnecessarily.
>>>>
>>>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0)
>>>>
>>>> Note that in 1.2, it is
>>>>> expected that targets will need to recover the write pointer themselves.
>>>>> What is more, in the normal path, this will be part of the metadata
>>>>> being stored so no wp recovery is needed. Still, this approach forces
>>>>> recovery on each 1.2 instance creation (also on factory reset). In this
>>>>> context, you are right, the patch I proposed only addresses the double
>>>>> erase issue, which was the original motivator, and left the actual
>>>>> pointer recovery to the normal pblk recovery process.
>>>>> Besides this, in order to consider this as a real possibility, we need
>>>>> to measure the impact on startup time. For this, could you implement
>>>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by
>>>>> recovering (i) asynchronously and (ii) concurrently across luns so that
>>>>> we can establish the recovery cost more fairly? We can look at a
>>>>> specific penalty ranges afterwards.
>>>>
>>>> Honestly, 1.2 is deprecated.
>>> For some...
>> No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have
>> their own storage stack and spec that they will continue development
>> on, which can not be expected to be compatible with the OCSSD 1.2 that
>> is implemented in the lightnvm subsystem.
>>
> 
> There are 1.2 devices out there using the current stack with no changes. >

Yes, obviously, and they should continue to work. Which this patch 
doesn't change.

>>>> I don't care about the performance, I
>>>> care about being easy to maintain, so it doesn't borg me down in the
>>>> future.
>>> This should be stated clear in the commit message.
>>>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per
>>>> die, and 60us read time, will take 4 seconds to scan if all chunks are
>>>> free, a worst case something like ~10 seconds. -> Not a problem for
>>>> me.
>>> Worst case is _much_ worse than 10s if you need to scan the block to
>>> find the write pointer. We are talking minutes.
>>
>> I think you may be assuming that all blocks are open. My assumption is
>> that this is very rare (given the NAND characteristics). At most a
>> couple of blocks may be open per die. That leads me to the time
>> quoted.
>>
> 
> Worst case is worst case, no assumptions.
> 
>>> At least make the recovery reads asynchronous. It is low hanging fruit
>>> and will help the average case significantly.
>>>>> Also, the recovery scheme in pblk will change significantly by doing
>>>>> this, so I assume you will send a followup patchset reimplementing
>>>>> recovery for the 1.2 path?
>>>>
>>>> The 1.2 path shouldn't be necessary after this. That is the idea of
>>>> this work. Obviously, the set bad block interface will have to
>>>> preserved and called.
>>> If we base this patch on top of my 2.0 recovery, we will still need to
>>> make changes to support all 1.2 corner cases.
>>> How do you want to do it? We get this patch in shape and I rebase on top
>>> or the other way around?
>>
>> I'll pull this in when you're tested it with your 1.2 implementation.
> 
> Please, address the asynchronous read comment before considering pulling
> this path. There is really no reason not to improve this.
> 

I'll accept patches, but I won't spend time on it. Please let me know if 
you have other comments.