lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <F3989C76-24EF-4E26-801A-DB3E1CC45981@cnexlabs.com>
Date:   Fri, 17 Aug 2018 10:49:28 +0000
From:   Javier Gonzalez <javier@...xlabs.com>
To:     Matias Bjørling <mb@...htnvm.io>
CC:     "Konopko, Igor J" <igor.j.konopko@...el.com>,
        "marcin.dziegielewski@...el.com" <marcin.dziegielewski@...el.com>,
        Hans Holmberg <hans.holmberg@...xlabs.com>,
        Heiner Litz <hlitz@...c.edu>,
        Young Tack Tack Jin <youngtack.jin@...cuitblvd.com>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v2 0/1] lightnvm: move bad block and chunk state logic
 to core


> On 17 Aug 2018, at 11.42, Matias Bjørling <mb@...htnvm.io> wrote:
> 
> On 08/17/2018 11:34 AM, Javier Gonzalez wrote:
>>> On 17 Aug 2018, at 11.29, Matias Bjørling <mb@...htnvm.io> wrote:
>>> 
>>> On 08/17/2018 10:44 AM, Javier Gonzalez wrote:
>>>>> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@...htnvm.io> wrote:
>>>>> 
>>>>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote:
>>>>>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@...htnvm.io> wrote:
>>>>>>> 
>>>>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to
>>>>>>> core.
>>>>>>> 
>>>>>>> Hi Javier, I did not end up using your patch. I had misunderstood what
>>>>>>> was implemented. Instead I implemented the detection of the each chunk by
>>>>>>> first sensing the first page, then the last page, and if the chunk
>>>>>>> is sensed as open, a per page scan will be executed to update the write
>>>>>>> pointer appropriately.
>>>>>> I see why you want to do it this way for maintaining the chunk
>>>>>> abstraction, but this is potentially very inefficient as blocks not used
>>>>>> by any target will be recovered unnecessarily.
>>>>> 
>>>>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0)
>>>>> 
>>>>> Note that in 1.2, it is
>>>>>> expected that targets will need to recover the write pointer themselves.
>>>>>> What is more, in the normal path, this will be part of the metadata
>>>>>> being stored so no wp recovery is needed. Still, this approach forces
>>>>>> recovery on each 1.2 instance creation (also on factory reset). In this
>>>>>> context, you are right, the patch I proposed only addresses the double
>>>>>> erase issue, which was the original motivator, and left the actual
>>>>>> pointer recovery to the normal pblk recovery process.
>>>>>> Besides this, in order to consider this as a real possibility, we need
>>>>>> to measure the impact on startup time. For this, could you implement
>>>>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by
>>>>>> recovering (i) asynchronously and (ii) concurrently across luns so that
>>>>>> we can establish the recovery cost more fairly? We can look at a
>>>>>> specific penalty ranges afterwards.
>>>>> 
>>>>> Honestly, 1.2 is deprecated.
>>>> For some...
>>> No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have
>>> their own storage stack and spec that they will continue development
>>> on, which can not be expected to be compatible with the OCSSD 1.2 that
>>> is implemented in the lightnvm subsystem.
>> There are 1.2 devices out there using the current stack with no changes. >
> 
> Yes, obviously, and they should continue to work. Which this patch doesn't change.
> 
>>>>> I don't care about the performance, I
>>>>> care about being easy to maintain, so it doesn't borg me down in the
>>>>> future.
>>>> This should be stated clear in the commit message.
>>>>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per
>>>>> die, and 60us read time, will take 4 seconds to scan if all chunks are
>>>>> free, a worst case something like ~10 seconds. -> Not a problem for
>>>>> me.
>>>> Worst case is _much_ worse than 10s if you need to scan the block to
>>>> find the write pointer. We are talking minutes.
>>> 
>>> I think you may be assuming that all blocks are open. My assumption is
>>> that this is very rare (given the NAND characteristics). At most a
>>> couple of blocks may be open per die. That leads me to the time
>>> quoted.
>> Worst case is worst case, no assumptions.
>>>> At least make the recovery reads asynchronous. It is low hanging fruit
>>>> and will help the average case significantly.
>>>>>> Also, the recovery scheme in pblk will change significantly by doing
>>>>>> this, so I assume you will send a followup patchset reimplementing
>>>>>> recovery for the 1.2 path?
>>>>> 
>>>>> The 1.2 path shouldn't be necessary after this. That is the idea of
>>>>> this work. Obviously, the set bad block interface will have to
>>>>> preserved and called.
>>>> If we base this patch on top of my 2.0 recovery, we will still need to
>>>> make changes to support all 1.2 corner cases.
>>>> How do you want to do it? We get this patch in shape and I rebase on top
>>>> or the other way around?
>>> 
>>> I'll pull this in when you're tested it with your 1.2 implementation.
>> Please, address the asynchronous read comment before considering pulling
>> this path. There is really no reason not to improve this.
> 
> I'll accept patches, but I won't spend time on it. Please let me know if you have other comments.

Your choice to ignore my comment on a RFC at this early stage of the
4.20 window.

In any case, I tested on our 1.2 devices and the patch has passed
functionality.

Please do not add reviewed-by or tested-by tags with my name as I do not
back the performance implications of the current implementation (on an
otherwise good cleanup patch).

Javier



Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ