linux-kernel - Re: [PATCH] lightnvm: pblk: Introduce hot-cold data separation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <139AF16B-E69C-4AA5-A9AC-38576BB9BD4B@javigon.com>
Date:   Wed, 1 May 2019 08:19:16 +0200
From:   Javier González <javier@...igon.com>
To:     Heiner Litz <hlitz@...c.edu>
Cc:     "Konopko, Igor J" <igor.j.konopko@...el.com>,
        Matias Bjørling <mb@...htnvm.io>,
        Hans Holmberg <hans.holmberg@...xlabs.com>,
        linux-block@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lightnvm: pblk: Introduce hot-cold data separation

> On 26 Apr 2019, at 18.23, Heiner Litz <hlitz@...c.edu> wrote:
> 
> Nice catch Igor, I hadn't thought of that.
> 
> Nevertheless, here is what I think: In the absence of a flush we don't
> need to enforce ordering so we don't care about recovering the older
> gc'ed write. If we completed a flush after the user write, we should
> have already invalidated the gc mapping and hence will not recover it.
> Let me know if I am missing something.

I think that this problem is orthogonal to a flush on the user path. For example

   - Write to LBA0 + completion to host
   - […]
   - GC LBA0
   - Write to LBA0 + completion to host
   - fsync() + completion
   - Power Failure

When we power up and do recovery in the current implementation, you
might get the old LBA0 mapped correctly in the L2P table.

If we enforce ID ordering for GC lines this problem goes away as we can
continue ordering lines based on ID and then recovering sequentially.

Thoughts?

Thanks,
Javier

> 
> On Fri, Apr 26, 2019 at 6:46 AM Igor Konopko <igor.j.konopko@...el.com> wrote:
>> On 26.04.2019 12:04, Javier González wrote:
>>>> On 26 Apr 2019, at 11.11, Igor Konopko <igor.j.konopko@...el.com> wrote:
>>>> 
>>>> On 25.04.2019 07:21, Heiner Litz wrote:
>>>>> Introduce the capability to manage multiple open lines. Maintain one line
>>>>> for user writes (hot) and a second line for gc writes (cold). As user and
>>>>> gc writes still utilize a shared ring buffer, in rare cases a multi-sector
>>>>> write will contain both gc and user data. This is acceptable, as on a
>>>>> tested SSD with minimum write size of 64KB, less than 1% of all writes
>>>>> contain both hot and cold sectors.
>>>> 
>>>> Hi Heiner
>>>> 
>>>> Generally I really like this changes, I was thinking about sth similar since a while, so it is very good to see that patch.
>>>> 
>>>> I have a one question related to this patch, since it is not very clear for me - how you ensure the data integrity in following scenarios:
>>>> -we have open line X for user data and line Y for GC
>>>> -GC writes LBA=N to line Y
>>>> -user writes LBA=N to line X
>>>> -we have power failure when both line X and Y were not written completely
>>>> -during pblk creation we are executing OOB metadata recovery
>>>> And here is the question, how we distinguish whether LBA=N from line Y or LBA=N from line X is the valid one?
>>>> Line X and Y might have seq_id either descending or ascending - this would create two possible scenarios too.
>>>> 
>>>> Thanks
>>>> Igor
>>> 
>>> You are right, I think this is possible in the current implementation.
>>> 
>>> We need an extra constrain so that we only GC lines above the GC line
>>> ID. This way, when we order lines on recovery, we can guarantee
>>> consistency. This means potentially that we would need several open
>>> lines for GC to avoid padding in case this constrain forces to choose a
>>> line with an ID higher than the GC line ID.
>>> 
>>> What do you think?
>> 
>> I'm not sure yet about your approach, I need to think and analyze this a
>> little more.
>> 
>> I also believe that probably we need to ensure that current user data
>> line seq_id is always above the current GC line seq_id or sth like that.
>> We cannot also then GC any data from the lines which are still open, but
>> I believe that this is a case even right now.
>> 
>>> Thanks,
>>> Javier

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)