lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Dec 2017 16:28:27 +0900
From:   Hyunchul Lee <hyc.lee@...il.com>
To:     Jaegeuk Kim <jaegeuk@...nel.org>, Chao Yu <yuchao0@...wei.com>,
        Chao Yu <chao@...nel.org>
CC:     Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net, kernel-team@....com,
        linux-fsdevel@...r.kernel.org, Hyunchul Lee <cheol.lee@....com>
Subject: Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer
 for bufferd write

Hi Jaegeuk,

Agreed. If Chao agrees with this policy, I will implement it.

Thanks for the comment.

On 12/15/2017 11:06 AM, Jaegeuk Kim wrote:
> On 12/14, Hyunchul Lee wrote:
>> Hi Jaegeuk,
>>
>> I need your comment about the fs_iohint mount option.
>>
>> a) w/o fs_iohint, propagate user hints to low layer.
>> b) w/ fs_iohint, ignore user hints, and use hints which is generated
>> with F2FS.
>>
>> Chao suggests this option. because user hints are more accurate than
>> file system.
>>
>> This is resonable, But I have some concerns about this option. 
>> The first thing is that blocks of a segments have different hints. This
>> could make GC less effective. 
>> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
>> really needed. I think that difference between them is a little ambigous 
>> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
>> hints by F2FS.
> 
> I think what we really can do would assign many user hints to our 3 DATA
> logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
> Then, we can decide how to keep that as much as possible, since we have
> another filesystem metadata such as meta and nodes. In addition, I don't
> think we have to keep the original user-hints which makes F2FS logs be
> messed up.
> 
> With that mind, I can think of the below cases. Especially, if user wants
> to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
> In order to keep this policy, I think fs_iohints would be better to be a
> feature set by mkfs.f2fs and detected by sysfs entries for users.
> 
> 1) w/ fs_iohints
> 
> User                        F2FS               Block
> -------------------------------------------------------------------
>                             Meta               WRITE_LIFE_MEDIUM
>                             HOT_NODE           WRITE_LIFE_NOTSET
>                             WARM_NODE          -'
>                             COLD_NODE          WRITE_LIFE_NONE
> ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
> extention list              -'                 -'
> WRITE_LIFE_EXTREME          -'                 -'
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> 
> -- buffered_io
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
> WRITE_LIFE_NONE             -'                 -'
> WRITE_LIFE_MEDIUM           -'                 -'
> WRITE_LIFE_LONG             -'                 -'
> 
> -- direct_io (Not recommendable)
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> 
> 2) w/o fs_iohints
> 
> User                        F2FS               Block
> -------------------------------------------------------------------
>                             Meta               -
>                             HOT_NODE           -
>                             WARM_NODE          -
>                             COLD_NODE          -
> ioctl(cold)                 COLD_DATA          -
> extention list              -'                 -
> 
> -- buffered_io
> WRITE_LIFE_EXTREME          COLD_DATA          -
> WRITE_LIFE_SHORT            HOT_DATA           -
> WRITE_LIFE_NOT_SET          WARM_DATA          -
> WRITE_LIFE_NONE             -'                 -
> WRITE_LIFE_MEDIUM           -'                 -
> WRITE_LIFE_LONG             -'                 -
> 
> -- direct_io
> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> 
> 
> Note that, I don't much care about how to manipulate streamid in nvme driver
> in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
> in different ways. Taking a look at the definition, at least, we don't need
> to assume that those are same at all. For example, if we can expolit this in
> UFS driver, we can pass all the stream ids to the device as context ids.
> 
> Thanks,
> 
>>
>> Thanks.
>>
>> On 12/12/2017 11:45 AM, Chao Yu wrote:
>>> Hi Hyunchul,
>>>
>>> On 2017/12/12 10:15, Hyunchul Lee wrote:
>>>> Hi Chao,
>>>>
>>>> On 12/11/2017 10:15 PM, Chao Yu wrote:
>>>>> Hi Hyunchul,
>>>>>
>>>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
>>>>>> Hi Chao,
>>>>>>
>>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>>>>>> Hi Hyunchul,
>>>>>>>
>>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>>>>>> From: Hyunchul Lee <cheol.lee@....com>
>>>>>>>>
>>>>>>>> This implements which hint is passed down to block layer
>>>>>>>> for datas from the specific segment type.
>>>>>>>>
>>>>>>>> segment type                     hints
>>>>>>>> ------------                     -----
>>>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>>>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>>>>>> META_DATA                        WRITE_LIFE_SHORT
>>>>>>>
>>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>>>>>> provider hint to lower layer according to hot/cold separation ability,
>>>>>>> it will be okay. But once user give his hint which may be more accurate
>>>>>>> than filesystem, hint converted by f2fs may be wrong.
>>>>>>>
>>>>>>> So what do you think of adding an option to control whether filesystem
>>>>>>> can convert hint user given?
>>>>>>>
>>>>>>
>>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>>>>>> converted to different hints.
>>>>>
>>>>> What I mean is introducing a mount option, e.g. fs_iohint,
>>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
>>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
>>>>> with filesystem's private rule.
>>>>>
>>>>
>>>> Okay, I will implement this option and send this patch again.
>>>
>>> Let's wait for Jaegeuk's comments first?
>>>
>>>>
>>>> Without fs_iohint, Even if data blocks are moved due to GC, 
>>>> we should keep user hints. And if user hints are not given, 
>>>> any hints are not passed down to block layer, right?
>>>
>>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
>>> layout, so later when we trigger GC, we can use the last io_hint in inode rather
>>> than giving no hint or fs' hint.
>>>
>>> I think it needs to discuss with original author of IO hint, what is the IO hint
>>> policy when filesystem move block by itself after inode has been released in system.
>>>
>>> Thanks,
>>>
>>>>
>>>> Thank you for comments.
>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> file hint      segment type        io hint
>>>>>> ---------      ------------        -------
>>>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>>>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>>>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
>>>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>>>>>
>>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>>>>>> difference between them is a little ambigous for users, and if WARM_DATA 
>>>>>> segment has two different hints, it can makes GC non-efficient.
>>>>>>
>>>>>> I wonder your thought about this.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>> _______________________________________________
>>>>>> Linux-f2fs-devel mailing list
>>>>>> Linux-f2fs-devel@...ts.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@...ts.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@...ts.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ