lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171215020612.GF35234@jaegeuk-macbookpro.roam.corp.google.com>
Date:   Thu, 14 Dec 2017 18:06:12 -0800
From:   Jaegeuk Kim <jaegeuk@...nel.org>
To:     Hyunchul Lee <hyc.lee@...il.com>
Cc:     Chao Yu <yuchao0@...wei.com>, Chao Yu <chao@...nel.org>,
        Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net, kernel-team@....com,
        linux-fsdevel@...r.kernel.org, Hyunchul Lee <cheol.lee@....com>
Subject: Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block
 layer for bufferd write

On 12/14, Hyunchul Lee wrote:
> Hi Jaegeuk,
> 
> I need your comment about the fs_iohint mount option.
> 
> a) w/o fs_iohint, propagate user hints to low layer.
> b) w/ fs_iohint, ignore user hints, and use hints which is generated
> with F2FS.
> 
> Chao suggests this option. because user hints are more accurate than
> file system.
> 
> This is resonable, But I have some concerns about this option. 
> The first thing is that blocks of a segments have different hints. This
> could make GC less effective. 
> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
> really needed. I think that difference between them is a little ambigous 
> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
> hints by F2FS.

I think what we really can do would assign many user hints to our 3 DATA
logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
Then, we can decide how to keep that as much as possible, since we have
another filesystem metadata such as meta and nodes. In addition, I don't
think we have to keep the original user-hints which makes F2FS logs be
messed up.

With that mind, I can think of the below cases. Especially, if user wants
to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
In order to keep this policy, I think fs_iohints would be better to be a
feature set by mkfs.f2fs and detected by sysfs entries for users.

1) w/ fs_iohints

User                        F2FS               Block
-------------------------------------------------------------------
                            Meta               WRITE_LIFE_MEDIUM
                            HOT_NODE           WRITE_LIFE_NOTSET
                            WARM_NODE          -'
                            COLD_NODE          WRITE_LIFE_NONE
ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
extention list              -'                 -'
WRITE_LIFE_EXTREME          -'                 -'
WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT

-- buffered_io
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
WRITE_LIFE_NONE             -'                 -'
WRITE_LIFE_MEDIUM           -'                 -'
WRITE_LIFE_LONG             -'                 -'

-- direct_io (Not recommendable)
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG

2) w/o fs_iohints

User                        F2FS               Block
-------------------------------------------------------------------
                            Meta               -
                            HOT_NODE           -
                            WARM_NODE          -
                            COLD_NODE          -
ioctl(cold)                 COLD_DATA          -
extention list              -'                 -

-- buffered_io
WRITE_LIFE_EXTREME          COLD_DATA          -
WRITE_LIFE_SHORT            HOT_DATA           -
WRITE_LIFE_NOT_SET          WARM_DATA          -
WRITE_LIFE_NONE             -'                 -
WRITE_LIFE_MEDIUM           -'                 -
WRITE_LIFE_LONG             -'                 -

-- direct_io
WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG


Note that, I don't much care about how to manipulate streamid in nvme driver
in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
in different ways. Taking a look at the definition, at least, we don't need
to assume that those are same at all. For example, if we can expolit this in
UFS driver, we can pass all the stream ids to the device as context ids.

Thanks,

> 
> Thanks.
> 
> On 12/12/2017 11:45 AM, Chao Yu wrote:
> > Hi Hyunchul,
> > 
> > On 2017/12/12 10:15, Hyunchul Lee wrote:
> >> Hi Chao,
> >>
> >> On 12/11/2017 10:15 PM, Chao Yu wrote:
> >>> Hi Hyunchul,
> >>>
> >>> On 2017/12/1 16:28, Hyunchul Lee wrote:
> >>>> Hi Chao,
> >>>>
> >>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
> >>>>> Hi Hyunchul,
> >>>>>
> >>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
> >>>>>> From: Hyunchul Lee <cheol.lee@....com>
> >>>>>>
> >>>>>> This implements which hint is passed down to block layer
> >>>>>> for datas from the specific segment type.
> >>>>>>
> >>>>>> segment type                     hints
> >>>>>> ------------                     -----
> >>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> >>>>>> WARM_DATA                        WRITE_LIFE_NONE
> >>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> >>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
> >>>>>> META_DATA                        WRITE_LIFE_SHORT
> >>>>>
> >>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
> >>>>> provider hint to lower layer according to hot/cold separation ability,
> >>>>> it will be okay. But once user give his hint which may be more accurate
> >>>>> than filesystem, hint converted by f2fs may be wrong.
> >>>>>
> >>>>> So what do you think of adding an option to control whether filesystem
> >>>>> can convert hint user given?
> >>>>>
> >>>>
> >>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
> >>>> converted to different hints.
> >>>
> >>> What I mean is introducing a mount option, e.g. fs_iohint,
> >>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
> >>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
> >>> with filesystem's private rule.
> >>>
> >>
> >> Okay, I will implement this option and send this patch again.
> > 
> > Let's wait for Jaegeuk's comments first?
> > 
> >>
> >> Without fs_iohint, Even if data blocks are moved due to GC, 
> >> we should keep user hints. And if user hints are not given, 
> >> any hints are not passed down to block layer, right?
> > 
> > Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
> > layout, so later when we trigger GC, we can use the last io_hint in inode rather
> > than giving no hint or fs' hint.
> > 
> > I think it needs to discuss with original author of IO hint, what is the IO hint
> > policy when filesystem move block by itself after inode has been released in system.
> > 
> > Thanks,
> > 
> >>
> >> Thank you for comments.
> >>
> >>> Thanks,
> >>>
> >>>>
> >>>> file hint      segment type        io hint
> >>>> ---------      ------------        -------
> >>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
> >>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
> >>>> LIFE_LONG      WARM_DATA           LIFE_NONE
> >>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
> >>>>
> >>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
> >>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
> >>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
> >>>> difference between them is a little ambigous for users, and if WARM_DATA 
> >>>> segment has two different hints, it can makes GC non-efficient.
> >>>>
> >>>> I wonder your thought about this.
> >>>>
> >>>> Thanks.
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>>
> >>>>
> >>>> ------------------------------------------------------------------------------
> >>>> Check out the vibrant tech community on one of the world's most
> >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>> _______________________________________________
> >>>> Linux-f2fs-devel mailing list
> >>>> Linux-f2fs-devel@...ts.sourceforge.net
> >>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>
> >>>
> >>
> >> .
> >>
> > 
> > 
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@...ts.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ