[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181210154121.GO29289@quack2.suse.cz>
Date: Mon, 10 Dec 2018 16:41:21 +0100
From: Jan Kara <jack@...e.cz>
To: Jens Axboe <axboe@...nel.dk>
Cc: Jan Kara <jack@...e.cz>, Kanchan Joshi <joshi.k@...sung.com>,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
tytso@....edu, adilger.kernel@...ger.ca, jack@...e.com,
viro@...iv.linux.org.uk, darrick.wong@...cle.com,
jrdr.linux@...il.com, ebiggers@...gle.com,
jooyoung.hwang@...sung.com, chur.lee@...sung.com,
prakash.v@...sung.com
Subject: Re: [PATCH 2/2] fs/ext4,jbd2: Add support for passing write-hint
with journal.
On Mon 10-12-18 08:17:18, Jens Axboe wrote:
> On 12/10/18 7:12 AM, Jan Kara wrote:
> > On Mon 10-12-18 18:20:04, Kanchan Joshi wrote:
> >> This patch introduces "j_writehint" in JBD2 journal,
> >> which is set based by Ext4 depending on "journal_writehint"
> >> mount option (inspired from "journal_ioprio").
> >
> > Thanks for the patch! It would be good to provide the explanation you have
> > in the cover letter in this patch as well so that it gets recorded in git
> > logs.
> >
> > Also I don't like the fact that users have to set the hint via a mount
> > option for this to be enabled. OTOH the WRITE_FILE_<foo> hints defined in
> > fs.h are generally supposed to be used by userspace so it's difficult to
> > pick anything if we don't know what the userspace is going to do. I'd argue
> > it's even difficult for the sysadmin to pick any good value even if he
> > actually knows that he might benefit from setting some. Jens, is there
> > some reasonable way for fs to automatically pick some stream value for its
> > journal?
>
> I think we have two options here:
>
> 1) It's _probably_ safe to assume that journal data is short lived. While
> hints are all relative to the specific use case, the size of the journal
> compared to the rest of the drive is most likely very small. Hence a
> default of WRITE_LIFE_SHORT is probably a good idea.
That's what I was thinking as well. But there are some exceptions like
heavy DB load where there's very little of metadata modified (and thus
almost no journal IO) compared to the DB data. So journal blocks may have
actually longer life time than data blocks. OTOH if there's little journal
IO there's no big benefit in specifying a stream for it so WRITE_LIFE_SHORT
is probably a good default anyway.
> 2) We add a specific internal life time hint for fs journals.
>
> #2 makes the most sense to me, but requires a bit more work...
Yeah, #2 would look more natural to me but I guess it needs some mapping to
what the drive offers, doesn't it?
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists