[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210413043733.28880-1-nanich.lee@samsung.com>
Date: Tue, 13 Apr 2021 13:37:33 +0900
From: Changheun Lee <nanich.lee@...sung.com>
To: ming.lei@...hat.com
Cc: Damien.LeMoal@....com, Johannes.Thumshirn@....com,
asml.silence@...il.com, axboe@...nel.dk, bvanassche@....org,
gregkh@...uxfoundation.org, hch@...radead.org,
jisoo2146.oh@...sung.com, junho89.kim@...sung.com,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
mj0123.lee@...sung.com, nanich.lee@...sung.com, osandov@...com,
patchwork-bot@...nel.org, seunghwan.hyun@...sung.com,
sookwan7.kim@...sung.com, tj@...nel.org, tom.leiming@...il.com,
woosung2.lee@...sung.com, yt0928.kim@...sung.com
Subject: Re: [RESEND,v5,1/2] bio: limit bio max size
> On Sun, Apr 11, 2021 at 10:13:01PM +0000, Damien Le Moal wrote:
> > On 2021/04/09 23:47, Bart Van Assche wrote:
> > > On 4/7/21 3:27 AM, Damien Le Moal wrote:
> > >> On 2021/04/07 18:46, Changheun Lee wrote:
> > >>> I'll prepare new patch as you recommand. It will be added setting of
> > >>> limit_bio_size automatically when queue max sectors is determined.
> > >>
> > >> Please do that in the driver for the HW that benefits from it. Do not do this
> > >> for all block devices.
> > >
> > > Hmm ... is it ever useful to build a bio with a size that exceeds
> > > max_hw_sectors when submitting a bio directly to a block device, or in
> > > other words, if no stacked block driver sits between the submitter and
> > > the block device? Am I perhaps missing something?
> >
> > Device performance wise, the benefits are certainly not obvious to me either.
> > But for very fast block devices, I think the CPU overhead of building more
> > smaller BIOs may be significant compared to splitting a large BIO into multiple
> > requests. Though it may be good to revisit this with some benchmark numbers.
>
> This patch tries to address issue[1] in do_direct_IO() in which
> Changheun observed that other operations takes time between adding page
> to bio.
>
> However, do_direct_IO() just does following except for adding bio and
> submitting bio:
>
> - retrieves pages at batch(pin 64 pages each time from VM) and
>
> - retrieve block mapping(get_more_blocks), which is still done usually
> very less times for 32MB; for new mapping, clean_bdev_aliases() may
> take a bit time.
>
> If there isn't system memory pressure, pin 64 pages won't be slow, but
> get_more_blocks() may take a bit time.
>
> Changheun, can you check if multiple get_more_blocks() is called for submitting
> 32MB in your test?
almost one time called.
>
> In my 32MB sync dio f2fs test on x86_64 VM, one buffer_head mapping can
> hold 32MB, but it is one freshly new f2fs.
>
> I'd suggest to understand the issue completely before figuring out one
> solution.
Thank you for your advice. I'll analyze more about your point later. :)
But I think it's different from finding main time spend point in
do_direct_IO(). I think excessive loop should be controlled.
8,192 loops in do_direct_IO() - for 32MB - to submit one bio is too much
on 4KB page system. I want to apply a optional solution to avoid
excessive loop casued by multipage bvec.
Thanks,
Changheun Lee
Powered by blists - more mailing lists