[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJfpegvpnQMSRU+TW4J5+F+3KiAj8J_m+OjNrnh7f2X9DZp2Ag@mail.gmail.com>
Date: Fri, 18 Jun 2021 10:18:21 +0200
From: Miklos Szeredi <miklos@...redi.hu>
To: Michael Stapelberg <stapelberg+linux@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-mm <linux-mm@...ck.org>,
linux-fsdevel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
Dennis Zhou <dennis@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Roman Gushchin <guro@...com>,
Johannes Thumshirn <johannes.thumshirn@....com>,
Jan Kara <jack@...e.cz>, Song Liu <song@...nel.org>,
David Sterba <dsterba@...e.com>
Subject: Re: [PATCH] backing_dev_info: introduce min_bw/max_bw limits
On Thu, 17 Jun 2021 at 11:53, Michael Stapelberg
<stapelberg+linux@...gle.com> wrote:
>
> These new knobs allow e.g. FUSE file systems to guide kernel memory
> writeback bandwidth throttling.
>
> Background:
>
> When using mmap(2) to read/write files, the page-writeback code tries to
> measure how quick file system backing devices (BDI) are able to write data,
> so that it can throttle processes accordingly.
>
> Unfortunately, certain usage patterns, such as linkers (tested with GCC,
> but also the Go linker) seem to hit an unfortunate corner case when writing
> their large executable output files: the kernel only ever measures
> the (non-representative) rising slope of the starting bulk write, but the
> whole file write is already over before the kernel could possibly measure
> the representative steady-state.
>
> As a consequence, with each program invocation hitting this corner case,
> the FUSE write bandwidth steadily sinks in a downward spiral, until it
> eventually reaches 0 (!). This results in the kernel heavily throttling
> page dirtying in programs trying to write to FUSE, which in turn manifests
> itself in slow or even entirely stalled linker processes.
>
> Change:
>
> This commit adds 2 knobs which allow avoiding this situation entirely on a
> per-file-system basis by restricting the minimum/maximum bandwidth.
This looks like a bug in the dirty throttling heuristics, that may be
effecting multiple fuse based filesystems.
Ideally the solution should be a fix to those heuristics, not adding more knobs.
Is there a fundamental reason why that can't be done? Maybe the
heuristics need to detect the fact that steady state has not been
reached, and not modify the bandwidth in that case, or something along
those lines.
Thanks,
Miklos
Powered by blists - more mailing lists