lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJfpegvpnQMSRU+TW4J5+F+3KiAj8J_m+OjNrnh7f2X9DZp2Ag@mail.gmail.com>
Date:   Fri, 18 Jun 2021 10:18:21 +0200
From:   Miklos Szeredi <miklos@...redi.hu>
To:     Michael Stapelberg <stapelberg+linux@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm <linux-mm@...ck.org>,
        linux-fsdevel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
        Dennis Zhou <dennis@...nel.org>, Jens Axboe <axboe@...nel.dk>,
        Roman Gushchin <guro@...com>,
        Johannes Thumshirn <johannes.thumshirn@....com>,
        Jan Kara <jack@...e.cz>, Song Liu <song@...nel.org>,
        David Sterba <dsterba@...e.com>
Subject: Re: [PATCH] backing_dev_info: introduce min_bw/max_bw limits

On Thu, 17 Jun 2021 at 11:53, Michael Stapelberg
<stapelberg+linux@...gle.com> wrote:
>
> These new knobs allow e.g. FUSE file systems to guide kernel memory
> writeback bandwidth throttling.
>
> Background:
>
> When using mmap(2) to read/write files, the page-writeback code tries to
> measure how quick file system backing devices (BDI) are able to write data,
> so that it can throttle processes accordingly.
>
> Unfortunately, certain usage patterns, such as linkers (tested with GCC,
> but also the Go linker) seem to hit an unfortunate corner case when writing
> their large executable output files: the kernel only ever measures
> the (non-representative) rising slope of the starting bulk write, but the
> whole file write is already over before the kernel could possibly measure
> the representative steady-state.
>
> As a consequence, with each program invocation hitting this corner case,
> the FUSE write bandwidth steadily sinks in a downward spiral, until it
> eventually reaches 0 (!). This results in the kernel heavily throttling
> page dirtying in programs trying to write to FUSE, which in turn manifests
> itself in slow or even entirely stalled linker processes.
>
> Change:
>
> This commit adds 2 knobs which allow avoiding this situation entirely on a
> per-file-system basis by restricting the minimum/maximum bandwidth.


This looks like  a bug in the dirty throttling heuristics, that may be
effecting multiple fuse based filesystems.

Ideally the solution should be a fix to those heuristics, not adding more knobs.

Is there a fundamental reason why that can't be done?    Maybe the
heuristics need to detect the fact that steady state has not been
reached, and not modify the bandwidth in that case, or something along
those lines.

Thanks,
Miklos

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ