lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 30 Mar 2022 09:00:46 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     CGEL <cgel.zte@...il.com>
Cc:     axboe@...nel.dk, viro@...iv.linux.org.uk,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, akpm@...ux-foundation.org,
        Yang Yang <yang.yang29@....com.cn>,
        Ran Xiaokai <ran.xiaokai@....com.cn>,
        Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH] block/psi: make PSI annotations of submit_bio only work
 for file pages

On Wed, Mar 30, 2022 at 08:34:08AM +0000, CGEL wrote:
> On Wed, Mar 23, 2022 at 06:11:03AM +0000, CGEL wrote:
> > On Tue, Mar 22, 2022 at 09:27:58AM -0400, Johannes Weiner wrote:
> > > On Tue, Mar 22, 2022 at 02:47:42AM +0000, CGEL wrote:
> > > > On Mon, Mar 21, 2022 at 10:33:20AM -0400, Johannes Weiner wrote:
> > > > > On Wed, Mar 16, 2022 at 06:39:28AM +0000, cgel.zte@...il.com wrote:
> > > > > > From: Yang Yang <yang.yang29@....com.cn>
> > > > > > 
> > > > > > psi tracks the time spent on submitting the IO of refaulting file pages
> > > > > > and anonymous pages[1]. But after we tracks refaulting anonymous pages
> > > > > > in swap_readpage[2][3], there is no need to track refaulting anonymous
> > > > > > pages in submit_bio.
> > > > > > 
> > > > > > So this patch can reduce redundant calling of psi_memstall_enter. And
> > > > > > make it easier to track refaulting file pages and anonymous pages
> > > > > > separately.
> > > > > 
> > > > > I don't think this is an improvement.
> > > > > 
> > > > > psi_memstall_enter() will check current->in_memstall once, detect the
> > > > > nested call, and bail. Your patch checks PageSwapBacked for every page
> > > > > being added. It's more branches for less robust code.
> > > > 
> > > > We are also working for a new patch to classify different reasons cause
> > > > psi_memstall_enter(): reclaim, thrashing, compact, etc. This will help
> > > > user to tuning sysctl, for example, if user see high compact delay, he
> > > > may try do adjust THP sysctl to reduce the compact delay.
> > > > 
> > > > To support that, we should distinguish what's the reason cause psi in
> > > > submit_io(), this patch does the job.
> > > 
> > > Please submit these patches together then. On its own, this patch
> > > isn't desirable.
> > I think this patch has it's independent value, I try to make a better
> > explain.

You missed the point about it complicating semantics.

Right now, the bio layer annotates stalls from queue contention. This
is very simple. The swap code has relied on it in the past. It doesn't
now, but that doesn't change what the concept is at the bio layer.

You patch explicitly codifies that the MM handles swap IOs, and the
lower bio layer handles files. This asymmetry is ugly and error prone.

If you want type distinction, we should move it all into MM code, like
Christoph is saying. Were swap code handles anon refaults and the page
cache code handles file refaults. This would be my preferred layering,
and my original patch did that: https://lkml.org/lkml/2019/7/22/1070.

But it was NAKed, and I had to agree with the argument. The page cache
code is not very centralized, and the place where we deal with
individual pages (and detect refaults) and where we submit bios (where
the stalls occur) is spread out into multiple filesystems. There are
180 submit_bio() calls in fs/; you'd have to audit which ones are for
page cache submissions, and then add stall annotations or use a new
submit_bio_cache() wrapper that handles it. Changes in the filesystem
could easily miss this protocol and silently break pressure detection.

I would prefer the annotations to be at this level, I just don't see
how to do it cleanly/robustly. Maybe Christoph has an idea, he
understands the fs side much better than I do.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ