lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGWkznEHTVJzrCqfZRSHN=HtFjKHBGy0yyxpK8paP+9W1DsX_w@mail.gmail.com>
Date:   Fri, 3 Dec 2021 17:16:47 +0800
From:   Zhaoyang Huang <huangzhaoyang@...il.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Nitin Gupta <ngupta@...are.org>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Jens Axboe <axboe@...nel.dk>, Minchan Kim <minchan@...nel.org>,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>,
        "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] mm: count zram read/write into PSI_IO_WAIT

On Fri, Dec 3, 2021 at 12:28 AM Johannes Weiner <hannes@...xchg.org> wrote:
>
> On Wed, Dec 01, 2021 at 07:12:30PM +0800, Zhaoyang Huang wrote:
> > There is no chance for zram reading/writing to be counted in
> > PSI_IO_WAIT so far as zram will deal with the request just in current
> > context without invoking submit_bio and io_schedule.
>
> Hm, but you're also not waiting for a real io device - during which
> the CPU could be doing something else e.g. You're waiting for
> decompression. The thread also isn't in D-state during that time. What
> scenario would benefit from this accounting? How is IO pressure from
> comp/decomp paths actionable to you?
No. Block device related D-state will be counted in via
psi_dequeue(io_wait). What I am proposing here is do NOT ignore the
influence on non-productive time by huge numbers of in-context swap
in/out (zram like). This can help to make IO pressure more accurate
and coordinate with the number of PSWPIN/OUT. It is like counting the
IO time within filemap_fault->wait_on_page_bit_common into
psi_mem_stall, which introduces memory pressure high by IO.
>
> What about when you use zram with disk writeback enabled, and you see
> a mix of decompression and actual disk IO. Wouldn't you want to be
> able to tell the two apart, to see if you're short on CPU or short on
> IO bandwidth in this setup? Your patch would make that impossible.
OK. Is it better to start the IO counting from pageout? Both of the
bdev and ram backed swap would benefit from it.
>
> This needs a much more comprehensive changelog.
>
> > > @@ -1246,7 +1247,9 @@ static int __zram_bvec_read(struct zram *zram, struct page *page, u32 index,
> > >                                 zram_get_element(zram, index),
> > >                                 bio, partial_io);
> > >         }
> > > -
> > > +#ifdef CONFIG_PSI
> > > +       psi_task_change(current, 0, TSK_IOWAIT);
> > > +#endif
>
> Add psi_iostall_enter() and leave helpers that encapsulate the ifdefs.
OK.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ