linux-kernel - Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170914131446.GA12850@bgram>
Date:   Thu, 14 Sep 2017 22:14:46 +0900
From:   Minchan Kim <minchan@...nel.org>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>,
        Rik van Riel <riel@...hat.com>, Shaohua Li <shli@...nel.org>,
        Hugh Dickins <hughd@...gle.com>,
        Fengguang Wu <fengguang.wu@...el.com>,
        Tim Chen <tim.c.chen@...el.com>,
        Dave Hansen <dave.hansen@...el.com>
Subject: Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

On Thu, Sep 14, 2017 at 08:01:30PM +0800, Huang, Ying wrote:
> Minchan Kim <minchan@...nel.org> writes:
> 
> > On Wed, Sep 13, 2017 at 02:02:29PM -0700, Andrew Morton wrote:
> >> On Wed, 13 Sep 2017 10:40:19 +0900 Minchan Kim <minchan@...nel.org> wrote:
> >> 
> >> > Every zram users like low-end android device has used 0 page-cluster
> >> > to disable swap readahead because it has no seek cost and works as
> >> > synchronous IO operation so if we do readahead multiple pages,
> >> > swap falut latency would be (4K * readahead window size). IOW,
> >> > readahead is meaningful only if it doesn't bother faulted page's
> >> > latency.
> >> > 
> >> > However, this patch introduces additional knob /sys/kernel/mm/swap/
> >> > vma_ra_max_order as well as page-cluster. It means existing users
> >> > has used disabled swap readahead doesn't work until they should be
> >> > aware of new knob and modification of their script/code to disable
> >> > vma_ra_max_order as well as page-cluster.
> >> > 
> >> > I say it's a *regression* and wanted to fix it but Huang's opinion
> >> > is that it's not a functional regression so userspace should be fixed
> >> > by themselves.
> >> > Please look into detail of discussion in
> >> > http://lkml.kernel.org/r/%3C1505183833-4739-4-git-send-email-minchan@kernel.org%3E
> >> 
> >> hm, tricky problem.  I do agree that linking the physical and virtual
> >> readahead schemes in the proposed fashion is unfortunate.  I also agree
> >> that breaking existing setups (a bit) is also unfortunate.
> >> 
> >> Would it help if, when page-cluster is written to zero, we do
> >> 
> >> printk_once("physical readahead disabled, virtual readahead still
> >> enabled.  Disable virtual readhead via
> >> /sys/kernel/mm/swap/vma_ra_max_order").
> >> 
> >> Or something like that.  It's pretty lame, but it should help alert the
> >> zram-readahead-disabling people to the issue?
> >
> > It was my last resort. If we cannot find other ways after all, yes, it would
> > be a minimum we should do. But it still breaks users don't/can't read/modify
> > alert and program.
> >
> > How about this?
> >
> > Can't we make vma-based readahead config option?
> > With that, users who no interest on readahead don't enable vma-based
> > readahead. In this case, page-cluster works as expected "disable readahead
> > completely" so it doesn't break anything.
> 
> Now.  Users can choose between VMA based readahead and original
> readahead via a knob as follow at runtime,
> 
> /sys/kernel/mm/swap/vma_ra_enabled

It's not a config option and is enabled by default. IOW, it's under the radar
so current users cannot notice it. That's why we want to emit big fat warnning.
when old user set 0 to page-cluster. However, as Andrew said, it's lame.

If we make it config option, product maker/kernel upgrade user can have
a chance to notice and read description so they could be aware of two weird
knobs and help to solve the problem in advance without printk_once warn.
If user has no interest about swap-readahead or skip the new config option
by mistake, it works physcial readahead which means no regression.

>  
> Best Regards,
> Huang, Ying
> 
> 
> > People who want to use upcoming vma-based readahead can enable the feature
> > and we can say such unfortunate things in config/document description
> > somewhere so upcoming users will be aware of that unforunate two knobs.