linux-kernel - Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170914081547.GC5533@bbox>
Date:   Thu, 14 Sep 2017 17:15:47 +0900
From:   Minchan Kim <minchan@...nel.org>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>,
        Rik van Riel <riel@...hat.com>, Shaohua Li <shli@...nel.org>,
        Hugh Dickins <hughd@...gle.com>,
        Fengguang Wu <fengguang.wu@...el.com>,
        Tim Chen <tim.c.chen@...el.com>,
        Dave Hansen <dave.hansen@...el.com>
Subject: Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

On Thu, Sep 14, 2017 at 08:53:04AM +0800, Huang, Ying wrote:
> Hi, Andrew,
> 
> Andrew Morton <akpm@...ux-foundation.org> writes:
> 
> > On Wed, 13 Sep 2017 10:40:19 +0900 Minchan Kim <minchan@...nel.org> wrote:
> >
> >> Every zram users like low-end android device has used 0 page-cluster
> >> to disable swap readahead because it has no seek cost and works as
> >> synchronous IO operation so if we do readahead multiple pages,
> >> swap falut latency would be (4K * readahead window size). IOW,
> >> readahead is meaningful only if it doesn't bother faulted page's
> >> latency.
> >> 
> >> However, this patch introduces additional knob /sys/kernel/mm/swap/
> >> vma_ra_max_order as well as page-cluster. It means existing users
> >> has used disabled swap readahead doesn't work until they should be
> >> aware of new knob and modification of their script/code to disable
> >> vma_ra_max_order as well as page-cluster.
> >> 
> >> I say it's a *regression* and wanted to fix it but Huang's opinion
> >> is that it's not a functional regression so userspace should be fixed
> >> by themselves.
> >> Please look into detail of discussion in
> >> http://lkml.kernel.org/r/%3C1505183833-4739-4-git-send-email-minchan@kernel.org%3E
> >
> > hm, tricky problem.  I do agree that linking the physical and virtual
> > readahead schemes in the proposed fashion is unfortunate.  I also agree
> > that breaking existing setups (a bit) is also unfortunate.
> >
> > Would it help if, when page-cluster is written to zero, we do
> >
> > printk_once("physical readahead disabled, virtual readahead still
> > enabled.  Disable virtual readhead via
> > /sys/kernel/mm/swap/vma_ra_max_order").
> >
> > Or something like that.  It's pretty lame, but it should help alert the
> > zram-readahead-disabling people to the issue?
> 
> This sounds good for me.
> 
> Hi, Minchan, what do you think about this?  I think for low-end android
> device, the end-user may have no opportunity to upgrade to the latest
> kernel, the device vendor should care about this.  For desktop users,
> the warning proposed by Andrew may help to remind them for the new knob.

Yes, it would be option. At least, we should alert to the user to make
a chance to fix. However, can't we make vma-based readahead new config
option? Please look at the detail in my reply of andrew.

With that, there is no regression with current users and as a bonus,
user can measure both algorithm with their real workload with both
algorithm rather than artificial benchmark. I think recency vs spartial
locality would have each pros and cons so that kind soft landing would
be safer option rather than sudden replacing.
After a while, we can set new algorithm as default.