linux-kernel - Re: [PATCH v2 0/9] workingset protection/detection on the anonymous LRU list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200228095534.GA30796@js1304-desktop>
Date:   Fri, 28 Feb 2020 18:56:11 +0900
From:   Joonsoo Kim <js1304@...il.com>
To:     Aaron Lu <aaron.lwe@...il.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Michal Hocko <mhocko@...nel.org>,
        Hugh Dickins <hughd@...gle.com>,
        Minchan Kim <minchan@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Mel Gorman <mgorman@...hsingularity.net>, kernel-team@....com,
        Huang Ying <ying.huang@...el.com>
Subject: Re: [PATCH v2 0/9] workingset protection/detection on the anonymous
 LRU list

On Fri, Feb 28, 2020 at 05:17:00PM +0800, Aaron Lu wrote:
> On Fri, Feb 28, 2020 at 03:52:59PM +0900, Joonsoo Kim wrote:
> > On Fri, Feb 28, 2020 at 01:57:26PM +0800, Aaron Lu wrote:
> > > On Fri, Feb 28, 2020 at 01:03:03PM +0900, Joonsoo Kim wrote:
> > > > Hello,
> > > > 
> > > > On Fri, Feb 28, 2020 at 11:23:58AM +0800, Aaron Lu wrote:
> > > > > On Thu, Feb 27, 2020 at 08:48:06AM -0500, Johannes Weiner wrote:
> > > > > > On Wed, Feb 26, 2020 at 07:39:42PM -0800, Andrew Morton wrote:
> > > > > > > It sounds like the above simple aging changes provide most of the
> > > > > > > improvement, and that the workingset changes are less beneficial and a
> > > > > > > bit more risky/speculative?
> > > > > > > 
> > > > > > > If so, would it be best for us to concentrate on the aging changes
> > > > > > > first, let that settle in and spread out and then turn attention to the
> > > > > > > workingset changes?
> > > > > > 
> > > > > > Those two patches work well for some workloads (like the benchmark),
> > > > > > but not for others. The full patchset makes sure both types work well.
> > > > > > 
> > > > > > Specifically, the existing aging strategy for anon assumes that most
> > > > > > anon pages allocated are hot. That's why they all start active and we
> > > > > > then do second-chance with the small inactive LRU to filter out the
> > > > > > few cold ones to swap out. This is true for many common workloads.
> > > > > > 
> > > > > > The benchmark creates a larger-than-memory set of anon pages with a
> > > > > > flat access profile - to the VM a flood of one-off pages. Joonsoo's
> > > > > 
> > > > > test: swap-w-rand-mt, which is a multi thread swap write intensive
> > > > > workload so there will be swap out and swap ins.
> > > > > 
> > > > > > first two patches allow the VM to usher those pages in and out of
> > > > > 
> > > > > Weird part is, the robot says the performance gain comes from the 1st
> > > > > patch only, which adjust the ratio, not including the 2nd patch which
> > > > > makes anon page starting from inactive list.
> > > > > 
> > > > > I find the performance gain hard to explain...
> > > > 
> > > > Let me explain the reason of the performance gain.
> > > > 
> > > > 1st patch provides more second chance to the anonymous pages.
> > > 
> > > By second chance, do I understand correctely this refers to pages on 
> > > inactive list get moved back to active list?
> > 
> > Yes.
> > 
> > > 
> > > > In swap-w-rand-mt test, memory used by all threads is greater than the
> > > > amount of the system memory, but, memory used by each thread would
> > > > not be much. So, although it is a rand test, there is a locality
> > > > in each thread's job. More second chance helps to exploit this
> > > > locality so performance could be improved.
> > > 
> > > Does this mean there should be fewer vmstat.pswpout and vmstat.pswpin
> > > with patch1 compared to vanilla?
> > 
> > It depends on the workload. If the workload consists of anonymous
> 
> This swap-rand-w-mt workload is anon only.

Yes, I know.

> 
> > pages only, I think, yes, pswpout/pswpin would be lower than vanilla
> 
> I think LKP robot has captured these two metrics but the report didn't
> show them, which means the number is about the same with or without
> patch #1.

robot showed these two metrics. See below.

  50190319 ± 31%     -35.7%   32291856 ± 14%  proc-vmstat.pswpin
  56429784 ± 21%     -42.6%   32386842 ± 14%  proc-vmstat.pswpout

pswpin/out are improved.

> > with patch #1. With large inactive list, we can easily find the
> > frequently referenced page and it would result in less swap in/out.
> 
> But with small inactive list, the pages that would be on inactive list
> will stay on active list? I think the larger inactive list is mainly
> used to give the anon page a chance to be promoted to active list now
> that anon pages land on inactive list first, but on reclaim, I don't see
> how a larger inactive list can cause fewer swap outs.

Point is that larger inactive LRU helps to find hot pages and these
hot pages leads to more cache hits.

When a cache hit happens, no swap outs happens. But, if a cache miss
happens, a new page is added to the LRU and then it causes the reclaim
and swap out.

> Forgive me for my curiosity and feel free to ignore my question as I
> don't want to waste your time on this. Your patchset looks a worthwhile
> thing to do, it's just the robot's report on patch1 seems er...

I appreciate your attention. Feel free to ask. :)

Thanks.