[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufbi7h6siHdnhsAEiXaCoNrUs9bUnEihYouE4CNMt-Zd_w@mail.gmail.com>
Date: Tue, 7 Jun 2022 13:07:11 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Will Deacon <will@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux-MM <linux-mm@...ck.org>, Andi Kleen <ak@...ux.intel.com>,
Aneesh Kumar <aneesh.kumar@...ux.ibm.com>,
Catalin Marinas <catalin.marinas@....com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
Johannes Weiner <hannes@...xchg.org>,
Jonathan Corbet <corbet@....net>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Michael Larabel <Michael@...haellarabel.com>,
Michal Hocko <mhocko@...nel.org>,
Mike Rapoport <rppt@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Tejun Heo <tj@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
LAK <linux-arm-kernel@...ts.infradead.org>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
Kernel Page Reclaim v2 <page-reclaim@...gle.com>,
Brian Geffon <bgeffon@...gle.com>,
Jan Alexander Steffens <heftig@...hlinux.org>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
Steven Barrett <steven@...uorix.net>,
Suleiman Souhlal <suleiman@...gle.com>,
Daniel Byrne <djbyrne@....edu>,
Donald Carr <d@...os-reins.com>,
Holger Hoffstätte <holger@...lied-asynchrony.com>,
Konstantin Kharlamov <Hi-Angel@...dex.ru>,
Shuang Zhai <szhai2@...rochester.edu>,
Sofia Trinh <sofia.trinh@....works>,
Vaibhav Jain <vaibhav@...ux.ibm.com>, huzhanyuan@...o.com
Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap
On Tue, Jun 7, 2022 at 1:37 AM Barry Song <21cnbao@...il.com> wrote:
>
> On Mon, Jun 6, 2022 at 9:25 PM Barry Song <21cnbao@...il.com> wrote:
> >
> > On Wed, May 18, 2022 at 4:49 PM Yu Zhao <yuzhao@...gle.com> wrote:
...
> I can't really explain why we are getting a random app/java vm crash in monkey
> test by using ptep_test_and_clear_young() only in lru_gen_look_around() on an
> armv8-a machine without hardware PTE young support.
>
> Moving to ptep_clear_flush_young() in look_around can make the random
> hang disappear according to zhanyuan(Cc-ed).
This sounds too familiar -- let me ask again: was the following commit
included during the test?
07509e10dcc7 arm64: pgtable: Fix pte_accessible()
If not, it will cause exactly the problem you described. And what
about this one?
e914d8f00391 mm: fix unexpected zeroed page mapping with zram swap
Missing it also causes userspace memory corruption on Android, i.e.,
random app crashes.
> On x86, ptep_clear_flush_young() is exactly ptep_test_and_clear_young()
> after
> 'commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear
> the accessed bit instead of flushing the TLB")'
>
> But on arm64, they are different. according to Will's comments in this
> thread which
> tried to make arm64 same with x86,
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793881.html
>
> "
> This is blindly copied from x86 and isn't true for us: we don't invalidate
> the TLB on context switch. That means our window for keeping the stale
> entries around is potentially much bigger and might not be a great idea.
>
> If we roll a TLB invalidation routine without the trailing DSB, what sort of
> performance does that get you?
> "
> We shouldn't think ptep_clear_flush_young() is safe enough in LRU to
> clear PTE young? Any comments from Will?
>
> >
> > btw, lru_gen_look_around() has already included 'address', are we doing
> > pte check for 'address' twice here?
Explained in the previous reply. Hope that clarifies things.
Powered by blists - more mailing lists