linux-kernel - Re: [PATCH v2 0/9] support large folio swap-out and swap-in for shmem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d186f398-4f40-7760-1a2a-2a88f041c6d3@google.com>
Date: Mon, 24 Jun 2024 21:54:40 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
cc: Hugh Dickins <hughd@...gle.com>, Barry Song <baohua@...nel.org>, 
    Baolin Wang <baolin.wang@...ux.alibaba.com>, willy@...radead.org, 
    david@...hat.com, wangkefeng.wang@...wei.com, chrisl@...nel.org, 
    ying.huang@...el.com, 21cnbao@...il.com, ryan.roberts@....com, 
    shy828301@...il.com, ziy@...dia.com, ioworker0@...il.com, 
    da.gomez@...sung.com, p.raghav@...sung.com, linux-mm@...ck.org, 
    linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/9] support large folio swap-out and swap-in for
 shmem

On Wed, 19 Jun 2024, Hugh Dickins wrote:
> On Wed, 19 Jun 2024, Hugh Dickins wrote:
> > 
> > and on second attempt, then a VM_BUG_ON_FOLIO(!folio_contains) from
> > find_lock_entries().
> > 
> > Or maybe that VM_BUG_ON_FOLIO() was unrelated, but a symptom of the bug
> > I'm trying to chase even when this series is reverted:
> 
> Yes, I doubt now that the VM_BUG_ON_FOLIO(!folio_contains) was related
> to Baolin's series: much more likely to be an instance of other problems.
> 
> > some kind of page
> > double usage, manifesting as miscellaneous "Bad page"s and VM_BUG_ONs,
> > mostly from page reclaim or from exit_mmap(). I'm still getting a feel
> > for it, maybe it occurs soon enough for a reliable bisection, maybe not.
> > 
> > (While writing, a run with mm-unstable cut off at 2a9964cc5d27,
> > drop KSM_KMEM_CACHE(), instead of reverting just Baolin's latest,
> > has not yet hit any problem: too early to tell but promising.)
> 
> Yes, that ran without trouble for many hours on two machines. I didn't
> do a formal bisection, but did appear to narrow it down convincingly to
> Barry's folio_add_new_anon_rmap() series: crashes soon on both machines
> with Barry's in but Baolin's out, no crashes with both out.
> 
> Yet while I was studying Barry's patches trying to explain it, one
> of the machines did at last crash: it's as if Barry's has opened a
> window which makes these crashes more likely, but not itself to blame.
> 
> I'll go back to studying that crash now: two CPUs crashed about the
> same time, perhaps they interacted and give a hint at root cause.
> 
> (I do have doubts about Barry's: the "_new" in folio_add_new_anon_rmap()
> was all about optimizing a known-exclusive case, so it surprises me
> to see it being extended to non-exclusive; and I worry over how its
> atomic_set(&page->_mapcount, 0)s can be safe when non-exclusive (but
> I've never caught up with David's exclusive changes, I'm out of date).

David answered on this, thanks: yes, I hadn't got around to seeing that
it only got to be called this way when page anon was not already set:
so agreed, no problem here with the _mapcount 0.

But eventually I came to see what's wrong with folio_add_new_anon_rmap():
this Baolin thread is the wrong place to post the fix, I'll send it now
in response to Barry's 2/3.  With that fix, and another mentioned below,
mm-unstable becomes very much more stable for me.

There is still something else causing "Bad page"s and maybe double
frees, something perhaps in the intersection of folio migration and
deferred splitting and miniTHP; but it's too rare, happened last night
when I wanted to post, but has refused to reappear since then; not
holding up testing, I'll give it more thought next time it comes.

> 
> But even if those are wrong, I'd expect them to tend towards a mapped
> page becoming unreclaimable, then "Bad page map" when munmapped,
> not to any of the double-free symptoms I've actually seen.)
> 
> > 
> > And before 2024-06-18, I was working on mm-everything-2024-06-15 minus
> > Chris Li's mTHP swap series: which worked fairly well, until it locked
> > up with __try_to_reclaim_swap()'s filemap_get_folio() spinning around
> > on a page with 0 refcount, while a page table lock is held which one
> > by one the other CPUs come to want for reclaim. On two machines.
> 
> I've not seen that symptom at all since 2024-06-15: intriguing,
> but none of us can afford the time to worry about vanished bugs.

That issue reappeared when I was testing on next-20240621:
I'll send the fix now in response to Kefeng's 2/5.

Hugh