[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e88b1850-ca36-aec5-ad27-0b2753c836f5@google.com>
Date: Fri, 30 Aug 2024 03:18:11 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
cc: Hugh Dickins <hughd@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>,
willy@...radead.org, david@...hat.com, wangkefeng.wang@...wei.com,
chrisl@...nel.org, ying.huang@...el.com, 21cnbao@...il.com,
ryan.roberts@....com, shy828301@...il.com, ziy@...dia.com,
ioworker0@...il.com, da.gomez@...sung.com, p.raghav@...sung.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 4/9] mm: filemap: use xa_get_order() to get the swap
entry order
On Thu, 29 Aug 2024, Baolin Wang wrote:
> On 2024/8/29 16:07, Hugh Dickins wrote:
...
> >
> > Fix below. Successful testing on mm-everything-2024-08-24-07-21 (well,
> > that minus the commit which spewed warnings from bootup) confirmed it.
> > But testing on mm-everything-2024-08-28-21-38 very quickly failed:
> > unrelated to this series, presumably caused by patch or patches added
> > since 08-24, one kind of crash on one machine (some memcg thing called
> > from isolate_migratepages_block), another kind of crash on another (some
> > memcg thing called from __read_swap_cache_async), I'm exhausted by now
> > but will investigate later in the day (or hope someone else has).
>
> I saw the isolate_migratepages_block crash issue on
> mm-everything-2024-08-28-09-32, and I reverted Kefeng's series "[PATCH 0/4]
> mm: convert to folio_isolate_movable()", the isolate_migratepages_block issue
> seems to be resolved (at least I can not reproduce it).
>
> And I have already pointed out some potential issues in Kefeng’s series[1].
> Andrew has dropped this series from mm-everything-2024-08-28-21-38. However,
> you can still encounter the isolate_migratepages_block issue on
> mm-everything-2024-08-28-21-38, while I cannot, weird.
It was not that issue: isolate_migratepages_block() turned out to be an
innocent bystander in my case: and I didn't see it crash there again,
but in a variety of other memcg places, many of them stat updates.
The error came from a different series, fix now posted:
https://lore.kernel.org/linux-mm/56d42242-37fe-b94f-d3cb-00673f1e5efb@google.com/T/#u
>
> > [PATCH] mm: filemap: use xa_get_order() to get the swap entry order: fix
> >
> > find_lock_entries(), used in the first pass of shmem_undo_range() and
> > truncate_inode_pages_range() before partial folios are dealt with, has
> > to be careful to avoid those partial folios: as its doc helpfully says,
> > "Folios which are partially outside the range are not returned". Of
> > course, the same must be true of any value entries returned, otherwise
> > truncation and hole-punch risk erasing swapped areas - as has been seen.
> >
> > Rewrite find_lock_entries() to emphasize that, following the same pattern
> > for folios and for value entries.
> >
> > Adjust find_get_entries() slightly, to get order while still holding
> > rcu_read_lock(), and to round down the updated start: good changes, like
> > find_lock_entries() now does, but it's unclear if either is ever important.
> >
> > Signed-off-by: Hugh Dickins <hughd@...gle.com>
>
> Thanks Hugh. The changes make sense to me.
Thanks!
Hugh
Powered by blists - more mailing lists