[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufZ42qn4vv+2w2MhFhqHib66s054YaXben28nddbZWRp5Q@mail.gmail.com>
Date: Tue, 9 Jul 2024 16:27:54 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Bharata B Rao <bharata@....com>, "Matthew Wilcox (Oracle)" <willy@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Mel Gorman <mgorman@...hsingularity.net>,
Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH mm-unstable v1] mm/truncate: batch-clear shadow entries
On Mon, Jul 8, 2024 at 4:16 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Mon, 8 Jul 2024 15:27:53 -0600 Yu Zhao <yuzhao@...gle.com> wrote:
>
> > Make clear_shadow_entry() clear shadow entries in `struct folio_batch`
> > so that it can reduce contention on i_lock and i_pages locks, e.g.,
> >
> > watchdog: BUG: soft lockup - CPU#29 stuck for 11s! [fio:2701649]
> > clear_shadow_entry+0x3d/0x100
> > mapping_try_invalidate+0x117/0x1d0
> > invalidate_mapping_pages+0x10/0x20
> > invalidate_bdev+0x3c/0x50
> > blkdev_common_ioctl+0x5f7/0xa90
> > blkdev_ioctl+0x109/0x270
>
> This will clearly reduce lock traffic a lot, but does it truly fix the
> issue? Is it the case that sufficiently extreme loads will still run
> into problems?
I think Bharata was running extreme loads. So I'd say it's good enough
for now, considering truncation doesn't happen that often.
> > --- a/mm/truncate.c
> > +++ b/mm/truncate.c
> > @@ -39,12 +39,24 @@ static inline void __clear_shadow_entry(struct address_space *mapping,
> > xas_store(&xas, NULL);
> > }
> >
> > -static void clear_shadow_entry(struct address_space *mapping, pgoff_t index,
> > - void *entry)
> > +static void clear_shadow_entry(struct address_space *mapping,
> > + struct folio_batch *fbatch, pgoff_t *indices)
> > {
> > + int i;
> > +
> > + if (shmem_mapping(mapping) || dax_mapping(mapping))
> > + return;
>
> We lost the comment which was in invalidate_exceptional_entry() and
> elsewhere. It wasn't a terribly good one but I do think a few words
> which explain why we're testing for these things would be helpful.
I'll put the original comment back. It seems to me it was stating the
obvious, and I don't really know how to improve it since it's obvious
;)
> I expect we should backport this. But identifying a Fixes: target
> looks to be challenging.
I wouldn't worry about backporting, nobody else has run into this
scalability issue (not even a day-to-day performance problem).
Powered by blists - more mailing lists