lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dca2824e8e88e826c6b260a831d79089b5b9c79d.camel@surriel.com>
Date: Thu, 19 Dec 2024 21:00:56 -0500
From: Rik van Riel <riel@...riel.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Shakeel Butt <shakeel.butt@...ux.dev>, David Hildenbrand
 <david@...hat.com>,  Chris Li <chrisl@...nel.org>, Ryan Roberts
 <ryan.roberts@....com>, "Matthew Wilcox (Oracle)"	 <willy@...radead.org>,
 linux-mm@...ck.org, kernel-team@...a.com, 	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: remove unnecessary calls to lru_add_drain

On Thu, 2024-12-19 at 14:14 -0800, Andrew Morton wrote:
> On Thu, 19 Dec 2024 15:32:53 -0500 Rik van Riel <riel@...riel.com>
> wrote:
> 
> > There seem to be several categories of calls to lru_add_drain
> > and lru_add_drain_all.
> > 
> > The first are code paths that recently allocated, swapped in,
> > or otherwise processed a batch of pages, and want them all on
> > the LRU. These drain pages that were recently allocated,
> > probably on the local CPU.
> > 
> > A second category are code paths that are actively trying to
> > reclaim, migrate, or offline memory. These often use
> > lru_add_drain_all,
> > to drain the caches on all CPUs.
> > 
> > However, there also seem to be some other callers where we
> > aren't really doing either. They are calling lru_add_drain(),
> > despite operating on pages that may have been allocated
> > long ago, and quite possibly on different CPUs.
> > 
> > Those calls are not likely to be effective at anything but
> > creating lock contention on the LRU locks.
> > 
> > Remove the lru_add_drain calls in the latter category.
> 
> These lru_add_drain() calls are the sorts of things we've added as
> bugfixes when things go weird in unexpected situations.  So the need
> for them can be obscure.
> 
> I'd be more comfortable if we'd gone through them all, hunted down
> the
> commits which added them, learned why these calls were added then
> explained why that reasoning is no longer valid.
> 
> 
> A lot of the ones you're removing precede a tlb_gather_mmu()
> operation.
> I wonder why we have (or had) that pattern?
> 

It goes all the way back to before we were using git.

In those days, computers had fewer CPUs, and much less
memory. Maybe 2-4 CPUs and 64-256 MB of memory?

That means the value of freeing pages from lru_add_drain
in more places is larger, and the chance of the local
CPU holding relevant pages in its pagevecs would have
been larger, too.

On a system with 16 CPUs and 64GB of memory, the cost
of flushing the pagevecs more frequently is higher,
while the chance of encountering the right pages, and
the memory benefit are both lower.

I did not find any changeset in git history where we
had an existing tlb_gather_mmu, and an lru_add_drain
was added in front of it.

I did find some other things in 18 years of
"git log -S lru_add_drain", though :)


I found one place in git history where lru_add_drain
and tlb_gather_mmu are added together, like:

f5cc4eef9987 ("VM: make zap_page_range() callers that act on a single
VMA use separate helper")


I also found some changesets where unnecessary 
lru_add_drain calls are removed:

67e4eb076840 ("mm: thp: don't need to drain lru cache when splitting
and mlocking THP")
72b03fcd5d51 ("mm: mlock: remove lru_add_drain_all()")
586a32ac1d33 ("mm: munlock: remove unnecessary call to
lru_add_drain()")


Since 2006, most of the places that add lruvec flushing
seem to have added calls to lru_add_drain_all, for example:

7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage
collapse")
a980df33e935 ("khugepaged: drain all LRU caches before scanning pages")
9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages
allocated from CMA region")

-- 
All Rights Reversed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ