[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211007163554.d9088a65f0e293e2bd906a56@linux-foundation.org>
Date: Thu, 7 Oct 2021 16:35:54 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Mel Gorman <mgorman@...hsingularity.net>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm: Optimise put_pages_list()
On Thu, 7 Oct 2021 21:55:21 +0100 Matthew Wilcox <willy@...radead.org> wrote:
> On Thu, Oct 07, 2021 at 12:31:09PM -0700, Andrew Morton wrote:
> > On Thu, 7 Oct 2021 20:21:37 +0100 "Matthew Wilcox (Oracle)" <willy@...radead.org> wrote:
> >
> > > Instead of calling put_page() one page at a time, pop pages off
> > > the list if their refcount was too high and pass the remainder to
> > > put_unref_page_list(). This should be a speed improvement, but I have
> > > no measurements to support that. Current callers do not care about
> > > performance, but I hope to add some which do.
> >
> > Don't you think it would actually be slower to take an additional pass
> > across the list? If the list is long enough to cause cache thrashing.
> > Maybe it's faster for small lists.
>
> My first response is an appeal to authority -- release_pages() does
> this same thing. Only it takes an array, constructs a list and passes
> that to put_unref_page_list(). So if that's slower (and lists _are_
> slower than arrays), we should have a put_unref_page_array().
And put_unref_page_list() does two passes across the list!
<quietly sobs>
Here is my beautiful release_pages(), as disrtibuted in linux-2.5.33:
void release_pages(struct page **pages, int nr)
{
int i;
struct pagevec pages_to_free;
struct zone *zone = NULL;
pagevec_init(&pages_to_free);
for (i = 0; i < nr; i++) {
struct page *page = pages[i];
struct zone *pagezone;
if (PageReserved(page) || !put_page_testzero(page))
continue;
pagezone = page_zone(page);
if (pagezone != zone) {
if (zone)
spin_unlock_irq(&zone->lru_lock);
zone = pagezone;
spin_lock_irq(&zone->lru_lock);
}
if (TestClearPageLRU(page))
del_page_from_lru(zone, page);
if (page_count(page) == 0) {
if (!pagevec_add(&pages_to_free, page)) {
spin_unlock_irq(&zone->lru_lock);
pagevec_free(&pages_to_free);
pagevec_init(&pages_to_free);
spin_lock_irq(&zone->lru_lock);
}
}
}
if (zone)
spin_unlock_irq(&zone->lru_lock);
pagevec_free(&pages_to_free);
}
I guess the current version is some commentary on the aging process?
> Second, we can follow through the code paths and reason about it.
>
> Before:
>
> while (!list_empty(pages)) {
> put_page(victim);
> page = compound_head(page);
> if (put_page_testzero(page))
> __put_page(page);
> __put_single_page(page)
> __page_cache_release(page);
> mem_cgroup_uncharge(page);
> <---
> free_unref_page(page, 0);
> free_unref_page_prepare()
> local_lock_irqsave(&pagesets.lock, flags);
> free_unref_page_commit(page, pfn, migratetype, order);
> local_unlock_irqrestore(&pagesets.lock, flags);
>
> After:
>
> free_unref_page_list(pages);
> list_for_each_entry_safe(page, next, list, lru) {
> if (!free_unref_page_prepare(page, pfn, 0)) {
> }
>
> local_lock_irqsave(&pagesets.lock, flags);
> list_for_each_entry_safe(page, next, list, lru) {
> free_unref_page_commit()
> }
> local_unlock_irqrestore(&pagesets.lock, flags);
>
> So the major win here is that we disable/enable interrupts once per
> batch rather than once per page.
Perhaps that's faster if the list is fully cached.
Any feelings for how often release_pages() will be passed a huge enough
list for this to occur?
Powered by blists - more mailing lists