[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3329f63e-5671-1500-0730-cd46ba461d04@redhat.com>
Date: Thu, 2 Mar 2023 11:10:03 +0100
From: David Hildenbrand <david@...hat.com>
To: Marcelo Tosatti <mtosatti@...hat.com>
Cc: Christoph Lameter <cl@...ux.com>,
Aaron Tomlin <atomlin@...mlin.com>,
Frederic Weisbecker <frederic@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Mel Gorman <mgorman@...e.de>, Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v2 01/11] mm/vmstat: remove remote node draining
[...]
>
>> (2) drain_zone_pages() documents that we're draining the PCP
>> (bulk-freeing them) of the current CPU on remote nodes. That bulk-
>> freeing will properly adjust free memory counters. What exactly is
>> the impact when no longer doing that? Won't the "snapshot" of some
>> counters eventually be wrong? Do we care?
>
> Don't see why the snapshot of counters will be wrong.
>
> Instead of freeing pages on pcp list of remote nodes after they are
> considered idle ("3 seconds idle till flush"), what will happen is that
> drain_all_pages() will free those pcps, for example after an allocation
> fails on direct reclaim:
>
> page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>
> /*
> * If an allocation failed after direct reclaim, it could be because
> * pages are pinned on the per-cpu lists or in high alloc reserves.
> * Shrink them and try again
> */
> if (!page && !drained) {
> unreserve_highatomic_pageblock(ac, false);
> drain_all_pages(NULL);
> drained = true;
> goto retry;
> }
>
> In both cases the pages are freed (and counters maintained) here:
>
> static inline void __free_one_page(struct page *page,
> unsigned long pfn,
> struct zone *zone, unsigned int order,
> int migratetype, fpi_t fpi_flags)
> {
> struct capture_control *capc = task_capc(zone);
> unsigned long buddy_pfn = 0;
> unsigned long combined_pfn;
> struct page *buddy;
> bool to_tail;
>
> VM_BUG_ON(!zone_is_initialized(zone));
> VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page);
>
> VM_BUG_ON(migratetype == -1);
> if (likely(!is_migrate_isolate(migratetype)))
> __mod_zone_freepage_state(zone, 1 << order, migratetype);
>
> VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page);
> VM_BUG_ON_PAGE(bad_range(zone, page), page);
>
> while (order < MAX_ORDER - 1) {
> if (compaction_capture(capc, page, order, migratetype)) {
> __mod_zone_freepage_state(zone, -(1 << order),
> migratetype);
> return;
> }
>
>> Describing the difference between instructed refresh of vmstat and "remotely
>> drain per-cpu lists" in order to move free memory from the pcp to the buddy
>> would be great.
>
> The difference is that now remote PCPs will be drained on demand, either via
> kcompactd or direct reclaim (through drain_all_pages), when memory is
> low.
>
> For example, with the following test:
>
> dd if=/dev/zero of=file bs=1M count=32000 on a tmpfs filesystem:
>
> kcompactd0-116 [005] ...1 228232.042873: drain_all_pages <-kcompactd_do_work
> kcompactd0-116 [005] ...1 228232.042873: __drain_all_pages <-kcompactd_do_work
> dd-479485 [003] ...1 228232.455130: __drain_all_pages <-__alloc_pages_slowpath.constprop.0
> dd-479485 [011] ...1 228232.721994: __drain_all_pages <-__alloc_pages_slowpath.constprop.0
> gnome-shell-3750 [015] ...1 228232.723729: __drain_all_pages <-__alloc_pages_slowpath.constprop.0
>
> The commit message was indeed incorrect. Updated one:
>
> "mm/vmstat: remove remote node draining
>
> Draining of pages from the local pcp for a remote zone should not be
> necessary, since once the system is low on memory (or compaction on a
> zone is in effect), drain_all_pages should be called freeing any unused
> pcps."
>
> Thanks!
Thanks for the explanation, that makes sense to me. Feel free to add my
Acked-by: David Hildenbrand <david@...hat.com>
... hoping that some others (Mel, Vlastimil?) can have another look.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists