[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6a378a57-a453-0318-924b-05dfa0a10c1f@intel.com>
Date: Thu, 20 Aug 2020 08:21:55 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: "Huang, Ying" <ying.huang@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>
Cc: linux-kernel@...r.kernel.org, yang.shi@...ux.alibaba.com,
rientjes@...gle.com, dan.j.williams@...el.com,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim
On 8/20/20 1:06 AM, Huang, Ying wrote:
>> + /* Migrate pages selected for demotion */
>> + nr_reclaimed += demote_page_list(&ret_pages, &demote_pages, pgdat, sc);
>> +
>> pgactivate = stat->nr_activate[0] + stat->nr_activate[1];
>>
>> mem_cgroup_uncharge_list(&free_pages);
>> _
> Generally, it's good to batch the page migration. But one side effect
> is that, if the pages are failed to be migrated, they will be placed
> back to the LRU list instead of falling back to be reclaimed really.
> This may cause some issue in some situation. For example, if there's no
> enough space in the PMEM (slow) node, so the page migration fails, OOM
> may be triggered, because the direct reclaiming on the DRAM (fast) node
> may make no progress, while it can reclaim some pages really before.
Yes, agreed.
There are a couple of ways we could fix this. Instead of splicing
'demote_pages' back into 'ret_pages', we could try to get them back on
'page_list' and goto the beginning on shrink_page_list(). This will
probably yield the best behavior, but might be a bit ugly.
We could also add a field to 'struct scan_control' and just stop trying
to migrate after it has failed one or more times. The trick will be
picking a threshold that doesn't mess with either the normal reclaim
rate or the migration rate.
This is on my list to fix up next.
Powered by blists - more mailing lists