[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65A1FFA0-531C-4078-9704-3F44819C3C07@nvidia.com>
Date: Mon, 18 Feb 2019 09:31:24 -0800
From: Zi Yan <ziy@...dia.com>
To: Matthew Wilcox <willy@...radead.org>
CC: <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Michal Hocko <mhocko@...nel.org>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Mel Gorman <mgorman@...hsingularity.net>,
John Hubbard <jhubbard@...dia.com>,
Mark Hairgrove <mhairgrove@...dia.com>,
Nitin Gupta <nigupta@...dia.com>,
David Nellans <dnellans@...dia.com>
Subject: Re: [RFC PATCH 01/31] mm: migrate: Add exchange_pages to exchange two
lists of pages.
On 17 Feb 2019, at 3:29, Matthew Wilcox wrote:
> On Fri, Feb 15, 2019 at 02:08:26PM -0800, Zi Yan wrote:
>> +struct page_flags {
>> + unsigned int page_error :1;
>> + unsigned int page_referenced:1;
>> + unsigned int page_uptodate:1;
>> + unsigned int page_active:1;
>> + unsigned int page_unevictable:1;
>> + unsigned int page_checked:1;
>> + unsigned int page_mappedtodisk:1;
>> + unsigned int page_dirty:1;
>> + unsigned int page_is_young:1;
>> + unsigned int page_is_idle:1;
>> + unsigned int page_swapcache:1;
>> + unsigned int page_writeback:1;
>> + unsigned int page_private:1;
>> + unsigned int __pad:3;
>> +};
>
> I'm not sure how to feel about this. It's a bit fragile versus
> somebody adding
> new page flags. I don't know whether it's needed or whether you can
> just
> copy page->flags directly because you're holding PageLock.
I agree with you that current way of copying page flags individually
could miss
new page flags. I will try to come up with something better. Copying
page->flags as a whole
might not simply work, since the upper part of page->flags has the page
node information,
which should not be changed. I think I need to add a helper function to
just copy/exchange
all page flags, like calling migrate_page_stats() twice.
>> +static void exchange_page(char *to, char *from)
>> +{
>> + u64 tmp;
>> + int i;
>> +
>> + for (i = 0; i < PAGE_SIZE; i += sizeof(tmp)) {
>> + tmp = *((u64 *)(from + i));
>> + *((u64 *)(from + i)) = *((u64 *)(to + i));
>> + *((u64 *)(to + i)) = tmp;
>> + }
>> +}
>
> I have a suspicion you'd be better off allocating a temporary page and
> using copy_page(). Some architectures have put a lot of effort into
> making copy_page() run faster.
When I am doing exchange_pages() between two NUMA nodes on a x86_64
machine,
I actually can saturate the QPI bandwidth with this operation. I think
cache
prefetching was doing its job.
The purpose of proposing exchange_pages() is to avoid allocating any new
page,
so that we would not trigger any potential page reclaim or memory
compaction.
Allocating a temporary page defeats the purpose.
>
>> + xa_lock_irq(&to_mapping->i_pages);
>> +
>> + to_pslot = radix_tree_lookup_slot(&to_mapping->i_pages,
>> + page_index(to_page));
>
> This needs to be converted to the XArray. radix_tree_lookup_slot() is
> going away soon. You probably need:
>
> XA_STATE(to_xas, &to_mapping->i_pages, page_index(to_page));
Thank you for pointing this out. I will do the change.
>
> This is a lot of code and I'm still trying to get my head aroud it
> all.
> Thanks for putting in this work; it's good to see this approach being
> explored.
Thank you for taking a look at the code.
--
Best Regards,
Yan Zi
Powered by blists - more mailing lists