[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <9E51ECF6-E9E8-4772-B7D8-7E528DD56A89@lca.pw>
Date: Thu, 5 Dec 2019 04:42:25 -0500
From: Qian Cai <cai@....pw>
To: Yang Shi <yang.shi@...ux.alibaba.com>
Cc: fabecassis@...dia.com, jhubbard@...dia.com, mhocko@...e.com,
cl@...ux.com, vbabka@...e.cz, mgorman@...hsingularity.net,
akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [v2 PATCH] mm: move_pages: return valid node id in status if the page is already on the target node
> On Dec 4, 2019, at 11:21 PM, Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>
> Felix Abecassis reports move_pages() would return random status if the
> pages are already on the target node by the below test program:
>
> ---8<---
>
> int main(void)
> {
> const long node_id = 1;
> const long page_size = sysconf(_SC_PAGESIZE);
> const int64_t num_pages = 8;
>
> unsigned long nodemask = 1 << node_id;
> long ret = set_mempolicy(MPOL_BIND, &nodemask, sizeof(nodemask));
> if (ret < 0)
> return (EXIT_FAILURE);
>
> void **pages = malloc(sizeof(void*) * num_pages);
> for (int i = 0; i < num_pages; ++i) {
> pages[i] = mmap(NULL, page_size, PROT_WRITE | PROT_READ,
> MAP_PRIVATE | MAP_POPULATE | MAP_ANONYMOUS,
> -1, 0);
> if (pages[i] == MAP_FAILED)
> return (EXIT_FAILURE);
> }
>
> ret = set_mempolicy(MPOL_DEFAULT, NULL, 0);
> if (ret < 0)
> return (EXIT_FAILURE);
>
> int *nodes = malloc(sizeof(int) * num_pages);
> int *status = malloc(sizeof(int) * num_pages);
> for (int i = 0; i < num_pages; ++i) {
> nodes[i] = node_id;
> status[i] = 0xd0; /* simulate garbage values */
> }
>
> ret = move_pages(0, num_pages, pages, nodes, status, MPOL_MF_MOVE);
> printf("move_pages: %ld\n", ret);
> for (int i = 0; i < num_pages; ++i)
> printf("status[%d] = %d\n", i, status[i]);
> }
> ---8<---
>
> Then running the program would return nonsense status values:
> $ ./move_pages_bug
> move_pages: 0
> status[0] = 208
> status[1] = 208
> status[2] = 208
> status[3] = 208
> status[4] = 208
> status[5] = 208
> status[6] = 208
> status[7] = 208
>
> This is because the status is not set if the page is already on the
> target node, but move_pages() should return valid status as long as it
> succeeds. The valid status may be errno or node id.
>
> We can't simply initialize status array to zero since the pages may be
> not on node 0. Fix it by updating status with node id which the page is
> already on. And, it looks we have to update the status inside
> add_page_for_migration() since the page struct is not available outside
> it.
>
> Make add_page_for_migration() return 1 if store_status() is failed in
> order to not mix up the status value since -EFAULT is also a valid
> status.
Don’t really feel it is a bug after all. As you mentioned, the manpage was rather poorly written. Why it is not a good idea just update the manpage or/and code comments instead to document the current behavior?
Powered by blists - more mailing lists