lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0190d084-5295-83c3-98e7-eceae1b45c89@linux.alibaba.com>
Date:   Mon, 27 Jan 2020 08:34:23 -0800
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     richardw.yang@...ux.intel.com, akpm@...ux-foundation.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org
Subject: Re: [v2 PATCH] mm: move_pages: report the number of non-attempted
 pages



On 1/27/20 1:55 AM, Michal Hocko wrote:
> On Thu 23-01-20 07:38:51, Yang Shi wrote:
>> Since commit a49bd4d71637 ("mm, numa: rework do_pages_move"),
>> the semantic of move_pages() was changed to return the number of
>> non-migrated pages (failed to migration) and the call would be aborted
>> immediately if migrate_pages() returns positive value.  But it didn't
>> report the number of pages that we even haven't attempted to migrate.
>> So, fix it by including non-attempted pages in the return value.
> I would rephrased the changelog like this
> "
> Since commit 49bd4d71637 ("mm, numa: rework do_pages_move"),
> the semantic of move_pages() has changed to return the number of
> non-migrated pages if they were result of a non-fatal reasons (usually a
> busy page). This was an unintentional change that hasn't been noticed
> except for LTP tests which checked for the documented behavior.
>
> There are two ways to go around this change. We can even get back to the
> original behavior and return -EAGAIN whenever migrate_pages is not able
> to migrate pages due to non-fatal reasons. Another option would be to
> simply continue with the changed semantic and extend move_pages
> documentation to clarify that -errno is returned on an invalid input or
> when migration simply cannot succeed (e.g. -ENOMEM, -EBUSY) or the
> number of pages that couldn't have been migrated due to ephemeral
> reasons (e.g. page is pinned or locked for other reasons).
>
> This patch implements the second option because this behavior is in
> place for some time without anybody complaining and possibly new users
> depending on it. Also it allows to have a slightly easier error handling
> as the caller knows that it is worth to retry when err > 0.
> "
>
>> Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move")
>> Suggested-by: Michal Hocko <mhocko@...e.com>
>> Cc: Wei Yang <richardw.yang@...ux.intel.com>
>> Cc: <stable@...r.kernel.org>    [4.17+]
>> Signed-off-by: Yang Shi <yang.shi@...ux.alibaba.com>
> With a more clarification, feel free to add
> Acked-by: Michal Hocko <mhocko@...e.com>

Thanks. Will post v3 with the rephrased commit log.

>
>> ---
>> v2: Rebased on top of the latest mainline kernel per Andrew
>>
>>   mm/migrate.c | 24 ++++++++++++++++++++++--
>>   1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 86873b6..9b8eb5d 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1627,8 +1627,18 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>>   			start = i;
>>   		} else if (node != current_node) {
>>   			err = do_move_pages_to_node(mm, &pagelist, current_node);
>> -			if (err)
>> +			if (err) {
>> +				/*
>> +				 * Positive err means the number of failed
>> +				 * pages to migrate.  Since we are going to
>> +				 * abort and return the number of non-migrated
>> +				 * pages, so need incude the rest of the
>> +				 * nr_pages that have not attempted as well.
>> +				 */
>> +				if (err > 0)
>> +					err += nr_pages - i - 1;
>>   				goto out;
>> +			}
>>   			err = store_status(status, start, current_node, i - start);
>>   			if (err)
>>   				goto out;
>> @@ -1659,8 +1669,11 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>>   			goto out_flush;
>>   
>>   		err = do_move_pages_to_node(mm, &pagelist, current_node);
>> -		if (err)
>> +		if (err) {
>> +			if (err > 0)
>> +				err += nr_pages - i - 1;
>>   			goto out;
>> +		}
>>   		if (i > start) {
>>   			err = store_status(status, start, current_node, i - start);
>>   			if (err)
>> @@ -1674,6 +1687,13 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>>   
>>   	/* Make sure we do not overwrite the existing error */
>>   	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
>> +	/*
>> +	 * Don't have to report non-attempted pages here since:
>> +	 *     - If the above loop is done gracefully there is not non-attempted
>> +	 *       page.
>> +	 *     - If the above loop is aborted to it means more fatal error
>> +	 *       happened, should return err.
>> +	 */
>>   	if (!err1)
>>   		err1 = store_status(status, start, current_node, i - start);
>>   	if (!err)
>> -- 
>> 1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ