lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9aa3ff03-8397-4ca9-dc55-d893948f7ece@linux.alibaba.com>
Date:   Fri, 24 Jan 2020 09:48:30 -0800
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Wei Yang <richardw.yang@...ux.intel.com>,
        Michal Hocko <mhocko@...nel.org>
Cc:     akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [v2 PATCH] mm: move_pages: report the number of non-attempted
 pages



On 1/24/20 7:26 AM, Wei Yang wrote:
> On Fri, Jan 24, 2020 at 07:46:49AM +0100, Michal Hocko wrote:
>> On Fri 24-01-20 06:56:47, Wei Yang wrote:
>>> On Thu, Jan 23, 2020 at 09:55:26AM +0100, Michal Hocko wrote:
>>>> On Thu 23-01-20 11:27:36, Wei Yang wrote:
>>>>> On Thu, Jan 23, 2020 at 07:38:51AM +0800, Yang Shi wrote:
>>>>>> Since commit a49bd4d71637 ("mm, numa: rework do_pages_move"),
>>>>>> the semantic of move_pages() was changed to return the number of
>>>>>> non-migrated pages (failed to migration) and the call would be aborted
>>>>>> immediately if migrate_pages() returns positive value.  But it didn't
>>>>>> report the number of pages that we even haven't attempted to migrate.
>>>>>> So, fix it by including non-attempted pages in the return value.
>>>>>>
>>>>> First, we want to change the semantic of move_pages(2). The return value
>>>>> indicates the number of pages we didn't managed to migrate?
>>>>>
>>>>> Second, the return value from migrate_pages() doesn't mean the number of pages
>>>>> we failed to migrate. For example, one -ENOMEM is returned on the first page,
>>>>> migrate_pages() would return 1. But actually, no page successfully migrated.
>>>> ENOMEM is considered a permanent failure and as such it is returned by
>>>> migrate pages (see goto out).
>>>>
>>>>> Third, even the migrate_pages() return the exact non-migrate page, we are not
>>>>> sure those non-migrated pages are at the tail of the list. Because in the last
>>>>> case in migrate_pages(), it just remove the page from list. It could be a page
>>>>> in the middle of the list. Then, in userspace, how the return value be
>>>>> leveraged to determine the valid status? Any page in the list could be the
>>>>> victim.
>>>> Yes, I was wrong when stating that the caller would know better which
>>>> status to check. I misremembered the original patch as it was quite some
>>>> time ago. While storing the error code would be possible after some
>>>> massaging of migrate_pages is this really something we deeply care
>>>> about. The caller can achieve the same by initializing the status array
>>>> to a non-node number - e.g. -1 - and check based on that.
>>>>
>>> So for a user, the best practice is to initialize the status array to -1 and
>>> check each status to see whether the page is migrated successfully?
>> Yes IMO. Just consider -errno return value. You have no way to find out
>> which pages have been migrated until we reached that error. The
>> possitive return value would fall into the same case.
>>
>>> Then do we need to return the number of non-migrated page? What benefit could
>>> user get from the number. How about just return an error code to indicate the
>>> failure? I may miss some point, would you mind giving me a hint?
>> This is certainly possible. We can return -EAGAIN if some pages couldn't
>> be migrated because they are pinned. But please read my previous email
>> to the very end for arguments why this might cause more problems than it
>> actually solves.
>>
> Let me put your comment here:
>
>      Because new users could have started depending on it. It
>      is not all that unlikely that the current implementation would just
>      work for them because they are migrating a set of pages on to the same
>      node so the batch would be a single list throughout the whole given
>      page set.
>
> Your idea is to preserve current semantic, return non-migrated pages number to
> userspace.
>
> And the reason is:
>
>     1. Users have started depending on it.
>     2. No real bug reported yet.
>     3. User always migrate page to the same node. (If my understanding is
>        correct)
>
> I think this gets some reason, since we want to minimize the impact to
> userland.
>
> While let's see what user probably use this syscall. Since from the man page,
> we never told the return value could be positive, the number of non-migrated
> pages, user would think only 0 means a successful migration and all other
> cases are failure. Then user probably handle negative and positive return
> value the same way, like (!err).
>
> If my guess is true, return a negative error value for this case could
> minimize the impact to userland here.
>     1. Preserve the semantic of move_pages(2): 0 means success, negative means
>        some error and needs extra handling.
>     2. Trivial change to the man page.
>     3. Suppose no change to users.
>
> Well, in case I missed your point, sorry about that.

I think we should compare the new semantic with the old one. With the 
old semantic the move_pages() return 0 for both success *and* migration 
failure. So, I'm supposed (I don't have any real usecase) the user may 
do the below with the old semantic:
     - Just check if it is failed (ignore migration failure), "!err" is 
good enough.  This usecase is fine as well with the new semantic since 
migration failure is also a kind of error cases.
      - Care about migration failure, the user needs traverse all bits 
in the status array. With the new semantic they just need check if "err 
 > 0", if they want to know what specific pages are failed to migrate, 
then traverse the status array (with initialized as -1 as Michal 
suggested in earlier email).

So, with returning errno for migration failure if the userspace wants to 
see if migration is failed, they need do:
     1. Check "!err"
     2. Read errno if #1 returns false
     3. Traverse status array to see how many pages are failed to migrate

But with the new semantic they just need check if "err > 0", one step is 
fine for the most cases. So I said this approach seems more 
straightforward to the userspace and makes more sense IMHO.

>>>> This system call has quite a complex semantic and I am not 100% sure
>>>> what is the right thing to do here. Maybe we do want to continue and try
>>>> to migrate as much as possible on non-fatal migration failures and
>>>> accumulate the number of failed pages while doing so.
>>>>
>>>> The main problem is that we can have an academic discussion but
>>>> the primary question is what do actual users want. A lack of real
>>>> bug reports suggests that nobody has actually noticed this. So I
>>>> would rather keep returning the correct number of non-migrated
>>>> pages. Why? Because new users could have started depending on it. It
>>>> is not all that unlikely that the current implementation would just
>>>> work for them because they are migrating a set of pages on to the same
>>>> node so the batch would be a single list throughout the whole given
>>>> page set.
>> -- 
>> Michal Hocko
>> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ