[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c880e88-6eb7-cd6d-fbf3-394b89355e10@linux.alibaba.com>
Date: Wed, 20 Mar 2019 11:31:50 -0700
From: Yang Shi <yang.shi@...ux.alibaba.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: chrubis@...e.cz, vbabka@...e.cz, kirill@...temov.name,
akpm@...ux-foundation.org, stable@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: mempolicy: make mbind() return -EIO when
MPOL_MF_STRICT is specified
On 3/20/19 1:16 AM, Oscar Salvador wrote:
> On Wed, Mar 20, 2019 at 02:35:56AM +0800, Yang Shi wrote:
>> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
>> Reported-by: Cyril Hrubis <chrubis@...e.cz>
>> Cc: Vlastimil Babka <vbabka@...e.cz>
>> Cc: stable@...r.kernel.org
>> Suggested-by: Kirill A. Shutemov <kirill@...temov.name>
>> Signed-off-by: Yang Shi <yang.shi@...ux.alibaba.com>
>> Signed-off-by: Oscar Salvador <osalvador@...e.de>
> Hi Yang, thanks for the patch.
>
> Some observations below.
>
>> }
>> page = pmd_page(*pmd);
>> @@ -473,8 +480,15 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
>> ret = 1;
>> flags = qp->flags;
>> /* go to thp migration */
>> - if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
>> + if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>> + if (!vma_migratable(walk->vma)) {
>> + ret = -EIO;
>> + goto unlock;
>> + }
>> +
>> migrate_page_add(page, qp->pagelist, flags);
>> + } else
>> + ret = -EIO;
> if (!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) ||
> !vma_migratable(walk->vma)) {
> ret = -EIO;
> goto unlock;
> }
>
> migrate_page_add(page, qp->pagelist, flags);
> unlock:
> spin_unlock(ptl);
> out:
> return ret;
>
> seems more clean to me?
Yes, it sounds so.
>
>
>> unlock:
>> spin_unlock(ptl);
>> out:
>> @@ -499,8 +513,10 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>> ptl = pmd_trans_huge_lock(pmd, vma);
>> if (ptl) {
>> ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
>> - if (ret)
>> + if (ret > 0)
>> return 0;
>> + else if (ret < 0)
>> + return ret;
> I would go with the following, but that's a matter of taste I guess.
>
> if (ret < 0)
> return ret;
> else
> return 0;
No, this is not correct. queue_pages_pmd() may return 0, which means THP
gets split. If it returns 0 the code should just fall through instead of
returning.
>
>> }
>>
>> if (pmd_trans_unstable(pmd))
>> @@ -521,11 +537,16 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>> continue;
>> if (!queue_pages_required(page, qp))
>> continue;
>> - migrate_page_add(page, qp->pagelist, flags);
>> + if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>> + if (!vma_migratable(vma))
>> + break;
>> + migrate_page_add(page, qp->pagelist, flags);
>> + } else
>> + break;
> I might be missing something, but AFAICS neither vma nor flags is going to change
> while we are in queue_pages_pte_range(), so, could not we move the check just
> above the loop?
> In that way, 1) we only perform the check once and 2) if we enter the loop
> we know that we are going to do some work, so, something like:
>
> index af171ccb56a2..7c0e44389826 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -487,6 +487,9 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
> if (pmd_trans_unstable(pmd))
> return 0;
>
> + if (!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) || !vma_migratable(vma))
> + return -EIO;
It sounds not correct to me. We need check if there is existing page on
the node which is not allowed by the policy. This is what
queue_pages_required() does.
Thanks,
Yang
> +
> pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
> for (; addr != end; pte++, addr += PAGE_SIZE) {
> if (!pte_present(*pte))
>
>
>> }
>> pte_unmap_unlock(pte - 1, ptl);
>> cond_resched();
>> - return 0;
>> + return addr != end ? -EIO : 0;
> If we can do the above, we can leave the return value as it was.
>
Powered by blists - more mailing lists