lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Mar 2019 11:31:50 -0700
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Oscar Salvador <osalvador@...e.de>
Cc:     chrubis@...e.cz, vbabka@...e.cz, kirill@...temov.name,
        akpm@...ux-foundation.org, stable@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: mempolicy: make mbind() return -EIO when
 MPOL_MF_STRICT is specified



On 3/20/19 1:16 AM, Oscar Salvador wrote:
> On Wed, Mar 20, 2019 at 02:35:56AM +0800, Yang Shi wrote:
>> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
>> Reported-by: Cyril Hrubis <chrubis@...e.cz>
>> Cc: Vlastimil Babka <vbabka@...e.cz>
>> Cc: stable@...r.kernel.org
>> Suggested-by: Kirill A. Shutemov <kirill@...temov.name>
>> Signed-off-by: Yang Shi <yang.shi@...ux.alibaba.com>
>> Signed-off-by: Oscar Salvador <osalvador@...e.de>
> Hi Yang, thanks for the patch.
>
> Some observations below.
>
>>   	}
>>   	page = pmd_page(*pmd);
>> @@ -473,8 +480,15 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
>>   	ret = 1;
>>   	flags = qp->flags;
>>   	/* go to thp migration */
>> -	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
>> +	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>> +		if (!vma_migratable(walk->vma)) {
>> +			ret = -EIO;
>> +			goto unlock;
>> +		}
>> +
>>   		migrate_page_add(page, qp->pagelist, flags);
>> +	} else
>> +		ret = -EIO;
> 	if (!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) ||
>         	        !vma_migratable(walk->vma)) {
>                 	ret = -EIO;
>                  goto unlock;
>          }
>
> 	migrate_page_add(page, qp->pagelist, flags);
> unlock:
>          spin_unlock(ptl);
> out:
>          return ret;
>
> seems more clean to me?

Yes, it sounds so.

>
>
>>   unlock:
>>   	spin_unlock(ptl);
>>   out:
>> @@ -499,8 +513,10 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>>   	ptl = pmd_trans_huge_lock(pmd, vma);
>>   	if (ptl) {
>>   		ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
>> -		if (ret)
>> +		if (ret > 0)
>>   			return 0;
>> +		else if (ret < 0)
>> +			return ret;
> I would go with the following, but that's a matter of taste I guess.
>
> if (ret < 0)
> 	return ret;
> else
> 	return 0;

No, this is not correct. queue_pages_pmd() may return 0, which means THP 
gets split. If it returns 0 the code should just fall through instead of 
returning.

>
>>   	}
>>   
>>   	if (pmd_trans_unstable(pmd))
>> @@ -521,11 +537,16 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>>   			continue;
>>   		if (!queue_pages_required(page, qp))
>>   			continue;
>> -		migrate_page_add(page, qp->pagelist, flags);
>> +		if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>> +			if (!vma_migratable(vma))
>> +				break;
>> +			migrate_page_add(page, qp->pagelist, flags);
>> +		} else
>> +			break;
> I might be missing something, but AFAICS neither vma nor flags is going to change
> while we are in queue_pages_pte_range(), so, could not we move the check just
> above the loop?
> In that way, 1) we only perform the check once and 2) if we enter the loop
> we know that we are going to do some work, so, something like:
>
> index af171ccb56a2..7c0e44389826 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -487,6 +487,9 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>          if (pmd_trans_unstable(pmd))
>                  return 0;
>   
> +       if (!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) || !vma_migratable(vma))
> +               return -EIO;

It sounds not correct to me. We need check if there is existing page on 
the node which is not allowed by the policy. This is what 
queue_pages_required() does.

Thanks,
Yang

> +
>          pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
>          for (; addr != end; pte++, addr += PAGE_SIZE) {
>                  if (!pte_present(*pte))
>
>
>>   	}
>>   	pte_unmap_unlock(pte - 1, ptl);
>>   	cond_resched();
>> -	return 0;
>> +	return addr != end ? -EIO : 0;
> If we can do the above, we can leave the return value as it was.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ