linux-kernel - Re: [PATCH] mm/huge_memory: fix the memory leak due to the race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <576A5FE4.3090108@huawei.com>
Date:	Wed, 22 Jun 2016 17:52:36 +0800
From:	zhong jiang <zhongjiang@...wei.com>
To:	"Kirill A. Shutemov" <kirill@...temov.name>
CC:	<mhocko@...nel.org>, <akpm@...ux-foundation.org>,
	<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm/huge_memory: fix the memory leak due to the race

On 2016/6/21 23:29, Kirill A. Shutemov wrote:
> On Tue, Jun 21, 2016 at 11:19:07PM +0800, zhong jiang wrote:
>> On 2016/6/21 22:37, Kirill A. Shutemov wrote:
>>> On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
>>>> From: zhong jiang <zhongjiang@...wei.com>
>>>>
>>>> with great pressure, I run some test cases. As a result, I found
>>>> that the THP is not freed, it is detected by check_mm().
>>>>
>>>> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
>>>>
>>>> Consider the following race :
>>>>
>>>> 	CPU0                               CPU1
>>>>   __handle_mm_fault()
>>>>         wp_huge_pmd()
>>>>    	    do_huge_pmd_wp_page()
>>>> 		pmdp_huge_clear_flush_notify()
>>>>                 (pmd_none = true)
>>>> 					exit_mmap()
>>>> 					   unmap_vmas()
>>>> 					     zap_pmd_range()
>>>> 						pmd_none_or_trans_huge_or_clear_bad()
>>>> 						   (result in memory leak)
>>>>                 set_pmd_at()
>>>>
>>>> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
>>>> and it make the pmd entry to be null. Therefore, The memory leak can occur.
>>>>
>>>> The patch fix the scenario that the pmd entry can lead to be null.
>>> I don't think the scenario is possible.
>>>
>>> exit_mmap() called when all mm users have gone, so no parallel threads
>>> exist.
>>>
>>  Forget  this patch.  It 's my fault , it indeed don not exist.
>>  But I  hit the following problem.  we can see the memory leak when the process exit.
>>  
>>  
>>  Any suggestion will be apprecaited.
> Could you try this:
>
> http://lkml.kernel.org/r/20160621150433.GA7536@node.shutemov.name
 The patch I have seen ,  but I  don not think this patch  can fix so problem . if that race occur,  pmd entry points to
 the huge page will be changed ,  and freeze_page spilt pmd will fail. subsequent vm_bug_on() will fired.

 freeze_page()
     try_to_unmap()
         split_huge_pmd_address() (return fail) result in page_mapcount is not zero
 vm_bug_on()