lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <561D0C5A.6030505@suse.cz>
Date:	Tue, 13 Oct 2015 15:51:22 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Martin Schwidefsky <schwidefsky@...ibm.com>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Yong Sun <yosun@...e.com>, linux390@...ibm.com,
	linux-s390@...r.kernel.org, Zhang Yi <wetpzy@...il.com>,
	Mel Gorman <mgorman@...e.de>,
	Andrea Arcangeli <aarcange@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Hugh Dickins <hughd@...gle.com>,
	Dominik Dingel <dingel@...ux.vnet.ibm.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	Christian Borntraeger <borntraeger@...ibm.com>
Subject: Re: [RFC] futex: prevent endless loop on s390x with emulated
 hugepages

On 10/13/2015 01:48 PM, Vlastimil Babka wrote:
> On 09/28/2015 01:49 PM, Martin Schwidefsky wrote:
>> On Thu, 24 Sep 2015 17:05:48 +0200
>> Vlastimil Babka <vbabka@...e.cz> wrote:
>
> [...]
>
>>> However, __get_user_pages_fast() is still broken. The get_user_pages_fast()
>>> wrapper will hide this in the common case. The other user of the __ variant
>>> is kvm, which is mentioned as the reason for removal of emulated hugepages.
>>> The call of page_cache_get_speculative() looks also broken in this scenario
>>> on debug builds because of VM_BUG_ON_PAGE(PageTail(page), page). With
>>> CONFIG_TINY_RCU enabled, there's plain atomic_inc(&page->_count) which also
>>> probably shouldn't happen for a tail page...
>>
>> It boils down to __get_user_pages_fast being broken for emulated large pages,
>> doesn't it? My preferred fix would be to get __get_user_page_fast to work
>> in this case.
>
> I agree, but didn't know enough of the architecture to attempt such fix
> :) Thanks!
>
>> For 3.12 a patch would look like this (needs more testing
>> though):
>
> FWIW it works for me in the particular LTP test, but as you said, it
> needs more testing and breaking stable would suck.

I'm trying to break the patch on 3.12 with trinity, let's see...
Tried also to review it, although it's unlikely I'll catch some 
s390x-specific gotchas. For example, can't say what the effect of 
_SEGMENT_ENTRY_CO removal will be - before, the bit was set for 
non-emulated hugepages, and now the same bit is set for emulated ones?
Or if pmd_bad() was also broken before, and now isn't? But otherwise the 
change seems OK, besides some nitpick below.

[...]

>> @@ -103,7 +104,7 @@ static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
>>    		unsigned long end, int write, struct page **pages, int *nr)
>>    {
>>    	unsigned long next;
>> -	pmd_t *pmdp, pmd;
>> +	pmd_t *pmdp, pmd, pmd_orig;
>>
>>    	pmdp = (pmd_t *) pudp;
>>    #ifdef CONFIG_64BIT
>> @@ -112,7 +113,7 @@ static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
>>    	pmdp += pmd_index(addr);
>>    #endif
>>    	do {
>> -		pmd = *pmdp;
>> +		pmd = pmd_orig = *pmdp;
>>    		barrier();
>>    		next = pmd_addr_end(addr, end);
>>    		/*
>> @@ -127,8 +128,9 @@ static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
>>    		if (pmd_none(pmd) || pmd_trans_splitting(pmd))
>>    			return 0;
>>    		if (unlikely(pmd_large(pmd))) {
>> -			if (!gup_huge_pmd(pmdp, pmd, addr, next,
>> -					  write, pages, nr))
>> +			if (!gup_huge_pmd(pmdp, pmd_orig,
>> +					  pmd_swlarge_deref(pmd),
>> +					  addr, next, write, pages, nr))
>>    				return 0;
>>    		} else if (!gup_pte_range(pmdp, pmd, addr, next,
>>    					  write, pages, nr))

The "pmd" variable isn't changed anywhere in this loop after the initial 
assignment, so the extra "pmd_orig" variable isn't needed.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ