linux-kernel - Re: [RFC v5 09/11] mm: Try spin lock in speculative path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <ce7a039a-2697-f16e-b0b3-f6ae41391682@linux.vnet.ibm.com>
Date:   Thu, 6 Jul 2017 17:29:26 +0200
From:   Laurent Dufour <ldufour@...ux.vnet.ibm.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     paulmck@...ux.vnet.ibm.com, akpm@...ux-foundation.org,
        kirill@...temov.name, ak@...ux.intel.com, mhocko@...nel.org,
        dave@...olabs.net, jack@...e.cz,
        Matthew Wilcox <willy@...radead.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        haren@...ux.vnet.ibm.com, khandual@...ux.vnet.ibm.com,
        npiggin@...il.com, bsingharora@...il.com,
        Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: [RFC v5 09/11] mm: Try spin lock in speculative path

On 06/07/2017 16:48, Peter Zijlstra wrote:
> On Thu, Jul 06, 2017 at 03:46:59PM +0200, Laurent Dufour wrote:
>> On 05/07/2017 20:50, Peter Zijlstra wrote:
>>> On Fri, Jun 16, 2017 at 07:52:33PM +0200, Laurent Dufour wrote:
>>>> @@ -2294,8 +2295,19 @@ static bool pte_map_lock(struct vm_fault *vmf)
>>>>  	if (vma_has_changed(vmf->vma, vmf->sequence))
>>>>  		goto out;
>>>>  
>>>> -	pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
>>>> -				  vmf->address, &ptl);
> 
>>>> +	ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
>>>> +	pte = pte_offset_map(vmf->pmd, vmf->address);
>>>> +	if (unlikely(!spin_trylock(ptl))) {
>>>> +		pte_unmap(pte);
>>>> +		goto out;
>>>> +	}
>>>> +
>>>>  	if (vma_has_changed(vmf->vma, vmf->sequence)) {
>>>>  		pte_unmap_unlock(pte, ptl);
>>>>  		goto out;
>>>
>>> Right, so if you look at my earlier patches you'll see I did something
>>> quite disgusting here.
>>>
>>> Not sure that wants repeating, but I cannot remember why I thought this
>>> deadlock didn't exist anymore.
>>
>> Regarding the deadlock I did face it on my Power victim node, so I guess it
>> is still there, and the stack traces are quiet explicit.
>> Am I missing something here ?
> 
> No, you are right in that the deadlock is quite real. What I cannot
> remember is what made me think to remove the really 'wonderful' code I
> had to deal with it.
> 
> That said, you might want to look at how often you terminate the
> speculation because of your trylock failing. If that shows up at all we
> might need to do something about it.

Based on the benchmarks I run, it doesn't fail so much often, but I was
thinking about adding some counters here. The system is accounting for
major page faults and minor ones, respectively current->maj_flt and
current->min_flt. I was wondering if an additional type like async_flt will
be welcome or if there is another smarter way to get that metric.

Feel free to advise.

Thanks
Laurent.