lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 8 Jun 2017 10:29:38 -0500
From:   Larry Finger <Larry.Finger@...inger.net>
To:     David Rientjes <rientjes@...gle.com>,
        Vlastimil Babka <vbabka@...e.cz>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: Sleeping BUG in khugepaged for i586

On 06/07/2017 03:56 PM, David Rientjes wrote:
> On Wed, 7 Jun 2017, Vlastimil Babka wrote:
> 
>>>> Hmm I'd expect such spin lock to be reported together with mmap_sem in
>>>> the debugging "locks held" message?
>>>
>>> My bisection of the problem is about half done. My latest good version is commit
>>> 7b8cd33 and the latest bad one is 2ea659a. Only about 7 steps to go.
>>
>> Hmm, your bisection will most likely just find commit 338a16ba15495
>> which added the cond_resched() at mm/khugepaged.c:655. CCing David who
>> added it.
>>
> 
> I agree it's probably going to bisect to 338a16ba15495 since it's the
> cond_resched() at the line number reported, but I think there must be
> something else going on.  I think the list of locks held by khugepaged is
> correct because it matches with the implementation.  The preempt_count(),
> as suggested by Andrew, does not.  If this is reproducible, I'd like to
> know what preempt_count() is.
> 

The BUG output is reproducible. By the time the box finishes booting, there are 
at least 2 of them logged. My bisection shows that commit 338a16ba15495 is the 
bad one. I added a pr_info() to output the value of preempt_count() just before 
the cond_resched() statement. The count was always 1 whether the BUG was 
triggered or not.

If there are other things you would like logged at that point, or any other 
diagnostics, please let me know.

Larry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ