lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Fri, 21 Feb 2014 11:50:29 -0500
From:	Sasha Levin <sasha.levin@...cle.com>
To:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
CC:	linux-mm@...ck.org, akpm@...ux-foundation.org, mpm@...enic.com,
	cpw@....com, kosaki.motohiro@...fujitsu.com, hannes@...xchg.org,
	kamezawa.hiroyu@...fujitsu.com, mhocko@...e.cz,
	aneesh.kumar@...ux.vnet.ibm.com, xemul@...allels.com,
	riel@...hat.com, kirill.shutemov@...ux.intel.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/11] pagewalk: update page table walker core

On 02/21/2014 11:35 AM, Naoya Horiguchi wrote:
> On Fri, Feb 21, 2014 at 01:43:20AM -0500, Sasha Levin wrote:
>> On 02/20/2014 10:20 PM, Naoya Horiguchi wrote:
>>> Hi Sasha,
>>>
>>> On Thu, Feb 20, 2014 at 06:47:56PM -0500, Sasha Levin wrote:
>>>> Hi Naoya,
>>>>
>>>> This patch seems to trigger a NULL ptr deref here. I didn't have a change to look into it yet
>>>> but here's the spew:
>>>
>>> Thanks for reporting.
>>> I'm not sure what caused this bug from the kernel message. But in my guessing,
>>> it seems that the NULL pointer is deep inside lockdep routine __lock_acquire(),
>>> so if we find out which pointer was NULL, it might be useful to bisect which
>>> the proble is (page table walker or lockdep, or both.)
>>
>> This actually points to walk_pte_range() trying to lock a NULL spinlock. It happens when we call
>> pte_offset_map_lock() and get a NULL ptl out of pte_lockptr().
> 
> I don't think page->ptl was NULL, because if so we hit NULL pointer dereference
> outside __lock_acquire() (it's derefered in __raw_spin_lock()).
> Maybe page->ptl->lock_dep was NULL. I'll digging it more to find out how we failed
> to set this lock_dep thing.

I don't see __raw_spin_lock() derefing it before calling __lock_acquire():

	static inline void __raw_spin_lock(raw_spinlock_t *lock)
	{
		preempt_disable();
		spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
		LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
	}

So after we disable preemption, spin_acquire() is basically a macro that ends up pointing to
lock_acquire().

__raw_spin_lock() would dereference 'lock' only after the lockdep call.

>>> BTW, just from curiousity, in my build environment many of kernel functions
>>> are inlined, so should not be shown in kernel message. But in your report
>>> we can see the symbols like walk_pte_range() and __lock_acquire() which never
>>> appear in my kernel. How did you do it? I turned off CONFIG_OPTIMIZE_INLINING,
>>> but didn't make it.
>>
>> I'm really not sure. I've got a bunch of debug options enabled and it just seems to do the trick.
>>
>> Try CONFIG_READABLE_ASM maybe?
> 
> Hmm, it makes no change, can I have your config?

Sure, attached.


Thanks,
Sasha


Download attachment "config.gz" of type "application/gzip" (39429 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ