linux-kernel - Re: mm: kernel BUG at mm/memory.c:1230

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120526235447.GA4016@redhat.com>
Date:	Sun, 27 May 2012 01:54:47 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Hugh Dickins <hughd@...gle.com>
Cc:	Sasha Levin <levinsasha928@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	viro <viro@...iv.linux.org.uk>, oleg@...hat.com,
	"a.p.zijlstra" <a.p.zijlstra@...llo.nl>, mingo <mingo@...nel.org>,
	Dave Jones <davej@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>
Subject: Re: mm: kernel BUG at mm/memory.c:1230

Hello everyone,

On Sat, May 26, 2012 at 01:26:48PM -0700, Hugh Dickins wrote:
> I've been round this loop before with that particular VM_BUG_ON.
> 
> At first I thought like Andrew, that it's glaringly wrong on the exit
> path; but then changed my mind.
> 
> When munmapping, we certainly can arrive here with an unaligned addr
> and next; but in that case rwsem_is_locked.
> 
> Whereas in exiting, rwsem is not locked, but we're going linearly upwards,
> and whenever we walk into a pmd_trans_huge area, both addr and next should
> be hpage aligned: the vma bounds are unsuited to THP if they're unaligned.
> 
> Other cases equally should not arise: madvise MADV_DONTNEED should
> have rwsem_is_locked; and truncation or hole-punching shouldn't be
> possible on a pure-anonymous (!vma->vm_ops) area considered for THP.
> 
> But I cannot remember what brought me here before: a crash in testing
> on one of my machines, which further investigation root-caused elsewhere?
> or a report from someone else? or noticed when auditing another problem?
> I'm frustrated not to recall.

I agree it's not a false positive.

The reason I introduced that VM_BUG_ON was to verify if any
vma_adjust_trans_huge() was missing anywhere (so that it doesn't crash
later in split_huge_page with an obscure mapcount != page_mapcount
BUG_ON, there it would be much less obvious to see why it crashed than
here).

We should printk addr, end and the vma->vm_start/vm_end to debug this
further.

> > I'm not sure if that's indeed the issue or not, but note that this is
> > the first time I've managed to trigger that with the fuzzer, and it's
> > not that easy to reproduce. Which is a bit odd for code that was there
> > for 4 months...
> 
> I'm keeping off the linux-next for the moment; I'll worry about this
> more if it shows up when we try 3.5-rc1.  Your fuzzing tells that my
> logic above is wrong, but maybe it's just a passing defect in next.

If it's a missing vma_adjust_trans_huge() it shouldn't go unnoticed
even with DEBUG_VM=n, so I agree that if it only happens on linux-next
it's worth trying to reproduce it with 3.5-rc/3.4 too just in
case. It's actually the first time I hear of this bugcheck triggering.

Thanks!
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/