[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1004120854440.26679@i5.linux-foundation.org>
Date: Mon, 12 Apr 2010 09:02:03 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Borislav Petkov <bp@...en8.de>
cc: Johannes Weiner <hannes@...xchg.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Rik van Riel <riel@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Minchan Kim <minchan.kim@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Nick Piggin <npiggin@...e.de>,
Andrea Arcangeli <aarcange@...hat.com>,
Hugh Dickins <hugh.dickins@...cali.co.uk>,
sgunderson@...foot.com
Subject: Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the anon_vmas
of a mergeable VMA
On Mon, 12 Apr 2010, Borislav Petkov wrote:
> >
> > If the warnings do happen, they are not going to be printing out any
> > hugely informative data apart from the fact that the bad case happened at
> > all. But If they do trigger, I can try to improve on them - it's just not
> > worth trying to make them any more interesting if they never trigger.
>
> Haa, I think you're gonna want to improve them :)
>
> WARN_ONCE(1, "page->mapping does not exist in vma chain");
>
> triggered on the first resume showing a rather messy 4 WARN_ONCEs. Had I
> more cores, there maybe would've been more of them :) Maybe need locking
> if clean output is of interest (see below).
Goodie.
I can't trigger this on my machine (not that I tried very hard - but I did
do some swapping loads etc by limiting my memory to just 1GB etc). So I'm
pretty sure my verification code is "correct", and verifies things that
should be right.
And the fact that it triggers under the exact load that you use to then
trigger the bug is a damn good thing. That means that we are finally on
the right track, and we have somethign that correlates well with the
actual bug.
> So, anyway, if I can read this correctly, there is a page->mapping
> anon_vma which is _not_ in the anon_vmas chain of the vma
> (avc->same_vma).
Yes, and that is supposed to be a no-no. The page is clearly associated
with the vma in question (since we are unmapping it through that vma), but
the vma list of 'anon_vma's doesn't actually have the one that
'page->mapping' points to.
And that, in turn, means that we've lost sight of the 'page->mapping'
anon_vma, and THAT in turn means that it could well have been free'd as
being no longer referenced.
And if it was free'd, it could be re-allocated as something else (after
the RCU grace period), and that directly explains your oops.
> By the way, I completely understand when you say that your head hurts
> from looking at this :).
Well, I have to say that I'm happy I've spent the time on it, because this
way I got to learn all the new rules. It's just that I really wish I
wouldn't have _had_ to.
Anyway, I'll have to think way more about this to see if I can come up
with a debugging patch that shows more details about what actually caused
this to happen in the first place. But we definitely have a smoking gun.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists