[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20060727110425.5cd40bd9.akpm@osdl.org>
Date: Thu, 27 Jul 2006 11:04:25 -0700
From: Andrew Morton <akpm@...l.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: aia21@....ac.uk, nickpiggin@...oo.com.au, eike-kernel@...tec.de,
linux-kernel@...r.kernel.org, aia21@...tab.net, arjan@...radead.org
Subject: Re: [BUG?] possible recursive locking detected
On Thu, 27 Jul 2006 16:45:43 +0200
Ingo Molnar <mingo@...e.hu> wrote:
>
> * Anton Altaparmakov <aia21@....ac.uk> wrote:
>
> > Note that even the above patch is not a 100% solution. What
> > guarantees are there that the page faulted in will still be around
> > when it is read a few lines down the line in the code? Given
> > sufficient parallel memory pressure/io pressure it can still cause the
> > page to be evicted again immediately after it is faulted in...
> >
> > All the above patch does is to _dramatically_ reduce the race window
> > for this happening but it does not eliminate it in theory (AFAICS).
> >
> > So if your stance is that deadlocks are completely unacceptable it
> > still is not fixed. If your stance is that _really_ unlikely
> > deadlocks are acceptable then it is fixed.
>
> my 'stance' is pretty common-sense: exploitable deadlocks (it's possible
> to force eviction of a page), or even hard-to-trigger but possible
> deadlocks (which are not associated with hopeless resource exhaustation)
> must be fixed.
Yeah. It's super-hard to hit though - I spent some time trying to do so
back in 2.5.<late> and was unable to do so.
And nobody is likely to hit it in production because nobody will go and
write() into a pagecache page from a mmapped copy of the same page
(surely?). So it's the deliberately-triggered deadlocks we need to be
concerned of here.
That's for ext2/3. I didn't know about the reiserfs problem.
> couldnt we exclude the case of 'write writing to the same page it is
> reading from' abuse, to avoid the deadlock problem?
That would involve doing a follow_page() to get at the other pageframe. If
we were to do that, we could just pin the page. But we've always been
reluctant to add the cost of that.
I guess we could fix it by making the copy_to/from_user be atomic and if it
faults, drop the page lock, loop around and try again.
There's a more serious deadlock in there: an ab/ba deadlock between
journal_start() and lock_page(). It's hard to fix.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists