[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1153992484.21849.36.camel@imp.csi.cam.ac.uk>
Date: Thu, 27 Jul 2006 10:28:04 +0100
From: Anton Altaparmakov <aia21@....ac.uk>
To: Andrew Morton <akpm@...l.org>
Cc: nickpiggin@...oo.com.au, eike-kernel@...tec.de,
linux-kernel@...r.kernel.org, aia21@...tab.net
Subject: Re: [BUG?] possible recursive locking detected
On Thu, 2006-07-27 at 01:53 -0700, Andrew Morton wrote:
> On Thu, 27 Jul 2006 09:19:58 +0100
> Anton Altaparmakov <aia21@....ac.uk> wrote:
>
> > b) is impossible for ntfs.
>
> ntfs write() is already doing GFP_HIGHUSER allocations inside i_mutex.
Yes.
> Presumably there's some reason why it isn't deadlocking at present.
NTFS always also supplies GFP_NOFS so it should never deadlock on itself
(except in cases where it has no control of memory allocations because
VFS / kernel functions do the allocation).
I would assume the likelyhood of a cross fs ab/ba deadlock is so small
that it effectively never happens...
> Could
> be that we'll end up deciding to make lockdep shut up about cross-fs
> i_mutex-takings, but that's a bit lame because if some other fs starts
> taking i_mutex in the reclaim path we're exposed to ab/ba deadlocks, and
> they won't be reported.
True but I think that will be the only solution to get rid of the
messages. It is not the only place in the kernel where there are
theoretical problems in the code but it is considered acceptable because
it happens very seldomly. This is such a case I feel.
An example is the potential deadlock in generic buffered file write
where we fault in a page via fault_in_pages_readable() but there is
nothing to guarantee that page will not go away between us doing this
and us using the page.
This deadlock is hit tons when using reiserfs (unmodified) and even ext3
is affected on servers that run very high i/o loads. (Loads being
between once a day and once a week and it requires a reboot to sort out
for the unmodified reiserfs case but we have a local patch that improves
matters dramatically.)
And still the code is there and no-one complains as the real solutions
are either too horrible to contemplate or need a lot of intricate
changes to the kernel's page handling (e.g. like a "page is pinned" flag
that then needs to be honoured in all the right places)...
> But sorry, we just cannot go and require that write()'s pagecache
> allocations not be able to write dirty data, not be able to strip buffers
> from clean pages and not be able to reclaim slab.
As I said, perhaps all those code paths need to be able to take a "no"
for an answer from the FS and the fs needs to try-lock and abort if it
fails. The problem at the moment is that the FS must succeed or
deadlock trying (or panic or silently cause errors on the file system
take your pick of what you consider the least bad).
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://www.linux-ntfs.org/ & http://www-stu.christs.cam.ac.uk/~aia21/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists