[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwHMQd-VDeTDh-gm3jyj+5+FSoAHOeU47mwU-mKtEj9RQ@mail.gmail.com>
Date: Tue, 16 Jul 2013 14:02:12 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Dave Chinner <david@...morbit.com>
Cc: Ben Myers <bpm@....com>, Peter Zijlstra <peterz@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Dave Jones <davej@...hat.com>, xfs@....sgi.com
Subject: Re: splice vs execve lockdep trace.
On Tue, Jul 16, 2013 at 1:43 PM, Dave Chinner <david@...morbit.com> wrote:
>
> Yes - IO is serialised based on the ip->i_iolock, not i_mutex. We
> don't use i_mutex for many things IO related, and so internal
> locking is needed to serialise against stuff like truncate, hole
> punching, etc, that are run through non-vfs interfaces.
Umm. But the page IO isn't serialized by i_mutext *either*. You don't
hold it across page faults. In fact you don't even take it at all
across page faults.
That's kind of my point. splice is about the page IO, and it's
serialized purely by the page lock. And then "->readpage()" will get
whatever IO mutex in order to do that right, but think about the case
where things are already in the page cache. There's no reason for any
serialization what-so-ever.
So this isn't about i_mutex. At all.
> Read isn't the problem - it's write that's the deadlock issue...
I agree, and I think your patches are needed, as I said in that email
you replied to. But due to this issue, I was looking at the XFS splice
support, and the read-side splice support seems inefficient and overly
complex. I'm not seeing why it needs that i_iolock.
And no, this really has nothing to do with i_mutex. Go look at
generic_file_splice_read(). There's no i_mutex there at all. It's more
like a series of magic page-faults without the actual page table
actions. Which is kind of the whole point of splice - zero-copy
without bothering with page table setup/teardown.
Now, it's perfectly possible that XFS really needs some odd locking
here, but your reply about i_mutex makes me think that you did it
because you were confused about what it actually wants.
*Every* other local filesystem uses generic_file_splice_read() with
just a single
.splice_read = generic_file_splice_read,
in the file ops initializer. Sure, nfs and ocfs2 wrap things like xfs
does, but they basically do it to revalidate their caches.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists