[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140114212513.GB23999@fieldses.org>
Date: Tue, 14 Jan 2014 16:25:13 -0500
From: "J. Bruce Fields" <bfields@...ldses.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Jeff Layton <jlayton@...hat.com>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
nfs-ganesha-devel@...ts.sourceforge.net,
samba-technical@...ts.samba.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
drh@...ci.com
Subject: Re: [PATCH v5 13/14] locks: skip deadlock detection on FL_FILE_PVT
locks
On Tue, Jan 14, 2014 at 01:17:20PM -0800, Andy Lutomirski wrote:
> On Tue, Jan 14, 2014 at 1:10 PM, J. Bruce Fields <bfields@...ldses.org> wrote:
> > On Tue, Jan 14, 2014 at 12:29:17PM -0800, Andy Lutomirski wrote:
> >> [cc: drh, who I suspect is responsible for the most widespread
> >> userspace software that uses this stuff]
> >>
> >> On Tue, Jan 14, 2014 at 11:27 AM, J. Bruce Fields <bfields@...ldses.org> wrote:
> >> > On Thu, Jan 09, 2014 at 04:58:59PM -0800, Andy Lutomirski wrote:
> >> >> For this kind of deadlock detection, nothing global is needed -- I'm
> >> >> only talking about detecting deadlocks due to two tasks upgrading
> >> >> locks on the same file (with overlapping ranges) at the same time.
> >> >>
> >> >> This is actually useful for SQL-like things. Imagine this scenario:
> >> >>
> >> >> Program 1:
> >> >>
> >> >> Open a file
> >> >> BEGIN;
> >> >> SELECT whatever; -- acquires a read lock
> >> >>
> >> >> Program 2:
> >> >>
> >> >> Open the same file
> >> >> BEGIN;
> >> >> SELECT whatever; -- acquires a read lock
> >> >>
> >> >> Program 1:
> >> >> UPDATE something; -- upgrades to write
> >> >>
> >> >> Now program 1 is waiting for program 2 to release its lock. But if
> >> >> program 2 tries to UPDATE, then it deadlocks. A friendly MySQL
> >> >> implementation (which, sadly, does not include sqlite) will fail the
> >> >> abort the transaction instead.
> >> >
> >> > And then I suppose you'd need to get an exclusive lock when you retry,
> >> > to guarantee forward progress in the face of multiple processes retrying
> >> > at once.
> >>
> >> I don't think so -- as long as deadlock detection is 100% reliable and
> >> if you have writer priority,
> >
> > We don't have writer priority. Depending on how it worked I'm not
> > convinced it would help. E.g. consider the above but with 3 processes:
> >
> > processes 1, 2, and 3 each get a whole-file read lock.
> >
> > process 1 requests a write lock, blocks because it conflicts
> > with read locks held by 2 and 3.
> >
> > process 2 requests a write lock, gets -EDEADLK, unlocks and
> > requests a new read lock. That request succeeds because there
> > is no conflicting lock. (Note the lock manager had no
> > opportunity to upgrade 1's lock here thanks to the conflict with
> > 3's lock.)
>
> Writer priority here would detect that someone's waiting for write
> access and would cause new readers to block.
OK, got it, thanks.
I don't *think* traditional posix locks are allowed to block on a lock
that's not yet applied. But I haven't thought it through. If not, it
might be worth adding to a new lock type, though that could be a
slightly subtle distinction to document.
...
> FWIW, at least last time I checked, sqlite didn't implement deadlock
> detection (it uses timeouts instead). That was one of my least
> favorite things about sqlite.
>
> With this feature in fcntl, I think that sqlite could add deadlock
> detection and a true blocking mode without changing the file/locking
> format, at least if it still works the way I remember it working.
>
> Anyway, I still don't think that this feature should be a prerequisite
> for the new lock types.
Well, if there's another posix lock brokenness here that we could easily
fix at the same time, it might make sense to. At least it seems worth
understanding.
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists