[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YcXfm6U/6+Xmv7be@codewreck.org>
Date: Fri, 24 Dec 2021 23:56:27 +0900
From: Dominique Martinet <asmadeus@...ewreck.org>
To: Vasily Averin <vvs@...tuozzo.com>
Cc: Eric Van Hensbergen <ericvh@...il.com>,
Latchesar Ionkov <lucho@...kov.net>, kernel@...nvz.org,
v9fs-developer@...ts.sourceforge.net, linux-kernel@...r.kernel.org,
"J. Bruce Fields" <bfields@...ldses.org>
Subject: Re: [PATCH] v9fs: handle async processing of F_SETLK with FL_SLEEP
flag
Vasily Averin wrote on Fri, Dec 24, 2021 at 01:18:54PM +0300:
> > On the other hand, just clearing the FL_SLEEP flag like you've done for
> > 9p will make the server think the lock has been queued when it hasn't
> > really been.
> > That means the client lock request will hang forever and never be
> > granted even when the lock becomes available later on, so unless I
> > misunderstood something here I don't think that's a reasonable fallback.
>
> I did not get your this statement. Could you please elaborate it in more details?
>
> Right now nfsd/lockd/ksmbd drop FL_SLEEP on own side, and this looks acceptable for them:
> instead of blocking lock they submit non-blocking SETLK and it's enough to avoid server deadlock.
>
> If the lock is already taken: SETLK just return an error and will not wait.
> I'm agree it isn't ideal, and perhaps can cause server will return some unexpected errno,
> but I do not see how it can make the server think the lock has been queued.
Right, sorry I was still under the assumption that SETLK+sleep would
return error + enqueue something.
Reading again it must return FILE_LOCK_DEFERRED if enqueued, so the
server can make the difference, and we're "just" not respecting the
client's request to enqueue the lock as you say -- I guess that's
acceptable.
> diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
> index 612e297f3763..27ede4a4a6f4 100644
> --- a/fs/9p/vfs_file.c
> +++ b/fs/9p/vfs_file.c
> @@ -142,10 +142,15 @@ static int v9fs_file_do_lock(struct file *filp, int cmd, struct file_lock *fl)
> if ((fl->fl_flags & FL_POSIX) != FL_POSIX)
> BUG();
>
> - res = locks_lock_file_wait(filp, fl);
> - if (res < 0)
> - goto out;
> -
> + if ((fl->fl_flags & FL_SLEEP) && IS_SETLK(cmd)) {
> + res = posix_lock_file(filp, fl, NULL);
Should we also check fl->fl_flags & (FL_POSIX|FL_FLOCK) like
locks_lock_file_wait does, to call either posix_lock_file or ... there
doesn't seem to be an exported flock_lock_file equivalent in the other
case, so back to wait variant there?
(or rephrasing the question, what happens if the lock is a FL_FLOCK lock
and we call posix_lock_file on it? Or are we guaranted that if FL_SLEEP
is set we're about posix locks?)
> + if (res)
> + goto out;
> + } else {
> + res = locks_lock_file_wait(filp, fl);
> + if (res < 0)
> + goto out;
> + }
> /* convert posix lock to p9 tlock args */
> memset(&flock, 0, sizeof(flock));
> /* map the lock type */
Vasily Averin wrote on Fri, Dec 24, 2021 at 03:07:38PM +0300:
> On 24.12.2021 10:31, Dominique Martinet wrote:
> > If that process is made asynchronous, we need a way to run more
> > 9p-specific code in that one's lm_grant callback, so we can proceed onto
> > the second step which is...
> >
> > - send the lock request to the 9p server and wait for its reply
> > (note that the current code is always synchronous here: even if you
> > request SETLK without the SLEEP flag you can be made to wait here.
> > I have work in the closest to make some requests asynchronous, so
> > locking could be made asynchronous when that lands, but my code
> > introduced a race somewhere I haven't had the time to fix so this
> > improvement will come later)
> >
> > What would you suggest with that?
>
> It isn't necessary to make request asynchronous,
> it's enough to avoid blocking locks.
Well, it depends on what you have in mind with blocking.
A synchronous RPC to some server which might never reply if it doesn't
feel like it (bug or whatever) is very much blocking in my opinion.
> As far as I understand blocking does not happen for SETLK command,
> so it should be enough to chenge first part and call non-blocking
> posix_file_lock() instead of blocking locks_lock_file_wait().
>
> It would be great to make processing of 2nd part asynchronous too,
> but I think it looks like over-engineering, because we even are not
> 100% sure that someone really uses this functionality.
Yes, it will have to wait anyway, that asynchronous code has been
waiting for me to debug it for at least two years so it won't be ready
tomorrow... Let's start with what we can do; I'm happy with your
approach if it doesn't bring obvious problems.
--
Dominique
Powered by blists - more mailing lists