[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071026170750.GC13033@fieldses.org>
Date: Fri, 26 Oct 2007 13:07:50 -0400
From: "J. Bruce Fields" <bfields@...ldses.org>
To: "George G. Davis" <gdavis@...sta.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
On Thu, Oct 18, 2007 at 02:57:59PM -0400, George G. Davis wrote:
> On Wed, Oct 17, 2007 at 02:51:57PM -0400, George G. Davis wrote:
> > ---
> > Not sure if this is the correct fix but it does resolve the hangs we're
> > observing in posix_locks_deadlock().
>
> Please disregard the previous patch, it's not quite right - causes occasional
> segfaults and clearly did not retain the posix_same_owner() checks implemented
> in the original code. Here's a new version which I believe retains the
> intent of the original code:
>
> diff --git a/fs/locks.c b/fs/locks.c
> index 7f9a3ea..e012b27 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -702,14 +702,12 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
> {
> struct file_lock *fl;
>
> -next_task:
> if (posix_same_owner(caller_fl, block_fl))
> return 1;
> list_for_each_entry(fl, &blocked_list, fl_link) {
> if (posix_same_owner(fl, block_fl)) {
> - fl = fl->fl_next;
> - block_fl = fl;
> - goto next_task;
> + if (posix_same_owner(caller_fl, fl))
> + return 1;
> }
> }
> return 0;
It may take multiple steps to identify a deadlock. With the above
you'll miss deadlocks like
process 1 is requesting a lock held by process 2
process 2 is blocking on a lock held by process 3
process 3 is blocking on a lock held by process 1.
Could you give more details about how you're causing
posix_locks_deadlock to hang? Is there a simple test-case you can post?
--b.
>
>
> I'm not sure about those "fl = fl->fl_next; block_fl = fl;" statements,
> first, the order of those statements seems reversed to me. Otherwise,
> I think the intent was to advance the "fl" for loop variable to the next
> entry in the list but it doesn't work out that way at all - the for
> loop restarts from the beginning - this is where we get into an
> infinite loop condition. Whether the test case I posted before is
> valid or not, I reckon it shouldn't be possible for non-root Joe user
> to contrive a test case which can hang the system as we're observing
> with that test case. The above patch fixes the hang.
>
> Comments greatly appreciated...
>
> --
> Regards,
> George
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists