[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E1I0Dx9-00071X-00@dorka.pomaz.szeredi.hu>
Date: Mon, 18 Jun 2007 11:55:19 +0200
From: Miklos Szeredi <miklos@...redi.hu>
To: davem@...emloft.net
CC: akpm@...ux-foundation.org, viro@....linux.org.uk,
alan@...rguk.ukuu.org.uk, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] fix race in AF_UNIX
> > > Secondarily, this bug has been around for years and nobody noticed.
> > > The world will not explode if this bug takes a few more days or
> > > even a week to work out. Let's do it right instead of ramming
> > > arbitrary turds into the kernel.
> >
> > Fine, but just wishing a bug to get fixed won't accomplish anything.
> > I've spent a fair amount of time debugging this thing, and I'm out of
> > ideas. Really. So unless somebody steps up to look at this, it won't
> > _ever_ get fixed.
>
> Somone just needs to find a way to only lock the socket as it is
> being operated upon.
No, you are still not getting it. The problem is that the socket
needs to be locked for the _whole_ of the garbage collection. This is
because the gc is done in two phases, in the first phase sockets which
are installed into file descriptors are collected as a "root set",
then the set is expanded by iterating over everything it's in there
and that's later added. If a socket moves from in-flight to installed
_during_ the gc, then it can miss being collected. So the socket must
be locked for the duration of _both_ phases.
> The race you are dealing with is rather simple, the queue check
> and the state check need to be done atomically. The only chore
> is to find a way to make that happen in the context of what the
> garbage allocator is trying to do.
>
> I'm not even convinced that your most recent attempt is deadlock free.
> Locking multiple objects the same way all at once like that is
> something that needs to be seriously audited.
It's doing trylocks and releasing all those locks within the same spin
locked region, how the f**k can that deadlock?
The only problem I've experienced with this patch (other than being
ugly) is that the multitude of locks it acquires makes the lockdep
code give up. But I think we can live with that.
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists