[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120406180905.GA10835@llnl.gov>
Date: Fri, 6 Apr 2012 11:09:05 -0700
From: Jim Garlick <garlick@...l.gov>
To: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc: "levinsasha928@...il.com" <levinsasha928@...il.com>,
"ericvh@...il.com" <ericvh@...il.com>,
"oleg@...hat.com" <oleg@...hat.com>,
"eric.dumazet@...il.com" <eric.dumazet@...il.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuznet@....inr.ac.ru" <kuznet@....inr.ac.ru>,
"jmorris@...ei.org" <jmorris@...ei.org>,
"yoshfuji@...ux-ipv6.org" <yoshfuji@...ux-ipv6.org>,
"kaber@...sh.net" <kaber@...sh.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"davej@...hat.com" <davej@...hat.com>
Subject: Re: ipv6: tunnel: hang when destroying ipv6 tunnel
Hi Tetsuo,
I am sorry if my patch is causing you grief!
On Fri, Apr 06, 2012 at 04:44:37AM -0700, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Most suspicious change is net/9p/client.c because it is changing handling of
> > ERESTARTSYS case.
> >
> > --- linux-3.3.1/net/9p/client.c
> > +++ linux-next/net/9p/client.c
> > @@ -740,10 +740,18 @@
> > c->status = Disconnected;
> > goto reterr;
> > }
> > +again:
> > /* Wait for the response */
> > err = wait_event_interruptible(*req->wq,
> > req->status >= REQ_STATUS_RCVD);
> >
> > + if ((err == -ERESTARTSYS) && (c->status == Connected)
> > + && (type == P9_TFLUSH)) {
> > + sigpending = 1;
> > + clear_thread_flag(TIF_SIGPENDING);
> > + goto again;
> > + }
> > +
>
> I think this loop is bad with regard to response to SIGKILL.
> If wait_event_interruptible() was interrupted by SIGKILL, it will
> spin until req->status >= REQ_STATUS_RCVD becomes true.
> Rather,
>
> if ((c->status == Connected) && (type == P9_TFLUSH))
> err = wait_event_killable(*req->wq,
> req->status >= REQ_STATUS_RCVD);
> else
> err = wait_event_interruptible(*req->wq,
> req->status >= REQ_STATUS_RCVD);
>
> would be safer.
Does that work? What prevents p9_client_rpc() from recursing via
p9_client_flush() on receipt of SIGKILL?
> > error:
> > /*
> > * Fid is not valid even after a failed clunk
> > + * If interrupted, retry once then give up and
> > + * leak fid until umount.
> > */
> > - p9_fid_destroy(fid);
> > + if (err == -ERESTARTSYS) {
> > + if (retries++ == 0)
> > + goto again;
>
> I think it is possible that the process is interrupted again upon retrying.
> I suspect the handling of err == -ERESTARTSYS case when retries != 0.
> It is returning without calling p9_fid_destroy(), which will be
> unexpected behaviour for the various callers.
Yes but in the unlikely event that this happens, the effect is a small
memory leak for the duration of the mount. On the other hand if the
fid is destroyed without successfully informing the server, then
subsequent operations that involve new file references will fail
when that fid number is reused, and the mount becomes unusable.
> > + } else
> > + p9_fid_destroy(fid);
> > return err;
> > }
> > EXPORT_SYMBOL(p9_client_clunk);
Regards,
Jim
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists