[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <43EC24BD-D7FF-4611-9D88-DF9C496A620A@ornl.gov>
Date: Wed, 15 Aug 2012 16:45:04 -0400
From: "Atchley, Scott" <atchleyes@...l.gov>
To: Sage Weil <sage@...tank.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"ceph-devel@...r.kernel.org" <ceph-devel@...r.kernel.org>
Subject: Re: regression with poll(2)?
On Aug 15, 2012, at 3:46 PM, Sage Weil wrote:
> I'm experiencing a stall with Ceph daemons communicating over TCP that
> occurs reliably with 3.6-rc1 (and linus/master) but not 3.5. The basic
> situation is:
>
> - the socket is two processes communicating over TCP on the same host, e.g.
>
> tcp 0 2164849 10.214.132.38:6801 10.214.132.38:51729 ESTABLISHED
>
> - one end writes a bunch of data in
> - the other end consumes data, but at some point stalls.
> - reads are nonblocking, e.g.
>
> int got = ::recv( sd, buf, len, MSG_DONTWAIT );
>
> and between those calls we wait with
>
> struct pollfd pfd;
> short evmask;
> pfd.fd = sd;
> pfd.events = POLLIN;
> #if defined(__linux__)
> pfd.events |= POLLRDHUP;
> #endif
>
> if (poll(&pfd, 1, msgr->timeout) <= 0)
> return -1;
>
> - in my case the timeout is ~15 minutes. at that point it errors out,
> and the daemons reconnect and continue for a while until hitting this
> again.
>
> - at the time of the stall, the reading process is blocked on that
> poll(2) call. There are a bunch of threads stuck on poll(2), some of them
> stuck and some not, but they all have stacks like
>
> [<ffffffff8118f6f9>] poll_schedule_timeout+0x49/0x70
> [<ffffffff81190baf>] do_sys_poll+0x35f/0x4c0
> [<ffffffff81190deb>] sys_poll+0x6b/0x100
> [<ffffffff8163d369>] system_call_fastpath+0x16/0x1b
>
> - you'll note that the netstat output shows data queued:
>
> tcp 0 1163264 10.214.132.36:6807 10.214.132.36:41738 ESTABLISHED
> tcp 0 1622016 10.214.132.36:41738 10.214.132.36:6807 ESTABLISHED
>
> etc.
>
> Is this a known regression? Or might I be misusing the API? What
> information would help track it down?
>
> Thanks!
> sage
Sage,
Do you see the same behavior when using two hosts (i.e. not loopback)? If different, how much data is in the pipe in the localhost case?
Scott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists