[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141204064711.72d48317@tlielax.poochiereds.net>
Date: Thu, 4 Dec 2014 06:47:11 -0500
From: Jeff Layton <jeff.layton@...marydata.com>
To: Trond Myklebust <trond.myklebust@...marydata.com>
Cc: Jeff Layton <jeff.layton@...marydata.com>,
Tejun Heo <tj@...nel.org>, NeilBrown <neilb@...e.de>,
Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
Linux Kernel mailing list <linux-kernel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [RFC PATCH 00/14] nfsd/sunrpc: add support for a
workqueue-based nfsd
On Wed, 3 Dec 2014 15:44:31 -0500
Trond Myklebust <trond.myklebust@...marydata.com> wrote:
> On Wed, Dec 3, 2014 at 3:21 PM, Jeff Layton <jeff.layton@...marydata.com> wrote:
> > On Wed, 3 Dec 2014 14:59:43 -0500
> > Trond Myklebust <trond.myklebust@...marydata.com> wrote:
> >
> >> On Wed, Dec 3, 2014 at 2:20 PM, Jeff Layton <jeff.layton@...marydata.com> wrote:
> >> > On Wed, 3 Dec 2014 14:08:01 -0500
> >> > Trond Myklebust <trond.myklebust@...marydata.com> wrote:
> >> >> Which workqueue are you using? Since the receive code is non-blocking,
> >> >> I'd expect you might be able to use rpciod, for the initial socket
> >> >> reads, but you wouldn't want to use that for the actual knfsd
> >> >> processing.
> >> >>
> >> >
> >> > I'm using the same (nfsd) workqueue for everything. The workqueue
> >> > isn't really the bottleneck though, it's the work_struct.
> >> >
> >> > Basically, the problem is that the work_struct in the svc_xprt was
> >> > remaining busy for far too long. So, even though the XPT_BUSY bit had
> >> > cleared, the work wouldn't get picked up again until the previous
> >> > workqueue job had returned.
> >> >
> >> > With the change I made today, I just added a new work_struct to
> >> > svc_rqst and queue that to the same workqueue to do svc_process as soon
> >> > as the receive is done. That means though that each RPC ends up waiting
> >> > in the queue twice (once to do the receive and once to process the
> >> > RPC), and I think that's probably the reason for the performance delta.
> >>
> >> Why would the queuing latency still be significant now?
> >>
> >
> > That, I'm not clear on yet and that may not be why this is slower. But,
> > I was seeing slightly faster performance with reads before I made
> > today's changes. If changing how these jobs get queued doesn't help the
> > performance, then I'll have to look elsewhere...
>
> Do you have a good method for measuring that latency? If the queuing
> latency turns out to depend on the execution latency for each job,
> then perhaps running the message receives on a separate low latency
> queue could help (hence the suggestion to use rpciod).
>
I was using ftrace with the sunrpc:* and workqueue:* tracepoints, and
had a simple perl script to postprocess the trace info to figure out
average/min/max latency.
I don't think the queueing latency is that significant per-se, but I
think the best thing is to avoid making multiple trips through the
workqueue per RPC if we can help it. I tested and pushed a newer
patchset to my repo last night that does that (at least if there's
already a svc_rqst available when the xprt needs servicing). It seems
to be pretty close speed-wise to the thread-based code.
The next step is to test this out on something larger-scale. I'm hoping
to get access to just such a test rig soon. Once we have some results
from that, I think I'll have a much better idea of how viable this
approach is and where other potential bottlenecks might be.
Thanks!
--
Jeff Layton <jlayton@...marydata.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists