[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1518226912.22856.3.camel@primarydata.com>
Date: Sat, 10 Feb 2018 01:41:55 +0000
From: Trond Myklebust <trondmy@...marydata.com>
To: "thiago.becker@...il.com" <thiago.becker@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>
CC: "bfields@...ldses.org" <bfields@...ldses.org>,
"anna.schumaker@...app.com" <anna.schumaker@...app.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"jlayton@...nel.org" <jlayton@...nel.org>
Subject: Re: [PATCH] sunrpc: Add task's xid to 'not responding' messages on
call_timeout
On Fri, 2018-02-09 at 23:06 -0200, Thiago Rafael Becker wrote:
> When investigating reasons for nfs failures, packet dumps arei
> eventually used.
> Finding the rpc that generated the failure is done by comparing all
> sent
> rpc calls and all received rpc replies for those which are
> unanswered,
> which is prone to errors like
> - Slow server responses
> - Incomplete and uncaptured packets in the packet dump
> - The heuristics used to inspect packets failing to interpret one
>
> This patch adds the xid of rpc_tasks to the 'not responding' messages
> in call_timeout to make these analysis more precise.
>
> Signed-off-by: Thiago Rafael Becker <thiago.becker@...il.com>
> ---
> net/sunrpc/clnt.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index e2a4184f3c5d..83c8aca951f4 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -2214,9 +2214,10 @@ call_timeout(struct rpc_task *task)
> }
> if (RPC_IS_SOFT(task)) {
> if (clnt->cl_chatty) {
> - printk(KERN_NOTICE "%s: server %s not
> responding, timed out\n",
> + printk(KERN_NOTICE "%s: server %s not
> responding, timed out (xid: %x)\n",
> clnt->cl_program->name,
> - task->tk_xprt->servername);
> + task->tk_xprt->servername,
> + be32_to_cpu(task->tk_rqstp-
> >rq_xid));
> }
> if (task->tk_flags & RPC_TASK_TIMEOUT)
> rpc_exit(task, -ETIMEDOUT);
> @@ -2228,9 +2229,10 @@ call_timeout(struct rpc_task *task)
> if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) {
> task->tk_flags |= RPC_CALL_MAJORSEEN;
> if (clnt->cl_chatty) {
> - printk(KERN_NOTICE "%s: server %s not
> responding, still trying\n",
> + printk(KERN_NOTICE "%s: server %s not
> responding, still trying (xid: %x)\n",
> clnt->cl_program->name,
> - task->tk_xprt->servername);
> + task->tk_xprt->servername,
> + be32_to_cpu(task->tk_rqstp->rq_xid));
> }
> }
> rpc_force_rebind(clnt);
NACK. We should not be logging internal information such as XIDs as
KERN_NOTICE messages. If you want this information, you can extract it
yourself; there are already plenty of ways to do so as a privileged
user.
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@...marydata.com
Powered by blists - more mailing lists