linux-kernel - Re: [PATCH] sunrpc: Add task's xid to 'not responding' messages on call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1518226912.22856.3.camel@primarydata.com>
Date:   Sat, 10 Feb 2018 01:41:55 +0000
From:   Trond Myklebust <trondmy@...marydata.com>
To:     "thiago.becker@...il.com" <thiago.becker@...il.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>
CC:     "bfields@...ldses.org" <bfields@...ldses.org>,
        "anna.schumaker@...app.com" <anna.schumaker@...app.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "jlayton@...nel.org" <jlayton@...nel.org>
Subject: Re: [PATCH] sunrpc: Add task's xid to 'not responding' messages on
 call_timeout

On Fri, 2018-02-09 at 23:06 -0200, Thiago Rafael Becker wrote:
> When investigating reasons for nfs failures, packet dumps arei
> eventually used.
> Finding the rpc that generated the failure is done by comparing all
> sent
> rpc calls and all received rpc replies for those which are
> unanswered,
> which is prone to errors like
> - Slow server responses
> - Incomplete and uncaptured packets in the packet dump
> - The heuristics used to inspect packets failing to interpret one
> 
> This patch adds the xid of rpc_tasks to the 'not responding' messages
> in call_timeout to make these analysis more precise.
> 
> Signed-off-by: Thiago Rafael Becker <thiago.becker@...il.com>
> ---
>  net/sunrpc/clnt.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index e2a4184f3c5d..83c8aca951f4 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -2214,9 +2214,10 @@ call_timeout(struct rpc_task *task)
>  	}
>  	if (RPC_IS_SOFT(task)) {
>  		if (clnt->cl_chatty) {
> -			printk(KERN_NOTICE "%s: server %s not
> responding, timed out\n",
> +			printk(KERN_NOTICE "%s: server %s not
> responding, timed out (xid: %x)\n",
>  				clnt->cl_program->name,
> -				task->tk_xprt->servername);
> +				task->tk_xprt->servername,
> +				be32_to_cpu(task->tk_rqstp-
> >rq_xid));
>  		}
>  		if (task->tk_flags & RPC_TASK_TIMEOUT)
>  			rpc_exit(task, -ETIMEDOUT);
> @@ -2228,9 +2229,10 @@ call_timeout(struct rpc_task *task)
>  	if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) {
>  		task->tk_flags |= RPC_CALL_MAJORSEEN;
>  		if (clnt->cl_chatty) {
> -			printk(KERN_NOTICE "%s: server %s not
> responding, still trying\n",
> +			printk(KERN_NOTICE "%s: server %s not
> responding, still trying (xid: %x)\n",
>  			clnt->cl_program->name,
> -			task->tk_xprt->servername);
> +			task->tk_xprt->servername,
> +			be32_to_cpu(task->tk_rqstp->rq_xid));
>  		}
>  	}
>  	rpc_force_rebind(clnt);

NACK. We should not be logging internal information such as XIDs as
KERN_NOTICE messages. If you want this information, you can extract it
yourself; there are already plenty of ways to do so as a privileged
user.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@...marydata.com