linux-kernel - Re: [PATCH] netfs: Add retry stat counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2986469.1739185956@warthog.procyon.org.uk>
Date: Mon, 10 Feb 2025 11:12:36 +0000
From: David Howells <dhowells@...hat.com>
To: "Ihor Solodrai" <ihor.solodrai@...ux.dev>
Cc: dhowells@...hat.com, "Marc Dionne" <marc.dionne@...istor.com>,
    "Steve French" <stfrench@...rosoft.com>,
    "Eric Van Hensbergen" <ericvh@...nel.org>,
    "Latchesar
 Ionkov" <lucho@...kov.net>,
    "Dominique Martinet" <asmadeus@...ewreck.org>,
    "Christian Schoenebeck" <linux_oss@...debyte.com>,
    "Paulo Alcantara" <pc@...guebit.com>,
    "Jeff Layton" <jlayton@...nel.org>,
    "Christian Brauner" <brauner@...nel.org>, v9fs@...ts.linux.dev,
    linux-cifs@...r.kernel.org, netfs@...ts.linux.dev,
    linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
    ast@...nel.org, bpf@...r.kernel.org
Subject: Re: [PATCH] netfs: Add retry stat counters

Ihor Solodrai <ihor.solodrai@...ux.dev> wrote:

> Bash piece starting a process collecting /proc/fs/netfs/stats:
> 
>     function tail_netfs {
>         echo -n > /mnt/vmtest/netfs-stats.log
>         while true; do
>             echo >> /mnt/vmtest/netfs-stats.log
>             cat /proc/fs/netfs/stats >> /mnt/vmtest/netfs-stats.log
>             sleep 1
>         done
>     }
>     export -f tail_netfs
>     nohup bash -c 'tail_netfs' & disown

I'm afraid, intermediate snapshots of this file aren't particularly useful -
just the last snapshot:

> Last recored /proc/fs/netfs/stats (note 0 retries):
> 
>     Reads  : DR=0 RA=15184 RF=5 RS=0 WB=0 WBZ=0
>     Writes : BW=488 WT=0 DW=0 WP=488 2C=0
>     ZeroOps: ZR=7964 sh=0 sk=0
>     DownOps: DL=15189 ds=15189 df=0 di=0
>     CaRdOps: RD=0 rs=0 rf=0
>     UpldOps: UL=488 us=488 uf=0
>     CaWrOps: WR=0 ws=0 wf=0
>     Retries: rq=0 rs=0 wq=0 ws=0
>     Objs   : rr=2 sr=1 foq=1 wsc=0
>     WbLock : skip=0 wait=0
>     -- FS-Cache statistics --
>     Cookies: n=0 v=0 vcol=0 voom=0
>     Acquire: n=0 ok=0 oom=0
>     LRU    : n=0 exp=0 rmv=0 drp=0 at=0
>     Invals : n=0
>     Updates: n=0 rsz=0 rsn=0
>     Relinqs: n=0 rtr=0 drop=0
>     NoSpace: nwr=0 ncr=0 cull=0
>     IO     : rd=0 wr=0 mis=0

Could you collect some tracing:

echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_read/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_write_iter/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq_ref/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq_ref/enable
echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_failure/enable

and then collect the tracelog:

trace-cmd show | bzip2 >some_file_somewhere.bz2

And if you could collect /proc/fs/netfs/requests as well, that will show the
debug IDs of the hanging requests.  These can be used to grep the trace by
prepending "R=".  For example, if you see:

	REQUEST  OR REF FL ERR  OPS COVERAGE
	======== == === == ==== === =========
	00000043 WB   1 2120    0   0 @34000000 0/0

then:

	trace-cmd show | grep R=00000043

Thanks,
David