[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f9a740f-3db5-4078-8135-0ec224b26a90@sandeen.net>
Date: Wed, 7 Feb 2024 13:29:59 -0600
From: Eric Sandeen <sandeen@...deen.net>
To: Miklos Szeredi <miklos@...redi.hu>, Jan Kara <jack@...e.cz>
Cc: lsf-pc <lsf-pc@...ts.linux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] tracing the source of errors
On 2/7/24 5:23 AM, Miklos Szeredi wrote:
> On Wed, 7 Feb 2024 at 12:00, Jan Kara <jack@...e.cz> wrote:
>
>> The problem always has been how to implement this functionality in a
>> transparent way so the code does not become a mess. So if you have some
>> idea, I'd say go for it :)
>
> My first idea would be to wrap all instances of E* (e.g. ERR(E*)).
> But this could be made completely transparent by renaming current
> definition of E* to _E* and defining E* to be the wrapped ones.
> There's probably a catch (or several catches) somewhere, though.
>
> Thanks,
> Miklos
>
Just FWIW, XFS has kind of been there and back again on wrapping error returns
with macros.
Long ago, we had an XFS_ERROR() macro, i.e.
if (error)
return -XFS_ERROR(error);
sprinkled (randomly) throughout the code.
(it didn't make it out through strace, and was pretty clunky but could printk or
BUG based on which error you were looking for, IIRC.)
In 2014(!) I removed it, pointing out that systemtap could essentially do the
same thing, and do it more flexibly (see: [PATCH 2/2] xfs: Nuke XFS_ERROR macro):
# probe module("xfs").function("xfs_*").return { if (@defined($return) &&
$return == VALUE) { ... } }
hch pointed out that systemtap was not a viable option for many, and further
discussion turned up a slightly kludgey way to use kprobes:
-- from dchinner --
#!/bin/bash
TRACEDIR=/sys/kernel/debug/tracing
grep -i 't xfs_' /proc/kallsyms | awk '{print $3}' ; while read F; do
echo "r:ret_$F $F \$retval" >> $TRACEDIR/kprobe_events
done
for E in $TRACEDIR/events/kprobes/ret_xfs_*/enable; do
echo 1 > $E
done;
echo 'arg1 > 0xffffffffffffff00' > $TRACEDIR/events/kprobes/filter
for T in $TRACEDIR/events/kprobes/ret_xfs_*/trigger; do
echo 'traceoff if arg1 > 0xffffffffffffff00' > $T
done
--------
which yields i.e.:
# dd if=/dev/zero of=/mnt/scratch/newfile bs=513 oflag=direct
dd: error writing ¿/mnt/scratch/newfile¿: Invalid argument
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000259882 s, 0.0 kB/s
root@...t4:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1 #P:16
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<...>-8073 [006] d... 145740.460546: ret_xfs_file_dio_aio_write:
(xfs_file_aio_write+0x170/0x180 <- xfs_file_dio_aio_write) arg1=0xffffffffffffffea
where that last negative number is the errno.
Not the prettiest thing but something that works today and could maybe be improved?
-Eric
Powered by blists - more mailing lists