lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f9a740f-3db5-4078-8135-0ec224b26a90@sandeen.net>
Date: Wed, 7 Feb 2024 13:29:59 -0600
From: Eric Sandeen <sandeen@...deen.net>
To: Miklos Szeredi <miklos@...redi.hu>, Jan Kara <jack@...e.cz>
Cc: lsf-pc <lsf-pc@...ts.linux-foundation.org>,
 linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] tracing the source of errors

On 2/7/24 5:23 AM, Miklos Szeredi wrote:
> On Wed, 7 Feb 2024 at 12:00, Jan Kara <jack@...e.cz> wrote:
> 
>> The problem always has been how to implement this functionality in a
>> transparent way so the code does not become a mess. So if you have some
>> idea, I'd say go for it :)
> 
> My first idea would be to wrap all instances of E* (e.g. ERR(E*)).
> But this could be made completely transparent by renaming current
> definition of E* to _E* and defining E* to be the wrapped ones.
> There's probably a catch (or several catches) somewhere, though.
> 
> Thanks,
> Miklos
> 

Just FWIW, XFS has kind of been there and back again on wrapping error returns
with macros.

Long ago, we had an XFS_ERROR() macro, i.e.

 	if (error)
		return -XFS_ERROR(error);

sprinkled (randomly) throughout the code.

(it didn't make it out through strace, and was pretty clunky but could printk or
BUG based on which error you were looking for, IIRC.)

In 2014(!) I removed it, pointing out that systemtap could essentially do the
same thing, and do it more flexibly (see: [PATCH 2/2] xfs: Nuke XFS_ERROR macro):

# probe module("xfs").function("xfs_*").return { if (@defined($return) &&
$return == VALUE) { ... } }

hch pointed out that systemtap was not a viable option for many, and further
discussion turned up a slightly kludgey way to use kprobes:

-- from dchinner --
#!/bin/bash

TRACEDIR=/sys/kernel/debug/tracing

grep -i 't xfs_' /proc/kallsyms | awk '{print $3}' ; while read F; do
	echo "r:ret_$F $F \$retval" >> $TRACEDIR/kprobe_events
done

for E in $TRACEDIR/events/kprobes/ret_xfs_*/enable; do
	echo 1 > $E
done;

echo 'arg1 > 0xffffffffffffff00' > $TRACEDIR/events/kprobes/filter

for T in $TRACEDIR/events/kprobes/ret_xfs_*/trigger; do
	echo 'traceoff if arg1 > 0xffffffffffffff00' > $T
done
--------

which yields i.e.:

# dd if=/dev/zero of=/mnt/scratch/newfile bs=513 oflag=direct
dd: error writing ¿/mnt/scratch/newfile¿: Invalid argument
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000259882 s, 0.0 kB/s
root@...t4:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1   #P:16
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
           <...>-8073  [006] d... 145740.460546: ret_xfs_file_dio_aio_write:
(xfs_file_aio_write+0x170/0x180 <- xfs_file_dio_aio_write) arg1=0xffffffffffffffea

where that last negative number is the errno.

Not the prettiest thing but something that works today and could maybe be improved?

-Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ