[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3cf6f319-4153-fd20-fef0-f445cbf66c3e@free.fr>
Date: Mon, 1 Aug 2022 11:19:06 +0200
From: Bernard f6bvp <f6bvp@...e.fr>
To: Thomas Osterried <thomas@...erried.de>,
Francois Romieu <romieu@...zoreil.com>
Cc: Eric Dumazet <edumazet@...gle.com>, linux-hams@...r.kernel.org,
Thomas Osterried DL9SAU <thomas@...erg.in-berlin.de>,
netdev@...r.kernel.org
Subject: Re: rose timer t error displayed in /proc/net/rose
The reason I am looking at /proc/net/rose is for I want to check if a
rose connect request is pending.
I am investigating a bug in rose socket connect that was ok until kernel
5.4.79 and occured with kernel 5.4.83.
All ROSE network client applications I am playing with (fpad, fpacwpd,
wplist, wpedit...) need connect rose socket and are failing since that
kernel bug.
The connect request triggers rose->timer from 200 until it counts down
to 0 and fails due to timeout.
This let me think that there is a bug somewhere in library that handles
rose socket ?
I have been told to perform a kernel git bisect from 5.4.79 (good) to
5.4.83 (bad) but I am facing difficulties rebooting after kernel is
compiled on my Ubuntu machine.
The reason is that with Ubuntu boot is very complex to me.
I can see that new vmlinuz is present in /boot but boot process is
rather using /boot/efi/EFI/ that I don't know how to manage...
rw-r--r-- 1 root root 173964 Jul 29 19:04 config-5.18.11-F6BVP-4
-rw-r--r-- 1 root root 176329 Aug 1 10:44
config-5.19.0-rc8-next-20220728-F6BVP-next-patchs+
drwx------ 3 root root 4096 Jan 1 1970 efi/
drwxr-xr-x 5 root root 4096 Aug 1 10:44 grub/
lrwxrwxrwx 1 root root 54 Jul 30 10:46 initrd.img ->
initrd.img-5.19.0-rc8-next-20220728-F6BVP-next-patchs+
-rw-r--r-- 1 root root 62359030 Jul 26 17:38 initrd.img-5.15.0-41-generic
-rw-r--r-- 1 root root 17810668 Jul 29 19:04 initrd.img-5.18.11-F6BVP-4
-rw-r--r-- 1 root root 18144144 Aug 1 10:44
initrd.img-5.19.0-rc8-next-20220728-F6BVP-next-patchs+
-rw-r--r-- 1 root root 182800 Feb 6 21:35 memtest86+.bin
-rw-r--r-- 1 root root 184476 Feb 6 21:35 memtest86+.elf
-rw-r--r-- 1 root root 184980 Feb 6 21:35 memtest86+_multiboot.bin
lrwxrwxrwx 1 root root 51 Aug 1 10:44 vmlinuz ->
vmlinuz-5.19.0-rc8-next-20220728-F6BVP-next-patchs+
-rw------- 1 root root 11086240 Jun 22 15:24 vmlinuz-5.15.0-41-generic
-rw-r--r-- 1 root root 9686976 Jul 29 19:04 vmlinuz-5.18.11-F6BVP-4
-rw-r--r-- 1 root root 10211648 Aug 1 10:44
vmlinuz-5.19.0-rc8-next-20220728-F6BVP-next-patchs+
-rw-r--r-- 1 root root 10211648 Jul 30 10:46
vmlinuz-5.19.0-rc8-next-20220728-F6BVP-next-patchs+.old
lrwxrwxrwx 1 root root 55 Aug 1 10:44 vmlinuz.old ->
vmlinuz-5.19.0-rc8-next-20220728-F6BVP-next-patchs+.old
Le 01/08/2022 à 10:06, Thomas Osterried a écrit :
> Hello,
>
> 1. why do you check for pending timer anymore?
> 2. I'm not really sure what value jiffies_delta_to_clock_t() returns. jiffies / HZ?
> jiffies_delta_to_clock_t() returns clock_t.
>
> ax25_display_timer() is used in ax25, netrom and rose, mostly for displaying states in /proc.
>
> In ax25_subr.c, ax25_calculate_rtt() it is used for rtt calculation. Is it proven that the the values that ax25_display_timer returns are still as expected?
> I ask, because we see there a substraction ax25_display_timer(&ax25->t1timer) from ax25->t1, and we need to be sure, that your change will not break he ax.25 stack.
>
> # grep ax25_display_timer ../*/*c|cut -d/ -f2-
> ax25/af_ax25.c: ax25_info.t1timer = ax25_display_timer(&ax25->t1timer) / HZ;
> ax25/af_ax25.c: ax25_info.t2timer = ax25_display_timer(&ax25->t2timer) / HZ;
> ax25/af_ax25.c: ax25_info.t3timer = ax25_display_timer(&ax25->t3timer) / HZ;
> ax25/af_ax25.c: ax25_info.idletimer = ax25_display_timer(&ax25->idletimer) / (60 * HZ);
> ax25/af_ax25.c: ax25_display_timer(&ax25->t1timer) / HZ, ax25->t1 / HZ,
> ax25/af_ax25.c: ax25_display_timer(&ax25->t2timer) / HZ, ax25->t2 / HZ,
> ax25/af_ax25.c: ax25_display_timer(&ax25->t3timer) / HZ, ax25->t3 / HZ,
> ax25/af_ax25.c: ax25_display_timer(&ax25->idletimer) / (60 * HZ),
> ax25/ax25.mod.c:SYMBOL_CRC(ax25_display_timer, 0x14cecd59, "");
> ax25/ax25_subr.c: ax25->rtt = (9 * ax25->rtt + ax25->t1 - ax25_display_timer(&ax25->t1timer)) / 10;
> ax25/ax25_timer.c:unsigned long ax25_display_timer(struct timer_list *timer)
> ax25/ax25_timer.c:EXPORT_SYMBOL(ax25_display_timer);
> netrom/af_netrom.c: ax25_display_timer(&nr->t1timer) / HZ,
> netrom/af_netrom.c: ax25_display_timer(&nr->t2timer) / HZ,
> netrom/af_netrom.c: ax25_display_timer(&nr->t4timer) / HZ,
> netrom/af_netrom.c: ax25_display_timer(&nr->idletimer) / (60 * HZ),
> netrom/netrom.mod.c: { 0x14cecd59, "ax25_display_timer" },
> rose/af_rose.c: ax25_display_timer(&rose->timer) / HZ,
> rose/af_rose.c: ax25_display_timer(&rose->idletimer) / (60 * HZ),
> rose/rose.mod.c: { 0x14cecd59, "ax25_display_timer" },
> rose/rose_route.c: ax25_display_timer(&rose_neigh->t0timer) / HZ,
> rose/rose_route.c: ax25_display_timer(&rose_neigh->ftimer) / HZ);
>
>
> 3. Back to the initial problem:
>
>>> When decreasing from 1 to 0 it displays a very large number until next clock
>>> tic as demonstrated below.
> I assume it's the information when timer for rose expired.
> If it has been expired 1s ago, the computed time diff diff becomes negative -> -(jiffies).
> We are unsigned long (and imho need to b), but the "underflow" result is something like (2**64)-1-jiffies -- a very large positive number that represents a small negative number.
>
> => If my assumptions for rose behavior are correct:
> 1. I expect rose->timer to be restarted soon. If it does not happen, is there a bug?
> 2. The time window with that large value is large.
> 3. Are negative numbers (-> timer expired) are of interest? Else, 0 should be enough to indicate that the timer has expired.
> linux/jiffies.h:
> extern clock_t jiffies_to_clock_t(unsigned long x);
> static inline clock_t jiffies_delta_to_clock_t(long delta)
> {
> return jiffies_to_clock_t(max(0L, delta));
> }
>
> => Negative may be handled due to Francois' patch now correctly. delta as signed long may be negative. max(0L, -nnnn) sould result to 0L.
> This would result to 0. Perhaps proven by Francois, because he used this function and achieved a correct display of that idle value. Francois, am I correct, is "0" really displayed?
>
>
>
Powered by blists - more mailing lists