[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEoi9W5gAMyLtf9TYKuZ7EUAQspmcHADr-bvRNDVXpL+or2dSQ@mail.gmail.com>
Date: Mon, 18 Aug 2025 18:11:55 -0400
From: Dan Cross <crossd@...il.com>
To: F6BVP <f6bvp@...e.fr>
Cc: Bernard Pidoux <bernard.pidoux@...e.fr>, David Ranch <dranch@...nnet.net>,
linux-hams@...r.kernel.org, netdev <netdev@...r.kernel.org>
Subject: Re: [ROSE] [AX25] 6.15.10 long term stable kernel oops
On Mon, Aug 18, 2025 at 2:29 PM F6BVP <f6bvp@...e.fr> wrote:
> I agree that it must be the same bug and mkiss module is involved in
> both cases although the environment is quite different.
> I am using ROSE/FPAC nodes on different machines for AX25 messages
> routing with LinFBB BBS.
> Nowadays I do not have radio anymore and all are interconnected via
> Internet using IP over AX25 encapsulation with ax25ipd (UDP ports).
>
> I am running two RaspBerry Pi 3B+ with RaspiOS 64Bit and kernel 6.12.14.
> AX25 configuration is performed via kissattach to create ax0 device.
> ROSE / FPAC suite of applications manage ROSE, NetRom and AX25 protocols
> for communications. FBB BBS forwards via rose0 port and TCP port 23
> (telnet).
>
> I do not observe any issue on those RasPiOS systems.
>
> Another mini PC with Ubuntu 24-04 LTS and kernel 6-14.0-27-generic is
> configured identiquely with FPAC/ROSE node and have absolutely no issues
> with mkiss, ROSE or NetRom.
>
> A few years ago I had been quite active on debugging ROSE module. As I
> wanted to restart AX25 debugging I installed Linux-6.15.10 stable
> kernel. This was the beginning of my kernel panic hunting...
>
> My strategy is to find the most recent kernel that do not have any issue
> with mkiss and progressively add AX25 patches in order to find the
> guilty instruction. I will use a buch of printk in order to localize the
> wrong code. We will see if it works.
Bernard,
Very good. A caveat is that the issue seems to be the bug
manifests itself in the `skbuff` infrastructure, independent of the
specific AX.25/NETROM/ROSE code: it may be that some other change
elsewhere in the kernel failed made a change that was incompatible
with AX.25 that gave rise to this bug.
I've found the oops to be very reproducible. Given that you seem
to have a known working kernel version, you may get more mileage out
of using `git bisect` to narrow things down to a specific failing
commit, instead of trying to forward-apply AX.25-specific commits.
- Dan C.
> Le 18/08/2025 à 18:30, Dan Cross a écrit :
> > On Mon, Aug 18, 2025 at 6:02 AM Bernard Pidoux <bernard.pidoux@...e.fr> wrote:
> >> Hi,
> >>
> >> I captured a screen picture of kernel panic in linux-6.16.0 that
> >> displays [mkiss]. See included picture.
> >
> > Hi Bernard,
> >
> > This is the same issue that I and a few other folks have run into.
> > Please see the analysis in
> > https://lore.kernel.org/linux-hams/CAEoi9W4FGoEv+2FUKs7zc=XoLuwhhLY8f8t_xQ6MgTJyzQPxXA@mail.gmail.com/#R
> >
> > There, I traced the issue far enough to see that it comes from
> > `sbk->dev` being NULL on these connections. I haven't had time to look
> > further into why that is, or what changed that made that the case. I
> > now think that this occurs on the _first_ of the two loops I
> > mentioned, not the second, however.
> >
> > - Dan C.
> >
> > (Aside: I'm pretty sure that `linux-hams@...r.kernel.org` is not a
> > Debian-specific list.)
>
Powered by blists - more mailing lists