[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA93jw7xoaokrN4oGAyqh6JHAn8nJpUpp_wj-iy+GWNoV-ipNA@mail.gmail.com>
Date: Fri, 26 Oct 2018 13:25:33 -0700
From: Dave Taht <dave.taht@...il.com>
To: hkallweit1@...il.com
Cc: Oleksandr Natalenko <oleksandr@...alenko.name>,
Toke Høiland-Jørgensen <toke@...e.dk>,
"David S. Miller" <davem@...emloft.net>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiří Pírko <jiri@...nulli.us>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
linux-kernel@...r.kernel.org
Subject: Re: CAKE and r8169 cause panic on upload in v4.19
On Fri, Oct 26, 2018 at 1:21 PM Heiner Kallweit <hkallweit1@...il.com> wrote:
>
> On 26.10.2018 21:26, Oleksandr Natalenko wrote:
> > Hello.
> >
> > I was excited regarding the fact that v4.19 introduced CAKE, so I've deployed it on my home router.
> >
> > I used this script of mine [1]:
> >
> > # bufferbloat enp3s0.100 20 20
> >
> > to do its job on the VLAN interface, where 20/20 ISP link is switched from the home switch. Basically, it just follows [2] with simple bandwidth restriction and egress mirroring using ifb.
> >
> > Then I thought it would be nice to run speedtest-cli on one of the computer in the home LAN, connected to this router. Download stage went fine, but immediately after upload started I've got a panic on the router: [3] (sorry, it is a photo, netconsole didn't work because, I assume, the panic happened in the networking code). I rebooted the router and tried once more, and got the same result, again during upload stage. Then I rebooted again, replaced CAKE script with my former HTB script, and after running speedtest-cli a couple of times there's no panic.
> >
> > Before running speedtest-cli I was using CAKE for a couple of days without generating much traffic just fine. It seems it crashes only if lots of traffic is generated with tools like this.
> >
> > My sysctl: [4] and ethtool -k: [5]
> >
> > So far, I've found something similar only here: [6] [7]. The common thing is r8169 driver in use, so, maybe, it is a driver issue, and CAKE is just happy to reveal it.
> >
> > If it is something known, please point me to a possible fix. If it is something new, I'm open to provide more info on your request, try patches etc (as usual).
> >
> It seems to be the same problem as described here: https://bugzilla.kernel.org/show_bug.cgi?id=201063
> As I commented in bugzilla, the GPF in dev_hard_start_xmit and the values of R12/R15 make me think
> that a poisoned list pointer is accessed. It's so deep in the network stack that I can not really
> imagine the network driver is to blame. One screenshot attached to the bug report shows that the
> GPF also happened with the igb driver. Most likely we find out only once somebody spends effort
> on bisecting the issue.
> d4546c2509b1 ("net: Convert GRO SKB handling to list_head.") and some subsequent changes deal with
> skb list processing, maybe the issue is related to one of these changes.
Can you repeat your test, disabling gro splitting in cake?
the option is "no-split-gso"
>
> > Thanks.
> >
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
Powered by blists - more mailing lists