[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180129083817.4pokzlwc4p7csik6@arbeitstier>
Date: Mon, 29 Jan 2018 09:38:17 +0100
From: Tobias Hommel <netdev-list@...oetigt.de>
To: Steffen Klassert <steffen.klassert@...unet.com>
Cc: netdev@...r.kernel.org
Subject: Re: BUG: 4.14.11 unable to handle kernel NULL pointer dereference in
xfrm_lookup
On Wed, Jan 24, 2018 at 10:59:21AM +0100, Steffen Klassert wrote:
> On Fri, Jan 19, 2018 at 03:45:46PM +0100, Tobias Hommel wrote:
> >
> > I tried to strip down the system configuration and was able to reproduce the
> > problem with a minimal configuration:
> > * ipsets are not used anymore
> > * no firewall markings are used any longer
> > * iptables are "completely empty", i.e. all policies set to ACCEPT and there is
> > no rule in any table
> > * no additional routing policies (ip rule) except the default ones
> > * only main routing table is used
> > * using a "minimal" kernel config:
> > * run `make defconfig`
> > * add basic things (ESP, IGB driver, some crypto algorithms)
> > * add options required to boot up the system (TPM crypt, some device mapper
> > options, overlayfs)
> >
> > I attached the minimal config (minimal.config) and the defconfig for reference
> > (minimal.defconfig).
> >
> > The setup is really simple now, the gateway is forwarding HTTP connections
> > between eth1(IPSec tunnels) and eth0 without any firewall, NAT, whatsoever.
>
> Thanks a lot for your debugging effort!
>
> >
> > The only thing I can think of are the rather aggressive roadwarrior clients.
> > There are 750 roadwarriors that are controlled by a script which starts and
> > stops the IPSec connection.
>
> I still can't reproduce it with my tests. This is probably some race
> triggered due to your aggressive roadwarrior setup which I don't have.
>
> > I tried 4.15-rc8 and have the same problem here (see attached
> > kernel-4.15-rc8.log). SMP affinity for IRQs has changed in 4.15 and something's
>
> There is one patch that could influence this which is not in v4.15-rc8:
>
> commit 76a4201191814a0061cb5c861fafb9ecaa764846
> ("xfrm: Fix a race in the xdst pcpu cache.")
>
> It is included in v4.15-rc9.
I already tested that one some weeks ago, when it appeared on the mailing list,
with 4.14. Without any luck.
>
> If this does not fix your problem, I'm out of ideas. In this case
> I have to ask to do a bisection to find the offending commit.
>
I'll do a bisect session then. It'll take some time though as the hardware is
currently occupied with other tests. I'll keep you up-to-date about the
results.
Powered by blists - more mailing lists