[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 2 Feb 2018 09:09:45 +0100
From: Steffen Klassert <steffen.klassert@...unet.com>
To: Markus Berner <Markus.Berner@....ch>
CC: <netdev-list@...oetigt.de>, <netdev@...r.kernel.org>
Subject: Re: BUG: 4.14.11 unable to handle kernel NULL pointer dereference in
xfrm_lookup
On Wed, Jan 31, 2018 at 09:26:51PM +0100, Markus Berner wrote:
> > I'm running into a NULL pointer dereference after updating from Linux
> 4.1.6 to
> > 4.14.11 (see kernel log below).
>
> We are running into the same problem on our production machine, running
> CoreOS 1576.5.0 Stable with the 4.14.11 kernel on a KVM Cloud VM. It is not
> as easy to reproduce though in our case – we observed a total of 5 crashes
> in the last 2 weeks - all except one on the production machine.
>
> > I still can't reproduce it with my tests. This is probably some race
> > triggered due to your aggressive roadwarrior setup which I don't have.
>
> We have a similar setup to Tobias
> - 2 Network Interfaces (KVM/virtio): Public and local VLAN
> - Strongswan VPN in Tunnel mode between local VLAN and on-premise network,
> running in a Docker container
> - Quite a few iptables NAT and forwarding rules regarding other local Docker
> containers
>
> Some Observations:
> - The workaround of locking the IRQs of the Rx/Tx queues of all network
> interfaces to CPU0 Tobias described a while back did not prevent the crashes
> in our case
> - The bug does not seem to correlate with load in our case, but load in
> general is quite low.
>
> I am happy to help if I can, but unfortunately our possibilities are a bit
> limited; both due to lack of kernel dev know-how as well as trying out
> changes to configuration on the production machine. I subscribed to LKML
> only now to respond, so I hope the reply works (and to the correct message).
Thanks for offering help, but I fear we have to wait until
Tobias has bisected it.
Powered by blists - more mailing lists