[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEwTi7RF6s1OhEEPTrDzk61QQD_AFoZ+uX-WRhc=aAVHoHJxpw@mail.gmail.com>
Date: Fri, 6 Oct 2017 10:52:44 +0100
From: James Chapman <jchapman@...alix.com>
To: SviMik <svimik@...il.com>
Cc: netdev@...r.kernel.org, Guillaume Nault <g.nault@...halink.fr>
Subject: Re: Fw: [Bug 197099] New: Kernel panic in interrupt [l2tp_ppp]
On 6 October 2017 at 05:45, SviMik <svimik@...il.com> wrote:
> 2017-10-04 10:49 GMT+03:00 James Chapman <jchapman@...alix.com>:
>> On 3 October 2017 at 08:27, James Chapman <jchapman@...alix.com> wrote:
>>> For capturing complete oops messages, have you tried setting up
>>> netconsole? You might also find the full text in the syslog on reboot.
>
> Why, thank you! You've just told me that Santa Claus exists :)
You're welcome. Heh, my wife says I have a few more grey hairs and I
don't shave as often as I should. :)
> I've set up netconsole on 93 of my servers, and hope starting from
> tomorrow I'll have more pretty kernel panic reports, and get them even
> from servers where I had never had a chance to capture the console
> before.
>
>>> It's interesting that you are seeing l2tp issues since switching to
>>> 4.x kernels. Are you able to try earlier kernels to find the latest
>>> version that works? I'm curious whether things broke at v3.15.
>
> I'll try, but it will take some time to grab enough statistics. The
> bug is relatively rare, only few panics per day on the whole bunch of
> 93 servers.
>
>> It's possible that this may be fixed by a patch that is already
>> upstream and merged for v4.14. The fix is from Guillaume Nault:
>>
>> f3c66d4 l2tp: prevent creation of sessions on terminated tunnels
>>
>> If it's possible that the L2TP server may try to create a session in a
>> tunnel that is being closed, this bug would be exposed.
>>
>> Guillaume's fix isn't yet pushed to stable releases. Are you able to
>> try a v4.14-rc build?
>
> Sorry, I'm not skilled enough to build a kernel for CentOS on my own.
> Will wait till it appears in elrepo. The latest version there is
> currently 4.13.5. Meanwhile I'll try to switch to 3.10 and see how it
> works.
No problem. Please keep us updated. If Guillaume's fix in v4.14
prevents the l2tp crashes in your systems, I'd like to push it out to
stable releases. I have been trying to reproduce the problem here but
have had no luck so far. My guess is that your l2tp servers have a
large ppp population and are handling a lot of traffic. Until we have
evidence that Guillaume's patch resolves this problem, it's harder to
justify pushing it out to stable.
> I have also captured few more kernel panics in the last few days.
> Please see if they are related to this bug:
> http://svimik.com/hdmmsk1kp2.png
> http://svimik.com/hdmmsk1kp3.png
> http://svimik.com/hdmmsk1kp4.png
> http://svimik.com/hdmmsk2kp6.png
Thanks. None of these are related to this bug but it looks like p3, p4
and p6 are all in the networking code. It might be worth opening
separate threads for these. A full oops capture with netconsole would
likely get more attention though.
To check whether the oops is related to this bug yourself, please
check for text that contains "l2tp_xmit_skb" before posting it to this
thread.
Powered by blists - more mailing lists