lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180906130328.fdmhpfk475gywgan@arbeitstier>
Date:   Thu, 6 Sep 2018 15:03:28 +0200
From:   Tobias Hommel <netdev-list@...oetigt.de>
To:     Kristian Evensen <kristian.evensen@...il.com>
Cc:     Steffen Klassert <steffen.klassert@...unet.com>,
        Markus Berner <Markus.Berner@....ch>,
        Network Development <netdev@...r.kernel.org>,
        Florian Westphal <fw@...len.de>,
        Wolfgang Walter <linux@...m.de>, Wei Wang <weiwan@...gle.com>
Subject: Re: BUG: 4.14.11 unable to handle kernel NULL pointer dereference in
 xfrm_lookup

Hey guys,

I finally got some time to do a bisect and we narrowed the problem down to:

b838d5e1c5b6e57b10ec8af2268824041e3ea911 is the first bad commit
commit b838d5e1c5b6e57b10ec8af2268824041e3ea911
Author: Wei Wang <weiwan@...gle.com>
Date:   Sat Jun 17 10:42:32 2017 -0700

    ipv4: mark DST_NOGC and remove the operation of dst_free()

    With the previous preparation patches, we are ready to get rid of the
    dst gc operation in ipv4 code and release dst based on refcnt only.
    So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls
    to dst_free().
    At this point, all dst created in ipv4 code do not use the dst gc
    anymore and will be destroyed at the point when refcnt drops to 0.

    Signed-off-by: Wei Wang <weiwan@...gle.com>
    Acked-by: Martin KaFai Lau <kafai@...com>
    Signed-off-by: David S. Miller <davem@...emloft.net>

:040000 040000 9b7e7fb641de6531fc7887473ca47ef7cb6a11da 831a73b71d3df1755f3e24c0d3c86d7a93fd55e2 M      net


I also saw there was a new thread some days ago reporting a similar problem. So
I put you guys (Wolfgang, Wei) into Cc.

Tobi

On Thu, Jun 14, 2018 at 10:38:01AM +0200, Kristian Evensen wrote:
> Hello,
> 
> On Tue, Jun 12, 2018 at 10:29 AM, Kristian Evensen
> <kristian.evensen@...il.com> wrote:
> > Thanks for spending time on this. I will see what I can manage in
> > terms of a bisect. Our last good kernel was 4.9, so at least it
> > narrows the scope down a bit compared to 4.4 or 4.1.
> 
> I hope we might have got somewhere. While looking more into ipsec and
> 4.14, we noticed large performance regressions (-~20%) on some
> low-powered devices we are also using. We quickly identified the
> removal of the flow cache as the "culprit", and the performance
> regression is discussed in the netdev-thread for the removal of the
> cache ("xfrm: remove flow cache"). For the time being and in order to
> restore the performance, we have reverted the patch series removing
> the flow cache. When running our tests (on the APU) after the revert,
> we no longer see the crash. Before the revert, the APU would always
> crash within some hours. After the revert, our tests have been running
> for 24 hours+. Our test is quite basic, we establish 1, 2, 3 ...,  50
> tunnels and then run iperf on all tunnels in parallel. The tunnels are
> teared down between each iteration.
> 
> We are still running the test and will keep doing so, but I thought I
> should share this finding in case it can help in fixing the error. I
> will report back in case we find out something more, and please let me
> know if you have any suggestions for things I can test. I don't for
> example know if it is safe to revert one and one commit of the flow
> cache, to try to pin the crash even more down.
> 
> BR,
> Kristian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ