lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Oct 2017 08:40:54 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     "Anders K. Pedersen | Cohaesio" <akp@...aesio.com>
Cc:     "pstaszewski@...are.pl" <pstaszewski@...are.pl>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "pavlos.parissis@...il.com" <pavlos.parissis@...il.com>,
        "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
        "alexander.h.duyck@...el.com" <alexander.h.duyck@...el.com>
Subject: Re: Linux 4.12+ memory leak on router with i40e NICs

On Thu, Oct 19, 2017 at 5:19 AM, Anders K. Pedersen | Cohaesio
<akp@...aesio.com> wrote:
> Hi Alex,
>
> On ons, 2017-10-18 at 16:37 -0700, Alexander Duyck wrote:
>> When we last talked I had asked if you could do a git bisect to find
>> the memory leak and you said you would look into it. The most useful
>> way to solve this would be to do a git bisect between your current
>> kernel and the 4.11 kernel to find the point at which this started.
>> If
>> we can do that then fixing this becomes much simpler as we just have
>> to fix the patch that introduced the issue.
>
> We're also seeing a smaller memory leak (about 1 GB per day) than the
> original one even with the "Fix memory leak related filter programming
> status" fix applied. So far I've determined that the leak is present on
> 4.13.7 and was introduced between 4.11 and 4.12, so I'll do another
> round of bisection to identify the patch that introduced this.
>
> Since the router must run for a couple of hours before I can be sure
> whether a kernel is good or bad, and I can't reboot it during working
> hours, it'll probably be about a week before I have a result.
>
> --
> Venlig hilsen / Best Regards
>
> Anders K. Pedersen
> Senior Technical Manager

Anders,

I'll do some digging on my side to see if I can find any other memory
leaks that might be floating around in the driver that could have been
introduced during that time-frame.

One thing you might try that would help with your testing would be to
just disable the ATR functionality in i40e. You can do that with the
ethtool command "ethtool --set-priv-flags <iface> flow-director-atr
off". That should allow you to bisect this without needing to deal
with the "programming status" patches since you won't be programming
ATR filters which is what caused that leak.

Thanks for looking into this.

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ