lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLmzADQF5=htGsjNwXrUZxMUYaYZB+6WOM_vo6zG+JB9A@mail.gmail.com>
Date:   Wed, 10 Apr 2019 08:36:06 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Juha-Matti Tilli <juha-matti.tilli@...eca.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>,
        Rafael Aquini <aquini@...hat.com>,
        Murphy Zhou <xzhou@...hat.com>,
        Yongcheng Yang <yoyang@...hat.com>,
        Jianhong Yin <jiyin@...hat.com>
Subject: Re: [PATCH] net: add big honking pfmemalloc OOM warning

On Wed, Apr 10, 2019 at 8:01 AM Juha-Matti Tilli
<juha-matti.tilli@...eca.com> wrote:
>
> On Wed, Apr 10, 2019 at 5:16 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > If NFS sessions hang, then there is a bug to eventually root cause and fix.
> >
> > Just telling the user : Increase the limit is the same thing than admitting :
> >
> > Our limit system or TCP or NFS stacks are broken and unable to
> > recover, so lets disable the limit system and work around a more
> > serious bug.
> >
> > Maybe the bug is in a NIC driver, please share more details before
> > adding yet another noisy signal in syslog
> >
> > SNMP counters are per netns, and more useful in the modern computing
> > era,  where a host is shared by many different containers.
>
> Any idea where the bug might be?
>

Before diving into the details, can we first double check which exact
kernel version you are using ?

In the past some pfmemalloc bugs have been solved, I do not want spend
time finding them a second time.



> It can't be in NFS, because I have observed the issue to be a TCP
> level issue. NFS would be working just fine if TCP worked, but the
> underlying TCP connection is not working fine, unless we bump up
> vm.min_free_kbytes.
>
> It could be in ixgbe, because the incoming SKB gets pfmemalloc pages
> for some reason, and that happens repeatedly for a duration of 5-10
> minutes for every single retransmit, until the condition clears. Ping
> is working just fine at the time the NFS connection is stuck. I think
> these 63-queue NICs use different queue for ping than they use for the
> TCP NFS connection. I think there is some code in ixgbe for not
> reusing pfmemalloc pages, but it seems every packet nevertheless gets
> a pfmemalloc page in the queue that is used for TCP NFS. Might the
> cause be that if ixgbe gets the pages in large bunches, it gets
> multiple pfmemalloc pages at a time and then every packet is dropped
> until all the pfmemalloc pages run out (not being reused)?
>
> It could also be in the default value of vm.min_free_kbytes, but I'm
> not experienced enough in Linux kernel internals to adjust the complex
> calculations. Just saying that 90 MB sounds ridiculously low on a 256
> GB NUMA machine.
>
> Are you of the opinion that Intel as the developer of ixgbe should be informed?
>
> Anyway, I posted more details to the mailing lists about a week ago,
> search for "NFS hang, sk_drops increasing, segs_in not, pfmemalloc
> suspected" in the mailing lists, or click this direct link:
> https://lkml.org/lkml/2019/4/3/682
>
> The current situation is that we've been running the production system
> for 2 weeks with a bumped-up vm.min_free_kbytes, no NFS hangs, whereas
> before the bump, we had approximately one hang per day, so without the
> bump, the period of 2 weeks would have approximately 14 NFS hangs.
>
> To me, this OOM condition seems to be global, so having it per-netns
> offers no clear benefit in my opinion. Or is vm.min_free_kbytes per
> container tunable?
>
> BR, Juha-Matti

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ