lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALvZod4Ww1vzxD90HFePVweFaQn+3WDwu8G-gHMA1AeiJGprBg@mail.gmail.com>
Date:   Fri, 30 Oct 2020 10:01:38 -0700
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>,
        LKML <linux-kernel@...r.kernel.org>,
        Greg Thelen <gthelen@...gle.com>
Subject: Re: Machine lockups on extreme memory pressure

On Tue, Sep 22, 2020 at 10:01 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Tue 22-09-20 09:51:30, Shakeel Butt wrote:
> > On Tue, Sep 22, 2020 at 9:34 AM Michal Hocko <mhocko@...e.com> wrote:
> > >
> > > On Tue 22-09-20 09:29:48, Shakeel Butt wrote:
> [...]
> > > > Anyways, what do you think of the in-kernel PSI based
> > > > oom-kill trigger. I think Johannes had a prototype as well.
> > >
> > > We have talked about something like that in the past and established
> > > that auto tuning for oom killer based on PSI is almost impossible to get
> > > right for all potential workloads and that so this belongs to userspace.
> > > The kernel's oom killer is there as a last resort when system gets close
> > > to meltdown.
> >
> > The system is already in meltdown state from the users perspective. I
> > still think allowing the users to optionally set the oom-kill trigger
> > based on PSI makes sense. Something like 'if all processes on the
> > system are stuck for 60 sec, trigger oom-killer'.
>
> We already do have watchdogs for that no? If you cannot really schedule
> anything then soft lockup detector should fire. In a meltdown state like
> that the reboot is likely the best way forward anyway.

Yes, soft lockup detector can catch this situation but I still think
we can do better than panic/reboot.

Anyways, I think we now know the reason for this extreme pressure and
I just wanted to share if someone else might be facing a similar
situation.

There were several thousand TCP delayed ACKs queued on the system. The
system was under memory pressure and alloc_skb(GFP_ATOMIC) for delayed
ACKs were either stealing from reclaimers or failing. For the delayed
ACKs whose allocation failed, the kernel reschedules them infinitely.
So, these failing allocations for delayed ACKs were keeping the system
in this lockup state for hours. The commit a37c2134bed6 ("tcp: add
exponential backoff in __tcp_send_ack()") recently added the fix for
this situation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ