linux-kernel - Re: OOM killer not nearly agressive enough?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200109224845.GA1220@amd>
Date:   Thu, 9 Jan 2020 23:48:45 +0100
From:   Pavel Machek <pavel@....cz>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     kernel list <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...l.org>, linux-mm@...ck.org,
        akpm@...ux-foundation.org
Subject: Re: OOM killer not nearly agressive enough?

Hi!

> > > > Do we agree that OOM killer should have reacted way sooner?
> > > 
> > > This is impossible to answer without knowing what was going on at the
> > > time. Was the system threshing over page cache/swap? In other words, is
> > > the system completely out of memory or refaulting the working set all
> > > the time because it doesn't fit into memory?
> > 
> > Swap was full, so "completely out of memory", I guess. Chromium does
> > that fairly often :-(.
> 
> The oom heuristic is based on the reclaim failure. If the reclaim makes
> some progress then the oom killer is not hit. Have a look at
> should_reclaim_retry for more details.

Thanks for pointer.

I guess setting MAX_RECLAIM_RETRIES to 1 is not something you'd
recommend? :-).

> > PSI is completely different system, but I guess
> > I should attempt to tweak the existing one first...
> 
> PSI is measuring the cost of the allocation (among other things) and
> that can give you some idea on how much time is spent to get memory.
> Userspace can implement a policy based on that and act. The kernel oom
> killer is the last resort when there is really no memory to
> allocate.

So what I'm seeing is system that is unresponsive, easily for an hour.

Sometimes, I'm able to log in. When I could do that, system was
absurdly slow, like ps printing at more than 10 seconds per line.
ps on my system takes 300msec, estimate in the slow case would be 2000
seconds, that is slowdown by factor of 6000x. That would be X terminal
opening in like two hours... that's not really usable.

DRAM is in 100nsec range, disk is in 10msec range; so worst case
slowdown is somewhere in 100000x range. (Actually, in the worst case
userland will do no progress at all, since you can need at 4+ pages in
single CPU instruction, right?)

But kernel is happy; system is unusable and will stay unusable for
hour or more, and there's not much user can do. (Besides sysrq, thanks
for the hint).

Can we do better? This is equivalent of system crash, and it is _way_
too easy to trigger. Should we do better by default?

Dunno. If user moved the mouse, and cursor did not move for 10
seconds, perhaps it is time for oom kill?

Or should I add more swap? Is it terrible to place swap on SSD?

Best regards,

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)