lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1567699853.5576.98.camel@lca.pw>
Date:   Thu, 05 Sep 2019 12:10:53 -0400
From:   Qian Cai <cai@....pw>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        David Rientjes <rientjes@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] mm, oom: disable dump_tasks by default

On Tue, 2019-09-03 at 17:13 +0200, Michal Hocko wrote:
> On Tue 03-09-19 11:02:46, Qian Cai wrote:
> > On Tue, 2019-09-03 at 16:45 +0200, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@...e.com>
> > > 
> > > dump_tasks has been introduced by quite some time ago fef1bdd68c81
> > > ("oom: add sysctl to enable task memory dump"). It's primary purpose is
> > > to help analyse oom victim selection decision. This has been certainly
> > > useful at times when the heuristic to chose a victim was much more
> > > volatile. Since a63d83f427fb ("oom: badness heuristic rewrite")
> > > situation became much more stable (mostly because the only selection
> > > criterion is the memory usage) and reports about a wrong process to
> > > be shot down have become effectively non-existent.
> > 
> > Well, I still see OOM sometimes kills wrong processes like ssh, systemd
> > processes while LTP OOM tests with staight-forward allocation patterns.
> 
> Please report those. Most cases I have seen so far just turned out to
> work as expected and memory hogs just used oom_score_adj or similar.

Here is the one where oom01 should be one to be killed.

[92598.855697][ T2588] Swap cache stats: add 105240923, delete 105250445, find
42196/101577
[92598.893970][ T2588] Free swap  = 16383612kB
[92598.913482][ T2588] Total swap = 16465916kB
[92598.932938][ T2588] 7275091 pages RAM
[92598.950212][ T2588] 0 pages HighMem/MovableOnly
[92598.971539][ T2588] 1315554 pages reserved
[92598.990698][ T2588] 16384 pages cma reserved
[92599.010760][ T2588] Tasks state (memory values in pages):
[92599.036265][ T2588] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
swapents oom_score_adj name
[92599.080129][ T2588]
[   1662]     0  1662    29511     1034   290816      244             0 systemd-
journal
[92599.126163][ T2588]
[   2586]   998  2586   508086        0   368640     1838             0 polkitd
[92599.168706][ T2588]
[   2587]     0  2587    52786        0   421888      500             0 sssd
[92599.210082][ T2588]
[   2588]     0  2588    31223        0   139264      195             0
irqbalance
[92599.255606][ T2588]
[   2589]    81  2589    18381        0   167936      217          -900 dbus-
daemon
[92599.303678][ T2588]
[   2590]     0  2590    97260      193   372736      573             0
NetworkManager
[92599.348957][ T2588]
[   2594]     0  2594    95350        1   229376      758             0 rngd
[92599.390216][ T2588]
[   2598]   995  2598     7364        0    94208      103             0 chronyd
[92599.432447][ T2588]
[   2629]     0  2629   106234      399   442368     3836             0 tuned
[92599.473950][ T2588]
[   2638]     0  2638    23604        0   212992      240         -1000 sshd
[92599.515158][ T2588]
[   2642]     0  2642    10392        0   102400      138             0
rhsmcertd
[92599.560435][ T2588]
[   2691]     0  2691    21877        0   208896      277             0 systemd-
logind
[92599.605035][ T2588]
[   2700]     0  2700     3916        0    69632       45             0 agetty
[92599.646750][ T2588]
[   2705]     0  2705    23370        0   225280      393             0 systemd
[92599.688063][ T2588]
[   2730]     0  2730    37063        0   294912      667             0 (sd-pam)
[92599.729028][ T2588]
[   2922]     0  2922     9020        0    98304      232             0 crond
[92599.769130][ T2588]
[   3036]     0  3036    37797        1   307200      305             0 sshd
[92599.813768][ T2588]
[   3057]     0  3057    37797        0   303104      335             0 sshd
[92599.853450][ T2588]
[   3065]     0  3065     6343        1    86016      163             0 bash
[92599.892899][ T2588] [  38249]     0
38249    58330      293   221184      246             0 rsyslogd
[92599.934457][ T2588] [  11329]     0
11329    55131       73   454656      396             0 sssd_nss
[92599.976240][ T2588] [  11331]     0
11331    54424        1   434176      610             0 sssd_be
[92600.017106][ T2588] [  25247]     0
25247    25746        1   212992      300         -1000 systemd-udevd
[92600.060539][ T2588] [  25391]     0
25391     2184        0    65536       32             0 oom01
[92600.100648][ T2588] [  25392]     0
25392     2184        0    65536       39             0 oom01
[92600.143516][ T2588] oom-
kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-
1,global_oom,task_memcg=/system.slice/tuned.service,task=tuned,pid=2629,uid=0
[92600.213724][ T2588] Out of memory: Killed process 2629 (tuned) total-
vm:424936kB, anon-rss:328kB, file-rss:1268kB, shmem-rss:0kB, UID:0
pgtables:442368kB oom_score_adj:0
[92600.297832][  T305] oom_reaper: reaped process 2629 (tuned), now anon-
rss:0kB, file-rss:0kB, shmem-rss:0kB


> 
> > I just
> > have not had a chance to debug them fully. The situation could be worse with
> > more complex allocations like random stress or fuzzy testing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ