lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1506473616-88120-1-git-send-email-yang.s@alibaba-inc.com>
Date:   Wed, 27 Sep 2017 08:53:33 +0800
From:   "Yang Shi" <yang.s@...baba-inc.com>
To:     cl@...ux.com, penberg@...nel.org, rientjes@...gle.com,
        iamjoonsoo.kim@....com, akpm@...ux-foundation.org,
        mhocko@...nel.org
Cc:     "Yang Shi" <yang.s@...baba-inc.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>
Subject: [PATCH 0/3 v7] oom: capture unreclaimable slab info in oom message when kernel panic


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 —> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 —> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 —> v4:
* Solved the comments from David
* Added David’s Acked-by in patch 1

Changelog v2 —> v3:
* Show used size and total size of each kmem cache per David’s comment

Changelog v1 —> v2:
* Removed the original patch 1 (“mm: slab: output reclaimable flag in /proc/slabinfo”) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christoph’s Acked-by
* Removed acquiring slab_mutex per Tetsuo’s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: oom: show unreclaimable slab info when kernel panic
      doc: add description for unreclaim_slabs_oom_ratio

 Documentation/sysctl/vm.txt | 12 ++++++++++++
 include/linux/oom.h         |  1 +
 include/uapi/linux/sysctl.h |  1 +
 kernel/sysctl.c             |  9 +++++++++
 kernel/sysctl_binary.c      |  1 +
 mm/oom_kill.c               | 31 +++++++++++++++++++++++++++++++
 mm/slab.h                   |  8 ++++++++
 mm/slab_common.c            | 29 +++++++++++++++++++++++++++++
 tools/vm/slabinfo.c         | 11 ++++++++++-
 9 files changed, 102 insertions(+), 1 deletion(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ