lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170928135357.GA8470@castle.DHCP.thefacebook.com>
Date:   Thu, 28 Sep 2017 14:53:57 +0100
From:   Roman Gushchin <guro@...com>
To:     Andrew Morton <akpm@...ux-foundation.org>
CC:     Linus Torvalds <torvalds@...ux-foundation.org>,
        <linux-mm@...ck.org>, Alexander Viro <viro@...iv.linux.org.uk>,
        Ingo Molnar <mingo@...nel.org>, <kernel-team@...com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

On Wed, Sep 27, 2017 at 04:31:06PM -0700, Andrew Morton wrote:
> On Wed, 20 Sep 2017 16:06:34 -0700 Roman Gushchin <guro@...com> wrote:
> 
> > Right now there is no convenient way to check if a process is being
> > coredumped at the moment.
> > 
> > It might be necessary to recognize such state to prevent killing
> > the process and getting a broken coredump.
> > Writing a large core might take significant time, and the process
> > is unresponsive during it, so it might be killed by timeout,
> > if another process is monitoring and killing/restarting
> > hanging tasks.
> > 
> > To provide an ability to detect if a process is in the state of
> > being coreduped, we can expose a boolean CoreDumping flag
> > in /proc/pid/status.
> > 
> > Example:
> > $ cat core.sh
> >   #!/bin/sh
> > 
> >   echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> >   sleep 1000 &
> >   PID=$!
> > 
> >   cat /proc/$PID/status | grep CoreDumping
> >   kill -ABRT $PID
> >   sleep 1
> >   cat /proc/$PID/status | grep CoreDumping
> > 
> > $ ./core.sh
> >   CoreDumping:	0
> >   CoreDumping:	1
> 
> I assume you have some real-world use case which benefits from this.

Sure, we're getting a sensible number of corrupted coredump files
on machines in our fleet, just because processes are being killed
by timeout in the middle of the core writing process.

We do have a process health check, and some agent is responsible
for restarting processes which are not responding for health check requests.
Writing a large coredump to the disk can easily exceed the reasonable timeout
(especially on an overloaded machine).

This flag will allow the agent to distinguish processes which are being
coredumped, extend the timeout for them, and let them produce a full
coredump file.

> 
> >  fs/proc/array.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> 
> A Documentation/ would be appropriate?   Include a brief mention of
> *why* someone might want to use this...
> 
>

Here it is. Thank you!

--

>From 71f86fc2bdd6104dc7d63c0c2eeb6b414494a582 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro@...com>
Date: Thu, 28 Sep 2017 13:47:19 +0100
Subject: [PATCH] proc: document CoreDumping flag in /proc/<pid>/status

Add description for the CoreDumping flag in /proc/<pid>/status.

The flag is intended to be used to avoid killing processes
during the generation of the coredump files and avoid getting
corrupted coredump files.

Signed-off-by: Roman Gushchin <guro@...com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: kernel-team@...com
Cc: linux-doc@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
---
 Documentation/filesystems/proc.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index adba21b5ada7..bc832f8b7a70 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -181,6 +181,7 @@ read the file /proc/PID/status:
   VmPTE:        20 kb
   VmSwap:        0 kB
   HugetlbPages:          0 kB
+  CoreDumping:    0
   Threads:        1
   SigQ:   0/28578
   SigPnd: 0000000000000000
@@ -254,6 +255,8 @@ Table 1-2: Contents of the status files (as of 4.8)
  VmSwap                      amount of swap used by anonymous private data
                              (shmem swap usage is not included)
  HugetlbPages                size of hugetlb memory portions
+ CoreDumping                 process's memory is currently being dumped
+                             (killing the process may lead to a corrupted core)
  Threads                     number of threads
  SigQ                        number of signals queued/max. number for queue
  SigPnd                      bitmap of pending signals for the thread
-- 
2.13.5

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ