lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9427e40-10b1-49eb-9baa-dde1364e8fe5@efficios.com>
Date: Fri, 23 Feb 2024 11:54:30 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Oleg Nesterov <oleg@...hat.com>, wenyang.linux@...mail.com,
 Masami Hiramatsu <mhiramat@...nel.org>, Ingo Molnar <mingo@...nel.org>,
 Mel Gorman <mgorman@...hsingularity.net>,
 Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
 lttng-dev <lttng-dev@...ts.lttng.org>,
 Karim Yaghmour <karim.yaghmour@...rsys.com>,
 Matthew Khouzam <matthew.khouzam@...csson.com>
Subject: Re: [PATCH] coredump debugging: add a tracepoint to report the
 coredumping

On 2024-02-23 09:26, Steven Rostedt wrote:
> On Mon, 19 Feb 2024 13:01:16 -0500
> Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> 
>> Between "sched_process_exit" and "sched_process_free", the task can still be
>> observed by a trace analysis looking at sched and signal events: it's a zombie at
>> that stage.
> 
> Looking at the history of this tracepoint, it was added in 2008 by commit
> 0a16b60758433 ("tracing, sched: LTTng instrumentation - scheduler").
> Hmm, LLTng? I wonder who the author was?

[ common typo: LLTng -> LTTng ;-) ]

> 
>    Author: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
> 
>   :-D
> 
> Mathieu, I would say it's your call on where the tracepoint can be located.
> You added it, you own it!

Wow! that's now 16 years ago :)

I've checked with Matthew Khouzam (maintainer of Trace Compass)
which care about this tracepoint, and we have not identified any
significant impact of moving it on its model of the scheduler, other
than slightly changing its timing.

I've also checked quickly in lttng-analyses and have not found
any code that care about its specific placement.

So I would say go ahead and move it earlier in do_exit(), it's
fine by me.

If you are interested in a bit of archeology, "sched_process_free"
originated from my ltt-experimental 0.1.99.13 kernel patch against
2.6.12-rc4-mm2 back in September 2005 (that's 19 years ago). It was
a precursor to the LTTng 0.x kernel patchset.

https://lttng.org/files/ltt-experimental/patch-2.6.12-rc4-mm2-ltt-exp-0.1.99.13.gz

Index: kernel/exit.c
===================================================================
--- a/kernel/exit.c	(.../trunk/kernel/linux-2.6.12-rc4-mm2)	(revision 41)
+++ b/kernel/exit.c	(.../branches/mathieu/linux-2.6.12-rc4-mm2)	(revision 41)
@@ -4,6 +4,7 @@
   *  Copyright (C) 1991, 1992  Linus Torvalds
   */
  
+#include <linux/ltt/ltt-facility-process.h>
  #include <linux/config.h>
  #include <linux/mm.h>
  #include <linux/slab.h>
@@ -55,6 +56,7 @@ static void __unhash_process(struct task
  	}
  
  	REMOVE_LINKS(p);
+  trace_process_free(p->pid);
  }
  
  void release_task(struct task_struct * p)
@@ -832,6 +834,8 @@ fastcall NORET_TYPE void do_exit(long co
  	}
  	exit_mm(tsk);
  
+	trace_process_exit(tsk->pid);
+
  	exit_sem(tsk);
  	__exit_files(tsk);
  	__exit_fs(tsk);

This was a significant improvement over the prior LTT which only
had the equivalent of "sched_process_exit", which caused issues
with the Linux scheduler model in LTTV due to zombie processes.

Here is where it appeared in LTT back in 1999:

http://www.opersys.com/ftp/pub/LTT/TracePackage-0.9.0.tgz

patch-ltt-2.2.13-991118

diff -urN linux/kernel/exit.c linux-2.2.13/kernel/exit.c
--- linux/kernel/exit.c	Tue Oct 19 20:14:02 1999
+++ linux-2.2.13/kernel/exit.c	Sun Nov  7 23:49:17 1999
@@ -14,6 +14,8 @@
  #include <linux/acct.h>
  #endif
  
+#include <linux/trace.h>
+
  #include <asm/uaccess.h>
  #include <asm/pgtable.h>
  #include <asm/mmu_context.h>
@@ -386,6 +388,8 @@
  	del_timer(&tsk->real_timer);
  	end_bh_atomic();
  
+	TRACE_PROCESS(TRACE_EV_PROCESS_EXIT, 0, 0);
+
  	lock_kernel();
  fake_volatile:
  #ifdef CONFIG_BSD_PROCESS_ACCT

And it was moved to its current location (after exit_mm()) a bit
later (2001):

http://www.opersys.com/ftp/pub/LTT/TraceToolkit-0.9.5pre2.tgz

Patches/patch-ltt-linux-2.4.5-vanilla-010909-1.10

diff -urN linux/kernel/exit.c /ext2/home/karym/kernel/linux-2.4.5/kernel/exit.c
--- linux/kernel/exit.c	Fri May  4 17:44:06 2001
+++ /ext2/home/karym/kernel/linux-2.4.5/kernel/exit.c	Wed Jun 20 12:39:24 2001
@@ -14,6 +14,8 @@
  #include <linux/acct.h>
  #endif
  
+#include <linux/trace.h>
+
  #include <asm/uaccess.h>
  #include <asm/pgtable.h>
  #include <asm/mmu_context.h>
@@ -439,6 +441,8 @@
  #endif
  	__exit_mm(tsk);
  
+	TRACE_PROCESS(TRACE_EV_PROCESS_EXIT, 0, 0);
+
  	lock_kernel();
  	sem_exit();
  	__exit_files(tsk);

So this sched_process_exit placement was actually decided
by Karim Yaghmour back in the LTT days (2001). I don't think
he will mind us moving it around some 23 years later. ;)

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ