lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 12 Feb 2024 12:24:23 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Linux trace kernel
 <linux-trace-kernel@...r.kernel.org>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, Vincent Donnefort
 <vdonnefort@...gle.com>, Sven Schnelle <svens@...ux.ibm.com>, Mete Durlu
 <meted@...ux.ibm.com>
Subject: Re: [PATCH v2] tracing: Fix wasted memory in saved_cmdlines logic

On Tue, 13 Feb 2024 00:40:38 +0900
Masami Hiramatsu (Google) <mhiramat@...nel.org> wrote:

> > Now, instead of saving only 128 comms by default, by using this wasted
> > space at the end of the structure it can save over 8000 comms and even
> > saves space by removing the need for allocating the other array.  
> 
> This looks good to me. So this will allocate the saved_cmdlines in page-size
> array instead of the power of two.

Once the size is greater than page-size, it still is a power of two ;-)

But I think you meant that it will be a reminder of extra data, and not a
power of two. Looking at the code, it didn't need to be a power of two as
the saved names were just in a ring buffer anyway. The recording of a PID
must be a power of two, as that is masked.

Basically we have three arrays. One for the PIDS that hold the index into the
COMM array. The COMM array really has no limitation. The PID array is
PID_MAX_DEFAULT. But PIDs can be more than PID_MAX_DEFAULT as that's just
the "default" setting. User space can increase it. I need to add a comment
to how this works, as there are three arrays:

 unsigned map_pid_to_cmdline[PID_MAX_DEFAULT + 1];

 /* The next two are allocated based on cmdline_num size */
 unsigned map_cmdline_to_pid[cmdline_num];
 char saved_cmdlines[cmdline_num][TASK_COMM_LEN];

map_pid_to_cmdline[task->pid & (PID_MAX_DEFAULT - 1)] = index into the other two arrays;

 [ Although I'm not sure I need that "+1" in the array. This code is ancient
   and I need to re-examine it. ]

To get back to the PID, you need the map_cmdline_to_pid[idx] which contains
the full PID. Really, that array is only needed if pid max is greater than
PID_MAX_DEFAULT (which I find is the case for most distros now).

To assign a new comm, we basically do:

	idx = map_pid_to_cmdline[pid & (PID_MAX_DEFAULT - 1)];
	if (idx == -1) {
		idx = // get next location in COMM ring buffer
		map_pid_to_cmdline[pid & (PID_MAX_DEFAULT - 1)] = idx;
	}
	map_cmdline_to_pid[idx] = pid;
	strcpy(saved_cmdlines[idx], task->comm);

And Mathieu pinged me about the non-commented use of that NO_CMDLINE_MAP
which is just -1. I have it as INT_MAX, but really maybe it should be -1U,
and add a comment around the memset().

> 
> Acked-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>

Thanks, but Linus already pulled this. But I have some patches to clean
this up a bit more.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ