linux-kernel - Re: [GIT PULL] tracing: Prevent trace_marker being bigger than unsigned short

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240302145958.05aabdd2@rorschach.local.home>
Date: Sat, 2 Mar 2024 14:59:58 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Masami Hiramatsu
 <mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [GIT PULL] tracing: Prevent trace_marker being bigger than
 unsigned short

On Sat, 2 Mar 2024 09:24:37 -0800
Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Sat, 2 Mar 2024 at 08:10, Steven Rostedt <rostedt@...dmis.org> wrote:
> >
> > - The change to allow trace_marker writes to be as big as the trace_seq can
> >   hold, and also the change that increases the size of the trace_seq to two
> >   pages, caused PowerPC kselftest trace_marker test to fail. The trace_marker
> >   kselftest writes up to subbuffer size which is determined by PAGE_SIZE.
> >   On PowerPC, the PAGE_SIZE can be 64K, which means the selftest will write
> >   a string that is around 64K in size. The output of the trace_marker is
> >   performed with a vsnprintf("%.*s", size, string), but this write would make
> >   the size greater than 32K, which is the max precision of "%.*s", and that
> >   causes a kernel warning. The fix is simply to keep the write of trace_marker
> >   less than or equal to max signed short.  
> 
> Please don't just add random limits that are based on other random limits.

It's not random limits, it's resource limits.

> 
> That printk warning is for "you did something obviously crazy".
> 
> That does NOT MEAN that you now should limit your strings to something
> JUST BORDERLINE CRAZY.

I don't have control over the strings. Anyone can do in user space:

	fd = open("/sys/kernel/tracing/trace_marker", O_WRONLY);
	r = write(fd, huge_string, 10000000);

And this code only gives you what is returned in 'r'. It doesn't error
out. It just limits what the max write size is. I just default it to
what the resources available are.

> 
> See?
> 
> There is not a way in hell that printing a 32kB string in tracing is
> valid. EVER.

Well, the limit is really PAGE_SIZE, which on most architectures is
4096, but on PowerPC, PAGE_SIZE is 64K. And the test in
tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc finds the
PAGE_SIZE and writes a string as long as it to see if it doesn't crash
the kernel. And all the resources can hold a 60K write. The problem
that this patch addresses is that the vsnprintf() used to move the data
into seq_file has a precision variable that checks for overflow, and it
has a max of 32K.

Yes, in most cases, 4K is the limit, which is why this doesn't trigger
on any architecture that has 4K page sizes.

> 
> So stop it. Stop making limits be some random implementation detail.
> Make limits *sane*.

The "implementation detail" is PAGE_SIZE. Similar to writing large
amounts of data to pipes and sockets. It may not write all data, but
just a smaller amount. The write doesn't error, it just says "this is
all I can write that you passed to me".

> 
> Make a *sane* limit for tracing. Not a "avoid being called crazy" limit.

What arbitrary limit do I do? It's just changes how the string will be
broken up, as "echo" or "cat" into trace_marker will continue writing the rest
of the string. It doesn't cause errors in the write. It simply breaks
the string up into smaller blocks.

> 
> Honestly, I suspect that a sane limit for tracing strings is likely on
> the order of tens or maybe hundreds of bytes. Not some kind of "fits
> in a short" that is just printk saying "I refuse to waste memory on
> the stack".

The error isn't printk, it's vsnprintf() that is writing to a seq_file
to user space. There's no stack or printk involved here.

	trace_seq_printf(s, ": %.*s", max, field->buf);

Where 's' is a trace_seq with a PAGE_SIZE buffer, that later gets passed
to seq_file.

> 
> Side note: for similar reasons the field-width is a 24-bit integer.
> And no, if you think that passing a 16MB field width is sane, you need
> to rethink your life. Again, that's a small implementation detail, not
> a "let's explore how stupid we can be".

I really don't understand what you mean by the above. This code is what
user space can write into the tracing ring buffer. If the ring buffer
sub-buffer is 64K, and user space does a 32K write into it, why prevent
that? The only limit here, is that the vsnprintf() has a max of signed
short for the precision it uses, which I used to prevent overflow.
That vsnprintf which writes into a memory buffer that will be sent out
via seq_file doesn't like huge strings greater than 32K, even though
what it is writing into is big enough to hold it.

-- Steve