lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whe12v6dDjhYcOcfHEorwZ90+cbzD1sKLRzNyB8CWsaPw@mail.gmail.com>
Date: Mon, 6 Jan 2025 19:05:22 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Andrew Morton <akpm@...ux-foundation.org>, 
	Petr Mladek <pmladek@...e.com>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, 
	Rasmus Villemoes <linux@...musvillemoes.dk>, Sergey Senozhatsky <senozhatsky@...omium.org>, 
	Kees Cook <keescook@...omium.org>
Subject: Re: [RFC][PATCH] printf: Harden accessing pointer dereference in vsprintf()

On Mon, 6 Jan 2025 at 18:28, Steven Rostedt <rostedt@...dmis.org> wrote:
>
> vsprintf() is used in critical paths, but I wonder how much these two
> versions actually make a difference in performance?

We've had people writing patches to avoid vsnprintf because it's *so*
critical for /proc etc.

And we had actual bugs as a result, eg things like

   f9ed1f7c2e26 ("genirq/proc: Use seq_put_decimal_ull_width() for
decimal values")

then causing

   9d9f204bdf72 ("genirq/proc: Add missing space separator back")

so bad performance results in actual BUGGY CODE.

It's one of the reasons I was looking at vsnprintf even before the
whole tracing discussion: some of the vsnprintf overhead was actually
disgusting. And almost all of it was just stupid code, allegedly for
simplicity. Causing things like pointless stack frame setup, bad
calling conventions etc.

So my vsnprintf branch was literally written with me looking at the
resulting assembler too, because when core code works badly, the end
result is badness.

In random non-core code, I want people to write it clearly and
maintainably. I very much do not want random drivers doing
micro-optimizations etc.

End result: I don't want people to say "oh, vsnprintf does badly at
handling '%s' so I'm going to use some special string-printing thing",
the way we had that seq_put_decimal_ull_width() chaos.

And yes, I saw that very error_string() function inlined several times
when I was looking at code generation. gcc and clang did things
differently, I forget which way it went, but there was a reason I sent
out the test patch that I had *checked* made code generation almost
identical. The "fetch a byte" was still fetch a simple byte fetch.

Iirc the main difference was that yes, we had to do that annoying
"pagefault_disable/enable()" and that generated something like five
instructions just to get 'current' and increment a variable.. But it
didn't cause extra stack frames or function calls. It's been about a
month, I forget the exact details.

(And yes, I even considered turning pagefault_disable/enable there
into a preempt_count() thing instead. We do better at those percpu
variables than at per-thread ones. We used to do it that way, but we
made pagefault_disable its own counter back ten years ago exactly
*because* people wanted to have a separate counter, and sadly that
does mean that it's now a few instructions).

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ