linux-kernel - Re: [PATCH next 1/1] lib: Optimise hex_dump_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250219131347.3ec72325@pumpkin>
Date: Wed, 19 Feb 2025 13:13:47 +0000
From: David Laight <david.laight.linux@...il.com>
To: Nick Child <nnac123@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Linus Torvalds
 <torvalds@...ux-foundation.org>, Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [PATCH next 1/1] lib: Optimise hex_dump_to_buffer()

On Tue, 18 Feb 2025 16:52:38 -0600
Nick Child <nnac123@...ux.ibm.com> wrote:

> On Sun, Feb 16, 2025 at 08:19:01PM +0000, David Laight wrote:
> > Fastpath the normal case of single byte output that fits in the buffer.
> > Output byte groups (byteswapped on little-endian) without calling snprintf().
> > Remove the restriction that rowsize must be 16 or 32.
> > Remove the restriction that groupsize must be 8 or less.
> > If groupsize isn't a power of 2 or doesn't divide into both len and
> >   rowsize it is set to 1 (otherwise byteswapping is hard).
> > Change the types of the rowsize and groupsize parameters to be unsigned types.
> >  
> 
> Thank you!
> 
> > Tested in a userspace harness, code size (x86-64) halved to 723 bytes.
> > 
> > Signed-off-by: David Laight <david.laight.linux@...il.com>
> > ---
> >  include/linux/printk.h |   6 +-
> >  lib/hexdump.c          | 165 ++++++++++++++++++++---------------------
> >  2 files changed, 85 insertions(+), 86 deletions(-)
> > 
> > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > index 4217a9f412b2..49e67f63277e 100644
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -752,9 +752,9 @@ enum {
> >  	DUMP_PREFIX_ADDRESS,
> >  	DUMP_PREFIX_OFFSET
> >  };  
> ...
> > - * hex_dump_to_buffer() works on one "line" of output at a time, i.e.,
> > + * If @groupsize isn't a power of 2 that divides into both @len and @rowsize
> > + * the it is set to 1.  
> 
> s/the/then/
> 
> > + *
> > + * hex_dump_to_buffer() works on one "line" of output at a time, e.g.,
> >   * 16 or 32 bytes of input data converted to hex + ASCII output.  
> ...
> > -		linebuf[lx++] = ' ';
> > +	if (!ascii) {
> > +		*dst = 0;
> > +		return out_len;
> >  	}
> > + 
> > +	pad_len += 2;  
> 
> So at a minimum there is 2 spaces before the ascii translation?

That is the way it always was.
Does make sense that way.

> when people allocate linebuf, what should they use to calculate the len?

Enough :-)
Unchanged from before, 'rowsize * 4 + 2' is just enough.

> 
> Also side nit, this existed before this patch, the endian switch may
> occur on the hex dump but it doesn't on the ascii conversion:
> [   20.172006][  T150] 706f6e6d6c6b6a696867666564636261  abcdefghijklmnop

Correct, that is what you want.
Consider { u32 x; char y[4]; } - you don't want the ascii byte reversed.

Indeed, if the data might not be aligned it is easier to process the
hex bytes in address order than a little-endian value that is split
between two 'words'. 

> 
> 
> > +	out_len += pad_len + len;
> > +	if (dst + pad_len >= dst_end)
> > +		pad_len = dst_end - dst - 1;  
> 
> Why not jump to hex_truncate here? This feels like an error case and
> if I am understanding correctly, this will pad the rest of the buffer
> leaving no room for ascii.

I missed that, can't remember what the old code did.

> 
> > +	while (pad_len--)
> > +		*dst++ = ' ';
> > +
> > +	if (dst + len >= dst_end)  
> ...
> > -- 
> > 2.39.5  
> 
> I will like to also support a wrapper to a bitmap argument as Andy
> mentioned. Mostly for selfish reasons though: I would like an argument
> to be added to skip endian conversion, and just observe the bytes as they
> appear in memory (without having to use groupsize 1).

More useful would be an option to keep address order with a space
between the bytes, but add an extra space every 'n' bytes.
So you get: "00 11 22 33  44 55 66 77  88..."

But that is an enhancement rather than a rewrite.
Although it is all easier to do with my new 'slow path' loop.

> I had fun tracking some of these bitwise operations on power of 2
> integers, I think I must've missed that day in school. Cool stuff :)

It came the day after the rule for inverting boolean expressions.

	David