linux-kernel - Re: [PATCH v4 01/34] lib/printbuf: New data structure for printing strings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220620153043.vgtfrltebiyprufz@moria.home.lan>
Date:   Mon, 20 Jun 2022 11:30:43 -0400
From:   Kent Overstreet <kent.overstreet@...il.com>
To:     David Laight <David.Laight@...LAB.COM>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "pmladek@...e.com" <pmladek@...e.com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "enozhatsky@...omium.org" <enozhatsky@...omium.org>,
        "linux@...musvillemoes.dk" <linux@...musvillemoes.dk>,
        "willy@...radead.org" <willy@...radead.org>
Subject: Re: [PATCH v4 01/34] lib/printbuf: New data structure for printing
 strings

On Mon, Jun 20, 2022 at 04:44:10AM +0000, David Laight wrote:
> From: Kent Overstreet
> > Sent: 20 June 2022 01:42
> > 
> > This adds printbufs: a printbuf points to a char * buffer and knows the
> > size of the output buffer as well as the current output position.
> > 
> > Future patches will be adding more features to printbuf, but initially
> > printbufs are targeted at refactoring and improving our existing code in
> > lib/vsprintf.c - so this initial printbuf patch has the features
> > required for that.
> > 
> > Signed-off-by: Kent Overstreet <kent.overstreet@...il.com>
> > Reviewed-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> > ---
> >  include/linux/printbuf.h | 122 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 122 insertions(+)
> >  create mode 100644 include/linux/printbuf.h
> > 
> > diff --git a/include/linux/printbuf.h b/include/linux/printbuf.h
> > new file mode 100644
> > index 0000000000..8186c447ca
> > --- /dev/null
> > +++ b/include/linux/printbuf.h
> > @@ -0,0 +1,122 @@
> > +/* SPDX-License-Identifier: LGPL-2.1+ */
> > +/* Copyright (C) 2022 Kent Overstreet */
> > +
> > +#ifndef _LINUX_PRINTBUF_H
> > +#define _LINUX_PRINTBUF_H
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +
> > +/*
> > + * Printbufs: String buffer for outputting (printing) to, for vsnprintf
> > + */
> > +
> > +struct printbuf {
> > +	char			*buf;
> > +	unsigned		size;
> > +	unsigned		pos;
> 
> No naked unsigneds.

This is the way I've _always_ written kernel code - single word type names.

> 
> > +};
> > +
> > +/*
> > + * Returns size remaining of output buffer:
> > + */
> > +static inline unsigned printbuf_remaining_size(struct printbuf *out)
> > +{
> > +	return out->pos < out->size ? out->size - out->pos : 0;
> > +}
> > +
> > +/*
> > + * Returns number of characters we can print to the output buffer - i.e.
> > + * excluding the terminating nul:
> > + */
> > +static inline unsigned printbuf_remaining(struct printbuf *out)
> > +{
> > +	return out->pos < out->size ? out->size - out->pos - 1 : 0;
> > +}
> 
> Those two are so similar mistakes will be make.

If you've got ideas for better names I'd be happy to hear them - we discussed
this and this was what we came up with.

> You can also just return negatives when the buffer has overlowed
> and get the callers to test < or <= as required.

Yeesh, no.

> I also wonder it is necessary to count the total length
> when the buffer isn't long enough?
> Unless there is a real pressing need for it I'd not bother.
> Setting pos == size (after writing the '\0') allows
> overflow be detected without most of the dangers.

Because that's what snprintf() needs.

> > +
> > +static inline unsigned printbuf_written(struct printbuf *out)
> > +{
> > +	return min(out->pos, out->size);
> 
> That excludes the '\0' for short buffers but includes
> it for overlong ones.

It actually doesn't.

> > +}
> > +
> > +/*
> > + * Returns true if output was truncated:
> > + */
> > +static inline bool printbuf_overflowed(struct printbuf *out)
> > +{
> > +	return out->pos >= out->size;
> > +}
> > +
> > +static inline void printbuf_nul_terminate(struct printbuf *out)
> > +{
> > +	if (out->pos < out->size)
> > +		out->buf[out->pos] = 0;
> > +	else if (out->size)
> > +		out->buf[out->size - 1] = 0;
> > +}
> > +
> > +static inline void __prt_char(struct printbuf *out, char c)
> > +{
> > +	if (printbuf_remaining(out))
> > +		out->buf[out->pos] = c;
> 
> At this point it is (should be) always safe to add the '\0'.
> Doing so would save the extra conditionals later on.

True, but at the cost of making the code less straightforward. I may have a look
at it.

> 
> > +	out->pos++;
> > +}
> > +
> > +static inline void prt_char(struct printbuf *out, char c)
> > +{
> > +	__prt_char(out, c);
> > +	printbuf_nul_terminate(out);
> > +}
> > +
> > +static inline void __prt_chars(struct printbuf *out, char c, unsigned n)
> > +{
> > +	unsigned i, can_print = min(n, printbuf_remaining(out));
> > +
> > +	for (i = 0; i < can_print; i++)
> > +		out->buf[out->pos++] = c;
> > +	out->pos += n - can_print;
> > +}
> > +
> > +static inline void prt_chars(struct printbuf *out, char c, unsigned n)
> > +{
> > +	__prt_chars(out, c, n);
> > +	printbuf_nul_terminate(out);
> > +}
> > +
> > +static inline void prt_bytes(struct printbuf *out, const void *b, unsigned n)
> > +{
> > +	unsigned i, can_print = min(n, printbuf_remaining(out));
> > +
> > +	for (i = 0; i < can_print; i++)
> > +		out->buf[out->pos++] = ((char *) b)[i];
> > +	out->pos += n - can_print;
> > +
> > +	printbuf_nul_terminate(out);
> 
> jeepers - that can be written so much better.
> Something like:
> 	unsigned int i, pos = out->pos;
> 	int space = pos - out->size - 1;
> 	char *tgt = out->buf + pos;
> 	const char *src = b;
> 	out->pos = pos + n;
> 
> 	if (space <= 0)
> 		return;
> 	if (n > space)
> 		n = space;
> 
> 	for (i = 0; i < n; i++)
> 		tgt[i] = src[i];
> 	tgt[1] = 0;
> 

I find your version considerably harder to read, and I've stared at enough
assembly that I trust the compiler to generate pretty equivalent code.

> > +}
> > +
> > +static inline void prt_str(struct printbuf *out, const char *str)
> > +{
> > +	prt_bytes(out, str, strlen(str));
> 
> Do you really need to call strlen() and then process
> the buffer byte by byte?

Versus introducing a branch to check for nul into the inner loop of prt_bytes()?
You're not serious, are you?