lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251120182707.42c225f5@pumpkin>
Date: Thu, 20 Nov 2025 18:27:07 +0000
From: David Laight <david.laight.linux@...il.com>
To: Kuan-Wei Chiu <visitorckw@...il.com>
Cc: Theodore Tso <tytso@....edu>, Guan-Chun Wu <409411716@....tku.edu.tw>,
 Andreas Dilger <adilger.kernel@...ger.ca>, linux-ext4@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ext4: improve str2hashbuf by processing 4-byte chunks

On Fri, 21 Nov 2025 00:58:23 +0800
Kuan-Wei Chiu <visitorckw@...il.com> wrote:

> Hi Ted,
> 
> On Thu, Nov 20, 2025 at 10:58:16AM -0500, Theodore Tso wrote:
> > On Sun, Nov 16, 2025 at 07:35:13PM +0000, David Laight wrote:  
> > > 
> > > The (int) casts are unnecessary (throughout), 'char' is always promoted to
> > > 'signed int' before any arithmetic.  
> > 
> > nit: in this case the casts aren't necessary, but your comment is not
> > correct in general, so I just wanted to make sure it's corrected in
> > case someone later looks at the mail archive.
> > 
> > "char" is not always signed.  It can be signed or unsigned; the C
> > specification allows either.  In this particular case, scp is a
> > "signed char", not "char".

It doesn't matter - as pointed out below.
Both 'signed char' and 'unsigned char' are promoted to 'signed int'
before ANY operation.
Well unless sizeof(char) == sizeof(int) when 'unsigned char' is
promoted to 'unsigned int' - which is technically valid and was
true for the C compiler for an old DSP (everything was 32bits).

This is one difference between K&R C and ANSI C - K&R promoted
'unsigned char' to 'unsigned int'.
So there was always the chance that compiling in ANSI mode would
break working code.

> > 
> > Secondly, it's not that a promotion happens before "any" arithmetic.
> > If we add two 8-bit values together, promotion doesn't happen.  In
> > this case, we are adding a signed char to an int, so the promotion
> > will happen.
> >   
> I believe David was referring to the C11 spec 6.3.1.1:
> 
> If an int can represent all values of the original type (as restricted
> by the width, for a bit-field), the value is converted to an int;
> otherwise, it is converted to an unsigned int. These are called the
> integer promotions. All other types are unchanged by the integer
> promotions.
> 
> The spec explicitly mentions char + char in 5.1.2.3 example:
> 
> EXAMPLE 2 In executing the fragment
> char c1, c2;
> /* ... */
> c1 = c1 + c2;
> the ‘‘integer promotions’’ require that the abstract machine promote
> the value of each variable to int size and then add the two ints and
> truncate the sum. Provided the addition of two chars can be done
> without overflow, or with overflow wrapping silently to produce the
> correct result, the actual execution need only produce the same result,
> possibly omitting the promotions.

So with:
	char c1, c2;
	int i1, i2, i3;
	...
	i1 = c1 + c2;
	i2 = (int)c1 + (int)c2;
	i3 = (unsigned int)c1 + (unsigned int)c2;
the values of i1, i2 and i3 are all the same (on a 2s compliment cpu for i3)
regardless of whether char is signed or unsigned (they do depend on
the signedness of char).

> 
> So IIUC conceptually the promotion happens, even if the compiler
> optimizes it out in the actual execution.

Any it is pretty much only x86 and m68k that have instructions for
byte arithmetic.
So for everything else if you assign the result of an arithmetic
operation to a char/short local variable (which is hopefully in
a register rather than on stack) the compiler has to add extra
instructions to mask the value back to 8 (or 16) bits and likely
keep sign extending it as well.

People also forget that the type of 'cond ? c1 : c2' is also 'int'.

Part of it is historic, the pdp11 is a 16bit cpu with byte-addressable
memory and sign-extending byte memory reads (which is probably why char
defaults to signed).

	David


> 
> Link: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
> 
> Regards,
> Kuan-Wei


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ