lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250723224831.4492ec75@pumpkin>
Date: Wed, 23 Jul 2025 22:48:30 +0100
From: David Laight <david.laight.linux@...il.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, Dave
 Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>,
 "Li,Rongqing" <lirongqing@...du.com>, Steven Rostedt <rostedt@...dmis.org>,
 linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH] x86/math64: handle #DE in mul_u64_u64_div_u64()

On Wed, 23 Jul 2025 11:38:25 +0200
Oleg Nesterov <oleg@...hat.com> wrote:

> On 07/22, David Laight wrote:
> >
> > On Tue, 22 Jul 2025 15:21:48 +0200
> > Oleg Nesterov <oleg@...hat.com> wrote:
> >  
> > > 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > > 	{
> > > 		char ok = 0;
> > > 		u64 q;
> > >
> > > 		asm ("mulq %3; 1: divq %4; movb $1,%1; 2:\n"
> > > 			_ASM_EXTABLE(1b, 2b)
> > > 			: "=a" (q), "+q" (ok)
> > > 			: "a" (a), "rm" (mul), "rm" (div)
> > > 			: "rdx");
> > >
> > > 		if (ok)
> > > 			return q;
> > > 		BUG_ON(!div);
> > > 		WARN_ONCE(1, "muldiv overflow.\n");  
> >
> > I wonder what WARN_ON_ONCE("muldiv overflow") outputs?  
> 
> Well, it outputs "muldiv overflow." ;) So I am not sure it is better
> than just WARN_ON_ONCE(1).
> 
> > Actually, without the BUG or WARN you want:
> > 	u64 fail = ~(u64)0;
> > then
> > 	incq $1 ... "+r" (fail)
> > and finally
> > 	return q | fail;
> > to remove the conditional branches from the normal path
> > (apart from one the caller might do)  
> 
> I was thinking about
> 
> 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> 	{
> 		u64 q;
> 
> 		asm ("mulq %2; 1: divq %3; jmp 3f; 2: movq $-1,%0; 3:\n"
> 			_ASM_EXTABLE(1b, 2b)
> 			: "=a" (q)
> 			: "a" (a), "rm" (mul), "rm" (div)
> 			: "rdx");
> 
> 		return q;
> 	}
> 
> to remove the conditional branch and additional variable. Your version
> is probably beterr... But this is without WARN/BUG.

I wish there was a way of doing a WARN_ONCE from asm with a single instruction.
Then you could put one after your 2:
Otherwise is it a conditional and a load of inlined code.

> So, which version do you prefer?

I wish I knew :-)

Yours is a few bytes shorter, uses one less register, but has that unconditional jmp.
I suspect we don't worry about the cpu not predicting a jump - especially with
the divq.
It's not as though some real-time code relies on this code being as fast
as absolutely possible.
Not using a register is probably the main win.

So maybe I lose (this time).

Further work could add an 'int *' parameter that is set non-zero (from %rax)
if the divide traps; optimised out if NULL.
The easy way is two copies of the asm statement.
But I've already got two copies in the version that does (a * b + c)/d
and four copies is getting silly.

Actually this seems ok - at least as a real function:

    u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 div)
    {
        unsigned __int128 v = (__int128)a * b + c;
                
        asm ("1: divq %1; jmp 3f; 2: movq $-1,%%rax; 3:\n"
			_ASM_EXTABLE(1b, 2b)
			: "+A" (v)
			: "r" (div));
        return v;
    }

But (as I found with 32bit) gcc can decide to do a 128x128 multiply.
It does do a full 128bit add - with an extra register for the zero.

Not that you should never pass "rm" to clang, needs to be "r".
There is a #define for it.

	David


> 
> Oleg.
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ