lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <78aab15e-5bc2-47cc-ac1c-5a348bff0e17@oracle.com>
Date: Thu, 3 Jul 2025 15:33:16 +0200
From: Vegard Nossum <vegard.nossum@...cle.com>
To: David Laight <david.laight.linux@...il.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Andy Lutomirski <luto@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ard Biesheuvel <ardb@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Xiongwei Song <xiongwei.song@...driver.com>,
        Xin Li <xin3.li@...el.com>, "Mike Rapoport (IBM)" <rppt@...nel.org>,
        Brijesh Singh <brijesh.singh@....com>,
        Michael Roth <michael.roth@....com>, Tony Luck <tony.luck@...el.com>,
        Alexey Kardashevskiy <aik@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jonathan Corbet <corbet@....net>, Sohil Mehta <sohil.mehta@...el.com>,
        Ingo Molnar <mingo@...nel.org>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Daniel Sneddon <daniel.sneddon@...ux.intel.com>,
        Kai Huang <kai.huang@...el.com>, Sandipan Das <sandipan.das@....com>,
        Breno Leitao <leitao@...ian.org>,
        Rick Edgecombe
 <rick.p.edgecombe@...el.com>,
        Alexei Starovoitov <ast@...nel.org>, Hou Tao <houtao1@...wei.com>,
        Juergen Gross <jgross@...e.com>, Kees Cook <kees@...nel.org>,
        Eric Biggers <ebiggers@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>,
        "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Luis Chamberlain <mcgrof@...nel.org>, Yuntao Wang <ytcoode@...il.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Tejun Heo <tj@...nel.org>, Changbin Du <changbin.du@...wei.com>,
        Huang Shijie <shijie@...amperecomputing.com>,
        Geert Uytterhoeven <geert+renesas@...der.be>,
        Namhyung Kim <namhyung@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-efi@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCHv8 02/17] x86/asm: Introduce inline memcpy and memset


On 03/07/2025 14:15, David Laight wrote:
> On Thu, 3 Jul 2025 13:39:57 +0300
> "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com> wrote:
>> On Thu, Jul 03, 2025 at 09:44:17AM +0100, David Laight wrote:
>>> On Tue,  1 Jul 2025 12:58:31 +0300
>>> "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com> wrote:
>>>> diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
>>>> index a508e4a8c66a..47b613690f84 100644
>>>> --- a/arch/x86/lib/clear_page_64.S
>>>> +++ b/arch/x86/lib/clear_page_64.S
>>>> @@ -55,17 +55,26 @@ SYM_FUNC_END(clear_page_erms)
>>>>   EXPORT_SYMBOL_GPL(clear_page_erms)
>>>>   
>>>>   /*
>>>> - * Default clear user-space.
>>>> + * Default memset.
>>>>    * Input:
>>>>    * rdi destination
>>>> + * rsi scratch
>>>>    * rcx count
>>>> - * rax is zero
>>>> + * al is value
>>>>    *
>>>>    * Output:
>>>>    * rcx: uncleared bytes or 0 if successful.
>>>> + * rdx: clobbered
>>>>    */
>>>>   SYM_FUNC_START(rep_stos_alternative)
>>>>   	ANNOTATE_NOENDBR
>>>> +
>>>> +	movzbq %al, %rsi
>>>> +	movabs $0x0101010101010101, %rax
>>>> +
>>>> +	/* RDX:RAX = RAX * RSI */
>>>> +	mulq %rsi
>>>
>>> NAK - you can't do that here.
>>> Neither %rsi nor %rdx can be trashed.
>>> The function has a very explicit calling convention.

That's why we have the clobbers... see below

>> What calling convention? We change the only caller to confirm to this.
> 
> The one that is implicit in:
> 
>>>> +	asm volatile("1:\n\t"
>>>> +		     ALT_64("rep stosb",
>>>> +			    "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRM))
>>>> +		     "2:\n\t"
>>>> +		     _ASM_EXTABLE_UA(1b, 2b)
>>>> +		     : "+c" (len), "+D" (addr), ASM_CALL_CONSTRAINT
>>>> +		     : "a" ((uint8_t)v)
> 
> The called function is only allowed to change the registers that
> 'rep stosb' uses - except it can access (but not change)
> all of %rax - not just %al.
> 
> See: https://godbolt.org/z/3fnrT3x9r
> In particular note that 'do_mset' must not change %rax.
> 
> This is very specific and is done so that the compiler can use
> all the registers.

I feel like you trimmed off the clobbers from the asm() in the context
above. For reference, it is:

+		     : "memory", _ASM_SI, _ASM_DX);

I'm not saying this can't be optimized, but that doesn't seem to be your
complaint -- you say "the called function is only allowed to change
...", but this is not true when we have the clobbers, right?

This is exactly what I fixed with my v7 fixlet to this patch:

https://lore.kernel.org/all/1b96b0ca-5c14-4271-86c1-c305bf052b16@oracle.com/


Vegard

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ