lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 22 Apr 2024 12:14:13 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Puranjay Mohan <puranjay@...nel.org>
Cc: Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Andrii Nakryiko <andrii@...nel.org>,
	Martin KaFai Lau <martin.lau@...ux.dev>,
	Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
	Yonghong Song <yonghong.song@...ux.dev>,
	John Fastabend <john.fastabend@...il.com>,
	KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...gle.com>,
	Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
	bpf@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org, puranjay12@...il.com
Subject: Re: [PATCH bpf] arm32, bpf: reimplement sign-extension mov
 instruction

On Fri, Apr 19, 2024 at 06:28:32PM +0000, Puranjay Mohan wrote:
> The current implementation of the mov instruction with sign extension
> has the following problems:
> 
>   1. It clobbers the source register if it is not stacked because it
>      sign extends the source and then moves it to the destination.
>   2. If the dst_reg is stacked, the current code doesn't write the value
>      back in case of 64-bit mov.
>   3. There is room for improvement by emitting fewer instructions.
> 
> The steps for fixing this and the instructions emitted by the JIT are
> explained below with examples in all combinations:
> 
> Case A: offset == 32:
> =====================
>   Case A.1: src and dst are stacked registers:
>   --------------------------------------------
>     1. Load src_lo into tmp_lo
>     2. Store tmp_lo into dst_lo
>     3. Sign extend tmp_lo into tmp_hi
>     4. Store tmp_hi to dst_hi
> 
>     Example: r3 = (s32)r3
> 	r3 is a stacked register
> 
> 	ldr     r6, [r11, #-16]	// Load r3_lo into tmp_lo
> 	// str to dst_lo is not emitted because src_lo == dst_lo
> 	asr     r7, r6, #31	// Sign extend tmp_lo into tmp_hi
> 	str     r7, [r11, #-12] // Store tmp_hi into r3_hi
> 
>   Case A.2: src is stacked but dst is not:
>   ----------------------------------------
>     1. Load src_lo into dst_lo
>     2. Sign extend dst_lo into dst_hi
> 
>     Example: r6 = (s32)r3
> 	r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
> 
> 	ldr     r4, [r11, #-16] // Load r3_lo into r6_lo
> 	asr     r5, r4, #31	// Sign extend r6_lo into r6_hi
> 
>   Case A.3: src is not stacked but dst is stacked:
>   ------------------------------------------------
>     1. Store src_lo into dst_lo
>     2. Sign extend src_lo into tmp_hi
>     3. Store tmp_hi to dst_hi
> 
>     Example: r3 = (s32)r6
> 	r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
> 
> 	str     r4, [r11, #-16] // Store r6_lo to r3_lo
> 	asr     r7, r4, #31	// Sign extend r6_lo into tmp_hi
> 	str     r7, [r11, #-12]	// Store tmp_hi to dest_hi
> 
>   Case A.4: Both src and dst are not stacked:
>   -------------------------------------------
>     1. Mov src_lo into dst_lo
>     2. Sign extend src_lo into dst_hi
> 
>     Example: (bf) r6 = (s32)r6
> 	r6 maps to {ARM_R5, ARM_R4}
> 
> 	// Mov not emitted because dst == src
> 	asr     r5, r4, #31 // Sign extend r6_lo into r6_hi
> 
> Case B: offset != 32:
> =====================
>   Case B.1: src and dst are stacked registers:
>   --------------------------------------------
>     1. Load src_lo into tmp_lo
>     2. Sign extend tmp_lo according to offset.
>     3. Store tmp_lo into dst_lo
>     4. Sign extend tmp_lo into tmp_hi
>     5. Store tmp_hi to dst_hi
> 
>     Example: r9 = (s8)r3
> 	r9 and r3 are both stacked registers
> 
> 	ldr     r6, [r11, #-16] // Load r3_lo into tmp_lo
> 	lsl     r6, r6, #24	// Sign extend tmp_lo
> 	asr     r6, r6, #24	// ..
> 	str     r6, [r11, #-56] // Store tmp_lo to r9_lo
> 	asr     r7, r6, #31	// Sign extend tmp_lo to tmp_hi
> 	str     r7, [r11, #-52] // Store tmp_hi to r9_hi
> 
>   Case B.2: src is stacked but dst is not:
>   ----------------------------------------
>     1. Load src_lo into dst_lo
>     2. Sign extend dst_lo according to offset.
>     3. Sign extend tmp_lo into dst_hi
> 
>     Example: r6 = (s8)r3
> 	r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
> 
> 	ldr     r4, [r11, #-16] // Load r3_lo to r6_lo
> 	lsl     r4, r4, #24	// Sign extend r6_lo
> 	asr     r4, r4, #24	// ..
> 	asr     r5, r4, #31	// Sign extend r6_lo into r6_hi
> 
>   Case B.3: src is not stacked but dst is stacked:
>   ------------------------------------------------
>     1. Sign extend src_lo into tmp_lo according to offset.
>     2. Store tmp_lo into dst_lo.
>     3. Sign extend src_lo into tmp_hi.
>     4. Store tmp_hi to dst_hi.
> 
>     Example: r3 = (s8)r1
> 	r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
> 
> 	lsl     r6, r2, #24 	// Sign extend r1_lo to tmp_lo
> 	asr     r6, r6, #24	// ..
> 	str     r6, [r11, #-16] // Store tmp_lo to r3_lo
> 	asr     r7, r6, #31	// Sign extend tmp_lo to tmp_hi
> 	str     r7, [r11, #-12] // Store tmp_hi to r3_hi
> 
>   Case B.4: Both src and dst are not stacked:
>   -------------------------------------------
>     1. Sign extend src_lo into dst_lo according to offset.
>     2. Sign extend dst_lo into dst_hi.
> 
>     Example: r6 = (s8)r1
> 	r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
> 
> 	lsl     r4, r2, #24	// Sign extend r1_lo to r6_lo
> 	asr     r4, r4, #24	// ..
> 	asr     r5, r4, #31	// Sign extend r6_lo to r6_hi
> 
> Fixes: fc832653fa0d ("arm32, bpf: add support for sign-extension mov instruction")
> Reported-by: syzbot+186522670e6722692d86@...kaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@google.com/
> Signed-off-by: Puranjay Mohan <puranjay@...nel.org>

This looks really great, thanks for putting in the extra effort!

Reviewed-by: Russell King (Oracle) <rmk+kernel@...linux.org.uk>

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ