lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 2 Apr 2024 19:11:43 +0100
From: Mark Rutland <mark.rutland@....com>
To: Vineet Gupta <vineetg@...osinc.com>
Cc: Will Deacon <will@...nel.org>, Andrew Waterman <andrew@...ive.com>,
	Palmer Dabbelt <palmer@...osinc.com>,
	lkml <linux-kernel@...r.kernel.org>,
	Mark Brown <broonie@...nel.org>, Dave Martin <Dave.Martin@....com>
Subject: Re: ARM SVE ABI: kernel dropping SVE/SME state on syscalls

On Wed, Mar 27, 2024 at 05:30:00PM -0700, Vineet Gupta wrote:
> Hi Will, Marc,
> 
> In the RISC-V land we are hitting an issue and need some help
> understanding the SVE ABI about dropping the state on syscalls (and its
> implications etc - in hindsight)
> 
> If I'm reading the arm64 code correctly, SVE state is unconditionally
> (for any syscall whatsoever) dropped in following code path:
> 
> el0_svc
>     fp_user_discard
> 
> The RISC-V Vector ABI mandates something similar and kernel implements
> something similar.
> 
>     2023-06-29 9657e9b7d253 riscv: Discard vector state on syscalls  
> 
> However in recent testing with RISC-V vector builds we are running into
> an issue when this just doesn't work.
> 
> Just for some background, RISC-V vector instructions relies on
> additional state in a VTYPE register which is setup using an apriori
> VSETVLI insn.
> So consider the following piece of code:
> 
>    3ff80:    cc787057              vsetivli    zero,16,e8,mf2,ta,ma    
> <-- sets up VTYPE
>    3ff84:    44d8                    lw    a4,12(s1)
>    3ff86:    449c                    lw    a5,8(s1)
>    3ff88:    06f75563              bge    a4,a5,3fff2
>    3ff8c:    02010087              vle8.v    v1,(sp)
>    3ff90:    020980a7              vse8.v    v1,(s3)   <-- Vector store
> instruction
> Here's the sequence of events that's causing the issue 
>
> 1. The vector store instruction (in say bash) takes a page fault, enters
> kernel.
> 2. In PF return path, a SIGCHLD signal is pending (a bash sub-shell
> which exited, likely on different cpu).

At this point, surely you need to save the VTYPE into a sigframe before
delivering the signal?

> 3. kernel resumes in userspace signal handler which ends up making an
> rt_sigreturn syscall - and which as specified discards the V state (and
> makes VTYPE reg invalid).

The state is discarded at syscall entry, but rt_sigreturn() runs *after* the
discard. If you saved the original VTYPE prior to delivering the signal, it
should be able to restore it regardless of whether it'd be clobbered at syscall
entry.

Surely you *must* save/restore VTYPE in the signal frame? Otherwise the signal
handler can't make any syscall whatsoever, or it's responsible for saving and
restoring VTYPE in userspace, which doesn't seem right.

> 4. When sigreturn finally returns to original Vector store instruction,
> invalid VTYPE triggers an Illegal instruction which causes a SIGILL (as
> state was discarded above).
> 
> So there is no way dropping syscall state would work here.

As above, I don't think that's quite true. It sounds to me like that the actual
bug is that you don't save+restore VTYPE in the signal frame?

> How do you guys handle this for SVE/SME ? One way would be to not do the
> discard in rt_sigreturn codepath, but I don't see that - granted I'm not
> too familiar with arch/arm64/*/**

IIUC this works on arm64 because we'll save all the original state when we
deliver the signal, then restore that state *after* entry to the rt_sigreturn()
syscall.

I can go dig into that tomorrow, but I don't see how this can work unless we
save *all* state prior to delivering the signal, and restoring *all* that state
from the sigframe.

> Other thing I wanted to ask is, have there been any perf implications of
> this ABI decision: as in if this was other way around, userspace (and/or
> compilers) could potentially leverage the fact that SVE/SME state would
> still be valid past a syscall - and won't have to reload/resetup etc.

I believe Mark Brown has made some changes recently to try to avoid some of
that impact. He might be able to comment on that.

Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ