lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 3 Apr 2024 10:33:10 -0700
From: Vineet Gupta <vineetg@...osinc.com>
To: Björn Töpel <bjorn@...nel.org>,
 Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt
 <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
 Andy Chiu <andy.chiu@...ive.com>, linux-riscv@...ts.infradead.org
Cc: Björn Töpel <bjorn@...osinc.com>,
 Conor Dooley <conor.dooley@...rochip.com>, Heiko Stuebner <heiko@...ech.de>,
 Vincent Chen <vincent.chen@...ive.com>, Ben Dooks
 <ben.dooks@...ethink.co.uk>, Greentime Hu <greentime.hu@...ive.com>,
 Haorong Lu <ancientmodern4@...il.com>, Jerry Shih <jerry.shih@...ive.com>,
 Nick Knight <nick.knight@...ive.com>, linux-kernel@...r.kernel.org,
 Charlie Jenkins <charlie@...osinc.com>, Vineet Gupta <vgupta@...nel.org>
Subject: Re: [PATCH] riscv: Fix vector state restore in rt_sigreturn()

On 4/3/24 00:26, Björn Töpel wrote:
> From: Björn Töpel <bjorn@...osinc.com>
>
> The RISC-V Vector specification states in "Appendix D: Calling
> Convention for Vector State" [1] that "Executing a system call causes
> all caller-saved vector registers (v0-v31, vl, vtype) and vstart to
> become unspecified.". In the RISC-V kernel this is called "discarding
> the vstate".
>
> Returning from a signal handler via the rt_sigreturn() syscall, vector
> discard is also performed. However, this is not an issue since the
> vector state should be restored from the sigcontext, and therefore not
> care about the vector discard.
>
> The "live state" is the actual vector register in the running context,
> and the "vstate" is the vector state of the task. A dirty live state,
> means that the vstate and live state are not in synch.
>
> When vectorized user_from_copy() was introduced, an bug sneaked in at
> the restoration code, related to the discard of the live state.
>
> An example when this go wrong:
>
>   1. A userland application is executing vector code
>   2. The application receives a signal, and the signal handler is
>      entered.
>   3. The application returns from the signal handler, using the
>      rt_sigreturn() syscall.
>   4. The live vector state is discarded upon entering the
>      rt_sigreturn(), and the live state is marked as "dirty", indicating
>      that the live state need to be synchronized with the current
>      vstate.
>   5. rt_sigreturn() restores the vstate, except the Vector registers,
>      from the sigcontext
>   6. rt_sigreturn() restores the Vector registers, from the sigcontext,
>      and now the vectorized user_from_copy() is used. The dirty live
>      state from the discard is saved to the vstate, making the vstate
>      corrupt.
>   7. rt_sigreturn() returns to the application, which crashes due to
>      corrupted vstate.
>
> Note that the vectorized user_from_copy() is invoked depending on the
> value of CONFIG_RISCV_ISA_V_UCOPY_THRESHOLD. Default is 768, which
> means that vlen has to be larger than 128b for this bug to trigger.
>
> The fix is simply to mark the live state as non-dirty/clean prior
> performing the vstate restore.
>
> Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-8abdb41-2024-03-26/unpriv-isa-asciidoc.pdf # [1]
> Reported-by: Charlie Jenkins <charlie@...osinc.com>
> Reported-by: Vineet Gupta <vgupta@...nel.org>
> Fixes: c2a658d41924 ("riscv: lib: vectorize copy_to_user/copy_from_user")
> Signed-off-by: Björn Töpel <bjorn@...osinc.com>

Tested-by: Vineet Gupta <vineetg@...osinc.com>

For completeness (and fun)

1. The issue was triggered on dual core spike run with a seemingly
benign workload (the key is repeated fork/execve/exit with a little I/O)

    some-shell-script.sh

    #!/bin/bash

    (while true; do ls; done) &

    for i in $seq (1 20); do
       <long running job>
    done

2. The issue initially appears as follows: Vector store instruction,
before starting to run invalidates it's own context (page fault ->
preemption -> handle-signal -> sigreturn -> VILL / V-clobber), so when
it eventually runs, it takes an illegal instruction exception, taking
down the entire program.

Thx,
-Vineet
   

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ