linux-kernel - Re: [PATCH -next V7 1/7] riscv: ftrace: Fixup panic by disabling preemption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJF2gTSpCGP55=VyOmRR+439H6PymPVk6cRUc_8LRX74-bHqKQ@mail.gmail.com>
Date:   Sat, 28 Jan 2023 17:45:18 +0800
From:   Guo Ren <guoren@...nel.org>
To:     Mark Rutland <mark.rutland@....com>
Cc:     anup@...infault.org, paul.walmsley@...ive.com, palmer@...belt.com,
        conor.dooley@...rochip.com, heiko@...ech.de, rostedt@...dmis.org,
        mhiramat@...nel.org, jolsa@...hat.com, bp@...e.de,
        jpoimboe@...nel.org, suagrfillet@...il.com, andy.chiu@...ive.com,
        e.shatokhin@...ro.com, linux-riscv@...ts.infradead.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next V7 1/7] riscv: ftrace: Fixup panic by disabling preemption

On Thu, Jan 12, 2023 at 8:57 PM Mark Rutland <mark.rutland@....com> wrote:
>
> On Thu, Jan 12, 2023 at 12:16:02PM +0000, Mark Rutland wrote:
> > Hi Guo,
> >
> > On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@...nel.org wrote:
> > > From: Andy Chiu <andy.chiu@...ive.com>
> > >
> > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate,
> > > forming a jump that jumps to an address over 4K. This may cause errors
> > > if we want to enable kernel preemption and remove dependency from
> > > patching code with stop_machine(). For example, if a task was switched
> > > out on auipc. And, if we changed the ftrace function before it was
> > > switched back, then it would jump to an address that has updated 11:0
> > > bits mixing with previous XLEN:12 part.
> > >
> > > p: patched area performed by dynamic ftrace
> > > ftrace_prologue:
> > > p|      REG_S   ra, -SZREG(sp)
> > > p|      auipc   ra, 0x? ------------> preempted
> > >                                     ...
> > >                             change ftrace function
> > >                                     ...
> > > p|      jalr    -?(ra) <------------- switched back
> > > p|      REG_L   ra, -SZREG(sp)
> > > func:
> > >     xxx
> > >     ret
> >
> > As mentioned on the last posting, I don't think this is sufficient to fix the
> > issue. I've replied with more detail there:
> >
> >   https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/
> >
> > Even in a non-preemptible SMP kernel, if one CPU can be in the middle of
> > executing the ftrace_prologue while another CPU is patching the
> > ftrace_prologue, you have the exact same issue.
> >
> > For example, if CPU X is in the prologue fetches the old AUIPC and the new
> > JALR (because it races with CPU Y modifying those), CPU X will branch to the
> > wrong address. The race window is much smaller in the absence of preemption,
> > but it's still there (and will be exacerbated in virtual machines since the
> > hypervisor can preempt a vCPU at any time).
>
> With that in mind, I think your current implementation of ftrace_make_call()
> and ftrace_make_nop() have a simlar bug. A caller might execute:
>
>         NOP     // not yet patched to AUIPC
>
>                                 < AUIPC and JALR instructions both patched >
>
>         JALR
>
> ... and go to the wrong place.
>
> Assuming individual instruction fetches are atomic, and that you only ever
> branch to the same trampoline, you could fix that by always leaving the AUIPC
> in place, so that you only patch the JALR to enable/disable the callsite.
Yes, the same trampoline is one of the antidotes.

>
> Depending on your calling convention, if you have two free GPRs, you might be
> able to avoid the stacking of RA by always saving it to a GPR in the callsite,
> using a different GPR for the address generation, and having the ftrace
> trampoline restore the original RA value, e.g.
>
>         MV      GPR1, ra
>         AUIPC   GPR2, high_bits_of(ftrace_caller)
>         JALR    ra, high_bits(GPR2)                     // only patch this
I think you mean temp registers here. We are at the prologue of a
function, so we have all of them.

But why do you need another "MV      GPR1, ra"

         AUIPC   GPR2, high_bits_of(ftrace_caller)
         JALR    GPR2, high_bits(GPR2)                     // only patch this

We could reserve ra on the trampoline.
        MV      XX, ra

>
> ... which'd save an instruction per callsite.
>
> Thanks,
> Mark.



-- 
Best Regards
 Guo Ren