lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 Jun 2019 09:01:20 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Vineet Gupta <Vineet.Gupta1@...opsys.com>
Cc:     Eugeniy Paltsev <Eugeniy.Paltsev@...opsys.com>,
        "linux-snps-arc@...ts.infradead.org" 
        <linux-snps-arc@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Alexey Brodkin <Alexey.Brodkin@...opsys.com>,
        Jason Baron <jbaron@...mai.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [PATCH] ARC: ARCv2: jump label: implement jump label patching

On Wed, Jun 19, 2019 at 11:55:41PM +0000, Vineet Gupta wrote:
> On 6/19/19 1:12 AM, Peter Zijlstra wrote:

> > I'm assuming you've looked at what x86 currently does and found
> > something like that doesn't work for ARC?
> 
> Just looked at x86 code and it seems similar

I think you missed a bit.

> >>> +	WRITE_ONCE(*instr_addr, instr);
> >>> +	flush_icache_range(entry->code, entry->code + JUMP_LABEL_NOP_SIZE);
> > So do you have a 2 byte opcode that traps unconditionally? In that case
> > I'm thinking you could do something like x86 does. And it would avoid
> > that NOP padding you do to get the alignment.
> 
> Just to be clear there is no trapping going on in the canonical sense of it. There
> are regular instructions for NO-OP and Branch.
> We do have 2 byte opcodes for both but as described the branch offset is too
> limited so not usable.

In particular we do not need the alignment.

So what the x86 code does is:

 - overwrite the first byte of the instruction with a single byte trap
   instruction

 - machine wide IPI which synchronizes I$

At this point, any CPU that encounters this instruction will trap; and
the trap handler will emulate the 'new' instruction -- typically a jump.

  - overwrite the tail of the instruction (if there is a tail)

  - machine wide IPI which syncrhonizes I$

At this point, nobody will execute the tail, because we'll still trap on
that first single byte instruction, but if they were to read the
instruction stream, the tail must be there.

  - overwrite the first byte of the instruction to now have a complete
    instruction.

  - machine wide IPI which syncrhonizes I$

At this point, any CPU will encounter the new instruction as a whole,
irrespective of alignment.


So the benefit of this scheme is that is works irrespective of the
instruction fetch window size and don't need the 'funny' alignment
stuff.

Now, I've no idea if something like this is feasible on ARC; for it to
work you need that 2 byte trap instruction -- since all instructions are
2 byte aligned, you can always poke that without issue.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ