netdev - Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAH3MdRVK-=pL8YCz44BPCmHhuZE_CRfih0QdZPLa=mpE_uoA_Q@mail.gmail.com>
Date:   Thu, 21 Sep 2017 16:11:41 -0700
From:   Y Song <ys114321@...il.com>
To:     Edward Cree <ecree@...arflare.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions

On Thu, Sep 21, 2017 at 12:58 PM, Edward Cree <ecree@...arflare.com> wrote:
> On 21/09/17 20:44, Alexei Starovoitov wrote:
>> On Thu, Sep 21, 2017 at 09:29:33PM +0200, Daniel Borkmann wrote:
>>> More intuitive, but agree on the from_be/le. Maybe we should
>>> just drop the "to_" prefix altogether, and leave the rest as is since
>>> it's not surrounded by braces, it's also not a cast but rather an op.
> That works for me.
>> 'be16 r4' is ambiguous regarding upper bits.
>>
>> what about my earlier suggestion:
>> r4 = (be16) (u16) r4
>> r4 = (le64) (u64) r4
>>
>> It will be pretty clear what instruction is doing (that upper bits become zero).
> Trouble with that is that's very *not* what C will do with those casts
>  and it doesn't really capture the bidirectional/symmetry thing.  The
>  closest I could see with that is something like `r4 = (be16/u16) r4`,
>  but that's quite an ugly mongrel.
> I think Daniel's idea of `be16`, `le32` etc one-arg opcodes is the
>  cleanest and clearest.  Should it be
>     r4 = be16 r4
>  or just
>     be16 r4
> ?  Personally I incline towards the latter, but admit it doesn't really
>  match the syntax of other opcodes.

I did some quick prototyping in llvm to make sure we have a syntax
llvm is happy. Apparently, llvm does not like the syntax
   r4 = be16 r4    or    r4 = (be16) (u16) r4.

In llvm:utils/TableGen/AsmMatcherEmitter.cpp:

    // Verify that any operand is only mentioned once.
    // We reject aliases and ignore instructions for now.
    if (Tok[0] == '$' && !OperandNames.insert(Tok).second) {
      if (!Hack)
        PrintFatalError(TheDef->getLoc(),
                        "ERROR: matchable with tied operand '" + Tok +
                        "' can never be matched!");
      // FIXME: Should reject these.  The ARM backend hits this with $lane in a
      // bunch of instructions.  It is unclear what the right answer is.
      DEBUG({
        errs() << "warning: '" << TheDef->getName() << "': "
               << "ignoring instruction with tied operand '"
               << Tok << "'\n";
      });
      return false;
    }

Later on, such insn will be ignored in table matching and assember
will not work any more.

Note that here bswap16/32/64 require source and destination register
must be the same.

So it looks like "be16/be32/be64/le16/le32/le64 #register" is a good idea.
We could use "be16 (u16)#register", but not sure whether extra u16
conversion adds value or
rather adds more confusion.

>
> To shed a few more bikes, I did also wonder about the BPF_NEG opcode,
>  which (if I'm reading the code correctly) currently renders as
>     r4 = neg r4 0
>     (u32) r4 = neg (u32) r4 0
> That printing of the insn->imm, while harmless, is needless and
>  potentially confusing.  Should we get rid of it?

Currently, llvm does not issue "neg" insn yet. Maybe you can issue
     r3 = -r4          // 64bit
     r3 = (u32) -r4 // 32bit

This matches what interpreter does. This will be similar to other ALU
operations.