netdev - Re: [PATCH bpf] bpf: fix narrower loads on s390

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAH3MdRUXJBf5qQFkUZ5x3X=ziXE8Ou7O1cH_WJvyXVpEtQQ0nQ@mail.gmail.com>
Date:   Wed, 17 Jul 2019 15:23:32 -0700
From:   Y Song <ys114321@...il.com>
To:     Ilya Leoshkevich <iii@...ux.ibm.com>
Cc:     bpf <bpf@...r.kernel.org>, netdev <netdev@...r.kernel.org>,
        gor@...ux.ibm.com, heiko.carstens@...ibm.com
Subject: Re: [PATCH bpf] bpf: fix narrower loads on s390

On Wed, Jul 17, 2019 at 1:52 PM Ilya Leoshkevich <iii@...ux.ibm.com> wrote:
>
> > Am 17.07.2019 um 18:25 schrieb Y Song <ys114321@...il.com>:
> >
> > On Wed, Jul 17, 2019 at 3:36 AM Ilya Leoshkevich <iii@...ux.ibm.com> wrote:
> >>
> >>
> >> Here is a better one: len=0x11223344 and we would like to do
> >> ((u8 *)&len)[3].
> >>
> >> len is represented as `11 22 33 44` in memory, so the desired result is
> >> 0x44. It can be obtained by doing (*(u32 *)&len) & 0xff, but today the
> >> verifier does ((*(u32 *)&len) >> 24) & 0xff instead.
> >
> > What you described above for the memory layout all makes sense.
> > The root cause is for big endian, we should do *((u8 *)&len + 3).
> > This is exactly what macros in test_pkt_md_access.c tries to do.
> >
> > if  __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> > #define TEST_FIELD(TYPE, FIELD, MASK)                                   \
> >        {                                                               \
> >                TYPE tmp = *(volatile TYPE *)&skb->FIELD;               \
> >                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
> >                        return TC_ACT_SHOT;                             \
> >        }
> > #else
> > #define TEST_FIELD_OFFSET(a, b) ((sizeof(a) - sizeof(b)) / sizeof(b))
> > #define TEST_FIELD(TYPE, FIELD, MASK)                                   \
> >        {                                                               \
> >                TYPE tmp = *((volatile TYPE *)&skb->FIELD +             \
> >                              TEST_FIELD_OFFSET(skb->FIELD, TYPE));     \
> >                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
> >                        return TC_ACT_SHOT;                             \
> >        }
> > #endif
> >
> > Could you check whether your __BYTE_ORDER__ is set
> > correctly or not for this case? You may need to tweak Makefile
> > if you are doing cross compilation, I am not sure how as I
> > did not have environment.
>
> I’m building natively on s390.
>
> Here is the (formatted) preprocessed C code for the first condition:
>
> {
>         __u8 tmp = *((volatile __u8 *)&skb->len +
>                 ((sizeof(skb->len) - sizeof(__u8)) / sizeof(__u8)));
>         if (tmp != ((*(volatile __u32 *)&skb->len) & 0xFF)) return 2;
> };
>
> So I believe the endianness is chosen correctly.
>
> Here is the clang-generated BPF bytecode for the first condition:
>
> # llvm-objdump -d test_pkt_md_access.o
> 0000000000000000 process:
>        0:       71 21 00 03 00 00 00 00 r2 = *(u8 *)(r1 + 3)
>        1:       61 31 00 00 00 00 00 00 r3 = *(u32 *)(r1 + 0)
>        2:       57 30 00 00 00 00 00 ff r3 &= 255
>        3:       5d 23 00 1d 00 00 00 00 if r2 != r3 goto +29 <LBB0_10>
>
> This also looks good to me.
>
> Finally, here is the verifier-generated BPF bytecode:
>
> # bpftool prog dump xlated id 14
> ; TEST_FIELD(__u8,  len, 0xFF);
>    0: (61) r2 = *(u32 *)(r1 +104)
>    1: (bc) w2 = w2
>    2: (74) w2 >>= 24
>    3: (bc) w2 = w2
>    4: (54) w2 &= 255
>    5: (bc) w2 = w2
>
> Here we can see the shift that I'm referring to. I believe we should
> translate *(u8 *)(r1 + 3) in this case without this shift on big-endian
> machines.

Thanks for the detailed illustration.
Now, with your detailed output of byte codes and xlated program, it
indeed becomes apparent
that verifier should not do shift at insn 2.

I was correct that after insn 0, register r2 should hold the same
value for big and little endian.
But I missed the fact in the first review that insn->off for original
first insn is different.
r2 = *(u8 *)(r1 + 3), the first insn on big endian, and r2 = *(u8 *)r1
for little endian.
They should really have the same shift amount.

Therefore, indeed, shifting amount is actually different between big
and little endians.
So your code is correct. Could you add a macro in linux/filter.h? Most
narrow load related
macros are there? This way, we maintain verifier.c __BYTE_ORDER__ macro free.

Also, could you put your above analysis in the commit message? This will help
reasoning the change easily later on.

Thanks!

>
> Best regards,
> Ilya