netdev - Re: [RFC bpf-next v2 7/9] bpf: when doing BPF_PROG_TEST

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190320190225.GL7431@mini-arch.hsd1.ca.comcast.net>
Date:   Wed, 20 Mar 2019 12:02:25 -0700
From:   Stanislav Fomichev <sdf@...ichev.me>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Stanislav Fomichev <sdf@...gle.com>,
        Network Development <netdev@...r.kernel.org>,
        bpf@...r.kernel.org, David Miller <davem@...emloft.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        simon.horman@...ronome.com, Willem de Bruijn <willemb@...gle.com>,
        Petar Penkov <peterpenkov96@...il.com>
Subject: Re: [RFC bpf-next v2 7/9] bpf: when doing BPF_PROG_TEST_RUN for flow
 dissector use no-skb mode

On 03/20, Willem de Bruijn wrote:
> On Wed, Mar 20, 2019 at 12:57 PM Stanislav Fomichev <sdf@...ichev.me> wrote:
> >
> > On 03/19, Willem de Bruijn wrote:
> > > On Tue, Mar 19, 2019 at 6:21 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > > >
> > > > Now that we have __flow_bpf_dissect which works on raw data (by
> > > > constructing temporary on-stack skb), use it when doing
> > > > BPF_PROG_TEST_RUN for flow dissector.
> > > >
> > > > This should help us catch any possible bugs due to missing shinfo on
> > > > the per-cpu skb.
> > > >
> > > > Note that existing __skb_flow_bpf_dissect swallows L2 headers and returns
> > > > nhoff=0, we need to preserve the existing behavior.
> > > >
> > > > Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
> > > > ---
> > > >  net/bpf/test_run.c | 48 ++++++++++++++--------------------------------
> > > >  1 file changed, 14 insertions(+), 34 deletions(-)
> > > >
> > >
> > > > @@ -300,9 +277,13 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > > >         preempt_disable();
> > > >         time_start = ktime_get_ns();
> > > >         for (i = 0; i < repeat; i++) {
> > > > -               retval = bpf_flow_dissect_skb(prog, skb,
> > > > -                                             &flow_keys_dissector,
> > > > -                                             &flow_keys);
> > > > +               retval = bpf_flow_dissect(prog, data, eth->h_proto, ETH_HLEN,
> > > > +                                         size, &flow_keys_dissector,
> > > > +                                         &flow_keys);
> > > > +               if (flow_keys.nhoff >= ETH_HLEN)
> > > > +                       flow_keys.nhoff -= ETH_HLEN;
> > > > +               if (flow_keys.thoff >= ETH_HLEN)
> > > > +                       flow_keys.thoff -= ETH_HLEN;
> > >
> > > why are these conditional?
> > Hm, I didn't want these to be negative, because bpf flow program can set
> > them to zero and clamp_flow_keys makes sure they are in a "sensible"
> > range. For this particular case, I think we need to amend
> > clamp_flow_keys to make sure that flow_keys.nhoff is in the range of
> > initial_nhoff..hlen, not 0..hlen (and then we can drop these checks).
> 
> So, previously eth_type_trans would call with data at the network
> header. Now it is called with data at the link layer. How would
> __skb_flow_bpf_dissect "swallows L2 headers and returns nhoff=0"? That
s/__skb_flow_bpf_dissect/eth_type_trans/, I'll clarify that in the patch
description.

> sounds incorrect.
Previously, for skb case, eth_type_trans would pull ETH_HLEN (L2) and
after that we did skb_reset_network_header. So when later we initialized
flow keys (flow_keys->nhoff = skb_network_offset(skb)), that would
yield nhoff == 0.

For example, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/prog_tests/flow_dissector.c

Now, we explicitly call bpf_flow_dissect with nhoff=ETH_HLEN and have to
undo it, otherwise, it breaks those tests.

We could do something like the following instead:
retval = bpf_flow_dissect(prog, data + ETH_HLEN, eth->h_proto, 0,
                          size, &flow_keys_dissector,
                          &flow_keys);

But I wanted to make sure nhoff != 0 works.

> 
> Agreed that the output should lie between nhoff and hlen, but as far
> as I can tell it is always zero indexed to the data passed, here the
> link layer:
> 
>         if (!data) {
>                 data = skb->data;
>                 proto = skb_vlan_tag_present(skb) ?
>                          skb->vlan_proto : skb->protocol;
>                 nhoff = skb_network_offset(skb);
That's for the skb != NULL case. eth_get_headlen calls with skb == NULL
and passes data and nhoff=sizeof(*eth):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ethernet/eth.c#n139