[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190320190225.GL7431@mini-arch.hsd1.ca.comcast.net>
Date: Wed, 20 Mar 2019 12:02:25 -0700
From: Stanislav Fomichev <sdf@...ichev.me>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Stanislav Fomichev <sdf@...gle.com>,
Network Development <netdev@...r.kernel.org>,
bpf@...r.kernel.org, David Miller <davem@...emloft.net>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
simon.horman@...ronome.com, Willem de Bruijn <willemb@...gle.com>,
Petar Penkov <peterpenkov96@...il.com>
Subject: Re: [RFC bpf-next v2 7/9] bpf: when doing BPF_PROG_TEST_RUN for flow
dissector use no-skb mode
On 03/20, Willem de Bruijn wrote:
> On Wed, Mar 20, 2019 at 12:57 PM Stanislav Fomichev <sdf@...ichev.me> wrote:
> >
> > On 03/19, Willem de Bruijn wrote:
> > > On Tue, Mar 19, 2019 at 6:21 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > > >
> > > > Now that we have __flow_bpf_dissect which works on raw data (by
> > > > constructing temporary on-stack skb), use it when doing
> > > > BPF_PROG_TEST_RUN for flow dissector.
> > > >
> > > > This should help us catch any possible bugs due to missing shinfo on
> > > > the per-cpu skb.
> > > >
> > > > Note that existing __skb_flow_bpf_dissect swallows L2 headers and returns
> > > > nhoff=0, we need to preserve the existing behavior.
> > > >
> > > > Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
> > > > ---
> > > > net/bpf/test_run.c | 48 ++++++++++++++--------------------------------
> > > > 1 file changed, 14 insertions(+), 34 deletions(-)
> > > >
> > >
> > > > @@ -300,9 +277,13 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > > > preempt_disable();
> > > > time_start = ktime_get_ns();
> > > > for (i = 0; i < repeat; i++) {
> > > > - retval = bpf_flow_dissect_skb(prog, skb,
> > > > - &flow_keys_dissector,
> > > > - &flow_keys);
> > > > + retval = bpf_flow_dissect(prog, data, eth->h_proto, ETH_HLEN,
> > > > + size, &flow_keys_dissector,
> > > > + &flow_keys);
> > > > + if (flow_keys.nhoff >= ETH_HLEN)
> > > > + flow_keys.nhoff -= ETH_HLEN;
> > > > + if (flow_keys.thoff >= ETH_HLEN)
> > > > + flow_keys.thoff -= ETH_HLEN;
> > >
> > > why are these conditional?
> > Hm, I didn't want these to be negative, because bpf flow program can set
> > them to zero and clamp_flow_keys makes sure they are in a "sensible"
> > range. For this particular case, I think we need to amend
> > clamp_flow_keys to make sure that flow_keys.nhoff is in the range of
> > initial_nhoff..hlen, not 0..hlen (and then we can drop these checks).
>
> So, previously eth_type_trans would call with data at the network
> header. Now it is called with data at the link layer. How would
> __skb_flow_bpf_dissect "swallows L2 headers and returns nhoff=0"? That
s/__skb_flow_bpf_dissect/eth_type_trans/, I'll clarify that in the patch
description.
> sounds incorrect.
Previously, for skb case, eth_type_trans would pull ETH_HLEN (L2) and
after that we did skb_reset_network_header. So when later we initialized
flow keys (flow_keys->nhoff = skb_network_offset(skb)), that would
yield nhoff == 0.
For example, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
Now, we explicitly call bpf_flow_dissect with nhoff=ETH_HLEN and have to
undo it, otherwise, it breaks those tests.
We could do something like the following instead:
retval = bpf_flow_dissect(prog, data + ETH_HLEN, eth->h_proto, 0,
size, &flow_keys_dissector,
&flow_keys);
But I wanted to make sure nhoff != 0 works.
>
> Agreed that the output should lie between nhoff and hlen, but as far
> as I can tell it is always zero indexed to the data passed, here the
> link layer:
>
> if (!data) {
> data = skb->data;
> proto = skb_vlan_tag_present(skb) ?
> skb->vlan_proto : skb->protocol;
> nhoff = skb_network_offset(skb);
That's for the skb != NULL case. eth_get_headlen calls with skb == NULL
and passes data and nhoff=sizeof(*eth):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ethernet/eth.c#n139
Powered by blists - more mailing lists