[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S34623bJP=L2EVNV6Gfx4oF=io5TGsGXCDQ=WH3VnHweMg@mail.gmail.com>
Date: Sun, 10 Jul 2016 15:30:44 -0500
From: Tom Herbert <tom@...bertland.com>
To: Brenden Blanco <bblanco@...mgrid.com>
Cc: Jesper Dangaard Brouer <brouer@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Martin KaFai Lau <kafai@...com>, Ari Saha <as754m@....com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Or Gerlitz <gerlitz.or@...il.com>,
john fastabend <john.fastabend@...il.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Thomas Graf <tgraf@...g.ch>,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH v6 01/12] bpf: add XDP prog type for early driver filter
On Sun, Jul 10, 2016 at 12:09 PM, Brenden Blanco <bblanco@...mgrid.com> wrote:
> On Sun, Jul 10, 2016 at 03:37:31PM +0200, Jesper Dangaard Brouer wrote:
>> On Sat, 9 Jul 2016 08:47:52 -0500
>> Tom Herbert <tom@...bertland.com> wrote:
>>
>> > On Sat, Jul 9, 2016 at 3:14 AM, Jesper Dangaard Brouer
>> > <brouer@...hat.com> wrote:
>> > > On Thu, 7 Jul 2016 19:15:13 -0700
>> > > Brenden Blanco <bblanco@...mgrid.com> wrote:
>> > >
>> > >> Add a new bpf prog type that is intended to run in early stages of the
>> > >> packet rx path. Only minimal packet metadata will be available, hence a
>> > >> new context type, struct xdp_md, is exposed to userspace. So far only
>> > >> expose the packet start and end pointers, and only in read mode.
>> > >>
>> > >> An XDP program must return one of the well known enum values, all other
>> > >> return codes are reserved for future use. Unfortunately, this
>> > >> restriction is hard to enforce at verification time, so take the
>> > >> approach of warning at runtime when such programs are encountered. The
>> > >> driver can choose to implement unknown return codes however it wants,
>> > >> but must invoke the warning helper with the action value.
>> > >
>> > > I believe we should define a stronger semantics for unknown/future
>> > > return codes than the once stated above:
>> > > "driver can choose to implement unknown return codes however it wants"
>> > >
>> > > The mlx4 driver implementation in:
>> > > [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program
>> > >
>> > > On Thu, 7 Jul 2016 19:15:16 -0700 Brenden Blanco <bblanco@...mgrid.com> wrote:
>> > >
>> > >> + /* A bpf program gets first chance to drop the packet. It may
>> > >> + * read bytes but not past the end of the frag.
>> > >> + */
>> > >> + if (prog) {
>> > >> + struct xdp_buff xdp;
>> > >> + dma_addr_t dma;
>> > >> + u32 act;
>> > >> +
>> > >> + dma = be64_to_cpu(rx_desc->data[0].addr);
>> > >> + dma_sync_single_for_cpu(priv->ddev, dma,
>> > >> + priv->frag_info[0].frag_size,
>> > >> + DMA_FROM_DEVICE);
>> > >> +
>> > >> + xdp.data = page_address(frags[0].page) +
>> > >> + frags[0].page_offset;
>> > >> + xdp.data_end = xdp.data + length;
>> > >> +
>> > >> + act = bpf_prog_run_xdp(prog, &xdp);
>> > >> + switch (act) {
>> > >> + case XDP_PASS:
>> > >> + break;
>> > >> + default:
>> > >> + bpf_warn_invalid_xdp_action(act);
>> > >> + case XDP_DROP:
>> > >> + goto next;
>> > >> + }
>> > >> + }
>> > >
>> > > Thus, mlx4 choice is to drop packets for unknown/future return codes.
>> > >
>> > > I think this is the wrong choice. I think the choice should be
>> > > XDP_PASS, to pass the packet up the stack.
>> > >
>> > > I find "XDP_DROP" problematic because it happen so early in the driver,
>> > > that we lost all possibilities to debug what packets gets dropped. We
>> > > get a single kernel log warning, but we cannot inspect the packets any
>> > > longer. By defaulting to XDP_PASS all the normal stack tools (e.g.
>> > > tcpdump) is available.
> The goal of XDP is performance, and therefore the driver-specific choice
> I am making here is to drop, because it preserves resources to do so. I
> cannot say for all drivers whether this is the right choice or not.
> Therefore, in the user-facing API I leave it undefined, so that future
> drivers can have flexibility to choose the most performant
> implementation for themselves.
>
I don't think this should be undefined. If a driver receives a code
from XDP that it doesn't understand this is an API mismatch and a bug
somewhere.
> Consider the UDP DDoS use case that we have mentioned before. Suppose an
> attacker has knowledge of the particular XDP program that is being used
> to filter their packets, and can somehow overflow the return code of the
> program. The attacker would prefer that the overflow case consumes
> time/memory/both, which if the mlx4 driver were to pass to stack it
> would certainly do, and so we must choose the opposite if we have
> network security in mind (we do!).
Yes.
>> > >
>> >
>> > It's an API issue though not a problem with the packet. Allowing
>> > unknown return codes to pass seems like a major security problem also.
>>
>> We have the full power and flexibility of the normal Linux stack to
>> drop these packets. And from a usability perspective it gives insight
>> into what is wrong and counters metrics. Would you rather blindly drop
>> e.g. 0.01% of the packets in your data-centers without knowing.
> Full power, but not full speed, and in the case of DDoS mitigation this
> is a strong enough argument IMHO.
>>
>> We already talk about XDP as an offload mechanism. Normally when
>> loading a (XDP) "offload" program it should be rejected, e.g. by the
>> validator. BUT we cannot validate all return eBPF codes, because they
>> can originate from a table lookup. Thus, we _do_ allow programs to be
>> loaded, with future unknown return code.
>> This then corresponds to only part of the program can be offloaded,
>> thus the natural response is to fallback, handling this is the
>> non-offloaded slower-path.
>>
>> I see the XDP_PASS fallback as a natural way of supporting loading
>> newer/future programs on older "versions" of XDP.
>> E.g. I can have a XDP program that have a valid filter protection
>> mechanism, but also use a newer mechanism, and my server fleet contains
>> different NIC vendors, some NICs only support the filter part. Then I
>> want to avoid having to compile and maintain different XDP/eBPF
>> programs per NIC vendor. (Instead I prefer having a Linux stack
>> fallback mechanism, and transparently XDP offload as much as the NIC
>> driver supports).
> I would then argue to only support offloading of XDP programs with
> verifiable return codes. We're not at that stage yet, and I think we can
> choose different defaults for these two cases.
>
> We have conflicting examples here, which lead to different conclusions.
> Reiterating an earlier argument that I made for others on the list to
> consider:
> """
> Besides, I don't see how PASS is any more correct than DROP. Consider a
> future program that is intended to rewrite a packet and forward it out
> another port (with some TX_OTHER return code or whatever). If the driver
> PASSes the packet, it will still not be interpreted by the stack, since
> it may have been destined for some other machine.
> """
> So, IMHO there is not a clear right or wrong, and I still fall back to
> the security argument to resolve the dilemma. The point there is not
> drop/pass, but resource preservation.
>
Blind pass is a security risk, drop is always a correct action in that sense.
Tom
>>
>>
>> > > I can also imagine that, defaulting to XDP_PASS, can be an important
>> > > feature in the future.
>> > >
>> > > In the future we will likely have features, where XDP can "offload"
>> > > packet delivery from the normal stack (e.g. delivery into a VM). On a
>> > > running production system you can then load your XDP program. If the
>> > > driver was too old defaulting to XDP_DROP, then you lost your service,
>> > > instead if defaulting to XDP_PASS, your service would survive, falling
>> > > back to normal delivery.
>> > >
>> > > (For the VM delivery use-case, there will likely be a need for having a
>> > > fallback delivery method in place, when the XDP program is not active,
>> > > in-order to support VM migration).
>> > >
>> > >
>> > >
> [...]
Powered by blists - more mailing lists