lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878sljcung.fsf@toke.dk>
Date:   Mon, 03 Feb 2020 20:56:03 +0100
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     David Ahern <dsahern@...il.com>, David Ahern <dsahern@...nel.org>,
        netdev@...r.kernel.org, prashantbhole.linux@...il.com,
        jasowang@...hat.com, davem@...emloft.net, brouer@...hat.com,
        mst@...hat.com, toshiaki.makita1@...il.com, daniel@...earbox.net,
        john.fastabend@...il.com, ast@...nel.org, kafai@...com,
        songliubraving@...com, yhs@...com, andriin@...com,
        David Ahern <dahern@...italocean.com>
Subject: Re: [PATCH bpf-next 03/12] net: Add IFLA_XDP_EGRESS for XDP programs in the egress path

Jakub Kicinski <kuba@...nel.org> writes:

> On Sat, 01 Feb 2020 21:05:28 +0100, Toke Høiland-Jørgensen wrote:
>> Jakub Kicinski <kuba@...nel.org> writes:
>> > On Sat, 01 Feb 2020 17:24:39 +0100, Toke Høiland-Jørgensen wrote:  
>> >> > I'm weary of partially implemented XDP features, EGRESS prog does us
>> >> > no good when most drivers didn't yet catch up with the REDIRECTs.    
>> >> 
>> >> I kinda agree with this; but on the other hand, if we have to wait for
>> >> all drivers to catch up, that would mean we couldn't add *anything*
>> >> new that requires driver changes, which is not ideal either :/  
>> >
>> > If EGRESS is only for XDP frames we could try to hide the handling in
>> > the core (with slight changes to XDP_TX handling in the drivers),
>> > making drivers smaller and XDP feature velocity higher.  
>> 
>> But if it's only for XDP frames that are REDIRECTed, then one might as
>> well perform whatever action the TX hook was doing before REDIRECTing
>> (as you yourself argued)... :)
>
> Right, that's why I think the design needs to start from queuing which
> can't be done today, and has to be done in context of the destination.
> Solving queuing justifies the added complexity if you will :)

Right, that makes sense. Hmm, I wonder if a TX driver hook is enough?
I.e., a driver callback in the TX path could also just queue that packet
(returning XDP_QUEUED?), without the driver needing any more handling
than it will already have? I'm spitballing a little bit here, but it may
be quite straight-forward? :)

>> > I think loading the drivers with complexity is hurting us in so many
>> > ways..  
>> 
>> Yeah, but having the low-level details available to the XDP program
>> (such as HW queue occupancy for the egress hook) is one of the benefits
>> of XDP, isn't it?
>
> I think I glossed over the hope for having access to HW queue occupancy
> - what exactly are you after? 
>
> I don't think one can get anything beyond a BQL type granularity.
> Reading over PCIe is out of question, device write back on high
> granularity would burn through way too much bus throughput.
>
>> Ultimately, I think Jesper's idea of having drivers operate exclusively
>> on XDP frames and have the skb handling entirely in the core is an
>> intriguing way to resolve this problem. Though this is obviously a
>> long-term thing, and one might reasonably doubt we'll ever get there for
>> existing drivers...
>> 
>> >> > And we're adding this before we considered the queuing problem.
>> >> >
>> >> > But if I'm alone in thinking this, and I'm not convincing anyone we
>> >> > can move on :)    
>> >> 
>> >> I do share your concern that this will end up being incompatible with
>> >> whatever solution we end up with for queueing. However, I don't
>> >> necessarily think it will: I view the XDP egress hook as something
>> >> that in any case will run *after* packets are dequeued from whichever
>> >> intermediate queueing it has been through (if any). I think such a
>> >> hook is missing in any case; for instance, it's currently impossible
>> >> to implement something like CoDel (which needs to know how long a
>> >> packet spent in the queue) in eBPF.  
>> >
>> > Possibly 🤔 I don't have a good mental image of how the XDP queuing
>> > would work.
>> >
>> > Maybe once the queuing primitives are defined they can easily be
>> > hooked into the Qdisc layer. With Martin's recent work all we need is 
>> > a fifo that can store skb pointers, really...
>> >
>> > It'd be good if the BPF queuing could replace TC Qdiscs, rather than 
>> > layer underneath.  
>> 
>> Hmm, hooking into the existing qdisc layer is an interesting idea.
>> Ultimately, I fear it won't be feasible for performance reasons; but
>> it's certainly something to consider. Maybe at least as an option?
>
> For forwarding sure, but for locally generated traffic.. 🤷‍♂️

Right, well, for locally generated traffic we already have the qdisc?
This was kinda the reason why my original thought was to add the
queueing for XDP only at REDIRECT time. I.e., we already have the
xdp_dev_bulk_queue (which now even lives in struct net_device), so we
could "just" extend that and make it into a proper queueing structure,
and call it a day? :)

But from your comments it sounds like that when you're saying "BPF
queueing" you mean that the queueing itself should be programmable using
BPF? To me, the most urgent thing has been to figure out a way to do
*any* kind of queueing with XDP-forwarded frames, so haven't given much
thought to what a "fully programmable" queue would look like... Do you
have any thoughts?

-Toke

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ