[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190724075700.GA15878@splinter>
Date: Wed, 24 Jul 2019 10:57:00 +0300
From: Ido Schimmel <idosch@...sch.org>
To: David Ahern <dsahern@...il.com>
Cc: Toke Høiland-Jørgensen <toke@...hat.com>,
netdev@...r.kernel.org, davem@...emloft.net, nhorman@...driver.com,
roopa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
jakub.kicinski@...ronome.com, andy@...yhouse.net,
f.fainelli@...il.com, andrew@...n.ch, vivien.didelot@...il.com,
mlxsw@...lanox.com, Ido Schimmel <idosch@...lanox.com>
Subject: Re: [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets
and metadata
On Tue, Jul 23, 2019 at 08:47:57AM -0700, David Ahern wrote:
> On 7/23/19 8:14 AM, Ido Schimmel wrote:
> > On Tue, Jul 23, 2019 at 02:17:49PM +0200, Toke Høiland-Jørgensen wrote:
> >> Ido Schimmel <idosch@...sch.org> writes:
> >>
> >>> On Mon, Jul 22, 2019 at 09:43:15PM +0200, Toke Høiland-Jørgensen wrote:
> >>>> Is there a mechanism for the user to filter the packets before they are
> >>>> sent to userspace? A bpf filter would be the obvious choice I guess...
> >>>
> >>> Hi Toke,
> >>>
> >>> Yes, it's on my TODO list to write an eBPF program that only lets
> >>> "unique" packets to be enqueued on the netlink socket. Where "unique" is
> >>> defined as {5-tuple, PC}. The rest of the copies will be counted in an
> >>> eBPF map, which is just a hash table keyed by {5-tuple, PC}.
> >>
> >> Yeah, that's a good idea. Or even something simpler like tcpdump-style
> >> filters for the packets returned by drop monitor (say if I'm just trying
> >> to figure out what happens to my HTTP requests).
> >
> > Yep, that's a good idea. I guess different users will use different
> > programs. Will look into both options.
>
> Perhaps I am missing something, but the dropmon code only allows a
> single user at the moment (in my attempts to run 2 instances the second
> one failed).
Yes, you're correct. By "different users" I meant users on different
systems with different needs. For example, someone trying to monitor
dropped packets on a laptop versus someone trying to do the same on a
ToR switch.
> If that part stays with the design
This stays the same.
> it afford better options for the design. e.g., attributes that control
> the enqueued packets when the event occurs as opposed to bpf filters
> which run much later when the message is enqueued to the socket.
I'm going to add an attribute that will control the number of packets
we're enqueuing on the per-CPU drop list. I'm not sure, but are you
suggesting to add even more attributes? If so, how do you imagine these
will look like?
>
> >
> >>> I think it would be good to have the program as part of the bcc
> >>> repository [1]. What do you think?
> >>
> >> Sure. We could also add it to the XDP tutorial[2]; it could go into a
> >> section on introspection and debugging (just added a TODO about that[3]).
> >
> > Great!
> >
> >>>> For integrating with XDP the trick would be to find a way to do it that
> >>>> doesn't incur any overhead when it's not enabled. Are you envisioning
> >>>> that this would be enabled separately for the different "modes" (kernel,
> >>>> hardware, XDP, etc)?
> >>>
> >>> Yes. Drop monitor have commands to enable and disable tracing, but they
> >>> don't carry any attributes at the moment. My plan is to add an attribute
> >>> (e.g., 'NET_DM_ATTR_DROP_TYPE') that will specify the type of drops
> >>> you're interested in - SW/HW/XDP. If the attribute is not specified,
> >>> then current behavior is maintained and all the drop types are traced.
> >>> But if you're only interested in SW drops, then overhead for the rest
> >>> should be zero.
> >>
> >> Makes sense (although "should be" is the key here ;)).
>
> static_key is used in other parts of the packet fast path.
>
> Toke/Jesper: Any reason to believe it is too much overhead for this path?
>
> >>
> >> I'm also worried about the drop monitor getting overwhelmed; if you turn
> >> it on for XDP and you're running a filtering program there, you'll
> >> suddenly get *a lot* of drops.
> >>
> >> As I read your patch, the current code can basically queue up an
> >> unbounded number of packets waiting to go out over netlink, can't it?
> >
> > That's a very good point. Each CPU holds a drop list. It probably makes
> > sense to limit it by default (to 1000?) and allow user to change it
> > later, if needed. I can expose a counter that shows how many packets
> > were dropped because of this limit. It can be used as an indication to
> > adjust the queue length (or flip to 'summary' mode).
> >
>
> And then with a single user limit, you can have an attribute that
> controls the backlog.
Yep, already on my list of changes for v1 :)
Thanks, David.
Powered by blists - more mailing lists