[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181204095608.7046b24e@cakuba.netronome.com>
Date: Tue, 4 Dec 2018 09:56:08 -0800
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: Toke Høiland-Jørgensen <toke@...e.dk>,
David Miller <davem@...emloft.net>, dsahern@...il.com,
saeedm@...lanox.com, mst@...hat.com, netdev@...r.kernel.org,
pstaszewski@...are.pl, jasowang@...hat.com
Subject: Re: consistency for statistics with XDP mode
On Tue, 4 Dec 2018 10:29:04 +0100, Jesper Dangaard Brouer wrote:
> On Mon, 3 Dec 2018 23:24:18 -0800
> Jakub Kicinski <jakub.kicinski@...ronome.com> wrote:
>
> > On Tue, 04 Dec 2018 09:03:52 +0200, Toke Høiland-Jørgensen wrote:
> > > David Miller <davem@...emloft.net> writes:
> > >
> > > > From: David Ahern <dsahern@...il.com>
> > > > Date: Mon, 3 Dec 2018 17:15:03 -0700
> > > >
> > > >> So, instead of a program tag which the program writer controls, how
> > > >> about some config knob that an admin controls that says at attach time
> > > >> use standard stats?
> > > >
> > > > How about, instead of replacing it is in addition to, and admin can
> > > > override?
> > >
> > > Yeah, I was also thinking about something the program writer can set,
> > > but the admin can override. There could even be a system-wide setting
> > > that would make the verifier inject it into all programs at load time?
> >
> > That'd be far preferable, having all drivers include the code when we
> > have JITs seems like a step backward.
>
> There is one problem with it being part of the eBPF prog. Once eBPf
> prog is loaded you cannot change it (store in read-only page for
> security reasons). So, that will not make it possible for an admin to
> enable stats when troubleshooting a system. The use-case I think we
> want to support, is to allow to opt-out due to performance concerns,
> but when an admin need to troubleshoot the system, allow the admin to
> enable this system wide.
>
> Besides placing this in every eBPF program in the system will replicate
> the stats update code (and put more I-cache pressure). The coded
> needed is actually very simple:
>
> [PATCH] xdp: add stats for XDP action codes in xdp_rxq_info
>
> Code muckup of adding XDP stats
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index cc17f5f32fbb..600a95e0cbcc 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -628,7 +628,10 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
> * already takes rcu_read_lock() when fetching the program, so
> * it's not necessary here anymore.
> */
> - return BPF_PROG_RUN(prog, xdp);
> + u32 action = BPF_PROG_RUN(prog, xdp);
> + // Q: will adding a branch here cost more than always accounting?
> + xdp->rxq->stats[action <= XDP_REDIRECT ? action : 0]++;
> + return action;
> }
Now if we can turn it into a trampoline dynamically inserted on program
invocation that would be perfect :)
> static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog)
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index 4a0ca7a3d5e5..3409dd9e0fbc 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -6,6 +6,7 @@
> #ifndef __LINUX_NET_XDP_H__
> #define __LINUX_NET_XDP_H__
>
> +#include <uapi/linux/bpf.h>
> /**
> * DOC: XDP RX-queue information
> *
> @@ -61,6 +62,8 @@ struct xdp_rxq_info {
> u32 queue_index;
> u32 reg_state;
> struct xdp_mem_info mem;
> + // TODO: benchmark if stats should be placed on different cache-line
> + u64 stats[XDP_REDIRECT + 1];
> } ____cacheline_aligned; /* perf critical, avoid false-sharing */
>
> struct xdp_buff {
> > We could probably fit the stats into the enormous padding of struct
> > xdp_rxq_info, the question is - how do we get to it in a clean way..
>
> Struct xdp_rxq_info is explicitly a read-only cache-line, which contain
> static information for each RX-queue. We could place the stats record
> in the next cache-line (most HW systems fetch 2 cache-lines). But we
> can also benchmark if it matters changing xdp_rxq_info to be a
> write-cache-line.
I see, I was thinking xdp_frame work made it generally less of an issue
but I haven't double checked. Perhaps having first cache line as read
only is a good rule to stick to regardless.
Powered by blists - more mailing lists