lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 4 Dec 2018 10:29:04 +0100
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Toke Høiland-Jørgensen <toke@...e.dk>,
        David Miller <davem@...emloft.net>, dsahern@...il.com,
        saeedm@...lanox.com, mst@...hat.com, netdev@...r.kernel.org,
        pstaszewski@...are.pl, jasowang@...hat.com, brouer@...hat.com
Subject: Re: consistency for statistics with XDP mode

On Mon, 3 Dec 2018 23:24:18 -0800
Jakub Kicinski <jakub.kicinski@...ronome.com> wrote:

> On Tue, 04 Dec 2018 09:03:52 +0200, Toke Høiland-Jørgensen wrote:
> > David Miller <davem@...emloft.net> writes:
> >   
> > > From: David Ahern <dsahern@...il.com>
> > > Date: Mon, 3 Dec 2018 17:15:03 -0700
> > >    
> > >> So, instead of a program tag which the program writer controls, how
> > >> about some config knob that an admin controls that says at attach time
> > >> use standard stats?    
> > >
> > > How about, instead of replacing it is in addition to, and admin can
> > > override?    
> > 
> > Yeah, I was also thinking about something the program writer can set,
> > but the admin can override. There could even be a system-wide setting
> > that would make the verifier inject it into all programs at load time?  
> 
> That'd be far preferable, having all drivers include the code when we
> have JITs seems like a step backward.

There is one problem with it being part of the eBPF prog.  Once eBPf
prog is loaded you cannot change it (store in read-only page for
security reasons).  So, that will not make it possible for an admin to
enable stats when troubleshooting a system.  The use-case I think we
want to support, is to allow to opt-out due to performance concerns,
but when an admin need to troubleshoot the system, allow the admin to
enable this system wide.

Besides placing this in every eBPF program in the system will replicate
the stats update code (and put more I-cache pressure).  The coded
needed is actually very simple:

[PATCH] xdp: add stats for XDP action codes in xdp_rxq_info
    
Code muckup of adding XDP stats

diff --git a/include/linux/filter.h b/include/linux/filter.h
index cc17f5f32fbb..600a95e0cbcc 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -628,7 +628,10 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
         * already takes rcu_read_lock() when fetching the program, so
         * it's not necessary here anymore.
         */
-       return BPF_PROG_RUN(prog, xdp);
+       u32 action = BPF_PROG_RUN(prog, xdp);
+       // Q: will adding a branch here cost more than always accounting?
+       xdp->rxq->stats[action <= XDP_REDIRECT ? action : 0]++;
+       return action;
 }
 
 static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog)
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 4a0ca7a3d5e5..3409dd9e0fbc 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -6,6 +6,7 @@
 #ifndef __LINUX_NET_XDP_H__
 #define __LINUX_NET_XDP_H__
 
+#include <uapi/linux/bpf.h>
 /**
  * DOC: XDP RX-queue information
  *
@@ -61,6 +62,8 @@ struct xdp_rxq_info {
        u32 queue_index;
        u32 reg_state;
        struct xdp_mem_info mem;
+       // TODO: benchmark if stats should be placed on different cache-line
+       u64 stats[XDP_REDIRECT + 1];
 } ____cacheline_aligned; /* perf critical, avoid false-sharing */
 
 struct xdp_buff {



> We could probably fit the stats into the enormous padding of struct
> xdp_rxq_info, the question is - how do we get to it in a clean way..

Struct xdp_rxq_info is explicitly a read-only cache-line, which contain
static information for each RX-queue.  We could place the stats record
in the next cache-line (most HW systems fetch 2 cache-lines).  But we
can also benchmark if it matters changing xdp_rxq_info to be a
write-cache-line.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ