lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210129085808.4e023d3f@carbon>
Date:   Fri, 29 Jan 2021 08:58:08 +0100
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     David Ahern <dsahern@...il.com>
Cc:     netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>, bpf@...r.kernel.org,
        Eric Dumazet <eric.dumazet@...il.com>,
        Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        brouer@...hat.com
Subject: Re: [PATCH net-next V1] net: adjust net_device layout for cacheline
 usage

On Thu, 28 Jan 2021 20:51:23 -0700
David Ahern <dsahern@...il.com> wrote:

> On 1/26/21 10:39 AM, Jesper Dangaard Brouer wrote:
> > The current layout of net_device is not optimal for cacheline usage.
> > 
> > The member adj_list.lower linked list is split between cacheline 2 and 3.
> > The ifindex is placed together with stats (struct net_device_stats),
> > although most modern drivers don't update this stats member.
> > 
> > The members netdev_ops, mtu and hard_header_len are placed on three
> > different cachelines. These members are accessed for XDP redirect into
> > devmap, which were noticeably with perf tool. When not using the map
> > redirect variant (like TC-BPF does), then ifindex is also used, which is
> > placed on a separate fourth cacheline. These members are also accessed
> > during forwarding with regular network stack. The members priv_flags and
> > flags are on fast-path for network stack transmit path in __dev_queue_xmit
> > (currently located together with mtu cacheline).
> > 
> > This patch creates a read mostly cacheline, with the purpose of keeping the
> > above mentioned members on the same cacheline.
> > 
> > Some netdev_features_t members also becomes part of this cacheline, which is
> > on purpose, as function netif_skb_features() is on fast-path via
> > validate_xmit_skb().  
> 
> A long over due look at the organization of this struct. Do you have
> performance numbers for the XDP case?

Yes, my measurements are documented here:
 https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp_redir01_net_device.org

Calc improvements of xdp_redirect_map on driver i40e:
 * (1/12115061-1/12906785)*10^9 = 5.06 ns
 * ((12906785/12115061)-1)*100  = 6.54%

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ