lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 11 Jul 2016 14:48:17 +0300 From: Saeed Mahameed <saeedm@....mellanox.co.il> To: Brenden Blanco <bblanco@...mgrid.com> Cc: Tariq Toukan <ttoukan.linux@...il.com>, "David S. Miller" <davem@...emloft.net>, Linux Netdev List <netdev@...r.kernel.org>, Martin KaFai Lau <kafai@...com>, Jesper Dangaard Brouer <brouer@...hat.com>, Ari Saha <as754m@....com>, Alexei Starovoitov <alexei.starovoitov@...il.com>, Or Gerlitz <gerlitz.or@...il.com>, john fastabend <john.fastabend@...il.com>, hannes@...essinduktion.org, Thomas Graf <tgraf@...g.ch>, Tom Herbert <tom@...bertland.com>, Daniel Borkmann <daniel@...earbox.net> Subject: Re: [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program On Sun, Jul 10, 2016 at 7:05 PM, Brenden Blanco <bblanco@...mgrid.com> wrote: > On Sun, Jul 10, 2016 at 06:25:40PM +0300, Tariq Toukan wrote: >> >> On 09/07/2016 10:58 PM, Saeed Mahameed wrote: >> >On Fri, Jul 8, 2016 at 5:15 AM, Brenden Blanco <bblanco@...mgrid.com> wrote: >> >>+ /* A bpf program gets first chance to drop the packet. It may >> >>+ * read bytes but not past the end of the frag. >> >>+ */ >> >>+ if (prog) { >> >>+ struct xdp_buff xdp; >> >>+ dma_addr_t dma; >> >>+ u32 act; >> >>+ >> >>+ dma = be64_to_cpu(rx_desc->data[0].addr); >> >>+ dma_sync_single_for_cpu(priv->ddev, dma, >> >>+ priv->frag_info[0].frag_size, >> >>+ DMA_FROM_DEVICE); >> >In case of XDP_PASS we will dma_sync again in the normal path, this >> >can be improved by doing the dma_sync as soon as we can and once and >> >for all, regardless of the path the packet is going to take >> >(XDP_DROP/mlx4_en_complete_rx_desc/mlx4_en_rx_skb). >> I agree with Saeed, dma_sync is a heavy operation that is now done >> twice for all packets with XDP_PASS. >> We should try our best to avoid performance degradation in the flow >> of unfiltered packets. > Makes sense, do folks here see a way to do this cleanly? yes, we need something like: +static inline void +mlx4_en_sync_dma(struct mlx4_en_priv *priv, + struct mlx4_en_rx_desc *rx_desc, + int length) +{ + dma_addr_t dma; + + /* Sync dma addresses from HW descriptor */ + for (nr = 0; nr < priv->num_frags; nr++) { + struct mlx4_en_frag_info *frag_info = &priv->frag_info[nr]; + + if (length <= frag_info->frag_prefix_size) + break; + + dma = be64_to_cpu(rx_desc->data[nr].addr); + dma_sync_single_for_cpu(priv->ddev, dma, frag_info->frag_size, + DMA_FROM_DEVICE); + } +} @@ -790,6 +808,10 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud goto next; } + length = be32_to_cpu(cqe->byte_cnt); + length -= ring->fcs_del; + + mlx4_en_sync_dma(priv,rx_desc, length); /* data is available continue processing the packet */ and make sure to remove all explicit dma_sync_single_for_cpu calls.
Powered by blists - more mailing lists