[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160902175006.GA14176@gmail.com>
Date: Fri, 2 Sep 2016 10:50:10 -0700
From: Brenden Blanco <bblanco@...mgrid.com>
To: Tom Herbert <tom@...bertland.com>
Cc: Saeed Mahameed <saeedm@....mellanox.co.il>,
Tariq Toukan <tariqt@...lanox.com>,
"David S. Miller" <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Tariq Toukan <ttoukan.linux@...il.com>,
Or Gerlitz <gerlitz.or@...il.com>
Subject: Re: [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
On Thu, Sep 01, 2016 at 04:30:28PM -0700, Tom Herbert wrote:
[...]
> > Yep, but this is an unlikely condition and the critical code here is
> > much smaller and it is more clear that the rcu_read_lock here meant to
> > protect the ring->xdp_prog under this small xdp critical section in
> > comparison to your patch where it is held across the whole RX
> > function.
>
> Note that there is already an rcu_read_lock potentially per packet
> buried in the function, if the whole function is under rcu_read_lock
> then that can be removed.
Yes I was aware of that, I had left it as-is since: 1. it seemed to be
in an exception path and less performance sensitive to nested calls, and
2. in case some future developer removed the top-level rcu_read_lock,
the finer-grained one would have been unprotected if not code reviewed
carefully.
I'll instead add a note at the top pointing out the dual need for the
lock, to address both yours and Saeed's comments.
As a side note, when considering the idea of moving the rcu_read_lock to
a more generic location (napi), I had toyed with the idea of
benchmarking to see if removing the actually-fast-path use of
rcu_read_lock in netif_receive_skb_internal could have any performance
benefit for the universal use case (non-xdp). However, that seems
completely out of scope at the moment, and only beneficial for
non-standard (IMO) .configs, besides being much harder to review. It was
showing up in perf at about 1-2% overhead in preempt=y kernels.
>
> Tom
Powered by blists - more mailing lists