[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpUL9xCrZCjCpcaBMyjSWEJ+1DsFMO6dwJVtVvz=zKJDxA@mail.gmail.com>
Date: Fri, 20 Oct 2017 13:31:43 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Jamal Hadi Salim <jhs@...atatu.com>,
Chris Mi <chrism@...lanox.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Eric Dumazet <edumazet@...gle.com>,
David Miller <davem@...emloft.net>,
Jiri Pirko <jiri@...nulli.us>
Subject: Re: Get rid of RCU callbacks in TC filters?
On Fri, Oct 20, 2017 at 9:56 AM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
> On Thu, Oct 19, 2017 at 08:26:01PM -0700, Cong Wang wrote:
>> On Wed, Oct 18, 2017 at 12:35 PM, Paul E. McKenney
>> <paulmck@...ux.vnet.ibm.com> wrote:
>> > 5) Keep call_rcu(), but have the RCU callback schedule a workqueue.
>> > The workqueue could then use blocking primitives, for example, acquiring
>> > RTNL.
>>
>> Yeah, this could work too but we would get one more async...
>>
>> filter delete -> call_rcu() -> schedule_work() -> action destroy
>
> True, but on the other hand you get to hold RTNL.
I can get RTNL too with converting call_rcu() to synchronize_rcu().
;)
So this turns into the question again: if we mind synchronize_rcu()
on slow paths or not?
Actually, I just tried this approach, this way makes the core tc filter
code harder to wait for flying callbacks, currently rcu_barrier() is
enough, with one more schedule_work() added we probably
need flush_workqueue()... Huh, this also means I can't use the
global workqueue so should add a new workqueue for tc filters.
Good news is I seem to make it work without adding much code.
Stay tuned. ;)
>
>> > 6) As with #5, have the RCU callback schedule a workqueue, but aggregate
>> > workqueue scheduling using a timer. This would reduce the number of
>> > RTNL acquisitions.
>>
>> Ouch, sounds like even one more async:
>>
>> filter delete -> call_rcu() -> schedule_work() -> timer -> flush_work()
>> -> action destroy
>>
>> :-(
>
> Indeed, the price of scalability and performance is often added
> asynchronous action at a distance. But sometimes you can have
> scalability, performance, -and- synchronous action. Not sure that this
> is one of those cases, but perhaps someone will come up with some trick
> that we are not yet seeing.
>
> And again, one benefit you get from the added asynchrony is the ability
> to acquire RTNL. Another is increased batching, allowing the overhead
> of acquiring RTNL to be amortized over a larger number of updates.
Understood, my point is it might not be worthy to optimize a slow path
which already has RTNL lock...
>
>> > 7) As with #5, have the RCU callback schedule a workqueue, but have each
>> > iterator accumulate a list of things removed and do call_rcu() on the
>> > list. This is an alternative way of aggregating to reduce the number
>> > of RTNL acquisitions.
>>
>> Yeah, this seems working too.
>>
>> > There are many other ways to skin this cat.
>>
>> We still have to pick one. :) Any preference? I want to keep it as simple
>> as possible, otherwise some day I would not understand it either.
>
> I must defer to the people who actually fully understand this code.
I understand that code, just not sure about which approach I should
pick.
I will keep you Cc'ed for any further change I make.
Thanks!
Powered by blists - more mailing lists