[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <ACF5E134-0C17-484B-AF9E-D4E6FAC9535F@joelfernandes.org>
Date: Thu, 3 Nov 2022 14:36:30 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: paulmck@...nel.org
Cc: Uladzislau Rezki <urezki@...il.com>, rcu@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] rcu/kfree: Do not request RCU when not needed
> On Nov 3, 2022, at 1:51 PM, Paul E. McKenney <paulmck@...nel.org> wrote:
>
> On Thu, Nov 03, 2022 at 01:41:43PM +0100, Uladzislau Rezki wrote:
>>>>>> /**
>>>>>> @@ -3066,10 +3068,12 @@ static void kfree_rcu_work(struct work_struct *work)
>>>>>> struct kfree_rcu_cpu_work *krwp;
>>>>>> int i, j;
>>>>>>
>>>>>> - krwp = container_of(to_rcu_work(work),
>>>>>> + krwp = container_of(work,
>>>>>> struct kfree_rcu_cpu_work, rcu_work);
>>>>>> krcp = krwp->krcp;
>>>>>>
>>>>>> + cond_synchronize_rcu(krwp->gp_snap);
>>>>>
>>>>> Might this provoke OOMs in case of callback flooding?
>>>>>
>>>>> An alternative might be something like this:
>>>>>
>>>>> if (!poll_state_synchronize_rcu(krwp->gp_snap)) {
>>>>> queue_rcu_work(system_wq, &krwp->rcu_work);
>>>>> return;
>>>>> }
>>>>>
>>>>> Either way gets you a non-lazy callback in the case where a grace
>>>>> period has not yet elapsed.
>>>>> Or am I missing something that prevents OOMs here?
>>>>
>>>> The memory consumptions appears to be much less in his testing with the onslaught of kfree, which makes OOM probably less likely.
>>>>
>>>> Though, was your reasoning that in case of a grace period not elapsing, we need a non lazy callback queued, so as to make the reclaim happen sooner?
>>>>
>>>> If so, the cond_synchronize_rcu() should already be conditionally queueing non-lazy CB since we don’t make synchronous users wait for seconds. Or did I miss something?
>>>
>>> My concern is that the synchronize_rcu() will block a kworker kthread
>>> for some time, and that in callback-flood situations this might slow
>>> things down due to exhausting the supply of kworkers.
>>>
>> This concern works in both cases. I mean in default configuration and
>> with a posted patch. The reclaim work, which name is kfree_rcu_work() only
>> does a progress when a gp is passed so the rcu_work_rcufn() can queue
>> our reclaim kworker.
>>
>> As it is now:
>>
>> 1. Collect pointers, then we decide to drop them we queue the
>> monitro_work() worker to the system_wq.
>>
>> 2. The monitor work, kfree_rcu_work(), tries to attach or saying
>> it by another words bypass a "backlog" to "free" channels.
>>
>> 3. It invokes the queue_rcu_work() that does call_rcu_flush() and
>> in its turn it queues our worker from the handler. So the worker
>> is run after GP is passed.
>
> So as it is now, we are not tying up a kworker kthread while waiting
> for the grace period, correct? We instead have an RCU callback queued
> during that time, and the kworker kthread gets involved only after the
> grace period ends.
>
>> With a patch:
>>
>> [1] and [2] steps are the same. But on third step we do:
>>
>> 1. Record the GP status for last in channel;
>> 2. Directly queue the drain work without any call_rcu() helpers;
>> 3. On the reclaim worker entry we check if GP is passed;
>> 4. If not it invokes synchronize_rcu().
>
> And #4 changes that, by (sometimes) tying up a kworker kthread for the
> full grace period.
>
>> The patch eliminates extra steps by not going via RCU-core route
>> instead it directly invokes the reclaim worker where it either
>> proceed or wait a GP if needed.
>
> I agree that the use of the polled API could be reducing delays, which
> is a good thing. Just being my usual greedy self and asking "Why not
> both?", that is use queue_rcu_work() instead of synchronize_rcu() in
> conjunction with the polled APIs so as to avoid both the grace-period
> delay and the tying up of the kworker kthread.
>
> Or am I missing something here?
Yeah I am with Paul on this, NAK on “blocking in kworker” instead of “checking for grace period + queuing either regular work or RCU work”. Note that blocking also adds a pointless and fully avoidable scheduler round trip.
- Joel
>
> Thanx, Paul
Powered by blists - more mailing lists