[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d37e86bb-3ddd-2c61-bbbe-ebb5cc4801dc@arm.com>
Date: Mon, 23 Mar 2020 08:28:13 +0000
From: Cristian Marussi <cristian.marussi@....com>
To: Lukasz Luba <lukasz.luba@....com>, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, sudeep.holla@....com,
james.quinlan@...adcom.com, Jonathan.Cameron@...wei.com
Subject: Re: [PATCH v4 07/13] firmware: arm_scmi: Add notification dispatch
and delivery
Hi
On 3/18/20 8:26 AM, Lukasz Luba wrote:
> Hi Cristian,
>
> On 3/16/20 2:46 PM, Cristian Marussi wrote:
>> On Thu, Mar 12, 2020 at 09:43:31PM +0000, Lukasz Luba wrote:
>>>
>>>
>>> On 3/12/20 6:34 PM, Cristian Marussi wrote:
>>>> On 12/03/2020 13:51, Lukasz Luba wrote:
>>>>> Hi Cristian,
>>>>>
>> Hi Lukasz
>>
>>>>> just one comment below...
>> [snip]
>>>>>> + eh.timestamp = ts;
>>>>>> + eh.evt_id = evt_id;
>>>>>> + eh.payld_sz = len;
>>>>>> + kfifo_in(&r_evt->proto->equeue.kfifo, &eh, sizeof(eh));
>>>>>> + kfifo_in(&r_evt->proto->equeue.kfifo, buf, len);
>>>>>> + queue_work(r_evt->proto->equeue.wq,
>>>>>> + &r_evt->proto->equeue.notify_work);
>>>>>
>>>>> Is it safe to ignore the return value from the queue_work here?
>>>>>
>>>>
[snip]
>> On the other side considering the impact of such scenario, I can imagine that
>> it's not simply that we could only have a delayed delivery, but we must consider
>> that if the delayed event is effectively the last one ever it would remain
>> undelivered forever; this is particularly worrying in a scenario in which such
>> last event is particularly important: imagine a system shutdown where a last
>> system-power-off remains undelivered.
>
> Agree, another example could be a thermal notification for some critical
> trip point.
>
>>
>> As a consequence I think this rare racy condition should be addressed somehow.
>>
>> Looking at this scenario, it seems the classic situation in which you want to
>> use some sort of completion to avoid missing out on events delivery, BUT in our
>> usecase:
>>
>> - placing the workers loaned from cmwq into an unbounded wait_for_completion()
>> once the queue is empty seems not the best to use resources (and probably
>> frowned upon)....using a few dedicated kernel threads to simply let them idle
>> waiting most of the time seems equally frowned upon (I could be wrong...))
>> - the needed complete() in the ISR would introduce a spinlock_irqsave into the
>> interrupt path (there's already one inside queue_work in fact) so it is not
>> desirable, at least not if used on a regular base (for each event notified)
>>
>> So I was thinking to try to reduce sensibly the above race window, more
>> than eliminate it completely, by adding an early flag to be checked under
>> specific conditions in order to retry the queue_work a few times when the race
>> is hit, something like:
>>
>> ISR (core N) | WQ (core N+1)
>> -------------------------------------------------------------------------------
>> | atomic_set(&exiting, 0);
>> |
>> | do {
>> | ...
>> | if (queue_is_empty) - WORK_PENDING 0 events queued
>> + atomic_set(&exiting, 1) - WORK_PENDING 0 events queued
>> static int cnt=3 | --> breakout of while - WORK_PENDING 0 events queued
>> kfifo_in() | ....
>> | } while (scmi_process_event_payload);
>> kfifo_in() |
>> exiting = atomic_read() | ...cmwq backing out - WORK_PENDING 1 events queued
>> do { | ...cmwq backing out - WORK_PENDING 1 events queued
>> ret = queue_work() | ...cmwq backing out - WORK_PENDING 1 events queued
>> if (ret || !exiting)| ...cmwq backing out - WORK_PENDING 1 events queued
>> break; | ...cmwq backing out - WORK_PENDING 1 events queued
>> mdelay(5); | ...cmwq backing out - WORK_PENDING 1 events queued
>> exiting = | ...cmwq backing out - WORK_PENDING 1 events queued
>> atomic_read; | ...cmwq backing out - WORK_PENDING 1 events queued
>> } while (--cnt); | ...cmwq backing out - WORK_PENDING 1 events queued
>> | ---- WORKER EXIT - !WORK_PENDING 0 events queued
>>
>> like down below between the scissors.
>>
>> Not tested or tried....I could be missing something...and the mdelay is horrible (and not
>> the cleanest thing you've ever seen probably :D)...I'll have a chat with Sudeep too.
>
> Indeed it looks more complicated. If you like I can join your offline
> discuss when Sudeep is back.
>
Yes this is as of now my main remaining issue to address for v6.
I'll wait for Sudeep general review/feedback and raise this point.
Regards
Cristian
> Regards,
> Lukasz
Powered by blists - more mailing lists