[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <364309dc-60d6-3250-b77b-f27935ab41a0@quicinc.com>
Date: Tue, 14 Nov 2023 18:00:58 +0530
From: Mukesh Ojha <quic_mojha@...cinc.com>
To: Tejun Heo <tj@...nel.org>
CC: <myungjoo.ham@...sung.com>, <kyungmin.park@...sung.com>,
<cw00.choi@...sung.com>, <jstultz@...gle.com>,
<tglx@...utronix.de>, <sboyd@...nel.org>, <jiangshanlai@...il.com>,
<linux-kernel@...r.kernel.org>, <linux-pm@...r.kernel.org>
Subject: Re: timer list corruption in devfreq
On 11/10/2023 12:38 AM, Tejun Heo wrote:
> Hello,
>
> On Wed, Nov 08, 2023 at 09:39:57PM +0530, Mukesh Ojha wrote:
>> We are facing an issue on 6.1 kernel while using devfreq framework
>> and looks like the devfreq_monitor_stop()/devfreq_monitor_start is
>> vulnerable if frequent governor change is being done from user space
>> in a loop.
>>
>> echo simple_ondemand > /sys/class/devfreq/1d84000.ufshc/governor
>> echo performance > /sys/class/devfreq/1d84000.ufshc/governor
>>
>> Here, we are using ufs device, but could be any device.
>>
>> Issue is because same instance of timer is being queued from two
>> places one from devfreq_monitor() and one from devfreq_monitor_start() as
>> cancel_delayed_work_sync() from devfreq_monitor_stop() was not
>> able to delete the delayed work time completely due to which
>> devfreq_monitor() work rearmed the same timer.
>>
>> But there looks to be issue in the timer framework where
>> it was initially discussed in [1] and later fixed in [2]
>> but not sure being whether is it issue in cancel_delayed_work_sync()
>> where del_timer() inside try_to_grab_pending() need to be replaced
>> with timer_delete[_sync]() or devfreq_monitor_stop() need to use
>> this api's and then delete the work.
>
> So, having shutdown can be more convenient in some cases and that'd be a
> useful addition to workqueue both for immediate and delayed work items. That
> said, that's usually not essential in fixing these issues - e.g. Can't you
> just synchronize devfreq_monitor_start() and stop()?
Thanks for the feedback..
This issue can be fixed with synchronizing devfreq_monitor_[start/stop()].
Posted here,
https://lore.kernel.org/all/1699957648-31299-1-git-send-email-quic_mojha@quicinc.com/
However, It forces the client to have a check in delayed work callback
to not queue the new delayed work timer. It is also possible if
del_timer in below sequence[1] return 'false' but do not want
another instance of the timer to be queued after a call to
cancel_delayed_work_sync() which is what can be achieved with
timer_shutdown() version of __cancel_work_timer or may be a
separate __cancel_work_timer_shutdown() introduction.
[1]
__cancel_work_timer=>try_to_grab_pending=>del_timer()
Let me know if anything wrong with my understanding.
-Mukesh
>
> Thanks.
>
Powered by blists - more mailing lists