[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38602241-3d4e-7348-526c-80b44fb4cba7@quicinc.com>
Date: Wed, 13 Apr 2022 19:48:12 +0530
From: Mukesh Ojha <quic_mojha@...cinc.com>
To: Greg KH <gregkh@...uxfoundation.org>,
<linux-remoteproc@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <tglx@...utronix.de>,
<sboyd@...nel.org>, <johannes@...solutions.net>,
<rafael@...nel.org>
Subject: Re: Possible race in dev_coredumpm()-del_timer() path
On 4/13/2022 4:28 PM, Greg KH wrote:
> On Wed, Apr 13, 2022 at 03:46:39PM +0530, Mukesh Ojha wrote:
>> On Wed, Apr 13, 2022 at 07:34:24AM +0200, Greg KH wrote:
>>> On Wed, Apr 13, 2022 at 10:59:22AM +0530, Mukesh Ojha wrote:
>>>> Hi All,
>>>>
>>>> We are hitting one race due to which try_to_grab_pending() is stuck .
>>>
>>> What kernel version are you using?
>>
>> 5.10
>
> 5.10.0 was released a very long time ago. Please use a more modern
> kernel release :)
>
It would not be feasible for us to switch to latest kernel and I think,
this issue could be there in recent kernel as well.
>> Sorry, for the formatting mess.
>>
>>>> In following scenario, while running (p1)dev_coredumpm() devcd device is
>>>> added to
>>>> the framework and uevent notification sent to userspace that result in the
>>>> call to (p2) devcd_data_write()
>>>> which eventually try to delete the queued timer which in the racy scenario
>>>> timer is not queued yet.
>>>> So, debug object report some warning and in the meantime timer is
>>>> initialized and queued from p1 path.
>>>> and from p2 path it gets overriden again timer->entry.pprev=NULL and
>>>> try_to_grab_pending() stuck
>> p1 p2(X)
>>
>> dev_coredump() uevent sent to userspace
>> device_add() =========================> userspace process X reads the uevents
>> writes to devcd fd which
>> results into writes to
>>
>> devcd_data_write()
>> mod_delayed_work()
>> try_to_grab_pending()
>> del_timer()
>> debug_assert_init()
>> INIT_DELAYED_WORK
>> schedule_delayed_work
>> debug_object_fixup()
>
> Why do you have object debugging enabled?
We have enabled object debugging to catch more issues around kernel.
> That's going to take a LONG
> time, and will find bugs in your code. Perhaps like this one?
>
> What type of device is this? What bus? What driver?
remoteproc client device driver would call dev_coredumpm() and devcd
device gets added as part of the call.
>
> And if you turn object debugging off, what happens?
We have not observed issue after disabling object debugging off.
Regards,
Mukesh
>
> thanks,
>
> greg k-h
Powered by blists - more mailing lists