[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e52578f3-e93a-1c1a-f898-36ed0c8c621a@alibaba-inc.com>
Date: Tue, 28 Nov 2017 08:28:43 +0800
From: "Yang Shi" <yang.s@...baba-inc.com>
To: Waiman Long <longman@...hat.com>, tglx@...utronix.de
Cc: linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/2] lib: debugobjects: touch watchdog to avoid
softlockup when !CONFIG_PREEMPT
On 11/27/17 11:36 AM, Waiman Long wrote:
> On 11/27/2017 01:52 PM, Yang Shi wrote:
>>
>>
>> On 11/27/17 10:18 AM, Waiman Long wrote:
>>> On 11/27/2017 12:54 PM, Yang Shi wrote:
>>>> Hi Waiman,
>>>>
>>>> The second patch of this series.
>>>>
>>>> Thanks,
>>>> Yang
>>>>
>>>>
>>>> On 11/17/17 11:43 AM, Yang Shi wrote:
>>>>> There are nested loops on debug objects free path, sometimes it may
>>>>> take
>>>>> over hundred thousands of loops, then cause soft lockup with
>>>>> !CONFIG_PREEMPT
>>>>> occasionally, like below:
>>>>>
>>>>> ...
>>>>>
>>>>> The code path might be called in either atomic or non-atomic context,
>>>>> so touching softlockup watchdog instead of calling cond_resched()
>>>>> which
>>>>> might fall asleep. However, it is unnecessary to touch the watchdog
>>>>> every loop, so just touch the watchdog at every 10000 (best estimate)
>>>>> loops.
>>>>>
>>>>> Signed-off-by: Yang Shi <yang.s@...baba-inc.com>
>>>
>>> I do have some concern about suppressing the soft lockup warning
>>> entirely. If the system feels unresponsive for a certain period of time
>>> (e.g. 22s), most users would like to know what is going on. It can be a
>>> custom message with less scary warning. Alternatively, some opt-out
>>
>> I'm not sure if it is necessary for debug code since the
>> unresponsiveness is introduced by debug config and is expected
>> somehow, so the user is supposed to know what they are doing, and it
>> sounds preferred to disregard the soft lockup message reported by
>> object debug for the most time.
>>
>
> Yes, it may be normal to cause the soft lockup while running some kind
> of stress tests. However, it isn't normal for a real user application to
> cause the lockup on a debug kernel. That is why I am suggesting some
> kind of opt-out mechanism so that one can turn off the soft lockup
> warning for stress testing.
Yes, I just saw such lockup when running some stress test, i.e. stress-ng.
>
>> We do have some other examples which suppress soft lockup completely
>> in kernel, i.e. kdb debug, printing some verbose debug or error
>> message, some slow driver code, etc.
>>
>>> mechanism can be added to explicitly disable soft lookup warning for
>>> debugobjs is OK as long as it is not the default.
>> If we really want to some opt-out, we should be able to add a proc
>> knob to disable soft lockup as the patch does, but not default. If
>> this is too overkilling, we may just add some comment in the Kconfig
>> help text to tell users the side effect.
>
> You may add a new debugfs file under the debug_objects sub-directory for
> suppressing the soft lockup message.
OK, will add a knob in v2.
Thanks,
Yang
>
> Cheers,
> Longman
>
Powered by blists - more mailing lists