lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 Jan 2010 14:46:36 -0600
From:	Jason Wessel <jason.wessel@...driver.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
	johnstul@...ibm.com, schwidefsky@...ibm.com, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:timers/urgent] clocksource: Prevent potential kgdb dead
 lock

Andrew Morton wrote:
> On Tue, 26 Jan 2010 14:09:45 GMT tip-bot for Thomas Gleixner <tglx@...utronix.de> wrote:
>
>   
>> --- a/kernel/time/clocksource.c
>> +++ b/kernel/time/clocksource.c
>> @@ -343,7 +343,19 @@ static void clocksource_resume_watchdog(void)
>>  {
>>  	unsigned long flags;
>>  
>> -	spin_lock_irqsave(&watchdog_lock, flags);
>> +	/*
>> +	 * We use trylock here to avoid a potential dead lock when
>> +	 * kgdb calls this code after the kernel has been stopped with
>> +	 * watchdog_lock held. When watchdog_lock is held we just
>> +	 * return and accept, that the watchdog might trigger and mark
>> +	 * the monitored clock source (usually TSC) unstable.
>> +	 *
>> +	 * This does not affect the other caller clocksource_resume()
>> +	 * because at this point the kernel is UP, interrupts are
>> +	 * disabled and nothing can hold watchdog_lock.
>> +	 */
>> +	if (!spin_trylock_irqsave(&watchdog_lock, flags))
>> +		return;
>>  	clocksource_reset_watchdog();
>>  	spin_unlock_irqrestore(&watchdog_lock, flags);
>>  }
>>
>> @@ -458,8 +470,8 @@ void clocksource_resume(void)
>>   * clocksource_touch_watchdog - Update watchdog
>>   *
>>   * Update the watchdog after exception contexts such as kgdb so as not
>> - * to incorrectly trip the watchdog.
>> - *
>> + * to incorrectly trip the watchdog. This might fail when the kernel
>> + * was stopped in code which holds watchdog_lock.
>>   */
>>  void clocksource_touch_watchdog(void)
>>  {
>>     
>
> It would be neater and a shade more reliable to add a separate
> clocksource_try_touch_watchdog() for kgdb's use.  Which could be inside
> #ifdef CONFIG_KGDB.
>
>   

The kernel debugger is the only user of this API function so
realistically you do not need a new function, and could simply rename
this one, if the name does not fit.

-- more notes on the problem --

I further diagnosed the problem, and it is reasonably rare that it
happens in the first place.  I had an instrumented version of the kernel
debugger which showed the problem happening quite frequently, while
trying to fix hw breakpoints.

The problem is not actually an interprocessor deadlock, but that of
re-entering the spin_lock() when it is already held by the interrupted
task on the same cpu.  There is no obvious way that I could see to check
for this specific condition and to abort.

For now the patch Thomas created is sufficient, and given some time
we'll decide how to best live with the side effects or to consider any
further changes.

Jason.

Tested-by: Jason Wessel <jason.wessel@...driver.com>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ