linux-kernel - Re: [RFC PATCH] hrtimer: remove deadlock due to waiting on IPI in softirq context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <53179D06.2050707@redhat.com>
Date:	Wed, 05 Mar 2014 16:54:14 -0500
From:	Rik van Riel <riel@...hat.com>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	linux-kernel@...r.kernel.org, Mateusz Guzik <mguzik@...hat.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Ingo Molnar <mingo@...hat.com>,
	Prarit Bhargava <prarit@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Clark Williams <williams@...hat.com>
Subject: Re: [RFC PATCH] hrtimer: remove deadlock due to waiting on IPI in
 softirq context

On 03/05/2014 04:51 PM, Thomas Gleixner wrote:
> On Wed, 5 Mar 2014, Rik van Riel wrote:
>> There appears to be a deadlock in the hrtimer code. Specifically,
>> clock_was_set() calls an IPI with wait=1, from softirq context.
>
> This should not be called from softirq context.
>
>> Waiting for IPIs to complete in irq context can lead to a deadlock,
>> because the current code (that was interrupted) might be holding some
>> kind of lock, that another CPU is waiting for with spin_lock_irq or
>> similar.
>>
>> In other words, the current CPU may need to release a resource, before
>> the IPI can be handled by one of the destination CPUs.
>>
>> To my untrained eye, it does not look like this patch introduces a
>> new bug to the timer code, but that is hard to ascertain with the
>> timer code. so I am posting this as an RFC for the timer gods to hurt
>> their brains on :)
>>
>> This bug was introduced by 54cdfdb4 in early 2007 (the original
>> hrtimer code patch).
>
> Right and we had some issues with that until we moved the calls to
> clock_was_set() out of lock held regions.

Ahh indeed, the bug got fixed already :)

> The only call which happens from interrupt context is in
> update_wall_time(). And that one definitely holds no locks which are
> relevant.
>
> On which kernel are you observing the issue?

This was RHEL6, and I saw that the immediate function
was still the same upstream.

I forgot to check that clock_was_set() is now called
in a different way. My bad.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/