linux-kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1650680264.0sj0tnkzx7.astroid@bobo.none>
Date:   Sat, 23 Apr 2022 12:29:30 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     Michael Ellerman <mpe@...erman.id.au>, paulmck@...nel.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Zhouyi Zhou <zhouzhouyi@...il.com>
Cc:     Daniel
 Lezcano <daniel.lezcano@...aro.org>,
        linux-kernel@...r.kernel.org,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        Miguel Ojeda <miguel.ojeda.sandonis@...il.com>,
        rcu <rcu@...r.kernel.org>,
        Viresh
 Kumar <viresh.kumar@...aro.org>
Subject: Re:

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick