linux-kernel - Re: [PATCH] Revert "timers: Don't wake ktimersoftd on every tick"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b1486490-5a96-7ba3-c280-5d5622c8c488@ni.com>
Date:   Fri, 2 Jun 2017 09:37:04 -0500
From:   Haris Okanovic <haris.okanovic@...com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Anna-Maria Gleixner <anna-maria@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        julia.cartwright@...com, gratian.crisan@...com
Subject: Re: [PATCH] Revert "timers: Don't wake ktimersoftd on every tick"

On 05/26/2017 03:50 PM, Thomas Gleixner wrote:
> On Fri, 26 May 2017, Haris Okanovic wrote:
>
>> Oh crap. I think I see the problem. I decrement expired_count before
>> processing the list. Dropping the lock permits another run of
>> tick_find_expired()->find_expired_timers() in the middle of __expire_timers()
>> since it uses expired_count==0 as a condition.
>>
>> This should fix it, but I'll wait for Anna-Maria's test next week before
>> submitting a patch.
>>
>>>  static void expire_timers(struct timer_base *base)
>>>  {
>>>         struct hlist_head *head;
>>> +       int expCount = base->expired_count;
>
> No camel case for heavens sake!
>
> And this requires:
>
>    	 cnt = READ_ONCE(base->expired_count);
>
>>> -       while (base->expired_count--) {
>>> -               head = base->expired_lists + base->expired_count;
>>> +       while (expCount--) {
>>> +               head = base->expired_lists + expCount;
>>>                 __expire_timers(base, head);
>>>         }
>
> Plus a comment.

Fixed, thanks.

Are your recommending READ_ONCE() purely for documentation purposes?
All reads and writes to base->expired_count happen while base->lock is 
held. It just can't reach zero until expired_lists is ready to be rewritten.

>
>>>         base->expired_count = 0;
>
> Anna-Maria spotted the same issue, but I voted for the revert right now
> because I was worried about the consistency of base->clk under all
> circumstances.
>
> The other thing I noticed was this weird condition which does not do the
> look ahead when base->clk is back for some time.

The soft interrupt fires unconditionally if base->clk hasn't advanced in 
some time to limit how long cpu spends in hard interrupt context.

> Why don't you use the
> existing optimization which uses the bitmap for fast forward?
>

Are you referring to forward_timer_base()/base->next_expiry? I think 
it's only updated in the nohz case. Can you share function name/line 
number(s) if you're thinking of something else.

> The other issue I have is that this can race at all. If you raised the
> softirq in the look ahead then you should not go into that function until
> the softirq has actually completed. There is no point in wasting time in
> the hrtimer interrupt if the softirq is running anyway.
>

Makes sense. Skipping the large `if` block in run_local_timers() when 
`local_softirq_pending() & TIMER_SOFTIRQ`.

> Thanks,
>
> 	tglx
>
>
>
>

I also ran Anna-Maria's test for 12h without failure; I.e. no "Stalled" 
messages. It fails withing 10-15m on my qemu VM without the fix (4-core 
Nehalem, 1GB RAM).

You can view a diff at
https://github.com/harisokanovic/linux/tree/dev/hokanovi/timers-race

-- Haris