linux-kernel - Re: [RFC v2] timers: Don't wake ktimersoftd on every tick

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 28 Dec 2016 12:06:42 -0600
From:   Haris Okanovic <haris.okanovic@...com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        tglx@...utronix.de, julia.cartwright@...com, gratian.crisan@...com
Subject: Re: [RFC v2] timers: Don't wake ktimersoftd on every tick

On 12/23/2016 11:28 AM, Sebastian Andrzej Siewior wrote:
> On 2016-12-13 15:44:05 [-0600], Haris Okanovic wrote:
>> Changed the way timers are collected per Julia and Thomas'
>
> I can only see Julia's response to the initial thread.
>

I should have been more clear. Thomas commented on irc and recommended 
Julia's approach.

>> recommendation: Expired timers are now collected in interrupt context
>> and fired in ktimersoftd to avoid double-walk of `pending_map`.
>>
>> This is implemented by storing lists of expired timers in timer_base,
>> which carries a memory overhead 9*sizeof(pointer) per CPU. The timer
>> system uses hlist's which don't have end-node references, making it
>> impossible to merge 2 hlist's in constant time. I.e. Merging requires
>> walking one list. I also considered switching `vectors` to regular
>> list's which don't have this limitations, but that approach has the same
>> memory overhead. list_head is bigger than hlist_head by sizeof(pointer)
>> and is instantiated 9+ times per CPU as `vectors`. I believe the only
>> way to trim overhead is to spend more CPU cycles in interrupt context
>> either in list merging (unbounded operation) or the original double-walk
>> implementation. Any suggestions/preferences?
>>
>> As before, a 6h run of cyclictest without CPU affinity shows decrease in
>> 22-70us latency range.
> what does this mean? Your cyclictest runs on a random CPU with one thread
> only?
>

Yes. My point is that cyclictest only shows a significant difference 
(before and after this change) when `-S` is not used.

>> No change in max jitter.
> Does this mean your average latency went down 20-70us and your max is
> the same?
>

Yes. Average latency (20-70us range) goes down in a single-threaded run 
of cyclictest. Max jitter stays the same in both single and multi-thread 
runs.

>> No change when `-S` is
>> used.
>
> -S gives you one thread per core, makes sure it stays on that core and
> uses clock_nanosleep().
>
> clock_nanosleep() should be used no matter what.
>
>
>> [Before/after traces]
>>
>> ftp://ftp.ni.com/outgoing/tp02-timer-peek-traces.tgz
>> (Email me if link dies. Server periodically purges old files.)
>>
>> [Hardware/software/config]
>>
>>  NI cRIO-9033
>>   2 core Intel Atom CPU
>>
>>  Kernel 4.8.6-rt5
>>   CONFIG_HZ_PERIODIC=y
>>
>> [Outstanding concerns/issues/questions]
>>
>> I'm relatively new to the timer subsystem, so please feel free to poke
>> as many holes as possible in this change. A few things that concern me
>> at the moment are:
>>
>> Can jiffies change while one or more cpus is inside tick_sched_timer(),
>>  in interrupt context? I'm copying jiffies to a local variable in
>>  find_expired_timers() to ensure it doesn't run unbounded, but I'm not
>>  sure if that's necessary.
>
> It could change. Only the house keeping does update jiffies in
> tick_sched_do_timer().
>
>> Any special considerations for testing NO_HZ builds? (Other than letting
>> it run idle for a while)
>>
>> timers_dead_cpu() presently asserts no timer callback is actively
>> running, which suggests that timers must be canceled prior to disabling
>> CPUs; otherwise, there's a race between active timers and hotplug
>> which can crash the whole kernel. Is this a safe assumption to make and
>> are there any special considerations for CPU hotplug testing?
>
> timers_dead_cpu() and hrtimers_dead_cpu() migrate timer away. At that
> point the CPU should be down already so a timer can't run on that CPU.
>
>> Other tests/performance benchmark I should run?
>>
>> Source: https://github.com/harisokanovic/linux/tree/dev/hokanovi/timer-peek-v2
>>
>> Thanks,
>> Haris
>
> Sebastian
>

-- Haris