[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1504222021240.13914@nanos>
Date: Wed, 22 Apr 2015 20:56:35 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Eric Dumazet <eric.dumazet@...il.com>
cc: Peter Zijlstra <peterz@...radead.org>,
viresh kumar <viresh.kumar@...aro.org>,
Ingo Molnar <mingo@...hat.com>, linaro-kernel@...ts.linaro.org,
linux-kernel@...r.kernel.org, Steven Miao <realmz6@...il.com>,
shashim@...eaurora.org
Subject: Re: [PATCH 1/2] timer: Avoid waking up an idle-core by migrate
running timer
On Wed, 22 Apr 2015, Eric Dumazet wrote:
> Check commit 4a8e320c929991c9480 ("net: sched: use pinned timers")
> for a specific example of the problems that can be raised.
If you have a problem with the core timer code then it should be fixed
there and not worked around in some place which will ruin stuff for
power saving interested users. I'm so tired of this 'I fix it in my
sandbox' attitude, really. If the core code has a shortcoming we fix
it there right away because you are probably not the only one who runs
into that shortcoming. So if we don't fix it in the core we end up
with a metric ton of slightly different (or broken) workarounds which
affect the workload/system characteristics of other people.
Just for the record. Even the changelog of this commit is blatantly
wrong:
"We can see that timers get migrated into a single cpu, presumably
idle at the time timers are set up."
The timer migration moves timers to non idle cpus to leave the idle
ones alone for power saving sake.
I can see and understand the reason why you want to avoid that, but I
have to ask the question whether this pinning is the correct behaviour
under all workloads and system characteristics. If yes, then the patch
is the right answer, if no, then it is simply the wrong approach.
> but /proc/sys/kernel/timer_migration adds a fair overhead in many
> workloads.
>
> get_nohz_timer_target() has to touch 3 cache lines per cpu...
And this is something we can fix and completely avoid if we think
about it. Looking at the code I have to admit that the out of line
call and the sysctl variable lookup is silly. But its not rocket
science to fix this.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists