linux-kernel - Re: [Query] Ticks happen in pair for NO_HZ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87wqj34eqs.fsf@linaro.org>
Date:	Tue, 17 Dec 2013 08:35:39 -0800
From:	Kevin Hilman <khilman@...aro.org>
To:	Viresh Kumar <viresh.kumar@...aro.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Lists linaro-kernel <linaro-kernel@...ts.linaro.org>,
	Linaro Networking <linaro-networking@...aro.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Tejun Heo <tj@...nel.org>
Subject: Re: [Query] Ticks happen in pair for NO_HZ_FULL cores ?

Viresh Kumar <viresh.kumar@...aro.org> writes:

> Sorry for the delay, was on holidays..
>
> On 11 December 2013 18:52, Frederic Weisbecker <fweisbec@...il.com> wrote:
>> On Tue, Dec 03, 2013 at 01:57:37PM +0530, Viresh Kumar wrote:
>>> - again got arch_timer interrupt after 5 ms (HZ=200)
>>
>> Right, looking at the details, the 2nd interrupt is caused by workqueue delayed
>> work bdi writeback.
>
> I am not that great at reading traces or kernelshark output, but I
> still feel I haven't
> seen anything wrong. And I wasn't talking about the delayed workqueue here..
>
> I am looking at the trace I attached with kernelshark after filtering
> out CPU0 events:
> - Event 41, timestamp: 159.891973
> - it ends at event 56, timestamp: 159.892043

For future reference, for generating email friendly trace output for
discussion like this, you can use something like:

   trace-cmd report --cpu=1 trace.dat

> And after that the next event comes after 5 Seconds.
>
> And so I was talking for the Event 41.

That first event (Event 41) is an interrupt, and comes from the
scheduler tick.  The tick is happening because the writeback workqueue
just ran and we're not in NO_HZ mode.

However, as soon as that IRQ (and resulting softirqs) are finished, we
enter NO_HZ mode again.  But as you mention, it only lasts for ~5 sec
when the timer fires again.  Once again, it fires because of the
writeback workqueue, and soon therafter it switches back to NO_HZ mode
again.

So the solution to avoid this jitter on the NO_HZ CPU is to set the
affinity of the writeback workqueue to CPU0:

  # pin the writeback workqueue to CPU0
  echo 1 > /sys/bus/workqueue/devices/writeback/cpumask

I suspect by doing that, you will no longer see the jitter.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/