linux-kernel - Re: [PATCH 4/4] sched,time: only call account_{user,sys,guest,idle}

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <1455052848.15821.12.camel@redhat.com>
Date:	Tue, 09 Feb 2016 16:20:48 -0500
From:	Rik van Riel <riel@...hat.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...nel.org,
	luto@...capital.net, peterz@...radead.org, clark@...hat.com,
	eric.dumazet@...il.com
Subject: Re: [PATCH 4/4] sched,time: only call
 account_{user,sys,guest,idle}_time once a jiffy

On Tue, 2016-02-09 at 18:11 +0100, Frederic Weisbecker wrote:
> 
> So for any T_slice being a given cpu timeslice (in secs) executed
> between
> two ring switch (user <-> kernel), we are going to account: 1 *
> P(T_slice*HZ)
> (P() stand for probability here).
> 
> Now after this patch, the scenario is rather different. We are
> accounting the
> real time spent in a slice with a similar probablity.
> This becomes: T_slice * P(T_slice*HZ).
> 
> So it seems it could result into logarithmic accounting: timeslices
> of 1 second
> will be accounted right whereas repeating tiny timeslices may result
> in much lower
> values than expected.
> 
> To fix this we should instead account jiffies_to_nsecs(jiffies - t-
> >vtime_jiffies).
> Well, that would drop the use of finegrained clock and even the need
> of nsecs based
> cputime. But why not if we still have acceptable result for much more
> performances.

Looking over the code some more, you are right.

My changes to vtime_account_idle and
vtime_account_system will cause them to do
nothing a lot of the time, when called from
vtime_common_task_switch.

This causes a discrepancy between the time
accounted at task switch time, and the time
delta accounted later on.

I see two ways to fix this:
1) Always do fine granularity accounting at
   task switch time, and revert to coarser
   granularity at syscall time only. This may
   or may not be more accurate (not sure how
   much, or whether we care).
2) Always account at coarser granularity,
   like you suggest above. This has the
   advantage of leading to faster context
   switch times (no TSC reads).

I am happy to implement either.

Frederic, Thomas, Ingo, do you have a
preference between these approaches?

Would you like me to opt for the higher
performance option, or go for the potentially
higher accuracy one?

As an aside, I am wondering whether the call
to vtime_account_user() from
vtime_common_task_switch() ever does anything.

After all, the preceding call to vtime_account_system
would have already accounted all the CPU time that
passed to system time, and there will be no time left
to account to userspace.

Unless I am missing something, I suspect that line
can just go.

-- 
All rights reversed

Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)