[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111017160736.6fc6caef@de.ibm.com>
Date: Mon, 17 Oct 2011 16:07:36 +0200
From: Martin Schwidefsky <schwidefsky@...ibm.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Simon Kirby <sim@...tway.ca>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Dave Jones <davej@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: Linux 3.1-rc9
On Mon, 17 Oct 2011 12:34:18 +0200
Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> > And
> > why does "cputime_add()" exist at all? It seems to always be just a
> > plain add, and nothing else would seem to ever make sense *anyway*?
>
> Martin and me were discussing the merit of that only a few weeks ago ;-)
I took my old cputime debug patch and compiled the latest git tree with it.
The compiler found a few places where fishy things happen:
1) fs/proc/uptime.c
static int uptime_proc_show(struct seq_file *m, void *v)
{
...
cputime_t idletime = cputime_zero;
for_each_possible_cpu(i)
idletime = cputime64_add(idletime, kstat_cpu(i).cpustat.idle);
...
cputime_to_timespec(idletime, &idle);
...
}
idletime is a 32-bit integer on x86-32. The sum of the idle time over all
cpus will quickly overflow, e.g. consider HZ=1000 on a quad-core. It would
overflow after 12.42 days (2^32 / 1000 / 4 / 86400).
2) kernel/posix-cpu-timers.c
/*
* Divide and limit the result to res >= 1
*
* This is necessary to prevent signal delivery starvation, when the result of
* the division would be rounded down to 0.
*/
static inline cputime_t cputime_div_non_zero(cputime_t time, unsigned long div)
{
cputime_t res = cputime_div(time, div);
return max_t(cputime_t, res, 1);
}
A cputime of 1 on s390 is 0.244 nano seconds, I have my doubts if that will
prevent signal starvation. Fortunately the function is unused and can be
removed.
3) kernel/itimer
enum hrtimer_restart it_real_fn(struct hrtimer *timer)
{
struct signal_struct *sig =
container_of(timer, struct signal_struct, real_timer);
trace_itimer_expire(ITIMER_REAL, sig->leader_pid, 0);
kill_pid_info(SIGALRM, SEND_SIG_PRIV, sig->leader_pid);
return HRTIMER_NORESTART;
}
trace_itimer_expire take a cputime as third argument. That should be
cputime_zero in the current notation, same in do_setitimer. After the
conversion all cputime_zero occurences would be replaced with 0.
4) kernel/sched.c
#define CPUACCT_BATCH \
min_t(long, percpu_counter_batch * cputime_one_jiffy, INT_MAX)
If cputime_t is defined as an 64-bit type on a 32-bit architecture the
CPUACCT_BATCH definition can break. Should work for the existing code
though.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists