linux-kernel - Re: Linux 3.1-rc9

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20111017160736.6fc6caef@de.ibm.com>
Date:	Mon, 17 Oct 2011 16:07:36 +0200
From:	Martin Schwidefsky <schwidefsky@...ibm.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Simon Kirby <sim@...tway.ca>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: Linux 3.1-rc9

On Mon, 17 Oct 2011 12:34:18 +0200
Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:

> > And
> > why does "cputime_add()" exist at all? It seems to always be just a
> > plain add, and nothing else would seem to ever make sense *anyway*?
> 
> Martin and me were discussing the merit of that only a few weeks ago ;-)

I took my old cputime debug patch and compiled the latest git tree with it.
The compiler found a few places where fishy things happen:

1) fs/proc/uptime.c
static int uptime_proc_show(struct seq_file *m, void *v)
{
	...
        cputime_t idletime = cputime_zero;

        for_each_possible_cpu(i)
                idletime = cputime64_add(idletime, kstat_cpu(i).cpustat.idle);
	...
        cputime_to_timespec(idletime, &idle);
	...
}

idletime is a 32-bit integer on x86-32. The sum of the idle time over all
cpus will quickly overflow, e.g. consider HZ=1000 on a quad-core. It would
overflow after 12.42 days (2^32 / 1000 / 4 / 86400).

2) kernel/posix-cpu-timers.c
/*                                                                              
 * Divide and limit the result to res >= 1                                      
 *                                                                              
 * This is necessary to prevent signal delivery starvation, when the result of  
 * the division would be rounded down to 0.                                     
 */
static inline cputime_t cputime_div_non_zero(cputime_t time, unsigned long div)
{
        cputime_t res = cputime_div(time, div);

        return max_t(cputime_t, res, 1);
}

A cputime of 1 on s390 is 0.244 nano seconds, I have my doubts if that will
prevent signal starvation. Fortunately the function is unused and can be
removed.

3) kernel/itimer
enum hrtimer_restart it_real_fn(struct hrtimer *timer)
{
        struct signal_struct *sig =
                container_of(timer, struct signal_struct, real_timer);

        trace_itimer_expire(ITIMER_REAL, sig->leader_pid, 0);
        kill_pid_info(SIGALRM, SEND_SIG_PRIV, sig->leader_pid);

        return HRTIMER_NORESTART;
}

trace_itimer_expire take a cputime as third argument. That should be
cputime_zero in the current notation, same in do_setitimer. After the
conversion all cputime_zero occurences would be replaced with 0.

4) kernel/sched.c
#define CPUACCT_BATCH   \
        min_t(long, percpu_counter_batch * cputime_one_jiffy, INT_MAX)

If cputime_t is defined as an 64-bit type on a 32-bit architecture the
CPUACCT_BATCH definition can break. Should work for the existing code
though.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/