linux-kernel - Re: Negative values in /proc/latency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090203214609.39c9c6b2.akpm@linux-foundation.org>
Date:	Tue, 3 Feb 2009 21:46:09 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Arjan van de Ven <arjan@...radead.org>
Cc:	Corrado Zoccolo <czoccolo@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: Negative values in /proc/latency_stats

On Tue, 3 Feb 2009 16:16:48 -0800 Arjan van de Ven <arjan@...radead.org> wrote:

> On Mon, 2 Feb 2009 20:55:45 -0800
> Andrew Morton <akpm@...ux-foundation.org> wrote:
> 
> > On Sat, 31 Jan 2009 14:42:20 +0100 Corrado Zoccolo
> > <czoccolo@...il.com> wrote:
> > 
> > > Hello,
> > > I found negative numbers sometimes appear in /proc/latency_stats
> > > (vanilla kernel 2.6.29-rc3, on x86 cpu, configuration attached)
> > > 
> > > [corrado@...alhost ~]$ while sleep 1; do grep --
> > > - /proc/latency_stats
> > > >> neg_stats; done
> > > ^Z
> > > [1]+  Stopped                 sleep 1
> > > [corrado@...alhost ~]$ cat neg_stats
> > > 1 -486373534 -486373534 sys_rt_sigsuspend sysenter_do_call
> > > 1 -486373534 -486373534 sys_rt_sigsuspend sysenter_do_call
> > > 1 -486373534 -486373534 sys_rt_sigsuspend sysenter_do_call
> > > 
> > > 
> > > I suspect this can be the cause for
> > > https://bugs.launchpad.net/ubuntu/+source/latencytop/+bug/297776 ,
> > > as I saw it happening on my machine during a kernel recompilation.
> > > 
> > 
> > <discovers kernel/latencytop.c>
> 
> how about this?
> 

lavish ;)

> ...
>
> +/*
> + * CONFIG_LATENCYTOP enables a kernel latency tracking infrastructure that is
> + * used by the "latencytop" userspace tool. The latency that is tracked is not
> + * the 'traditional' interrupt latency (which is primarily caused by something
> + * else consuming CPU), but instead, it is the latency an application encounters
> + * because the kernel sleeps on its behalf for various reasons.
> + *
> + * This code tracks 2 levels of statistics:
> + * 1) System level latency
> + * 2) Per process latency
> + *
> + * The latency is stored in fixed sized datastructures in an accumulated form;
> + * if the "same" latency cause is hit twice, this will be tracked as one entry
> + * in the datastructure. Both the count, total accumulated latency and maximum
> + * latency are tracked in this data structure. When the fixed size structure is
> + * full, no new causes are tracked until the buffer is flushed by writing to
> + * the /proc file; the userspace tool does this on a regular basis.

s/datastructure/data structure/

> + * A latency cause is identified by a stringified backtrace at the point that
> + * the scheduler gets invoked. The userland tool will use this string to
> + * identify the cause of the latency in human readable form.
> + *
> + * The information is exported via /proc/latency_stats and /proc/<pid>/latency.
> + * These files look like this:
> + *
> + * Latency Top version : v0.1
> + * 70 59433 4897 i915_irq_wait drm_ioctl vfs_ioctl do_vfs_ioctl sys_ioctl
> + * |    |    |    |
> + * |    |    |    +----> the stringified backtrace
> + * |    |    +---------> The maximum latency for this entry in microseconds
> + * |    +--------------> The accumulated latency for this entry (microseconds)
> + * +-------------------> The number of times this entry is hit
> + *
> + * (note: the average latency is the acummulated latency deviced by the number
> + * of times)

I was surprised at this.  Is the stack backtracer sufficiently
reliable/repeatable for this to work?

> + * Negative latencies (caused by time going backwards) are also explicitly
> + * skipped.

Are we sure that this was the cause of the problem?

Which time source is the caller using here, and why is it going backwards?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/