[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090724220423.11828b85.akpm@linux-foundation.org>
Date: Fri, 24 Jul 2009 22:04:23 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Arjan van de Ven <arjan@...ux.intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <peterz@...radead.org>,
"Kok, Auke-jan H" <auke-jan.h.kok@...el.com>
Subject: Re: [PATCH] sched: Provide iowait counters
On Fri, 24 Jul 2009 21:48:22 -0700 Arjan van de Ven <arjan@...ux.intel.com> wrote:
> Andrew Morton wrote:
> > On Fri, 24 Jul 2009 21:33:02 -0700 Arjan van de Ven <arjan@...ux.intel.com> wrote:
> >
> >> Andrew Morton wrote:
> >>> On Mon, 20 Jul 2009 11:31:47 -0700 Arjan van de Ven <arjan@...ux.intel.com> wrote:
> >>>
> >>>> For counting how long an application has been waiting for (disk) IO,
> >>>> there currently is only the HZ sample driven information available, while
> >>>> for all other counters in this class, a high resolution version is
> >>>> available via CONFIG_SCHEDSTATS.
> >>>>
> >>>> In order to make an improved bootchart tool possible, we also need
> >>>> a higher resolution version of the iowait time.
> >>>>
> >>>> This patch below adds this scheduler statistic to the kernel.
> >>> Doesn't this duplicate the delay accounting already available via the
> >>> taskstats interface?
> >> we have how long we wait. we do not have how long we iowait afaik...
> >> at least not in nanosecond granularity. (We do have the sampled data, but that
> >> is milisecond sampled data, not very useful for making charts based on time
> >> to show the sequence of events)
> >
> > See include/linux/sched.h's definition of task_delay_info - u64
> > blkio_delay is in nanoseconds. It uses
> > do_posix_clock_monotonic_gettime() internally.
>
> looks like it does.. to bad we don't expose that data in a /proc/<pid>/delay or something field
> like we do with the scheduler info...
>
I thought we did deliver a few of the taskstats counters via procfs,
but maybe I dreamed it. It would have been a rather bad thing to do.
taskstats has a large advantage over /proc-based things: it delivers a
packet to the monitoring process(es) when the monitored task exits. So
with no polling at all it is possible to gather all that information
about the just-completed task. This isn't possible with /proc.
There's a patch on the list now to teach taskstats to emit a packet at
fork- and exit-time too.
The monitored task can be polled at any time during its execution also,
like /proc files.
Please consider switching whatever-you're-working-on over to use
taskstats rather than adding (duplicative) things to /proc (which
require CONFIG_SCHED_DEBUG, btw).
If there's stuff missing from taskstats then we can add it - it's
versioned and upgradeable and is a better interface. It's better
to make taskstats stronger than it is to add /proc/pid fields, methinks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists