linux-kernel - Re: [RFC PATCHSET take#2] ioblame: IO tracer with origin tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJL_ektD=A5ATqt11+tAWeoCDGYRpi=_JVgr0ubM4ih96WAuxg@mail.gmail.com>
Date:	Wed, 11 Jan 2012 14:45:56 -0800
From:	David Sharp <dhsharp@...gle.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>, axboe@...nel.dk,
	mingo@...hat.com, rostedt@...dmis.org, teravest@...gle.com,
	slavapestov@...gle.com, ctalbott@...gle.com,
	linux-kernel@...r.kernel.org, winget@...gle.com, namhyung@...il.com
Subject: Re: [RFC PATCHSET take#2] ioblame: IO tracer with origin tracking

On Wed, Jan 11, 2012 at 9:02 AM, Tejun Heo <tj@...nel.org> wrote:
> Hello, Frederic.
>
> On Wed, Jan 11, 2012 at 03:40:14PM +0100, Frederic Weisbecker wrote:
>> I think this has been asked before. So sorry for asking twice.
>
> I thought Namhyung was primarily asking about stat gathering which is
> chopped now.
>
>> But I'm wondering why the post processing is made from the kernel. Do you think
>> it would be possible to pull that out in userspace. We have some nice scripting
>> framework for post processing of trace events in perf tools for example.
>>
>> If it's not possible please tell us why. We really would like to avoid adding such
>> a big piece of code in the tracing subsystem if possible.
>
> I suppose you're talking about the state tracking by post-processing,
> right?
>
> * ioblame tracks stack trace for each dirtying operation.  If we don't
>  want further state tracking in kernel, we would have to exort the
>  whole stack trace on each dirtying operation which can be high
>  frequency.  Also, is there an efficient way to export variable
>  length data via TPs?  If so, it can be somewhat better but still not
>  very good.

See __dynamic_array. It imposes a 4-byte overhead to store the offset
and length of data within the trace event.

That said, I'm always very wary of adding large amounts of data to
tracepoints, especially if they are high frequency, as that just leads
to faster ring buffer exhaustion.

>
> * Even if we track dirtying state in userland, when an io is issued,
>  it needs to be mapped back to the dirtying actions.  If the dirtier
>  state is in userland, we have to export all physaddrs of pages in
>  the IO so that userland can match them up and clear dirtied states.
>  Again, the same problem.
>
> * As implemented, most of state tracking should be fairly stable and
>  shouldn't require much modification as code base evolves but it's
>  still trying to extract pretty high level semantics from disjoint
>  events across multiple layers.  It's reasonable to expect future
>  changes would require updates to how those semantics are
>  established.  Exporting higher level semantics, we don't get tied to
>  keeping the relevant raw tracepoints and, more importantly, their
>  exact interactions stable.
>
> * It isn't trivial but still pretty straight-forward.  Most of what it
>  does is abbreviating strack trace to an identifier (which BTW could
>  be useful for other tracing purposes and may be worthwhile to
>  generalize) and tracking page and inode dirtiers using those
>  identifiers.  It stays mostly out of the way and doesn't noticeably
>  harm maintainability.  It fits the role of in-kernel tracers -
>  building information from domain knowledge and states and exporting
>  to userland in sensible form.
>
> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/