[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170713091911.aj7e7dvrbqcyxh7l@gmail.com>
Date: Thu, 13 Jul 2017 11:19:11 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Andres Freund <andres@...razel.de>, x86@...nel.org,
linux-kernel@...r.kernel.org, live-patching@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>, Jiri Slaby <jslaby@...e.cz>,
"H. Peter Anvin" <hpa@...or.com>, Mike Galbraith <efault@....de>,
Jiri Olsa <jolsa@...hat.com>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Namhyung Kim <namhyung@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Subject: Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)
* Peter Zijlstra <peterz@...radead.org> wrote:
> > One gloriously ugly hack would be to delay the userspace unwind to
> > return-to-userspace, at which point we have a schedulable context and can take
> > faults.
I don't think it's ugly, and it has various advantages:
> > Of course, then you have to somehow identify this later unwind sample with all
> > relevant prior samples and stitch the whole thing back together, but that
> > should be doable.
> >
> > In fact, it would not be at all hard to do, just queue a task_work from the
> > NMI and have that do the EH based unwind.
This would have a couple of advantages:
- as you mention, being able to fault in debug info and generally do
IO/scheduling,
- profiling overhead would be accounted to the task context that generates it,
not the NMI context,
- there would be a natural batching/coalescing optimization if multiple events
hit the same system call: the user-space backtrace would only have to be looked
up once for all samples that got collected.
This could be done by separating the user-space backtrace into a separate event,
and perf tooling would then apply the same user-space backtrace to all prior
kernel samples.
I.e. the ring-buffer would have trace entries like:
[ kernel sample #1, with kernel backtrace #1 ]
[ kernel sample #2, with kernel backtrace #2 ]
[ kernel sample #3, with kernel backtrace #3 ]
[ user-space backtrace #1 at syscall return ]
...
Note how the three kernel samples didn't have to do any user-space unwinding at
all, so the user-space unwinding overhead got reduced by a factor of 3.
Tooling would know that 'user-space backtrace #1' applies to the previous three
kernel samples.
Or so?
Thanks,
Ingo
Powered by blists - more mailing lists