linux-kernel - Re: [RFC] perf record: missing buildid for callstack modules

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160113152737.GK18367@kernel.org>
Date:	Wed, 13 Jan 2016 12:27:37 -0300
From:	Arnaldo Carvalho de Melo <acme@...nel.org>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Stephane Eranian <eranian@...gle.com>,
	Namhyung Kim <namhyung@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Jiri Olsa <jolsa@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>
Subject: Re: [RFC] perf record: missing buildid for callstack modules

Em Wed, Jan 13, 2016 at 10:57:38AM +0100, Ingo Molnar escreveu:
> * Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > Em Tue, Jan 12, 2016 at 11:39:43AM +0100, Ingo Molnar escreveu:
> > > But perf tooling cares very much: it can lead to subtle bugs and bad data if we 
> > > display a profile with the wrong DSO or binary. 'Bad' profiles resulting out of 
> > > binary mismatch can be very convincing and can send developers down the wrong path 
> > > for hours. I'd expect my tooling to not do that.
 
> > > Path names alone (the thing that exec() cares about) are not unique enough to 
> > > identify the binary that was profiled. So we need a content hash - hence the 
> > > build-ID.
 
> > > Can you suggest a better solution than a build-time calculated content hash?
  
> > > As for binary formats that suck and don't allow for a content hash: we do our 
> > > best, but of course the risk of data mismatch is there. We could perhaps cache the 
> > > binary inode's mtime field to at least produce a 'profile data is older than 
> > > binary/DSO modification date!' warning. (Which check won't catch all cases, like 
> > > cross-system profiling data matches.)

> > So, we could think of this as: binary formats that want to aid
> > observability tools to:

> > 1) Detect mismatches in contents for DSOs present at recording time to
> >    those to be used at analysis time.

> > 2) Find symtabs, DSO binary contents, CFI tables, present in the DSO
> >    where samples were taken.

> > Using mtime, as suggested in other messages will help with #1, but not
> > with #2.
> 
> But but ... why is #2 a problem with mtime? If we have an out of date record in 
> the perf.data, then the perf.data is uninteresting in 99% of the usecases! It's 
> out of date, most likely because the binary the developer is working on got 
> rebuilt, or the system got upgraded - in both cases the developer does not care 
> about the old records anymore...

Oh, with mtime we'll be able to, with some effort, most of the time,
find the right symtab/DSO contents, it is not as good as the
content-based build-id, but it improves the current situation, for, to
reuse an euphemism, 99% of the cases ;-)

It is just a pity that what would arguably be the last step to make
content-based DSO identifiers a first class citizen, already available
mostly everywhere, i.e. in ELF binaries will have to wait a bit more.

With PeterZ's change to make the filename length part of the
PERF_RECORD_MMAP3 we at least leave the door open to including that
cookie without having to introduce PERF_RECORD_MMAP4 :-)

> What matters is #1, to detect mismatches, to be a reliable tool. Once we've
> detected that, we can inform the user and our job is mostly done.

Well, we could tell the user to install the package with that symtab, in fedora
it is called 'foo-debuginfo' and can be installed via, for instance (example
taken from a gdb post somewhere):

  [root@zoo ~]# dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/3d/f5385c6be529423a8ae3dd39a3deb9425201cc
  <SNIP>
  Using metadata from Wed Jan 13 12:12:16 2016 (0:02:56 hours old)
  Dependencies resolved.
  ==================================================================
   Package            Arch    Version      Repository          Size
  ==================================================================
  Installing:
   glibc-debuginfo    x86_64  2.20-8.fc21  updates-debuginfo  9.2 M

  Transaction Summary
  ==================================================================
  Install  1 Package

  Total download size: 9.2 M
  Installed size: 58 M
  Is this ok [y/N]:

-----------------------------------

I.e. infrastructure is in place to get this "time machine" you mention below, is somewhat
in place to get a symtab for a DSO, be it the latest version of some versions ago.

Which could be useful to understand the behaviour of code in production while an
update can't be applied (not vetted by powers that be, whatever reason).
 
> But reliable != perfect time machine. Really, #2 is a second, third order concern 
> that should never cause slowdowns on the magnitude that Peter is complaining 
> about!

Sure, mtime will improve peterz's usecase, no question about it.
 
> I realize that there might be special workflows (such as system-wide monitoring) 
> where collecting at recording time might be useful, but those are not the common 
> case - and they should not slow down the common case.

Sure, no question about this.
 
> > Checking for inefficiencies in the current approach of
> > right-after-recording post-processing looking for PERF_RECORD_MMAPs,
> > Adrian suggested something here, also disabling the saving into
> > ~/.debug/ will help, collecting numbers would be great.
> 
> I think Peter mentioned a number: the kernel build time almost _doubles_ with 
> this. That's clearly unacceptable.

And for him, most of the time, not a problem, in fact we could argue that he
has not a problem, Stephane seems to have ;-)
 
> > But the mtime thing also requires traversing the whole perf.data
> > contents looking for those paths in PERF_RECORD_MMAP records.
 
> But why? Why cannot we do it at perf report time, when we will parse them anyway?

Hey, I was talking only about 'record' time.

And at record, we don'have to traverse the whole perf.data contents as soon as
we add a new PERF_RECORD_MMAP3, just not with the initial motivation for it (a
content-based cookie), but instead the DSOs's mtime :-)

- Arnaldo