lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120326172313.GM25820@infradead.org>
Date:	Mon, 26 Mar 2012 14:23:13 -0300
From:	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
To:	David Miller <davem@...emloft.net>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: perf crashes related to map/sym mismatch

Em Sun, Mar 25, 2012 at 01:57:19AM -0400, David Miller escreveu:
> 
> I have perf crashes that eminate in two different ways, but the cause
> seems to be identical.  The two symptoms are:
> 
> 1) SYM rbtree corruption, pointers have their low bits set.
> 
> 2) segmentation fault in symbol__inc_addr_samples
> 
>    h->addr[offset]++ crashes because offset is "huge" and
>    offset is "huge" because addr < sym->start
> 
> It turns out that #2 is what causes #1, incrementing random addresses
> eventually hits a SYM rbtree linkage address thus corrupting the
> pointer to be odd.

Agreed
 
> Why is "addr" smaller than sym->start?  It's because the 'map' used to
> perform ->map_ip() and adjust "ip" in perf_top__record_precise_ip() is
> different from the 'map' used earlier to invoke ->map_ip() to
> calculate the final al->addr value in thread__find_addr_map().
> 
> Basically if al->map != he->ms.map we are in trouble.
> 
> As best I can tell this happens because the hist entry sort routines
> do not take the map into account when doing comparisons of whether the
> symbols of two hist entries are equal.

And this seems the problematic part
 
> So you can end up with a hist entry from a lookup which uses a 'map'
> on a DSO which is stale and has subsequently been updated from a more
> recent MMAP event.  In my case we have two map objects of libc, the
> older one in the hist_entry covers:
> 
> start = 0xf77bc000
> end = 0xf7928000
> 
> whereas the newer one in the al->map covers:
> 
> start = 0xf765c000
> end = 0xf77c8000

Having multiple hists seems the way to go, i.e. we still would have the
hits that happened to that old map constrained to it, i.e.  annotation,
etc would work on them and the new hits, that take place after the
previous one was retired would be associated with the new one, no?
 
> The hist_entry map is probably in the current thread's removed_maps
> tree, and indeed there is an explicit comment about this in
> map_groups__flush()
> 
> So it looks like we did a lookup on a symbol in libc pre-exec() and
> created a hist_entry for it, then we flush the map groups on the
> exec() and in the newly exec'd program we then do a lookup on the same
> symbol and this is where we find the hist_entry with the out-of-date
> map information.
> 
> Perhaps that the right thing to do is to explicitly detect this
> situation and flush out the hist_entry with the stale map information
> so we can create one with more uptodate mapping info.

Checking hist_entry->ms.map would solve this case, I think.

Will read this code a bit more to check possible complications...

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ