lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20120325.015719.1170644257291462749.davem@davemloft.net>
Date:	Sun, 25 Mar 2012 01:57:19 -0400 (EDT)
From:	David Miller <davem@...emloft.net>
To:	acme@...stprotocols.net
CC:	linux-kernel@...r.kernel.org
Subject: perf crashes related to map/sym mismatch


I have perf crashes that eminate in two different ways, but the cause
seems to be identical.  The two symptoms are:

1) SYM rbtree corruption, pointers have their low bits set.

2) segmentation fault in symbol__inc_addr_samples

   h->addr[offset]++ crashes because offset is "huge" and
   offset is "huge" because addr < sym->start

It turns out that #2 is what causes #1, incrementing random addresses
eventually hits a SYM rbtree linkage address thus corrupting the
pointer to be odd.

Why is "addr" smaller than sym->start?  It's because the 'map' used to
perform ->map_ip() and adjust "ip" in perf_top__record_precise_ip() is
different from the 'map' used earlier to invoke ->map_ip() to
calculate the final al->addr value in thread__find_addr_map().

Basically if al->map != he->ms.map we are in trouble.

As best I can tell this happens because the hist entry sort routines
do not take the map into account when doing comparisons of whether the
symbols of two hist entries are equal.

So you can end up with a hist entry from a lookup which uses a 'map'
on a DSO which is stale and has subsequently been updated from a more
recent MMAP event.  In my case we have two map objects of libc, the
older one in the hist_entry covers:

start = 0xf77bc000
end = 0xf7928000

whereas the newer one in the al->map covers:

start = 0xf765c000
end = 0xf77c8000

The hist_entry map is probably in the current thread's removed_maps
tree, and indeed there is an explicit comment about this in
map_groups__flush()

So it looks like we did a lookup on a symbol in libc pre-exec() and
created a hist_entry for it, then we flush the map groups on the
exec() and in the newly exec'd program we then do a lookup on the same
symbol and this is where we find the hist_entry with the out-of-date
map information.

Perhaps that the right thing to do is to explicitly detect this
situation and flush out the hist_entry with the stale map information
so we can create one with more uptodate mapping info.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ