lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170711125116.GD27350@kernel.org>
Date:   Tue, 11 Jul 2017 09:51:16 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Krister Johansen <kjlx@...pleofstupid.com>
Cc:     Thomas-Mich Richter <tmricht@...ux.vnet.ibm.com>,
        Brendan Gregg <brendan.d.gregg@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 tip/perf/core 1/6] perf symbols: find symbols in
 different mount namespace

Em Mon, Jul 10, 2017 at 04:29:43PM -0700, Krister Johansen escreveu:
> On Mon, Jul 10, 2017 at 07:52:49PM -0300, Arnaldo Carvalho de Melo wrote:
> > I will work on testing them soon, I just wanted this discussion to take
> > place, what you did seems to be the best we can do with the existing
> > kernel infrastructure, and is a clear advance, so we need to test and
> > merge it.
 
> Happy to have the discussion. Aplologies if having the patches
> iteratively add to one another isn't the best way to have this reviewed
> and understood.  If you just apply the first few, you don't get the
> support to pull these into the build-id cache.
 
> > Getting the build-ids for the binaries is the key here, then its just a
> > matter of populating a database where to get the matching binaries, we
> > wouldn't need even to copy the actual binaries at record time.
 
> Unfortunately, it's not sufficient to save the path to the target binary
> because it's possible that after the container exits, and the namespace

The path is not that important, as "/usr/lib64/libc-2.24.so" is not
enough to uniquely identify a binary, for instance, here in this machine
I have:

[root@...et ~]# ls -la /root/.debug/usr/lib64/libc-2.24.so/
total 16
drwxr-xr-x.  4 root root 4096 Jun 29 15:46 .
drwxr-xr-x. 40 root root 4096 Jul  7 12:28 ..
drwxr-xr-x.  2 root root 4096 Jun 29 15:46 1c80f527d122e71f3dd3bd7d7f8a00a80143ae53
drwxr-xr-x.  2 root root 4096 Jun 23 10:43 b0fa2afea4d9239b66a0a260cbaceb1b9532299a
[root@...et ~]#

[root@...et ~]# file /root/.debug/usr/lib64/libc-2.24.so/*/elf 
/root/.debug/usr/lib64/libc-2.24.so/1c80f527d122e71f3dd3bd7d7f8a00a80143ae53/elf: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1c80f527d122e71f3dd3bd7d7f8a00a80143ae53, for GNU/Linux 2.6.32, not stripped
/root/.debug/usr/lib64/libc-2.24.so/b0fa2afea4d9239b66a0a260cbaceb1b9532299a/elf: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b0fa2afea4d9239b66a0a260cbaceb1b9532299a, for GNU/Linux 2.6.32, not stripped
[root@...et ~]# o

[root@...et ~]# readelf -sW /root/.debug/usr/lib64/libc-2.24.so/1c80f527d122e71f3dd3bd7d7f8a00a80143ae53/elf > /tmp/a
[root@...et ~]# readelf -sW /root/.debug/usr/lib64/libc-2.24.so/b0fa2afea4d9239b66a0a260cbaceb1b9532299a/elf > /tmp/b
[root@...et ~]# diff -u /tmp/a /tmp/b | wc -l
16398
[root@...et ~]# diff -u /tmp/a /tmp/b | head
@@ -13,298 +13,298 @@
      9: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  UND _dl_argv@...BC_PRIVATE (27)
     10: 000000000009fbd0    29 FUNC    GLOBAL DEFAULT   13 __strspn_c1@...BC_2.2.5
     11: 0000000000072690   333 FUNC    GLOBAL DEFAULT   13 putwchar@@GLIBC_2.2.5
-    12: 00000000001195c0    19 FUNC    GLOBAL DEFAULT   13 __gethostname_chk@@GLIBC_2.4
+    12: 0000000000119630    19 FUNC    GLOBAL DEFAULT   13 __gethostname_chk@@GLIBC_2.4
     13: 000000000009fbf0    37 FUNC    GLOBAL DEFAULT   13 __strspn_c2@...BC_2.2.5
-    14: 0000000000132e80   192 FUNC    GLOBAL DEFAULT   13 setrpcent@@GLIBC_2.2.5
[root@...et ~]# 

We need to as soon as possible to get the content based unique
identifier for a binary, then try to use just that, not the pathname.

> is destroyed, there may be no path that describes to the host how to
> access the files in the container.  There are two different interactions

Right, we need to use the build-id and look it up in a database
populated somehow.

perf right now, by default, collects the build-ids in a table, at the
end of the recording session, trying not to disrupt the monitored
workload by not processing anything, just reading from the buffers and
dumping to a file.

It will also try to populate the build-id, trying first to make a
hardlink and copying it if it fails.

If we can get the build-id at the time of the mmap(binary), as part of
the loading of binaries, that would be ideal, as we're touching the file
headers anyway and the build-id is a small enough cookie.

But again, we should first try to do as far as we can with the
infrastructure we have in the kernel and tooling libraries, lots of
workloads will be serviced just fine with that.

> here that frustrate this:
> 
> 1. Containers run under a pivoted root, so the containers view of the
> path may be different from the host's view of the path.  E.g. /usr/bin/node
> in the container may actually be /var/container_a/root/usr/bin/node, or
> something like that.  However, see #2.
> 
> 2. It's also entirely possible for a container to have mounted a
> filesystem that's not accessible or mounted from the host.  If, for
> example, you're using docker with the direct-lvm storage driver, then
> your storage device may be mounted in the vfs attached to the container,
> but have no mount in the host's vfs.  In a situation like this, once the
> container exits, the that lvm filesystem is unmounted.  In order to
> access the files in that container, you basically need to setns(2) into
> the container's mount namespace and look up the files using the a path
> that resolves in the mount namespace of perf's target.

That all frustrates accessing the binary via a pathname, agreed.

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ