lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e9212ccf-b02d-c2d0-f45f-a94ec2b82c5b@meta.com>
Date:   Mon, 1 May 2023 10:20:21 -0700
From:   Yonghong Song <yhs@...a.com>
To:     Espen Grindhaug <espen.grindhaug@...il.com>
Cc:     Andrii Nakryiko <andrii@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>,
        bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v2] libbpf: Improve version handling when attaching uprobe



On 5/1/23 9:30 AM, Espen Grindhaug wrote:
> On Mon, May 01, 2023 at 08:23:35AM -0700, Yonghong Song wrote:
>>
>>
>> On 5/1/23 6:00 AM, Espen Grindhaug wrote:
>>> On Thu, Apr 27, 2023 at 06:19:29PM -0700, Yonghong Song wrote:
>>>>
>>>>
>>>> On 4/27/23 12:19 PM, Espen Grindhaug wrote:
>>>>> On Wed, Apr 26, 2023 at 02:47:27PM -0700, Yonghong Song wrote:
>>>>>>
>>>>>>
>>>>>> On 4/23/23 11:55 AM, Espen Grindhaug wrote:
>>>>>>> This change fixes the handling of versions in elf_find_func_offset.
>>>>>>> In the previous implementation, we incorrectly assumed that the
>>>>>>
>>>>>> Could you give more explanation/example in the commit message
>>>>>> what does 'incorrectly' mean here? In which situations the
>>>>>> current libbpf implementation will not be correct?
>>>>>>
>>>>>
>>>>> How about something like this?
>>>>>
>>>>>
>>>>> libbpf: Improve version handling when attaching uprobe
>>>>>
>>>>> This change fixes the handling of versions in elf_find_func_offset.
>>>>>
>>>>> For example, let's assume we are trying to attach an uprobe to pthread_create in
>>>>> glibc. Prior to this commit, it would fail with an error message saying 'elf:
>>>>> ambiguous match [...]', this is because there are two entries in the symbol
>>>>> table with that name.
>>>>>
>>>>> $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
>>>>> 0000000000094cc0 T pthread_create@...BC_2.2.5
>>>>> 0000000000094cc0 T pthread_create@@GLIBC_2.34
>>>>>
>>>>> So we go ahead and modify our code to attach to 'pthread_create@@GLIBC_2.34',
>>>>> and this also fails, but this time with the error 'elf: failed to find symbol
>>>>> [...]'. This fails because we incorrectly assumed that the version information
>>>>> would be present in the string found in the string table, but there is only the
>>>>> string 'pthread_create'.
>>>>
>>>> I tried one example with my centos8 libpthread library.
>>>>
>>>> $ llvm-readelf -s /lib64/libc-2.28.so | grep pthread_cond_signal
>>>>       39: 0000000000095f70    43 FUNC    GLOBAL DEFAULT    14
>>>> pthread_cond_signal@@GLIBC_2.3.2
>>>>       40: 0000000000096250    43 FUNC    GLOBAL DEFAULT    14
>>>> pthread_cond_signal@...BC_2.2.5
>>>>     3160: 0000000000096250    43 FUNC    LOCAL  DEFAULT    14
>>>> __pthread_cond_signal_2_0
>>>>     3589: 0000000000095f70    43 FUNC    LOCAL  DEFAULT    14
>>>> __pthread_cond_signal
>>>>     5522: 0000000000095f70    43 FUNC    GLOBAL DEFAULT    14
>>>> pthread_cond_signal@@GLIBC_2.3.2
>>>>     5545: 0000000000096250    43 FUNC    GLOBAL DEFAULT    14
>>>> pthread_cond_signal@...BC_2.2.5
>>>> $ nm -D /lib64/libc-2.28.so | grep pthread_cond_signal
>>>> 0000000000095f70 T pthread_cond_signal@@GLIBC_2.3.2
>>>> 0000000000096250 T pthread_cond_signal@...BC_2.2.5
>>>> $
>>>>
>>>> Note that two pthread_cond_signal functions have different addresses,
>>>> which is expected as they implemented for different versions.
>>>>
>>>> But in your case,
>>>>> $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
>>>>> 0000000000094cc0 T pthread_create@...BC_2.2.5
>>>>> 0000000000094cc0 T pthread_create@@GLIBC_2.34
>>>>
>>>> Two functions have the same address which is very weird and I suspect
>>>> some issues here at least needs some investigation.
>>>>
>>>
>>> I am no expert on this, but as far as I can tell, this is normal,
>>> although much more common on my Ubuntu machine than my Fedora machine.
>>>
>>> Script to find duplicates:
>>>
>>> nm -D /usr/lib64/libc-2.33.so | awk '
>>> {
>>>       addr = $1;
>>>       symbol = $3;
>>>       sub(/[@].*$/, "", symbol);
>>>
>>>       if (addr == prev_addr && symbol == prev_symbol) {
>>>           if (prev_symbol_printed == 0) {
>>>               print prev_line;
>>>               prev_symbol_printed = 1;
>>>           }
>>>           print;
>>>       } else {
>>>           prev_symbol_printed = 0;
>>>       }
>>>       prev_addr = addr;
>>>       prev_symbol = symbol;
>>>       prev_line = $0;
>>> }'
>>>
>>>
>>>> Second, for the symbol table, the following is ELF encoding,
>>>>
>>>> typedef struct {
>>>>           Elf64_Word      st_name;
>>>>           unsigned char   st_info;
>>>>           unsigned char   st_other;
>>>>           Elf64_Half      st_shndx;
>>>>           Elf64_Addr      st_value;
>>>>           Elf64_Xword     st_size;
>>>> } Elf64_Sym;
>>>>
>>>> where
>>>> st_name
>>>>
>>>>       An index into the object file's symbol string table, which holds the
>>>> character representations of the symbol names. If the value is nonzero, the
>>>> value represents a string table index that gives the symbol name. Otherwise,
>>>> the symbol table entry has no name.
>>>>
>>>> So, the function name (including @..., @@...) should be in string table
>>>> which is the same for the above two pthread_cond_signal symbols.
>>>>
>>>> I think it is worthwhile to debug why in your situation
>>>> pthread_create@...BC_2.2.5 and pthread_create@@GLIBC_2.34 do not
>>>> have them in the string table.
>>>>
>>>
>>> I think you are mistaken here; the strings in the strings table don't contain
>>> the version. Take a look at this partial dump of the strings table.
>>>
>>> 	$ readelf -W -p .dynstr /usr/lib64/libc-2.33.so
>>>
>>> 	String dump of section '.dynstr':
>>> 		[     1]  xdrmem_create
>>> 		[     f]  __wctomb_chk
>>> 		[    1c]  getmntent
>>> 		[    26]  __freelocale
>>> 		[    33]  __rawmemchr
>>> 		[    3f]  _IO_vsprintf
>>> 		[    4c]  getutent
>>> 		[    55]  __file_change_detection_for_path
>>> 	(...)
>>> 		[  350e]  memrchr
>>> 		[  3516]  pthread_cond_signal
>>> 		[  352a]  __close
>>> 	(...)
>>> 		[  61b6]  GLIBC_2.2.5
>>> 		[  61c2]  GLIBC_2.2.6
>>> 		[  61ce]  GLIBC_2.3
>>> 		[  61d8]  GLIBC_2.3.2
>>> 		[  61e4]  GLIBC_2.3.3
>>>
>>> As you can see, the strings have no versions, and the version strings
>>> themselves are also in this table as entries at the end of the table.
>>
>> I see you search .dynstr section. Do you think whether we should
>> search .strtab instead since it contains versioned symbols?
>>
> 
> I searched .dynstr since my libc files only have that section, but I do see
> your point. If const char *binary_path points to an executable and not an
> .so file, then we would find some versioned symbols in the .strtab section.
> However, since libbpf supports using the .so as binary_path, would we not
> need the functionality to build the complete name regardless?

Okay, so you do not have .strtab section, the section probably removed
with `llvm-strip --strip-all <binary>`. In this particular case, I think
your approach to search SHT_GNU_versym and SHT_GNU_verdef for versioned
symbols probably is the right choice. Please do add such information
in the commit message.

> 
> Adding a check to not build the full name if it already contains an '@' is
> probably a good idea, though.

If you search strtab, you will find name with '@', but this won't be the 
case if you using SHT_GNU_versym/SHT_GNU_verdef. Since both dynstr and 
strtab are searched, I guess adding this check is a good idea if the 
version in strtab case is not NULL.

> 
>>>
>>>>>
>>>>> This patch reworks how we compare the symbol name provided by the user if it is
>>>>> qualified with a version (using @ or @@). We now look up the correct version
>>>>> string in the version symbol table before constructing the full name, as also
>>>>> done above by nm, before comparing.
>>>>>
>>>>>>> version information would be present in the string found in the
>>>>>>> string table.
>>>>>>>
>>>>>>> We now look up the correct version string in the version symbol
>>>>>>> table before constructing the full name and then comparing.
>>>>>>>
>>>>>>> This patch adds support for both name@...sion and name@@version to
>>>>>>> match output of the various elf parsers.
>>>>>>>
>>>>>>> Signed-off-by: Espen Grindhaug <espen.grindhaug@...il.com>
>>>>>>
>>>>>> [...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ