[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220711172706.rtfd6pp2pochmdre@google.com>
Date: Mon, 11 Jul 2022 10:27:06 -0700
From: Fangrui Song <maskray@...gle.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Leo Yan <leo.yan@...aro.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Namhyung Kim <namhyung@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, Chang Rui <changruinj@...il.com>
Subject: Re: [RFC PATCH v1] perf symbol: Correct address for bss symbols
On 2022-07-11, Ian Rogers wrote:
>On Sat, Jul 9, 2022 at 6:22 PM Leo Yan <leo.yan@...aro.org> wrote:
>>
>> When using 'perf mem' and 'perf c2c', an issue is observed that tool
>> reports the wrong offset for global data symbols. This is a common
>> issue on both x86 and Arm64 platforms.
>>
>> Let's see an example, for a test program, below is the disassembly for
>> its .bss section which is dumped with objdump:
>>
>> ...
>>
>> Disassembly of section .data:
>>
>> 0000000000004000 <__data_start>:
>> ...
>>
>> 0000000000004008 <__dso_handle>:
>> 4008: 08 40 00 or %al,0x0(%rax)
>> 400b: 00 00 add %al,(%rax)
>> 400d: 00 00 add %al,(%rax)
>> ...
>>
>> 0000000000004010 <wait_to_begin>:
>> 4010: 01 00 add %eax,(%rax)
>> 4012: 00 00 add %al,(%rax)
>> 4014: 00 00 add %al,(%rax)
>> ...
>>
>> 0000000000004018 <lock_thd_name>:
>> 4018: 08 20 or %ah,(%rax)
>> 401a: 00 00 add %al,(%rax)
>> 401c: 00 00 add %al,(%rax)
>> ...
>>
>> 0000000000004020 <reader_thd_name>:
>> 4020: 10 20 adc %ah,(%rax)
>> 4022: 00 00 add %al,(%rax)
>> 4024: 00 00 add %al,(%rax)
>> ...
>>
>> Disassembly of section .bss:
>>
>> 0000000000004040 <completed.0>:
>> ...
>>
>> 0000000000004080 <buf1>:
>> ...
>>
>> 00000000000040c0 <buf2>:
>> ...
>>
>> 0000000000004100 <thread>:
>> ...
>>
>> First we used 'perf mem record' to run the test program and then used
>> 'perf --debug verbose=4 mem report' to observe what's the symbol info
>> for 'buf1' and 'buf2' structures.
>>
>> # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
>> # ./perf --debug verbose=4 mem report
>> ...
>> dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
>> symbol__new: buf2 0x30a8-0x30e8
>> ...
>> dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
>> symbol__new: buf1 0x3068-0x30a8
>> ...
It seems unclear how 0x30a8 and 0x3068 are derived,
>> Perf tool relies on libelf to parse symbols, here 'st_value' is the
>> address from executable file, 'sh_addr' is the belonged section's linked
>> start address, and 'sh_offset' is the dynamic loaded address for this
>> section, then perf tool uses below formula to adjust symbol address:
>>
>> adjusted_address = st_value - sh_addr + sh_offset
>>
>> So we can see the final adjusted address ranges for buf1 and buf2 are
>> [0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
>> incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
>> compiler attribute with 64-byte alignment.
so I cannot judge this paragraph.
>> The problem happens for 'sh_offset', libelf returns it as 0x3028 which
>> is not 64-byte aligned, on the other hand, we can see both 'st_value'
>> and 'sh_addr' are 64-byte aligned. Combining with disassembly, it's
>> likely libelf uses the .data section end address as .bss section
>> start address, therefore, it doesn't respect the alignment attribute for
>> structures in .bss section.
>>
>> Since .data and .bss sections are in the continuous virtual address
>> space, and .data section info returned by libelf is reliable, to fix
>> this issue, if detects it's a bss symbol, it rolls back to use .data
>> section info to adjust symbol's virtual address.
This is not necessarily true.
* In GNU ld's internal linker script, .data1 sits between .data and .bss.
* A linker script can add other sections between .data and .bss
* A linker script may place .data and .bss in two PT_LOAD program headers.
% readelf -WS aa
There are 13 section headers, starting at offset 0x10a8:
With a linker script like
% cat a/a.lds
SECTIONS {
.text : { *(.text) }
data1 : { *(data1) }
data2 : { *(data2) }
.bss : { *(.bss) }
}
I can get something like
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000000000 001000 000001 00 AX 0 0 1
[ 2] data1 PROGBITS 0000000000000001 001001 000001 00 WA 0 0 1
[ 3] data2 PROGBITS 0000000000000002 001002 000001 00 WA 0 0 1
[ 4] .data PROGBITS 0000000000000003 001003 000000 00 WA 0 0 1
[ 5] data3 PROGBITS 0000000000000003 001003 000001 00 WA 0 0 1
[ 6] .bss NOBITS 0000000000000020 001004 000001 00 WA 0 0 32
.bss's sh_offset does not need to be aligned per http://www.sco.com/developers/gabi/latest/ch4.sheader.html
sh_offset
This member's value gives the byte offset from the beginning of the file to the first byte in the section. One section type, SHT_NOBITS described below, occupies no space in the file, and its sh_offset member locates the conceptual placement in the file.
I don't have more context why the file offset is needed for a variable in the all-zero section.
If the file offset has to be used and we want to use a heuristic, a better one is to find the section index of .bss, say, i.
const uint64_t align = shdr[i].sh_addralign;
assert(i > 0);
if (shdr[i].offset % align == 0)
return shdr[i].offset;
return (shdr[i-1].sh_offset + shdr[i-1].sh_size + align - 1) & -align;
Really, it is better to use the program header to derive the virtual address of a variable residing in .bss.
>> Essentially, we need to fix libelf to return correct offsets for
>> sections, on the other hand, we live commonly with existed versions of
>> libelf. So we also need this change in perf tool.
>>
>> Fixes: f17e04afaff8 ("perf report: Fix ELF symbol parsing")
>> Reported-by: Chang Rui <changruinj@...il.com>
>> Signed-off-by: Leo Yan <leo.yan@...aro.org>
>
>This looks good to me, I'm happy to add my:
>Acked-by: Ian Rogers <irogers@...gle.com>
>I've added Fangrui Song who is more knowledge-able on ELF, libelf,
>etc. than me and may have additional thoughts.
>
>Thanks,
>Ian
>
>> ---
>> tools/perf/util/dso.h | 1 +
>> tools/perf/util/symbol-elf.c | 26 ++++++++++++++++++++++++++
>> 2 files changed, 27 insertions(+)
>>
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index 3a9fd4d389b5..00f57f4ac6bc 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -180,6 +180,7 @@ struct dso {
>> u8 rel;
>> struct build_id bid;
>> u64 text_offset;
>> + int data_sec_index;
>> const char *short_name;
>> const char *long_name;
>> u16 long_name_len;
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index ecd377938eea..ed65dd26d58e 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1095,6 +1095,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>> Elf *elf;
>> int nr = 0;
>> bool remap_kernel = false, adjust_kernel_syms = false;
>> + size_t sec_index;
>>
>> if (kmap && !kmaps)
>> return -1;
>> @@ -1113,6 +1114,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>> ".text", NULL))
>> dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
>>
>> + if (elf_section_by_name(runtime_ss->elf, &runtime_ss->ehdr, &tshdr,
>> + ".data", &sec_index))
>> + dso->data_sec_index = sec_index;
>> +
>> if (runtime_ss->opdsec)
>> opddata = elf_rawdata(runtime_ss->opdsec, NULL);
>>
>> @@ -1227,6 +1232,27 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>>
>> gelf_getshdr(sec, &shdr);
>>
>> + /*
>> + * When the first data structure in .bss section is attributed
>> + * with alignment (e.g. 64-byte aligned), libelf doesn't reflect
>> + * the alignment in the 'shdr.sh_offset' field, at the end the
>> + * field is filled with the end loading address of a prior
>> + * section rather than the aligned address of .bss section.
>> + * This leads to mess for later parsing .bss symbols.
>> + *
>> + * Since .data and .bss sections are in the continuous virtual
>> + * address space, and .data section's info is reliable. So if
>> + * detects it's a bss symbol, we retrieve .data section info
>> + * for adjusting address.
>> + */
>> + if (!strcmp(elf_sec__name(&shdr, secstrs_sym), ".bss")) {
>> + sec = elf_getscn(syms_ss->elf, dso->data_sec_index);
>> + if (!sec)
>> + goto out_elf_end;
>> +
>> + gelf_getshdr(sec, &shdr);
>> + }
>> +
>> secstrs = secstrs_sym;
>>
>> /*
>> --
>> 2.25.1
>>
Powered by blists - more mailing lists