lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fU0o1iKL2c35sNN9XNvzdufQhSAYn0DiE3hnvft4aAsmQ@mail.gmail.com>
Date:   Tue, 30 May 2023 07:45:09 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
        "Steven Rostedt (Google)" <rostedt@...dmis.org>,
        Ross Zwisler <zwisler@...omium.org>,
        Leo Yan <leo.yan@...aro.org>,
        Tiezhu Yang <yangtiezhu@...ngson.cn>,
        Yang Jihong <yangjihong1@...wei.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Ravi Bangoria <ravi.bangoria@....com>,
        Sean Christopherson <seanjc@...gle.com>,
        K Prateek Nayak <kprateek.nayak@....com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v2 00/16] Address some perf memory/data size issues

On Tue, May 30, 2023 at 12:59 AM Andi Kleen <ak@...ux.intel.com> wrote:
>
> > BSS won't count toward file size, which the patches were primarily
> > going after - but checking the size numbers I have miscalculated from
> > reading size's output that I'm not familiar with. The numbers are
> > still improved, but I just see a 37kb saving, with 5kb more in
> > .rodata. Something but not much. .data.rel.ro is larger, which imo is
> > good, but those pages will still be dirtied so a mute point wrt file
> > size and memory overhead.
>
> The way perf is written (lots of separate code depending on a single high level
> switch) most pages probably won't be dirtied.

For data everything is relocated when perf is loaded. Setting a
breakpoint on main and then dumping smaps (edited for brevity) I see:
```
555555554000-5555555f8000 r--p 00000000 fe:01 32936368
  /tmp/perf/perf
Size:                656 kB
Pss:                 656 kB
Pss_Dirty:             0 kB
5555555f8000-555555828000 r-xp 000a4000 fe:01 32936368
  /tmp/perf/perf
Size:               2240 kB
Pss:                  32 kB
Pss_Dirty:             8 kB
555555828000-555555f23000 r--p 002d4000 fe:01 32936368
  /tmp/perf/perf
Size:               7148 kB
Pss:                  64 kB
Pss_Dirty:             0 kB
555555f23000-555555f6d000 r--p 009cf000 fe:01 32936368
  /tmp/perf/perf
Size:                296 kB
Pss:                 288 kB
Pss_Dirty:           288 kB
555555f6d000-555555f87000 rw-p 00a19000 fe:01 32936368
  /tmp/perf/perf
Size:                104 kB
Pss:                 104 kB
Pss_Dirty:           104 kB
```
These are roughly header, text, .rodata, .data.rel.ro, .data. So at
the point we enter main we have 392kB of dirty pages in .data.rel.ro
and .data.

For x86 a large contributor to the relocations comes from the insn-x86.c test:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/tests/insn-x86.c?h=perf-tools-next#n21
The test_data_32 and test_data_64 arrays are 75,024 bytes and 93,600
bytes respectively and are in .data.rel.ro, they account for nearly
40% of it.

In gdb at main entry:
```
(gdb) p test_data_32[0]
$1 = {data = "\017\061", '\000' <repeats 12 times>, expected_length =
2, expected_rel = 0,
 expected_op_str = 0x555555866adc "", expected_branch_str = 0x555555866adc "",
 asm_rep = 0x55555586fa2a "0f 31", ' ' <repeats 16 times>, "\trdtsc  "}
```
you can see that all the strings in test_data_32 have been relocated
(even though we haven't run any part of perf yet) and are pointing to
data in .rodata. To avoid these relocations for the output of
jevents.py (pmu-events.c) all the strings are merged into a big string
and then the offsets within the string are stored - no relocations
means everything goes in the nice non-dirty .rodata. As the data in
the insn-x86.c test is also generated then a similar trick could be
performed. There is also the possibility to separate all the perf
builtins into libraries...

Thanks,
Ian

> >
> > For huge pages I thought it was correct that things are aligned by max
> > page size which I thought on x86-64 was 2MB, so I tried:
> > EXTRA_LDFLAGS="-z max-page-size=4096"
> > but it made no difference to anything, and with:
> > EXTRA_CFLAGS="-Wl,-z,max-page-size=4096"
> > EXTRA_CXXFLAGS="-Wl,-z,max-page-size=4096"
> > file size just got worse.
>
> The default alignment to 2MB was dropped in the GNU toolchain in 2018 or
> so.
>
> -Andi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ