linux-kernel - Re: perf: Question about machine__create_extra_kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOQCU67xtf4ndP2fo6fFxgsb7q_6uUooHQK4mb+Xi4fZR_ir0g@mail.gmail.com>
Date: Thu, 13 Feb 2025 19:17:00 +0100
From: Krzysztof Łopatowski <krzysztof.m.lopatowski@...il.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: perf: Question about machine__create_extra_kernel_maps and
 trampoline symbols

Hi Ian,

> We do have a kallsyms parsing benchmark:

Yes, I've looked at `perf bench internals kallsyms-parse`. It returns for me
    Average kallsyms__parse took: 99,994 ms (+- 0,199 ms)
However, this benchmark only measures the raw parsing speed of the kallsyms
file, without any of the symbol processing that happens in real usage.

> I was curious to know if the regression is also visible there?

You can call it a regression if you mean from 2018 ;-)
I gave measurements at the top to give a sense of scale and show it's not
an already solved problem.

The core issue is that we're calling 'kallsyms__parse' multiple times, when
we could likely consolidate these calls since most of the overhead comes
from reading and parsing, not from processing the symbols.

Notably, the third call I mentioned (in machine__create_extra_kernel_maps)
accounts for about half of the total kallsyms parsing time, yet appears to
have no effect on my test system. This is why I'm questioning whether we
need to keep this functionality.

Ultimately, I believe we should explore ways to avoid reading /proc/kallsyms
altogether, given how expensive this operation is.

Best regards,
Krzysztof