lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zqz_BwG1fcQaUsoY@pathway.suse.cz>
Date: Fri, 2 Aug 2024 17:45:11 +0200
From: Petr Mladek <pmladek@...e.com>
To: Song Liu <songliubraving@...a.com>
Cc: Masami Hiramatsu <mhiramat@...nel.org>, Song Liu <song@...nel.org>,
	"live-patching@...r.kernel.org" <live-patching@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-trace-kernel@...r.kernel.org" <linux-trace-kernel@...r.kernel.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Jiri Kosina <jikos@...nel.org>, Miroslav Benes <mbenes@...e.cz>,
	Joe Lawrence <joe.lawrence@...hat.com>,
	Nathan Chancellor <nathan@...nel.org>,
	"morbo@...gle.com" <morbo@...gle.com>,
	Justin Stitt <justinstitt@...gle.com>,
	Luis Chamberlain <mcgrof@...nel.org>,
	Leizhen <thunder.leizhen@...wei.com>,
	"kees@...nel.org" <kees@...nel.org>,
	Kernel Team <kernel-team@...a.com>,
	Matthew Maurer <mmaurer@...gle.com>,
	Sami Tolvanen <samitolvanen@...gle.com>,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH 2/3] kallsyms: Add APIs to match symbol without
 .llmv.<hash> suffix.

On Wed 2024-07-31 01:00:34, Song Liu wrote:
> Hi Masami, 
> 
> > On Jul 30, 2024, at 6:03 AM, Masami Hiramatsu <mhiramat@...nel.org> wrote:
> > 
> > On Mon, 29 Jul 2024 17:54:32 -0700
> > Song Liu <song@...nel.org> wrote:
> > 
> >> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names
> >> to avoid duplication. This causes confusion with users of kallsyms.
> >> On one hand, users like livepatch are required to match the symbols
> >> exactly. On the other hand, users like kprobe would like to match to
> >> original function names.
> >> 
> >> Solve this by splitting kallsyms APIs. Specifically, existing APIs now
> >> should match the symbols exactly. Add two APIs that matches the full
> >> symbol, or only the part without .llvm.suffix. Specifically, the following
> >> two APIs are added:
> >> 
> >> 1. kallsyms_lookup_name_or_prefix()
> >> 2. kallsyms_on_each_match_symbol_or_prefix()
> > 
> > Since this API only removes the suffix, "match prefix" is a bit confusing.
> > (this sounds like matching "foo" with "foo" and "foo_bar", but in reality,
> > it only matches "foo" and "foo.llvm.*")
> > What about the name below?
> > 
> > kallsyms_lookup_name_without_suffix()
> > kallsyms_on_each_match_symbol_without_suffix()

This looks like the best variant to me. A reasonable compromise.

> >> These APIs will be used by kprobe.
> > 
> > No other user need this?
> 
> AFACIT, kprobe is the only use case here. Sami, please correct 
> me if I missed any users. 
> 
> 
> More thoughts on this: 
> 
> I actually hope we don't need these two new APIs, as they are 
> confusing. Modern compilers can do many things to the code 
> (inlining, etc.). So when we are tracing a function, we are not 
> really tracing "function in the source code". Instead, we are 
> tracing "function in the binary". If a function is inlined, it 
> will not show up in the binary. If a function is _partially_ 
> inlined (inlined by some callers, but not by others), it will 
> show up in the binary, but we won't be tracing it as it appears
> in the source code. Therefore, tracing functions by their names 
> in the source code only works under certain assumptions. And 
> these assumptions may not hold with modern compilers. Ideally, 
> I think we cannot promise the user can use name "ping_table" to
> trace function "ping_table.llvm.15394922576589127018"
> 
> Does this make sense?

IMHO, it depends on the use case. Let's keep "ping_table/"
as an example. Why people would want to trace this function?
There might be various reasons, for example:

  1. ping_table.llvm.15394922576589127018 appeared in a backtrace

  2. ping_table.llvm.15394922576589127018 appeared in a histogram

  3. ping_table looks interesting when reading code sources

  4. ping_table need to be monitored on all systems because
     of security/performance.

The full name "ping_table.llvm.15394922576589127018" is perfectly
fine in the 1st and 2nd scenario. People knew this name already
before they start thinking about tracing.

The short name is more practical in 3rd and 4th scenario. Especially,
when there is only one static symbol with this short name. Otherwise,
the user would need an extra step to find the full name.

The full name is even more problematic for system monitors. These
applications might need to probe particular symbols. They might
have hard times when the symbol is:

    <symbol_name_from_sources>.<random_suffix_generated_by_compiler>

They will have to deal with it. But it means that every such tool
would need an extra (non-trivial) code for this. Every tool would
try its own approach => a lot of problems.

IMHO, the two APIs could make the life easier.

Well, even kprobe might need two APIs to allow probing by
full name or without the suffix.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ