lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230823093614.fd704a98387d0ba1d23a29fb@kernel.org>
Date:   Wed, 23 Aug 2023 09:36:14 +0900
From:   Masami Hiramatsu (Google) <mhiramat@...nel.org>
To:     Francis Laniel <flaniel@...ux.microsoft.com>
Cc:     linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
        linux-trace-kernel@...r.kernel.org,
        Song Liu <songliubraving@...com>, bpf@...r.kernel.org
Subject: Re: [RFC PATCH v1 1/1] tracing/kprobe: Add multi-probe support for
 'perf_kprobe' PMU

Hi,

On Mon, 21 Aug 2023 14:55:14 +0200
Francis Laniel <flaniel@...ux.microsoft.com> wrote:

> > Could you tell me how do you use this feature, for what perpose?
> 
> Sure (I think I detailed this in the cover letter but I only sent it to the 
> "main" mailing list and not the tracing one, sorry for this inconvenience)!
> 
> Basically, I was adding NTFS tracing to an existing tool which monitors slow 
> I/Os using BPF [1].
> To test the tool, I compiled a kernel with both NTFS module built-in and 
> figured out the write operations when done on ntfs3 were missing from the 
> output of the tool.
> The problem comes from the library I use in the tool which does not handle 
> well when it exists different symbols with the same name.
> Contrary to perf, which only handles kprobes through sysfs, the library 
> handles it in both way (sysfs and PMU) with a preference for PMU when 
> available [2].
> 
> After some discussion, I thought there could be a way to handle this 
> automatically in the kernel when using PMU kprobes, hence this patch.
> I totally understand the case I described above is really a corner one, but I 
> thought this feature could be useful for other people.
> In the case of the library itself, we could indeed find the address in /proc/
> kallsyms but it would mean having CAP_SYS_ADMIN which is not forcefully 
> something we want to enforce.
> Also, if we need to read /boot/vmlinuz or /boot/System.map we also need to be 
> root as these files often belong to root and cannot be read by others.
> So, this patch solves the above problem while not needing specific capabilities 
> as the kernel will solve it for us.

Thanks for the explanation. I got the background, and still have some questions.

- Is the analysis tool really necessary to be used by users other than
  CAP_SYS_ADMIN? Even if it is useful, I still doubt CAP_PERFMON is safer
  than CAP_SYS_ADMIN, because BPF program can access any kernel register.

- My concern about this solution (enabling kprobe PMU on all symbols which
  have the same name) makes it hard to run the same BPF program on it.
  This is because symbols with the same name do not necessarily have the
  same arguments (theoretically). Also, the BPF will trace unwanted symbols
  at unwanted timing.

- Can you expand that library to handle the same name symbols differently?
  I think this should be done in the user space, or in the kallsyms like
  storing symbols with source line information.

I understand this demand, but solving that with probing *all* symbols seems
like a brute force solution and may cause another problem later.

But this is a good discussion item. Last month Alessandro sent a script which
makes such symbols unique. Current problem is that the numbering is not enough
to identify which one is from which source code.

https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/ 

> 
> > If you just need to trace/profile a specific function which has the same
> > name symbols, you might be better to use `perf probe` +
> > `/sys/kernel/tracing` or `perf record -e EVENT`.
> > 
> > Or if you need to run it with CAP_PERFMON, without CAP_SYS_ADMIN,
> > we need to change a userspace tool to find the correct address and
> > pass it to the perf_event_open().
> > 
> > > > > Added new events:
> > > > >   probe:ntfs_file_write_iter (on ntfs_file_write_iter)
> > > > >   probe:ntfs_file_write_iter (on ntfs_file_write_iter)
> > > > > 
> > > > > You can now use it in all perf tools, such as:
> > > > >         perf record -e probe:ntfs_file_write_iter -aR sleep 1
> > > > > 
> > > > > root@...amd64:~# cat /sys/kernel/tracing/kprobe_events
> > > > > p:probe/ntfs_file_write_iter _text+5088544
> > > > > p:probe/ntfs_file_write_iter _text+5278560
> > > > > 
> > > > > > Thought?
> > > > > 
> > > > > This contribution is basically here to sort of mimic what perf does
> > > > > but
> > > > > with PMU kprobes, as this is not possible to write in a sysfs file
> > > > > with
> > > > > this type of probe.
> > > > 
> > > > OK, I see it is for BPF only. Maybe BPF program can filter correct one
> > > > to access the argument etc.
> > > 
> > > I am not sure I understand, can you please precise?
> > > The eBPF program will be run when the kprobe will be triggered, so if the
> > > kprobe is armed for the function (e.g. old ntfs_file_write_iter()), the
> > > eBPF program will never be called.
> > 
> > As I said above, it is userspace BPF loader issue, because it has to specify
> > the correct address via unique symbol + offset, instead of attaching all of
> > them. I think that will be more side-effects.
> > 
> > But anyway, thanks for pointing this issue. I should fix kprobe event to
> > reject the symbols which is not unique. That should be pointed by other
> > unique symbols.
> 
> You are welcome and I thank you for the discussion.
> Can you please precise more what you think about "reject the symbols which is 
> not unique"?
> Basically something like this:

Yes, that's what I said.

> struct trace_event_call *
> create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
> 			  bool is_return)
> {
> 	...
> 	if (!addr && func) {

if (func) {  /* because anyway if user specify "func" we have to solve
 the symbol address */

> 		array.addrs = NULL;
> 		array.size = 0;
> 		ret = kallsyms_on_each_match_symbol(add_addr, func, &array);
> 		if (ret)
> 			goto error_free;
> 
> 		if (array.size != 1) {
> 			/* 
> 			 * Function name corresponding to several symbols must
> 			 * be passed by address only.
> 			 */
> 			return -EINVAL;

This case may return a unique error code so that the caller can notice
the problem.

Thank you,

> 		}
> 	}



> 	...
> }
> ?
> If my understanding is correct, I think I can write a patch to achieve this.
> 

> 
> Best regards.
> ---
> [1]: https://github.com/inspektor-gadget/inspektor-gadget/pull/1879
> [2]: https://github.com/cilium/ebpf/blob/
> 270c859894bd38cdd0c7783317b16343409e4df8/link/kprobe.go#L165-L191
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ