lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250619045659.390cc014@batman.local.home>
Date: Thu, 19 Jun 2025 04:56:59 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 bpf@...r.kernel.org, x86@...nel.org, Masami Hiramatsu
 <mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Josh Poimboeuf <jpoimboe@...nel.org>, Ingo Molnar <mingo@...nel.org>, Jiri
 Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Thomas
 Gleixner <tglx@...utronix.de>, Andrii Nakryiko <andrii@...nel.org>, Indu
 Bhagat <indu.bhagat@...cle.com>, "Jose E. Marchesi" <jemarch@....org>, Beau
 Belgrave <beaub@...ux.microsoft.com>, Jens Remus <jremus@...ux.ibm.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>, Andrew Morton
 <akpm@...ux-foundation.org>
Subject: Re: [PATCH v10 06/14] unwind_user/deferred: Add deferred unwinding
 interface

On Thu, 19 Jun 2025 09:50:08 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> On Wed, Jun 18, 2025 at 03:09:15PM -0400, Steven Rostedt wrote:
> > On Wed, 18 Jun 2025 20:46:20 +0200
> > Peter Zijlstra <peterz@...radead.org> wrote:
> >   
> > > > +struct unwind_work;
> > > > +
> > > > +typedef void (*unwind_callback_t)(struct unwind_work *work, struct unwind_stacktrace *trace, u64 timestamp);
> > > > +
> > > > +struct unwind_work {
> > > > +	struct list_head		list;    
> > > 
> > > Does this really need to be a list? Single linked list like
> > > callback_head not good enough?  
> > 
> > Doesn't a list head make it easier to remove without having to iterate the
> > list?  
> 
> Yeah, but why would you ever want to remove it? You asked for an unwind,
> you get an unwind, no?

No, it's not unique per tracing infrastructure, but tracing instance.
That is, per perf program, or per tracing instance. It needs to be
removed.

> 
> > > >  static __always_inline void unwind_exit_to_user_mode(void)
> > > >  {
> > > >  	if (unlikely(current->unwind_info.cache))
> > > >  		current->unwind_info.cache->nr_entries = 0;
> > > > +	current->unwind_info.timestamp = 0;    
> > > 
> > > Surely clearing that timestamp is only relevant when there is a cache
> > > around? Better to not add this unconditional write to the exit path.  
> > 
> > That's actually not quite true. If the allocation fails, we still want to
> > clear the timestamp. But later patches add more data to check and it does
> > exit out if there's been no requests:  
> 
> Well, you could put in an error value on alloc fail or somesuch. Then
> its non-zero.

OK.

> 
> > But for better reviewing, I could add a comment in this patch that states
> > that this will eventually exit out early when it does more work.  
> 
> You're making this really hard to review, why not do it right from the
> get-go?

Because the value that is to be checked isn't here yet.

> 
> > > > +/* Guards adding to and reading the list of callbacks */
> > > > +static DEFINE_MUTEX(callback_mutex);
> > > > +static LIST_HEAD(callbacks);    
> > > 
> > > Global state.. smells like failure.  
> > 
> > Yes, the unwind infrastructure is global, as it is the way tasks know what
> > tracer's callbacks to call.  
> 
> Well, that's apparently how you've set it up. I don't immediately see
> this has to be like this.
> 
> And there's no comments no nothing.
> 
> I don't see why you can't have something like:
> 
> struct unwind_work {
> 	struct callback_head task_work;
> 	void *data;
> 	void (*func)(struct unwind_work *work, void *data);
> };
> 
> void unwind_task_work_func(struct callback_head *task_work)
> {
> 	struct unwind_work *uw = container_of(task_work, struct unwind_work, task_work);
> 
> 	// do actual unwind
> 
> 	uw->func(uw, uw->data);
> }
> 
> or something along those lines. No global state involved.

We have a many to many relationship here where a task_work doesn't work.

That is, you can have a tracer that expects callbacks from several
tasks at the same time, as well as some of those tasks expect to send a
callback to different tracers.

Later patches add a bitmask to every task that gets set to know which
trace to use.

Since the number of tracers that can be called back is fixed to the
number of bits in long (for the bitmask), I can get rid of the link
list and make it into an array. That would make this easier.


> 
> 
> > > > +	guard(mutex)(&callback_mutex);
> > > > +	list_for_each_entry(work, &callbacks, list) {
> > > > +		work->func(work, &trace, timestamp);
> > > > +	}    
> > > 
> > > So now you're globally serializing all return-to-user instances. How is
> > > that not a problem?  
> > 
> > It was the original way we did things. The next patch changes this to SRCU.
> > But it requires a bit more care. For breaking up the series, I preferred
> > not to add that logic and make it a separate patch.
> > 
> > For better reviewing, I'll add a comment here that says:
> > 
> > 	/* TODO switch this global lock to SRCU */  
> 
> Oh ffs :-(
> 
> So splitting up patches is for ease of review, but now you're making
> splits that make review harder, how does that make sense?

Actually, a comment isn't the right place, I should have mentioned this
in the change log.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ