lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240130224224.GA435-beaub@linux.microsoft.com>
Date: Tue, 30 Jan 2024 14:42:24 -0800
From: Beau Belgrave <beaub@...ux.microsoft.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Masami Hiramatsu <mhiramat@...nel.org>, linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com
Subject: Re: [PATCH 2/4] tracing/user_events: Introduce multi-format events

On Tue, Jan 30, 2024 at 01:52:30PM -0500, Steven Rostedt wrote:
> On Tue, 30 Jan 2024 10:05:15 -0800
> Beau Belgrave <beaub@...ux.microsoft.com> wrote:
> 
> > On Mon, Jan 29, 2024 at 09:24:07PM -0500, Steven Rostedt wrote:
> > > On Mon, 29 Jan 2024 09:29:07 -0800
> > > Beau Belgrave <beaub@...ux.microsoft.com> wrote:
> > >   
> > > > Thanks, yeah ideally we wouldn't use special characters.
> > > > 
> > > > I'm not picky about this. However, I did want something that clearly
> > > > allowed a glob pattern to find all versions of a given register name of
> > > > user_events by user programs that record. The dot notation will pull in
> > > > more than expected if dotted namespace style names are used.
> > > > 
> > > > An example is "Asserts" and "Asserts.Verbose" from different programs.
> > > > If we tried to find all versions of "Asserts" via glob of "Asserts.*" it
> > > > will pull in "Asserts.Verbose.1" in addition to "Asserts.0".  
> > > 
> > > Do you prevent brackets in names?
> > >   
> > 
> > No. However, since brackets have a start and end token that are distinct
> > finding all versions of your event is trivial compared to a single dot.
> > 
> > Imagine two events:
> > Asserts
> > Asserts[MyCoolIndex]
> > 
> > Resolves to tracepoints of:
> > Asserts:[0]
> > Asserts[MyCoolIndex]:[1]
> > 
> > Regardless of brackets in the names, a simple glob of Asserts:\[*\] only
> > finds Asserts:[0]. This is because we have that end bracket in the glob
> > and the full event name including the start bracket.
> > 
> > If I register another "version" of Asserts, thne I'll have:
> > Asserts:[0]
> > Asserts[MyCoolIndex]:[1]
> > Asserts:[2]
> > 
> > The glob of Asserts:\[*\] will return both:
> > Asserts:[0]
> > Asserts:[2]
> 
> But what if you had registered "Asserts:[MyCoolIndex]:[1]"
> 

Good point, the above would still require a regex type pattern to not
get pulled in.

> Do you prevent colons?
> 

No, nothing is prevented at this point.

It seems we could either prevent certain characters to make it easier or
define a good regex that we should document.

I'm leaning toward just doing a simple suffix and documenting the regex
well.

> > 
> > At this point the program can either record all versions or scan further
> > to find which version of Asserts is wanted.
> > 
> > > > 
> > > > While a glob of "Asserts.[0-9]" works when the unique ID is 0-9, it
> > > > doesn't work if the number is higher, like 128. If we ever decide to
> > > > change the ID from an integer to say hex to save space, these globs
> > > > would break.
> > > > 
> > > > Is there some scheme that fits the C-variable name that addresses the
> > > > above scenarios? Brackets gave me a simple glob that seemed to prevent a
> > > > lot of this ("Asserts.\[*\]" in this case).  
> > > 
> > > Prevent a lot of what? I'm not sure what your example here is.
> > >   
> > 
> > I'll try again :)
> > 
> > We have 2 events registered via user_events:
> > Asserts
> > Asserts.Verbose
> > 
> > Using dot notation these would result in tracepoints of:
> > user_events_multi/Asserts.0
> > user_events_multi/Asserts.Verbose.1
> > 
> > Using bracket notation these would result in tracepoints of:
> > user_events_multi/Asserts:[0]
> > user_events_multi/Asserts.Verbose:[1]
> > 
> > A recording program only wants to enable the Asserts tracepoint. It does
> > not want to record the Asserts.Verbose tracepoint.
> > 
> > The program must find the right tracepoint by scanning tracefs under the
> > user_events_multi system.
> > 
> > A single dot suffix does not allow a simple glob to be used. The glob
> > Asserts.* will return both Asserts.0 and Asserts.Verbose.1.
> > 
> > A simple glob of Asserts:\[*\] will only find Asserts:[0], it will not
> > find Asserts.Verbose:[1].
> > 
> > We could just use brackets and not have the colon (Asserts[0] in this
> > case). But brackets are still special for bash.
> 
> Are these shell scripts or programs. I use regex in programs all the time.
> And if you have shell scripts, use awk or something.
> 

They could be both. In our case, it is a program.

> Unless you prevent something from being added, I don't see the protection.
> 

Yeah, it just makes it way less likely. Given that, I'm starting to lean
toward just documenting the regex well and not trying to get fancy.

> > 
> > > > 
> > > > Are we confident that we always want to represent the ID as a base-10
> > > > integer vs a base-16 integer? The suffix will be ABI to ensure recording
> > > > programs can find their events easily.  
> > > 
> > > Is there a difference to what we choose?
> > >   
> > 
> > If a simple glob of event_name:\[*\] cannot be used, then we must document
> > what the suffix format is, so an appropriate regex can be created. If we
> > start with base-10 then later move to base-16 we will break existing regex
> > patterns on the recording side.
> > 
> > I prefer, and have in this series, a base-16 output since it saves on
> > the tracepoint name size.
> 
> I honestly don't care which base you use. So if you want to use base 16,
> I'm fine with that.
> 
> > 
> > Either way we go, we need to define how recording programs should find
> > the events they care about. So we must be very clear, IMHO, about the
> > format of the tracepoint names in our documentation.
> > 
> > I personally think recording programs are likely to get this wrong
> > without proper guidance.
> > 
> 
> Agreed.
> 
> -- Steve

Thanks,
-Beau

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ