linux-kernel - Re: [GIT PULL v2] bkl tracepoints + filter regex support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 26 Sep 2009 06:44:21 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Tom Zanussi <tzanussi@...il.com>,
	Li Zefan <lizf@...fujitsu.com>
Subject: Re: [GIT PULL v2] bkl tracepoints + filter regex support

On Fri, 2009-09-25 at 12:38 +0200, Frederic Weisbecker wrote:

>  
> > Using globs in string matches most certainly is useful, no question
> > about that.
> > 
> > But I had understood from previous communications we were going to have
> > a C syntax, and there == is a straight comparison.
> > 
> > If however people have changed their minds (fine with me) and we're now
> > going to script like things..
> 
> 
> 
> Well, indeed we talked about C syntax, but I didn't think the idea
> was that fixed in the rock, hence why I was suprised.

Once we add globs, we just blew away C syntax.

> 
> 
> > Anyway, a glob in == just means we have to use another operator if we
> > ever want to support actual regexes, ~ would then be recommened I think,
> > since that's what awk and I think perl do.

Perhaps when we put full perl regex into the kernel (my goal ;-) then we
should look to keep different kinds of equals.

  ==  - is direct match. Only use of strcmp is needed.

  ~ - is globing. We can add a '*' which means match anything.

and if we do add true regex...

  =~ could be that.  field =~ '^spin.*{lock|unlock}$'

> 
> 
> Yeah. For example one may know python but not perl or awk,
> other people may be in the opposite situation. But most
> developers know the C (at least its basic syntax).

awk is much more known than either python nor perl. It is expected that
any unix person have a basic idea of sed and awk. If not a simple search
on the internet can help them.

It takes 5 minutes to figure out how to do something with awk, where as
we all know it takes a much longer time to figure out python or perl.

> 
> So I'm not sure using such ~ operator is a good idea. I think you're
> right in the fact we should stay tight to the C syntax.

I disagree.

> 
> 
> > Personally I wouldn't mind things like:
> > 
> >  glob_match(string, pattern)
> >  regex_match(string, pattern)

In a filter string. Yuck!

note I don't like python, which is probably why I don't like the above.

> 
> 
> 
> Yeah, actually that sounds more flexible and more something that people
> are familar with, once we consider the future evolutions.

please no! I hate that syntax. Again, this is probably one of the major
reasons I avoid python. (that and column forcing)


> 
> 
> 
> > But everybody involved in this filter stuff needs to agree what
> > direction you want to take the language in.
> 
> 
> 
> Right!

Yes, and I agree that == should not mean globing. We should have another
syntax, but I really don't want "functions" for matching.

> 
>  
> 
> > > I just don't want that this bridge turns out any ftrace uses through debugfs
> > > into an overkill.
> > > Instead I'd prefer to satisfy both, hence the above proposition.
> > 
> > So you're proposing to split the filter language? I'm sure that's going
> > to confuse a few people ;-)
> 
> 
> 
> Hmm, just at this level. That could even be a trace option.
> Anyway, it would nice to have other tracing developers
> opinion.

Finally getting around to it ;-)

> 
> 
>  
> > Thing is, if you (or others) have a need to experiment with the
> > language, then I'm not sure its the right moment to freeze bits into an
> > ABI.

Correct, and this is why I propose a "tracefs" that can become the place
that we add a stable API, and let debugfs be our playground.

> > 
> > I'm really fine with thing, as long as everybody on the filter side
> > knows experimenting isn't really an option and agrees on the direction
> > they want to take the language.
> 
> 
> Well, I talked about experimenting the language before pushing it as
> an ABI because I was afraid we were going too fast.
> 
> But I guess the ABI is a requirement to use it through perf ioctl,
> and delay that would keep it as a hostage, may be even slow its
> development.
> 
> 
> > Is there no existing language with a proper license and clean code-base
> > we can 'borrow'? That would avoid creating yet another funny language,
> > and learning how to implement things all over again.
> > 
> > Personally I don't think the kernel is the place to experiment in script
> > language design, but that's me ;-)
> 
> 
> Python? :-)

Perl is considered a much better language for regex. It has one of the
most (if not the most) powerful regex engines. I'm sure recordmcount.pl
would be much larger if I chose to do it in python. Same goes with
streamline_config.pl.  They both have strong needs for complex regex.

> 
> More seriously, as I said above, I think most developers are familiar with C
> syntax, so IMHO this is one of our best possibility.
> 

To avoid the Python vs Perl, I say we stick with sed/awk. That is also a
requirement for most unix developers.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/