lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 19 Nov 2009 16:55:58 -0500
From:	Jason Baron <jbaron@...hat.com>
To:	Roland McGrath <roland@...hat.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu,
	mathieu.desnoyers@...ymtl.ca, hpa@...or.com, tglx@...utronix.de,
	rostedt@...dmis.org, andi@...stfloor.org, rth@...hat.com,
	mhiramat@...hat.com
Subject: Re: [RFC PATCH 0/6] jump label v3

On Wed, Nov 18, 2009 at 07:54:24PM -0800, Roland McGrath wrote:
> 2. optimal compiled hot path code
> 
>    You and Richard have been working on this in gcc and we know the state
>    of it now.  When we get the cold labels feature done, it will be ideal
>    for -O(2?).  But people mostly use -Os and there no block reordering
>    gets done now (I think perhaps this even means likely/unlikely don't
>    really change which path is the straight line, just the source order
>    of the blocks still determines it).  So we hope for more incremental
>    improvements here, and maybe even really optimal code for -O2 soon.
>    But at least for -Os it may not be better than "unconditional jump
>    around" as the "straight line" path in the foreseeable future.  As
>    noted, that alone is still a nice savings over the status quo for the
>    disabled case.  (You gave an "average cycles saved" for this vs a load
>    and test, but do you have any comparisons of how those two compare to
>    no tracepoint at all?)
> 

i've run that in the past, and for the nop + jump sequence its between
2 - 4 cycles on average vs. no tracepoint.


> 3. bookkeeping magic to find all the jumps to enable for a given tracepoint
> 
>    Here you have a working first draft, but it looks pretty clunky.
>    That strcmp just makes me gag.  For a first version that's still
>    pretty simple, I think it should be trivial to use a pointer
>    comparison there.  For tracepoints, it can be the address of the
>    struct tracepoint.  For the general case, it can be the address of
>    the global that would be flag variable in case of no asm goto support.
> 
>    For more incremental improvements, we could cut down on running
>    through the entire table for every switch.  If there are many
>    different switches (as there are already for many different
>    tracepoints), then you really just want to run through the
>    insn-patch list for the particular switch when you toggle it.  
> 
>    It's possible to group this all statically at link time, but all
>    the linker magic hacking required to get that to go is probably
>    more trouble than it's worth.  
> 
>    A simple hack is to run through the big unsorted table at boot time
>    and turn it into a contiguous table for each switch.  Then
>    e.g. hang each table off the per-switch global variable by the same
>    name that in a no-asm-goto build would be the simple global flag.
> 

that probably makes the most sense. Do a sort of the jump table and then
store an offset,length pair with each switch. I was thinking of this as follow
on optimization (the tracepoint code is already O(N) per switch toggle, where
is N = total number of all tracepoint site locations, and not O(n), where
n = number of sites per tracepoint). Certainly, if this is a gating issue for
this patchset, I can fix it now.

> 
> Finally, for using this for general purposes unrelated to tracepoints,
> I envision something like:
> 
> 	DECLARE_MOSTLY_NOT(foobar);
> 
> 	foo(int x, int y)
> 	{
> 		if (x > y && mostly_not(foobar))
> 			do_foobar(x - y);
> 	}
> 
> 	... set_mostly_not(foobar, onoff);
> 
> where it's:
> 
> #define DECLARE_MOSTLY_NOT(name) ... __something_##name
> #define mostly_not(name) ({ int _doit = 0; __label__ _yes; \
> 			    JUMP_LABEL(name, _yes, __something_##name); \
> 			    if (0) _yes: __cold _doit = 1; \
> 			    unlikely (_doit); })
> 
> I don't think we've tried to figure out how well this compiles yet.
> But it shows the sort of thing that we can do to expose this feature
> in a way that's simple and unrestrictive for kernel code to use casually.
> 
> 

cool. the assembly output would be interesting here...

thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ