lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1355148168.17101.165.camel@gandalf.local.home>
Date:	Mon, 10 Dec 2012 09:02:48 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Will Deacon <will.deacon@....com>
Cc:	"Jon Medhurst (Tixy)" <tixy@...aro.org>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	Frederic Weisbecker <fweisbec@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Rabin Vincent <rabin@....in>, Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...ux.intel.com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH] ARM: ftrace: Ensure code modifications are synchronised
 across all cpus

On Mon, 2012-12-10 at 11:24 +0000, Will Deacon wrote:
> On Mon, Dec 10, 2012 at 11:04:05AM +0000, Jon Medhurst (Tixy) wrote:
> > On Fri, 2012-12-07 at 19:02 +0000, Will Deacon wrote:
> > > For ARMv7, there are small subsets of instructions for ARM and Thumb which
> > > are guaranteed to be atomic wrt concurrent modification and execution of
> > > the instruction stream between different processors:
> > > 
> > > Thumb:	The 16-bit encodings of the B, NOP, BKPT, and SVC instructions.
> > > ARM:	The B, BL, NOP, BKPT, SVC, HVC, and SMC instructions.
> > > 
> > 
> > So this means for things like kprobes which can modify arbitrary kernel
> > code we are going to need to continue to always use some form of
> > stop_the_whole_system() function?
> > 
> > Also, kprobes currently uses patch_text() which only uses stop_machine
> > for Thumb2 instructions which straddle a word boundary, so this needs
> > changing?
> 
> Yes; if you're modifying instructions other than those mentioned above, then
> you'll need to synchronise the CPUs, update the instructions, perform
> cache-maintenance on the writing CPU and then execute an isb on the
> executing core (this last bit isn't needed if you're going to go through an
> exception return to get back to the new code -- depends on how your
> stop/resume code works).

Yeah, kprobe optimizing will probably require stop_machine() always, as
it's modifying random code, or adding breakpoints into random places.
That's another adventure to deal with at another time.

> 
> For ftrace we can (hopefully) avoid a lot of this when we have known points
> of modification.

I'm also thinking about tracepoints which behave almost the same as
ftrace. They have nop place holders too. They happen to be 32bits too,
but may only need to be 16 bit. The way tracepoints work is with the use
of asm goto. For example we have:

arch/arm/include/asm/jump_label.h

#ifdef CONFIG_THUMB2_KERNEL
#define JUMP_LABEL_NOP	"nop.w"
#else
#define JUMP_LABEL_NOP	"nop"
#endif

static __always_inline bool arch_static_branch(struct static_key *key)
{
	asm goto("1:\n\t"
		 JUMP_LABEL_NOP "\n\t"
		 ".pushsection __jump_table,  \"aw\"\n\t"
		 ".word 1b, %l[l_yes], %c0\n\t"
		 ".popsection\n\t"
		 : :  "i" (key) :  : l_yes);

	return false;
l_yes:
	return true;
}

Tracepoints use the jump-label "static branch" logic, which uses a gcc
4.6 feature called asm goto. The asm goto allows the internal asm to
reference a label outside the asm stamement and the compiler is aware
that the asm statement may jump to that label. Thus the compiler treats
that asm statement as a possible branch to the given label and it wont
optimize required statements after the asm, if they are needed for the
jump to the label.

Now in include/linux/tracepoint.h we have:

	static inline void trace_##name(proto)				\
	{								\
		if (static_key_false(&__tracepoint_##name.key))		\
			__DO_TRACE(&__tracepoint_##name,		\
				TP_PROTO(data_proto),			\
				TP_ARGS(data_args),			\
				TP_CONDITION(cond),,);			\
	}								\

Where the static_key_false() is an "unlikely" version of the
static_branch() that tells gcc the result of the if statement goes into
the unlikely location (end of function perhaps).

But this doesn't guarantee that it becomes part of some if statement, so
this doesn't have all the limitations that ftrace mcount call has.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ