lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <525EBB92.2050103@gmail.com>
Date:	Thu, 17 Oct 2013 00:15:14 +0800
From:	Jiang Liu <liuj97@...il.com>
To:	Will Deacon <will.deacon@....com>
CC:	Steven Rostedt <rostedt@...dmis.org>,
	Catalin Marinas <Catalin.Marinas@....com>,
	Sandeepa Prabhu <sandeepa.prabhu@...aro.org>,
	Jiang Liu <jiang.liu@...wei.com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 2/7] arm64: introduce interfaces to hotpatch kernel
 and module code

On 10/16/2013 07:11 PM, Will Deacon wrote:
> On Wed, Oct 16, 2013 at 04:18:07AM +0100, Jiang Liu wrote:
>> From: Jiang Liu <jiang.liu@...wei.com>
>>
>> Introduce three interfaces to patch kernel and module code:
>> aarch64_insn_patch_text_nosync():
>> 	patch code without synchronization, it's caller's responsibility
>> 	to synchronize all CPUs if needed.
>> aarch64_insn_patch_text_sync():
>> 	patch code and always synchronize with stop_machine()
>> aarch64_insn_patch_text():
>> 	patch code and synchronize with stop_machine() if needed
>>
>> Signed-off-by: Jiang Liu <jiang.liu@...wei.com>
>> Cc: Jiang Liu <liuj97@...il.com>
>> ---
>>  arch/arm64/include/asm/insn.h |  7 +++-
>>  arch/arm64/kernel/insn.c      | 95 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 101 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index e7d1bc8..2dfcdb4 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -47,7 +47,12 @@ __AARCH64_INSN_FUNCS(nop,	0xFFFFFFFF, 0xD503201F)
>>  #undef	__AARCH64_INSN_FUNCS
>>  
>>  enum aarch64_insn_class aarch64_get_insn_class(u32 insn);
>> -
>> +u32 aarch64_insn_read(void *addr);
>> +void aarch64_insn_write(void *addr, u32 insn);
>>  bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
>>  
>> +int aarch64_insn_patch_text_nosync(void *addrs[], u32 insns[], int cnt);
>> +int aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt);
>> +int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
>> +
>>  #endif	/* _ASM_ARM64_INSN_H */
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index 1be4d11..ad4185f 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -16,6 +16,8 @@
>>   */
>>  #include <linux/compiler.h>
>>  #include <linux/kernel.h>
>> +#include <linux/stop_machine.h>
>> +#include <asm/cacheflush.h>
>>  #include <asm/insn.h>
>>  
>>  /*
>> @@ -84,3 +86,96 @@ bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
>>  	return __aarch64_insn_hotpatch_safe(old_insn) &&
>>  	       __aarch64_insn_hotpatch_safe(new_insn);
>>  }
>> +
>> +/*
>> + * In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always
>> + * little-endian. On the other hand, SCTLR_EL1.EE (bit 25, Exception Endianness)
>> + * flag controls endianness for EL1 explicit data accesses and stage 1
>> + * translation table walks as below:
>> + *	0: little-endian
>> + *	1: big-endian
>> + * So need to handle endianness when patching kernel code.
>> + */
>> +u32 __kprobes aarch64_insn_read(void *addr)
>> +{
>> +	u32 insn;
>> +
>> +#ifdef	__AARCH64EB__
>> +	insn = swab32(*(u32 *)addr);
>> +#else
>> +	insn = *(u32 *)addr;
>> +#endif
> 
> le32_to_cpu ?
> 
>> +
>> +	return insn;
>> +}
>> +
>> +void __kprobes aarch64_insn_write(void *addr, u32 insn)
>> +{
>> +#ifdef	__AARCH64EB__
>> +	*(u32 *)addr = swab32(insn);
>> +#else
>> +	*(u32 *)addr = insn;
>> +#endif
>> +}
> 
> cpu_to_le32 ?
Good suggestion, much more simpler.

> 
>> +int __kprobes aarch64_insn_patch_text_nosync(void *addrs[], u32 insns[],
>> +					     int cnt)
>> +{
>> +	int i;
>> +	u32 *tp;
>> +
>> +	if (cnt <= 0)
>> +		return -EINVAL;
>> +
>> +	for (i = 0; i < cnt; i++) {
>> +		tp = addrs[i];
>> +		/* A64 instructions must be word aligned */
>> +		if ((uintptr_t)tp & 0x3)
>> +			return -EINVAL;
>> +		aarch64_insn_write(tp, insns[i]);
>> +		flush_icache_range((uintptr_t)tp, (uintptr_t)tp + sizeof(u32));
> 
> What are you trying to achieve with this cache maintenance for the nosync
> case? If you're not synchronising, then you will always have races with the
> instruction patching, so I'd argue that this cache flush doesn't buy you
> anything.
aarch64_insn_patch_text_nosync() may be used in cases of
1) during early boot with only the master CPU running.
2) during runtime with all other CPUs in controlled state, such as kgdb.
3) patching hot-patching safe instructions
So flush_icache_range() is used to support case 1 and ensure local CPU
sees the new instructions.

> 
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +struct aarch64_insn_patch {
>> +	void	**text_addrs;
>> +	u32	*new_insns;
>> +	int	insn_cnt;
>> +};
>> +
>> +static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>> +{
>> +	struct aarch64_insn_patch *pp = arg;
>> +
>> +	return aarch64_insn_patch_text_nosync(pp->text_addrs, pp->new_insns,
>> +					      pp->insn_cnt);
>> +}
>> +
>> +int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
>> +{
>> +	struct aarch64_insn_patch patch = {
>> +		.text_addrs = addrs,
>> +		.new_insns = insns,
>> +		.insn_cnt = cnt,
>> +	};
>> +
>> +	if (cnt <= 0)
>> +		return -EINVAL;
>> +
>> +	/*
>> +	 * Execute __aarch64_insn_patch_text() on every online CPU,
>> +	 * which ensure serialization among all online CPUs.
>> +	 */
>> +	return stop_machine(aarch64_insn_patch_text_cb, &patch, NULL);
>> +}
>> +
>> +int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
>> +{
>> +	if (cnt == 1 && aarch64_insn_hotpatch_safe(aarch64_insn_read(addrs[0]),
>> +						   insns[0]))
>> +		return aarch64_insn_patch_text_nosync(addrs, insns, cnt);
>> +	else
>> +		return aarch64_insn_patch_text_sync(addrs, insns, cnt);
>> +}
> 
> The other way of doing this for cnt > 1 would be to patch in a branch to
> the insns array and then a branch over the original code at the end of the
> array. Obviously, this relies on insns being allocated somewhere persistent
> (and non-pageable!).
Kprobe uses the optimization method described above, but it's rather
complex and depends on the instructions in the insns array.

> 
> I was wondering whether you could do something clever with BKPT, but
> everything I've thought of so far is racy.
Need more time to think about this optimization.

According to my understanding, it's not safe to patch a normal
instruction (non B, BL, SVC, HVC, SMC, NOP, BRK) with BRK too.
If that's true, it will be challenging to avoid stop_machine()
here.

Thanks!
Gerry
> 
> Will
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ