lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 20 Jun 2024 10:39:03 +0800
From: "Liao, Chang" <liaochang1@...wei.com>
To: Oleg Nesterov <oleg@...hat.com>
CC: <jolsa@...nel.org>, <rostedt@...dmis.org>, <mhiramat@...nel.org>,
	<ast@...nel.org>, <daniel@...earbox.net>, <andrii@...nel.org>,
	<nathan@...nel.org>, <peterz@...radead.org>, <mingo@...hat.com>,
	<mark.rutland@....com>, <linux-perf-users@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next] uprobes: Fix the xol slots reserved for
 uretprobe trampoline

Hi, Oleg

在 2024/6/19 22:38, Oleg Nesterov 写道:
> On 06/19, Liao Chang wrote:
>>
>> When the new uretprobe system call was added [1], the xol slots reserved
>> for the uretprobe trampoline might be insufficient on some architecture.
> 
> Confused... this doesn't depend on the change above?

You are right, the uretprobe syscall is specifc to x86_64. This patch wouldn't
address that issue. However, when i asm porting uretprobe trampoline to arm64
to explore its benefits on that architecture, i discovered the problem that
single slot is not large enought for trampoline code.

> 
>> For example, on arm64, the trampoline is consist of three instructions
>> at least. So it should mark enough bits in area->bitmaps and
>> and area->slot_count for the reserved slots.
> 
> Do you mean that on arm64 UPROBE_SWBP_INSN_SIZE > UPROBE_XOL_SLOT_BYTES ?
> 
>>>From arch/arm64/include/asm/uprobes.h
> 
> 	#define MAX_UINSN_BYTES		AARCH64_INSN_SIZE
> 
> 	#define UPROBE_SWBP_INSN	cpu_to_le32(BRK64_OPCODE_UPROBES)
> 	#define UPROBE_SWBP_INSN_SIZE	AARCH64_INSN_SIZE
> 	#define UPROBE_XOL_SLOT_BYTES	MAX_UINSN_BYTES
> 
> 	typedef __le32 uprobe_opcode_t;
> 
> 	struct arch_uprobe_task {
> 	};
> 
> 	struct arch_uprobe {
> 		union {
> 			u8 insn[MAX_UINSN_BYTES];
> 			u8 ixol[MAX_UINSN_BYTES];
> 
> So it seems that UPROBE_SWBP_INSN_SIZE == MAX_UINSN_BYTES and it must
> be less than UPROBE_XOL_SLOT_BYTES, otherwise
> 
> arch_uprobe_copy_ixol(..., uprobe->arch.ixol, sizeof(uprobe->arch.ixol))
> in xol_get_insn_slot() won't fit the slot as well?
This real reason is that the current implmentation seems to assume the
ixol slot has enough space for the uretprobe trampoline. This assumption
is true for x86_64, since the ixol slot is size of 128 bytes.

>From arch/x86/include/asm/uprobes.h
	#define UPROBE_XOL_SLOT_BYTES	128

However, it is not large enough for arm64, the size of which is MAX_UINSN_BYTES
(4bytes).

>From arch/arm64/include/asm/uprobes.h

 	#define MAX_UINSN_BYTES		AARCH64_INSN_SIZE // 4bytes
 	...
 	#define UPROBE_XOL_SLOT_BYTES	MAX_UINSN_BYTES

Here is an exmample of the uretprobe trampoline code used for arm64:

uretprobe_trampoline_for_arm64:
	str x8, [sp, #-8]!
	mov x8, __NR_uretprobe
	svc #0

Above code is 12 bytes in size, exceeding the capacity of single ixol slot
on arm64, so three slots are necessary to be reserved for the entire trampoline.

Thanks.

> 
> OTOH, it look as if UPROBE_SWBP_INSN_SIZE == UPROBE_XOL_SLOT_BYTES, so
> I don't understand the problem...
> 
> Oleg.
> 
>> [1] https://lore.kernel.org/all/20240611112158.40795-4-jolsa@kernel.org/
>>
>> Signed-off-by: Liao Chang <liaochang1@...wei.com>
>> ---
>>  kernel/events/uprobes.c | 11 +++++++----
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
>> index 2816e65729ac..efd2d7f56622 100644
>> --- a/kernel/events/uprobes.c
>> +++ b/kernel/events/uprobes.c
>> @@ -1485,7 +1485,7 @@ void * __weak arch_uprobe_trampoline(unsigned long *psize)
>>  static struct xol_area *__create_xol_area(unsigned long vaddr)
>>  {
>>  	struct mm_struct *mm = current->mm;
>> -	unsigned long insns_size;
>> +	unsigned long insns_size, slot_nr;
>>  	struct xol_area *area;
>>  	void *insns;
>>  
>> @@ -1508,10 +1508,13 @@ static struct xol_area *__create_xol_area(unsigned long vaddr)
>>  
>>  	area->vaddr = vaddr;
>>  	init_waitqueue_head(&area->wq);
>> -	/* Reserve the 1st slot for get_trampoline_vaddr() */
>> -	set_bit(0, area->bitmap);
>> -	atomic_set(&area->slot_count, 1);
>>  	insns = arch_uprobe_trampoline(&insns_size);
>> +	/* Reserve enough slots for the uretprobe trampoline */
>> +	for (slot_nr = 0;
>> +	     slot_nr < max((insns_size / UPROBE_XOL_SLOT_BYTES), 1);
>> +	     slot_nr++)
>> +		set_bit(slot_nr, area->bitmap);
>> +	atomic_set(&area->slot_count, slot_nr);
>>  	arch_uprobe_copy_ixol(area->pages[0], 0, insns, insns_size);
>>  
>>  	if (!xol_add_vma(mm, area))
>> -- 
>> 2.34.1
>>
> 
> 

-- 
BR
Liao, Chang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ