lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHsH6Gst+UGCtiCaNq2ikaknZGghpTq2SFZX7S0A8=uDsXt=Zw@mail.gmail.com>
Date: Tue, 14 Jan 2025 06:08:36 -0800
From: Eyal Birger <eyal.birger@...il.com>
To: Jiri Olsa <olsajiri@...il.com>
Cc: oleg@...hat.com, Aleksa Sarai <cyphar@...har.com>, mhiramat@...nel.org, 
	linux-kernel <linux-kernel@...r.kernel.org>, linux-trace-kernel@...r.kernel.org, 
	BPF-dev-list <bpf@...r.kernel.org>, Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>, 
	John Fastabend <john.fastabend@...il.com>, peterz@...radead.org, tglx@...utronix.de, 
	bp@...en8.de, x86@...nel.org, linux-api@...r.kernel.org, 
	Andrii Nakryiko <andrii@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii.nakryiko@...il.com>, 
	"rostedt@...dmis.org" <rostedt@...dmis.org>, rafi@....io, 
	Shmulik Ladkani <shmulik.ladkani@...il.com>
Subject: Re: Crash when attaching uretprobes to processes running in Docker

Hi Jiri,

On Tue, Jan 14, 2025 at 1:22 AM Jiri Olsa <olsajiri@...il.com> wrote:
>
> On Sat, Jan 11, 2025 at 07:40:15PM +0100, Jiri Olsa wrote:
> > On Sat, Jan 11, 2025 at 02:25:37AM +1100, Aleksa Sarai wrote:
> > > On 2025-01-10, Eyal Birger <eyal.birger@...il.com> wrote:
> > > > Hi,
> > > >
> > > > When attaching uretprobes to processes running inside docker, the attached
> > > > process is segfaulted when encountering the retprobe. The offending commit
> > > > is:
> > > >
> > > > ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe")
> > > >
> > > > To my understanding, the reason is that now that uretprobe is a system call,
> > > > the default seccomp filters in docker block it as they only allow a specific
> > > > set of known syscalls.
> > >
> > > FWIW, the default seccomp profile of Docker _should_ return -ENOSYS for
> > > uretprobe (runc has a bunch of ugly logic to try to guarantee this if
> > > Docker hasn't updated their profile to include it). Though I guess that
> > > isn't sufficient for the magic that uretprobe(2) does...
> > >
> > > > This behavior can be reproduced by the below bash script, which works before
> > > > this commit.
> > > >
> > > > Reported-by: Rafael Buchbinder <rafi@....io>
> >
> > hi,
> > nice ;-) thanks for the report, the problem seems to be that uretprobe syscall
> > is blocked and uretprobe trampoline does not expect that
> >
> > I think we could add code to the uretprobe trampoline to detect this and
> > execute standard int3 as fallback to process uretprobe, I'm checking on that
>
> hack below seems to fix the issue, it's using rbx to signal that uretprobe
> syscall got executed, if not, trampoline does int3 and executes uretprobe
> handler in the old way

FWIW If I change the seccomp policy to SCMP_ACT_KILL this still fails.

Eyal.

>
> unfortunately now the uretprobe trampoline size crosses the xol slot limit so
> will need to come up with some generic/arch code solution for that, code below
> is neglecting that for now


>
> jirka
>
>
> ---
>  arch/x86/kernel/uprobes.c | 24 ++++++++++++++++++++++++
>  include/linux/uprobes.h   |  1 +
>  kernel/events/uprobes.c   | 10 ++++++++--
>  3 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 5a952c5ea66b..b54863f6fa25 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -315,14 +315,25 @@ asm (
>         ".global uretprobe_trampoline_entry\n"
>         "uretprobe_trampoline_entry:\n"
>         "pushq %rax\n"
> +       "pushq %rbx\n"
>         "pushq %rcx\n"
>         "pushq %r11\n"
> +       "movq $1, %rbx\n"
>         "movq $" __stringify(__NR_uretprobe) ", %rax\n"
>         "syscall\n"
>         ".global uretprobe_syscall_check\n"
>         "uretprobe_syscall_check:\n"
> +       "or %rbx,%rbx\n"
> +       "jz uretprobe_syscall_return\n"
>         "popq %r11\n"
>         "popq %rcx\n"
> +       "popq %rbx\n"
> +       "popq %rax\n"
> +       "int3\n"
> +       "uretprobe_syscall_return:\n"
> +       "popq %r11\n"
> +       "popq %rcx\n"
> +       "popq %rbx\n"
>
>         /* The uretprobe syscall replaces stored %rax value with final
>          * return address, so we don't restore %rax in here and just
> @@ -338,6 +349,16 @@ extern u8 uretprobe_trampoline_entry[];
>  extern u8 uretprobe_trampoline_end[];
>  extern u8 uretprobe_syscall_check[];
>
> +#define UINSNS_PER_PAGE                 (PAGE_SIZE/UPROBE_XOL_SLOT_BYTES)
> +
> +bool arch_is_uretprobe_trampoline(unsigned long vaddr)
> +{
> +       unsigned long start = uprobe_get_trampoline_vaddr();
> +       unsigned long end = start + 2*UINSNS_PER_PAGE;
> +
> +       return vaddr >= start && vaddr < end;
> +}
> +
>  void *arch_uprobe_trampoline(unsigned long *psize)
>  {
>         static uprobe_opcode_t insn = UPROBE_SWBP_INSN;
> @@ -418,6 +439,9 @@ SYSCALL_DEFINE0(uretprobe)
>         regs->r11 = regs->flags;
>         regs->cx  = regs->ip;
>
> +       /* zero rbx to signal trampoline that uretprobe syscall was executed */
> +       regs->bx  = 0;
> +
>         return regs->ax;
>
>  sigill:
> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> index e0a4c2082245..dbde57a68a1b 100644
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -213,6 +213,7 @@ extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
>  extern void uprobe_handle_trampoline(struct pt_regs *regs);
>  extern void *arch_uprobe_trampoline(unsigned long *psize);
>  extern unsigned long uprobe_get_trampoline_vaddr(void);
> +bool arch_is_uretprobe_trampoline(unsigned long vaddr);
>  #else /* !CONFIG_UPROBES */
>  struct uprobes_state {
>  };
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index fa04b14a7d72..73df64109f38 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1703,6 +1703,11 @@ void * __weak arch_uprobe_trampoline(unsigned long *psize)
>         return &insn;
>  }
>
> +bool __weak arch_is_uretprobe_trampoline(unsigned long vaddr)
> +{
> +       return vaddr == uprobe_get_trampoline_vaddr();
> +}
> +
>  static struct xol_area *__create_xol_area(unsigned long vaddr)
>  {
>         struct mm_struct *mm = current->mm;
> @@ -1725,8 +1730,9 @@ static struct xol_area *__create_xol_area(unsigned long vaddr)
>
>         area->vaddr = vaddr;
>         init_waitqueue_head(&area->wq);
> -       /* Reserve the 1st slot for get_trampoline_vaddr() */
> +       /* Reserve the first two slots for get_trampoline_vaddr() */
>         set_bit(0, area->bitmap);
> +       set_bit(1, area->bitmap);
>         insns = arch_uprobe_trampoline(&insns_size);
>         arch_uprobe_copy_ixol(area->page, 0, insns, insns_size);
>
> @@ -2536,7 +2542,7 @@ static void handle_swbp(struct pt_regs *regs)
>         int is_swbp;
>
>         bp_vaddr = uprobe_get_swbp_addr(regs);
> -       if (bp_vaddr == uprobe_get_trampoline_vaddr())
> +       if (arch_is_uretprobe_trampoline(bp_vaddr))
>                 return uprobe_handle_trampoline(regs);
>
>         rcu_read_lock_trace();
> --
> 2.47.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ