lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y2je9dJxUjEchB9k@FVFF77S0Q05N>
Date:   Mon, 7 Nov 2022 10:33:25 +0000
From:   Mark Rutland <mark.rutland@....com>
To:     Alexander Potapenko <glider@...gle.com>
Cc:     Baisong Zhong <zhongbaisong@...wei.com>, elver@...gle.com,
        Catalin Marinas <catalin.marinas@....com>, edumazet@...gle.com,
        keescook@...omium.org, kuba@...nel.org, ast@...nel.org,
        daniel@...earbox.net, davem@...emloft.net, pabeni@...hat.com,
        linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH -next,v2] bpf, test_run: fix alignment problem in
 bpf_prog_test_run_skb()

On Fri, Nov 04, 2022 at 06:06:05PM +0100, Alexander Potapenko wrote:
> On Wed, Nov 2, 2022 at 9:16 AM Baisong Zhong <zhongbaisong@...wei.com> wrote:
> >
> > we got a syzkaller problem because of aarch64 alignment fault
> > if KFENCE enabled.
> >
> > When the size from user bpf program is an odd number, like
> > 399, 407, etc, it will cause the struct skb_shared_info's
> > unaligned access. As seen below:
> >
> > BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> 
> It's interesting that KFENCE is reporting a UAF without a deallocation
> stack here.
> 
> Looks like an unaligned access to 0xffff6254fffac077 causes the ARM
> CPU to throw a fault handled by __do_kernel_fault()

Importantly, an unaligned *atomic*, which is a bug regardless of KFENCE.

> This isn't technically a page fault, but anyway the access address
> gets passed to kfence_handle_page_fault(), which defaults to a
> use-after-free, because the address belongs to the object page, not
> the redzone page.
> 
> Catalin, Mark, what is the right way to only handle traps caused by
> reading/writing to a page for which `set_memory_valid(addr, 1, 0)` was
> called?

That should appear as a translation fault, so we could add an
is_el1_translation_fault() helper for that. I can't immediately recall how
misaligned atomics are presented, but I presume as something other than a
translation fault.

If the below works for you, I can go spin that as a real patch.

Mark.

---->8----
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 5b391490e045b..1de4b6afa8515 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -239,6 +239,11 @@ static bool is_el1_data_abort(unsigned long esr)
        return ESR_ELx_EC(esr) == ESR_ELx_EC_DABT_CUR;
 }
 
+static bool is_el1_translation_fault(unsigned long esr)
+{
+       return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_FAULT;
+}
+
 static inline bool is_el1_permission_fault(unsigned long addr, unsigned long esr,
                                           struct pt_regs *regs)
 {
@@ -385,7 +390,8 @@ static void __do_kernel_fault(unsigned long addr, unsigned long esr,
        } else if (addr < PAGE_SIZE) {
                msg = "NULL pointer dereference";
        } else {
-               if (kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
+               if (is_el1_translation_fault(esr) &&
+                   kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs))
                        return;
 
                msg = "paging request";

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ