lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Feb 2023 03:13:45 +0100
From:   Andrey Konovalov <andreyknvl@...il.com>
To:     袁帅(Shuai Yuan) <yuanshuai@...u.com>,
        Catalin Marinas <catalin.marinas@....com>
Cc:     Dmitry Vyukov <dvyukov@...gle.com>,
        欧阳炜钊(Weizhao Ouyang) 
        <ouyangweizhao@...u.com>, Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Alexander Potapenko <glider@...gle.com>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "kasan-dev@...glegroups.com" <kasan-dev@...glegroups.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Weizhao Ouyang <o451686892@...il.com>,
        任立鹏(Peng Ren) <renlipeng@...u.com>,
        Peter Collingbourne <pcc@...gle.com>
Subject: Re: [PATCH v2] kasan: fix deadlock in start_report()

On Wed, Feb 15, 2023 at 2:22 PM 袁帅(Shuai Yuan) <yuanshuai@...u.com> wrote:
>
> I have got valid information to clarify the problem and solutions. I made
> a few changes to the code to do this.
>
> a) I was testing on a device that had hardware issues with MTE,
>     and the memory tag sometimes changed randomly.

Ah, I see. Faulty hardware explains the problem. Thank you!

> f) From the above log, you can see that the system tried to call kasan_report() twice,
>    because we visit tag address by kmem_cache and this tag have change..
>    Normally this doesn't happen easily. So I think we can add kasan_reset_tag() to handle
>    the kmem_cache address.
>
>    For example, the following changes are used for the latest kernel version.
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -412,7 +412,7 @@ static void complete_report_info(struct kasan_report_info *info)
>         slab = kasan_addr_to_slab(addr);
>         if (slab) {
> -               info->cache = slab->slab_cache;
> +               info->cache = kasan_reset_tag(slab->slab_cache);

This fixes the problem for accesses to slab_cache, but KASAN reporting
code also accesses stack depot memory and calls other routines that
might access (faulty) tagged memory. And the accessed addresses aren't
exposed to KASAN code, so we can't use kasan_reset_tag for those.

I wonder what would be a good solution here. I really don't want to
use kasan_depth or some other global/per-cpu flag here, as it would be
too good of a target for attackers wishing to bypass MTE. Perhaps,
disabling MTE once reporting started would be a better option: calling
the disabling routine would arguably be a harder task for an attacker
than overwriting a flag.

+Catalin, would it be acceptable to implement a routine that disables
in-kernel MTE tag checking (until the next
mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE
fault does this, but without the fault itself. I.e., expose the part
of do_tag_recovery functionality without report_tag_fault?

TL;DR on the problem: Besides relying on CPU tag checks, KASAN also
does explicit tag checks to detect double-frees and similar problems,
see the calls to kasan_report_invalid_free. Thus, when e.g. a
double-free report is printed, MTE checking is still on. This results
in a deadlock in case invalid memory is accessed during KASAN
reporting.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ