linux-kernel - Re: [PATCH v2] kasan: fix deadlock in start

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y/4nJEHeUAEBsj6y@arm.com>
Date:   Tue, 28 Feb 2023 16:09:08 +0000
From:   Catalin Marinas <catalin.marinas@....com>
To:     Andrey Konovalov <andreyknvl@...il.com>
Cc:     袁帅(Shuai Yuan) <yuanshuai@...u.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        欧阳炜钊(Weizhao Ouyang) 
        <ouyangweizhao@...u.com>, Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Alexander Potapenko <glider@...gle.com>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "kasan-dev@...glegroups.com" <kasan-dev@...glegroups.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Weizhao Ouyang <o451686892@...il.com>,
        任立鹏(Peng Ren) <renlipeng@...u.com>,
        Peter Collingbourne <pcc@...gle.com>
Subject: Re: [PATCH v2] kasan: fix deadlock in start_report()

On Mon, Feb 27, 2023 at 03:13:45AM +0100, Andrey Konovalov wrote:
> +Catalin, would it be acceptable to implement a routine that disables
> in-kernel MTE tag checking (until the next
> mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE
> fault does this, but without the fault itself. I.e., expose the part
> of do_tag_recovery functionality without report_tag_fault?

I don't think we ever re-enable MTE after do_tag_recovery(). The
mte_enable_kernel_*() are called at boot. We do call
kasan_enable_tagging() explicitly in the kunit tests but that's a
controlled fault environment.

IIUC, the problem is that the kernel already got an MTE fault, so at
that point the error is not really recoverable. If we want to avoid a
fault in the first place, we could do something like
__uaccess_enable_tco() (Vincenzo has some patches to generalise these
routines) but if an MTE fault already triggered and MTE is to stay
disabled after the reporting anyway, I don't think it's worth it.

So I wonder whether it's easier to just disable MTE before calling
report_tag_fault() so that it won't trigger additional faults:

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index f4cb0f85ccf4..1449d2bc6f10 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -329,8 +329,6 @@ static void do_tag_recovery(unsigned long addr, unsigned long esr,
 			   struct pt_regs *regs)
 {
 
-	report_tag_fault(addr, esr, regs);
-
 	/*
 	 * Disable MTE Tag Checking on the local CPU for the current EL.
 	 * It will be done lazily on the other CPUs when they will hit a
@@ -339,6 +337,8 @@ static void do_tag_recovery(unsigned long addr, unsigned long esr,
 	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF_MASK,
 			 SYS_FIELD_PREP_ENUM(SCTLR_EL1, TCF, NONE));
 	isb();
+
+	report_tag_fault(addr, esr, regs);
 }
 
 static bool is_el1_mte_sync_tag_check_fault(unsigned long esr)

-- 
Catalin