linux-kernel - Re: [PATCH v13 4/7] arm64: mte: Enable TCO in functions that can read beyond buffer limits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c3d565da-c446-dea2-266e-ef35edabca9c@arm.com>
Date:   Mon, 22 Feb 2021 12:08:07 +0000
From:   Vincenzo Frascino <vincenzo.frascino@....com>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        kasan-dev@...glegroups.com,
        Andrew Morton <akpm@...ux-foundation.org>,
        Will Deacon <will@...nel.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        Marco Elver <elver@...gle.com>,
        Evgenii Stepanov <eugenis@...gle.com>,
        Branislav Rankov <Branislav.Rankov@....com>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>
Subject: Re: [PATCH v13 4/7] arm64: mte: Enable TCO in functions that can read
 beyond buffer limits



On 2/12/21 5:21 PM, Catalin Marinas wrote:
>> +
>> +	/*
>> +	 * This function is called on each active smp core at boot
>> +	 * time, hence we do not need to take cpu_hotplug_lock again.
>> +	 */
>> +	static_branch_enable_cpuslocked(&mte_async_mode);
>>  }
> Sorry, I missed the cpuslocked aspect before. Is there any reason you
> need to use this API here? I suggested to add it to the
> mte_enable_kernel_sync() because kasan may at some point do this
> dynamically at run-time, so the boot-time argument doesn't hold. But
> it's also incorrect as this function will be called for hot-plugged
> CPUs as well after boot.
> 
> The only reason for static_branch_*_cpuslocked() is if it's called from
> a region that already invoked cpus_read_lock() which I don't think is
> the case here.

I agree with your analysis on why static_branch_*_cpuslocked() is needed, in
fact cpus_read_lock() takes cpu_hotplug_lock as per comment on top of the line
of code.

If I try to take that lock when enabling the secondary cores I end up in the
situation below:

[    0.283402] smp: Bringing up secondary CPUs ...
....
[    5.890963] Call trace:
[    5.891050]  dump_backtrace+0x0/0x19c
[    5.891212]  show_stack+0x18/0x70
[    5.891373]  dump_stack+0xd0/0x12c
[    5.891531]  dequeue_task_idle+0x28/0x40
[    5.891686]  __schedule+0x45c/0x6c0
[    5.891851]  schedule+0x70/0x104
[    5.892010]  percpu_rwsem_wait+0xe8/0x104
[    5.892174]  __percpu_down_read+0x5c/0x90
[    5.892332]  percpu_down_read.constprop.0+0xbc/0xd4
[    5.892497]  cpus_read_lock+0x10/0x1c
[    5.892660]  static_key_enable+0x18/0x3c
[    5.892823]  mte_enable_kernel_async+0x40/0x70
[    5.892988]  kasan_init_hw_tags_cpu+0x50/0x60
[    5.893144]  cpu_enable_mte+0x24/0x70
[    5.893304]  verify_local_cpu_caps+0x58/0x120
[    5.893465]  check_local_cpu_capabilities+0x18/0x1f0
[    5.893626]  secondary_start_kernel+0xe0/0x190
[    5.893790]  0x0
[    5.893975] bad: scheduling from the idle thread!
[    5.894065] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W
5.11.0-rc7-10587-g22cd50bcfcf-dirty #6

and the kernel panics.

Note: there is a look of msg drop in between enabling the secondary and the
first clean stack trace.

-- 
Regards,
Vincenzo