linux-kernel - Re: [RFC PATCH v1 2/5] arm64: Add BBM Level 2 cpu feature

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <21e72a5d-d488-4011-9abc-9f717e1c6643@os.amperecomputing.com>
Date: Fri, 13 Dec 2024 14:34:57 -0800
From: Yang Shi <yang@...amperecomputing.com>
To: Mikołaj Lenczewski <miko.lenczewski@....com>,
 Will Deacon <will@...nel.org>
Cc: catalin.marinas@....com, corbet@....net, maz@...nel.org,
 oliver.upton@...ux.dev, joey.gouly@....com, suzuki.poulose@....com,
 yuzenghui@...wei.com, linux-arm-kernel@...ts.infradead.org,
 liunx-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 kvmarm@...r.kernel.org
Subject: Re: [RFC PATCH v1 2/5] arm64: Add BBM Level 2 cpu feature



On 12/13/24 8:17 AM, Mikołaj Lenczewski wrote:
>>> +static int do_conflict_abort(unsigned long far, unsigned long esr,
>>> +			     struct pt_regs *regs)
>>> +{
>>> +	if (!system_supports_bbml2())
>>> +		return do_bad(far, esr, regs);
>>> +
>>> +	/* if we receive a TLB conflict abort, we know that there are multiple
>>> +	 * TLB entries that translate the same address range. the minimum set
>>> +	 * of invalidations to clear these entries is implementation defined.
>>> +	 * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1).
>>> +	 *
>>> +	 * if el2 is enabled and stage 2 translation enabled, this may be
>>> +	 * raised as a stage 2 abort. if el2 is enabled but stage 2 translation
>>> +	 * disabled, or if el2 is disabled, it will be raised as a stage 1
>>> +	 * abort.
>>> +	 *
>>> +	 * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to
>>> +	 * handle a stage 1 abort.
>>> +	 */
>>> +
>>> +	local_flush_tlb_all();
>>> +
>>> +	return 0;
>>> +}
>> Can we actually guarantee that we make it this far without taking another
>> abort? Given that I'm yet to see one of these things in the wild, I'm
>> fairly opposed to pretending that we can handle them. We'd be much better
>> off only violating BBM on CPUs that are known to handle the conflict
>> gracefully. Judging by your later patch, this is practically keyed off
>> the MIDR _anyway_...
>>
>> Will

Hi Mikołaj,

> Thanks for reviewing. Apologies for the delay in responding, and for
> spam (replied instead of group-replied).
>
> There should not be an option to take another fault while performing the
> handler, as long as the mappings covering the fault handler table or any
> code in this path are not screwed with. This is discussed further in the
> resent patch series [1].

Will lead me to this thread when we discussed about my series 
(https://lore.kernel.org/lkml/20241211223034.GA17836@willie-the-truck/#t). 
My series tried to use large mapping for kernel linear mapping when 
rodata=full. The common part of the both series is BBM lv2 support. My 
series depends on BBM lv2 to change page table block size when changing 
permission for kernel linear mapping.

Handling TLB conflict should be ok for your usecase since the conflict 
should just happen in user address space. But it may be not fine for my 
usecase. At least the TLB conflict handler needs to depend on 
CONFIG_VMAP_STACK otherwise the kernel stack pointer points to kernel 
linear address space. I'm not quite sure whether there is any other 
corner case which can trigger recursive abort in the path or not, maybe 
bad stack handler? So Will suggested MIDR based lookup. IIUC, he 
suggested just enable BBM lv2 support on the CPUs which can handle TLB 
conflict gracefully. This should make our lives easier. But the downside 
is maybe fewer CPUs can actually advertise BBM lv2 support.

Thanks,
Yang

> The MIDR revisions will be fixed. I was confused as to which revisions
> were affected on an earlier version of the series, and had missed
> updating them. The kconfig workarounds should be correct in this regard.
>
> [1]:https://lore.kernel.org/all/084c5ada-51af-4c1a-b50a-4401e62ddbd6@arm.com/
>