[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241219162535.GD24724@willie-the-truck>
Date: Thu, 19 Dec 2024 16:25:35 +0000
From: Will Deacon <will@...nel.org>
To: Yang Shi <yang@...amperecomputing.com>
Cc: Mikołaj Lenczewski <miko.lenczewski@....com>,
catalin.marinas@....com, corbet@....net, maz@...nel.org,
oliver.upton@...ux.dev, joey.gouly@....com, suzuki.poulose@....com,
yuzenghui@...wei.com, linux-arm-kernel@...ts.infradead.org,
liunx-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
kvmarm@...r.kernel.org
Subject: Re: [RFC PATCH v1 2/5] arm64: Add BBM Level 2 cpu feature
On Fri, Dec 13, 2024 at 02:34:57PM -0800, Yang Shi wrote:
> On 12/13/24 8:17 AM, Mikołaj Lenczewski wrote:
> > > > +static int do_conflict_abort(unsigned long far, unsigned long esr,
> > > > + struct pt_regs *regs)
> > > > +{
> > > > + if (!system_supports_bbml2())
> > > > + return do_bad(far, esr, regs);
> > > > +
> > > > + /* if we receive a TLB conflict abort, we know that there are multiple
> > > > + * TLB entries that translate the same address range. the minimum set
> > > > + * of invalidations to clear these entries is implementation defined.
> > > > + * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1).
> > > > + *
> > > > + * if el2 is enabled and stage 2 translation enabled, this may be
> > > > + * raised as a stage 2 abort. if el2 is enabled but stage 2 translation
> > > > + * disabled, or if el2 is disabled, it will be raised as a stage 1
> > > > + * abort.
> > > > + *
> > > > + * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to
> > > > + * handle a stage 1 abort.
> > > > + */
> > > > +
> > > > + local_flush_tlb_all();
> > > > +
> > > > + return 0;
> > > > +}
> > > Can we actually guarantee that we make it this far without taking another
> > > abort? Given that I'm yet to see one of these things in the wild, I'm
> > > fairly opposed to pretending that we can handle them. We'd be much better
> > > off only violating BBM on CPUs that are known to handle the conflict
> > > gracefully. Judging by your later patch, this is practically keyed off
> > > the MIDR _anyway_...
> > >
> > > Will
>
> Hi Mikołaj,
>
> > Thanks for reviewing. Apologies for the delay in responding, and for
> > spam (replied instead of group-replied).
> >
> > There should not be an option to take another fault while performing the
> > handler, as long as the mappings covering the fault handler table or any
> > code in this path are not screwed with. This is discussed further in the
> > resent patch series [1].
>
> Will lead me to this thread when we discussed about my series
> (https://lore.kernel.org/lkml/20241211223034.GA17836@willie-the-truck/#t).
> My series tried to use large mapping for kernel linear mapping when
> rodata=full. The common part of the both series is BBM lv2 support. My
> series depends on BBM lv2 to change page table block size when changing
> permission for kernel linear mapping.
>
> Handling TLB conflict should be ok for your usecase since the conflict
> should just happen in user address space. But it may be not fine for my
> usecase. At least the TLB conflict handler needs to depend on
> CONFIG_VMAP_STACK otherwise the kernel stack pointer points to kernel linear
> address space. I'm not quite sure whether there is any other corner case
> which can trigger recursive abort in the path or not, maybe bad stack
> handler? So Will suggested MIDR based lookup. IIUC, he suggested just enable
> BBM lv2 support on the CPUs which can handle TLB conflict gracefully. This
> should make our lives easier. But the downside is maybe fewer CPUs can
> actually advertise BBM lv2 support.
Since most of them seem to be broken anyway, I still think this (having
an allowlist and not handling the conflict abort in the kernel) is the
right approach.
Will
Powered by blists - more mailing lists