[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ztrqie427P_Lxaoh@J2N7QTR9R3>
Date: Fri, 6 Sep 2024 12:42:01 +0100
From: Mark Rutland <mark.rutland@....com>
To: "tiantao (H)" <tiantao6@...ilicon.com>
Cc: catalin.marinas@....com, will@...nel.org, jonathan.cameron@...wei.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linuxarm@...wei.com
Subject: Re: [PATCH] arm64: Add ARM64_HAS_LSE2 CPU capability
On Fri, Sep 06, 2024 at 06:58:19PM +0800, tiantao (H) wrote:
>
> 在 2024/9/6 17:44, Mark Rutland 写道:
> > On Fri, Sep 06, 2024 at 05:08:12PM +0800, Tian Tao wrote:
> > > When FEAT_LSE2 is implemented and Bit 6 of sctlr_elx is nAA, the
> > > full name of the Not-aligned access. nAA bit has two values:
> > > 0b0 Unaligned accesses by the specified instructions generate an
> > > Alignment fault.
> > > 0b1 Unaligned accesses by the specified instructions do not generate
> > > an Alignment fault.
> > >
> > > this patch sets the nAA bit to 1,The following instructions will not
> > > generate an Alignment fault if all bytes being accessed are not within
> > > a single 16-byte quantity:
> > > • LDAPR, LDAPRH, LDAPUR, LDAPURH, LDAPURSH, LDAPURSW, LDAR, LDARH,LDLAR,
> > > LDLARH.
> > > • STLLR, STLLRH, STLR, STLRH, STLUR, and STLURH
> > >
> > > Signed-off-by: Tian Tao <tiantao6@...ilicon.com>
> > What is going to depend on this? Nothing in the kernel depends on being
> > able to make unaligned accesses with these instructions, and (since you
> > haven't added a HWCAP), userspace has no idea that these accesses won't
> > generate an alignment fault.
> >
> > Mark.
>
> I've come across a situation where the simplified code is as follows:
>
> long address = (long) mmap(NULL,1024*1024*2,PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS,-1,0);
>
> long new_address = address + 9;
>
> long *p = (long*) new_address;
> long v = -1;
>
> __atomic_store(p, &v, __ATOMIC_RELEASE);
Hold on; looking at the ARM ARM (ARM DDI 0487K.a), the example and the
patch are both bogus. NAK to this patch, explanation below.
Per section B2.2.1.1 "Changes to single-copy atomicity in Armv8.4", all of the
LSE2 relaxations on alignment require:
| all bytes being accessed are within a 16-byte quantity that is aligned
| to 16 bytes
In your example you perform an 8-byte access at an offset of 9 bytes,
which means the access is *not* contained within 16 bytes, and is *not*
guaranteed to be atomic. That code simply has to be fixed, the kernel
cannot magically make it safe.
Regardless of that, the nAA bit *only* causes an alignment fault for
accesses which cannot be atomic. If a CPU has LSE2 and SCTLR_ELx.nAA=0,
an unaligned access within 16 bytes (which would be atomic) does not
cause an alignment fault. That's pretty clear from the description of
nAA and the AArch64.UnalignedAccessFaults() pseudocode:
| boolean AArch64.UnalignedAccessFaults(AccessDescriptor accdesc, bits(64) address, integer size)
| if AlignmentEnforced() then
| return TRUE;
| elsif accdesc.acctype == AccessType_GCS then
| return TRUE;
| elsif accdesc.rcw then
| return TRUE;
| elsif accdesc.ls64 then
| return TRUE;
| elsif accdesc.exclusive || accdesc.atomicop then
| return !IsFeatureImplemented(FEAT_LSE2) || !AllInAlignedQuantity(address, size, 16);
| elsif accdesc.acqsc || accdesc.acqpc || accdesc.relsc then
| return (!IsFeatureImplemented(FEAT_LSE2) ||
| (SCTLR_ELx[].nAA == '0' && !AllInAlignedQuantity(address, size, 16)));
| else
| return FALSE;
Note that when FEAT_LSE2 is implemented, unaligned atomic accesss within
a 16-byte window never raise an alignment fault, regardless of the value
of the nAA bit.
Setting the nAA bit only hides where atomicity is lost, and does not
make code function correctly. Consequently, that only serves to hide
bugs, making those harder to detect, debug, and fix. Thus I see no
reason to set the nAA bit.
So as above, NAK to this path. Please fir your broken userspace code.
Mark.
Powered by blists - more mailing lists