[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54980e73-4a1c-1eb2-98b4-fbb49e9a9b8f@hisilicon.com>
Date: Fri, 6 Sep 2024 20:20:19 +0800
From: "tiantao (H)" <tiantao6@...ilicon.com>
To: Mark Rutland <mark.rutland@....com>
CC: <catalin.marinas@....com>, <will@...nel.org>,
<jonathan.cameron@...wei.com>, <linux-arm-kernel@...ts.infradead.org>,
<linux-kernel@...r.kernel.org>, <linuxarm@...wei.com>
Subject: Re: [PATCH] arm64: Add ARM64_HAS_LSE2 CPU capability
在 2024/9/6 19:42, Mark Rutland 写道:
> On Fri, Sep 06, 2024 at 06:58:19PM +0800, tiantao (H) wrote:
>> 在 2024/9/6 17:44, Mark Rutland 写道:
>>> On Fri, Sep 06, 2024 at 05:08:12PM +0800, Tian Tao wrote:
>>>> When FEAT_LSE2 is implemented and Bit 6 of sctlr_elx is nAA, the
>>>> full name of the Not-aligned access. nAA bit has two values:
>>>> 0b0 Unaligned accesses by the specified instructions generate an
>>>> Alignment fault.
>>>> 0b1 Unaligned accesses by the specified instructions do not generate
>>>> an Alignment fault.
>>>>
>>>> this patch sets the nAA bit to 1,The following instructions will not
>>>> generate an Alignment fault if all bytes being accessed are not within
>>>> a single 16-byte quantity:
>>>> • LDAPR, LDAPRH, LDAPUR, LDAPURH, LDAPURSH, LDAPURSW, LDAR, LDARH,LDLAR,
>>>> LDLARH.
>>>> • STLLR, STLLRH, STLR, STLRH, STLUR, and STLURH
>>>>
>>>> Signed-off-by: Tian Tao <tiantao6@...ilicon.com>
>>> What is going to depend on this? Nothing in the kernel depends on being
>>> able to make unaligned accesses with these instructions, and (since you
>>> haven't added a HWCAP), userspace has no idea that these accesses won't
>>> generate an alignment fault.
>>>
>>> Mark.
>> I've come across a situation where the simplified code is as follows:
>>
>> long address = (long) mmap(NULL,1024*1024*2,PROT_READ|PROT_WRITE,
>> MAP_PRIVATE|MAP_ANONYMOUS,-1,0);
>>
>> long new_address = address + 9;
>>
>> long *p = (long*) new_address;
>> long v = -1;
>>
>> __atomic_store(p, &v, __ATOMIC_RELEASE);
> Hold on; looking at the ARM ARM (ARM DDI 0487K.a), the example and the
> patch are both bogus. NAK to this patch, explanation below.
>
> Per section B2.2.1.1 "Changes to single-copy atomicity in Armv8.4", all of the
> LSE2 relaxations on alignment require:
>
> | all bytes being accessed are within a 16-byte quantity that is aligned
> | to 16 bytes
>
> In your example you perform an 8-byte access at an offset of 9 bytes,
> which means the access is *not* contained within 16 bytes, and is *not*
> guaranteed to be atomic. That code simply has to be fixed, the kernel
> cannot magically make it safe.
>
> Regardless of that, the nAA bit *only* causes an alignment fault for
> accesses which cannot be atomic. If a CPU has LSE2 and SCTLR_ELx.nAA=0,
> an unaligned access within 16 bytes (which would be atomic) does not
> cause an alignment fault. That's pretty clear from the description of
> nAA and the AArch64.UnalignedAccessFaults() pseudocode:
Sorry, this example is just for verifying nnA, it's not an example of a
real scenario,
we have scenarios that don't require atomicity to be guaranteed, they
just require that coredump doesn't occur when non-aligned
> | boolean AArch64.UnalignedAccessFaults(AccessDescriptor accdesc, bits(64) address, integer size)
> | if AlignmentEnforced() then
> | return TRUE;
> | elsif accdesc.acctype == AccessType_GCS then
> | return TRUE;
> | elsif accdesc.rcw then
> | return TRUE;
> | elsif accdesc.ls64 then
> | return TRUE;
> | elsif accdesc.exclusive || accdesc.atomicop then
> | return !IsFeatureImplemented(FEAT_LSE2) || !AllInAlignedQuantity(address, size, 16);
> | elsif accdesc.acqsc || accdesc.acqpc || accdesc.relsc then
> | return (!IsFeatureImplemented(FEAT_LSE2) ||
> | (SCTLR_ELx[].nAA == '0' && !AllInAlignedQuantity(address, size, 16)));
> | else
> | return FALSE;
>
> Note that when FEAT_LSE2 is implemented, unaligned atomic accesss within
> a 16-byte window never raise an alignment fault, regardless of the value
> of the nAA bit.
> Setting the nAA bit only hides where atomicity is lost, and does not
> make code function correctly. Consequently, that only serves to hide
> bugs, making those harder to detect, debug, and fix. Thus I see no
> reason to set the nAA bit.
>
> So as above, NAK to this path. Please fir your broken userspace code.
>
> Mark.
>
> .
Powered by blists - more mailing lists