linux-kernel - Re: [PATCH v5 4/4] ARM: Add support for Hisilicon Kunpeng L3 cache controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b236bcbc-8610-dfc4-50f2-a4b71162735d@huawei.com>
Date:   Sat, 30 Jan 2021 10:51:12 +0800
From:   "Leizhen (ThunderTown)" <thunder.leizhen@...wei.com>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Arnd Bergmann <arnd@...nel.org>
CC:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Will Deacon <will.deacon@....com>,
        Haojian Zhuang <haojian.zhuang@...il.com>,
        "Arnd Bergmann" <arnd@...db.de>, Rob Herring <robh+dt@...nel.org>,
        Wei Xu <xuwei5@...ilicon.com>,
        devicetree <devicetree@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 4/4] ARM: Add support for Hisilicon Kunpeng L3 cache
 controller



On 2021/1/29 18:33, Russell King - ARM Linux admin wrote:
> On Fri, Jan 29, 2021 at 11:26:38AM +0100, Arnd Bergmann wrote:
>> Another clarification, as there are actually two independent
>> points here:
>>
>> * if you can completely remove the readl() above and just write a
>>   hardcoded value into the register, or perhaps read the original
>>   value once at boot time, that is probably a win because it
>>   avoids one of the barriers in the beginning. The datasheet should
>>   tell you if there are any bits in the register that have to be
>>   preserved
>>
>> * Regarding the _relaxed() accessors, it's a lot harder to know
>>   whether that is safe, as you first have to show, in particular in case
>>   any of the accesses stop being guarded by the spinlock in that
>>   case, and whether there may be a case where you have to
>>   serialize the memory access against accesses that are still in the
>>   store queue or prefetched.
>>
>> Whether this matters at all depends mostly on the type of devices
>> you are driving on your SoC. If you have any high-speed network
>> interfaces that are unable to do cache coherent DMA, any extra
>> instruction here may impact the number of packets you can transfer,
>> but if all your high-speed devices are connected to a coherent
>> interconnect, I would just go with the obvious approach and use
>> the safe MMIO accessors everywhere.
> 
> For L2 cache code, I would say the opposite, actually, because it is
> all too easy to get into a deadlock otherwise.
> 
> If you implement the sync callback, that will be called from every
> non-relaxed accessor, which means if you need to take some kind of
> lock in the sync callback and elsewhere in the L2 cache code, you will
> definitely deadlock.
> 
> It is safer to put explicit barriers where it is necessary.
> 
> Also remember that the barrier in readl() etc is _after_ the read, not
> before, and the barrier in writel() is _before_ the write, not after.
> The point is to ensure that DMA memory accesses are properly ordered
> with the IO-accessing instructions.

Yes, I known it. writel() must be used for the write operations that control
"start/stop" or "enable/disable" function, to ensure that the data of previous
write operations reaches the target. I've met this kind of problem before.

> 
> So, using readl_relaxed() with a read-modify-write is entirely sensible
> provided you do not access DMA memory inbetween.

Actually, I don't think this register is that complicated. I copied the code
back below. All the bits of L3_MAINT_CTRL are not affected by DMA access operations.
The software change the "range | op_type" to specify the operation type and scope,
the set the bit "L3_MAINT_STATUS_START" to start the operation. Then wait for that
bit to change from 1 to 0 by hardware.

+	reg = readl(l3_ctrl_base + L3_MAINT_CTRL);
+	reg &= ~(L3_MAINT_RANGE_MASK | L3_MAINT_TYPE_MASK);
+	reg |= range | op_type;
+	reg |= L3_MAINT_STATUS_START;
+	writel(reg, l3_ctrl_base + L3_MAINT_CTRL);
+
+	/* Wait until the hardware maintenance operation is complete. */
+	do {
+		cpu_relax();
+		reg = readl(l3_ctrl_base + L3_MAINT_CTRL);
+	} while ((reg & L3_MAINT_STATUS_MASK) != L3_MAINT_STATUS_END);

>