linux-kernel - Re: [RFC] openrisc: Add cacheinfo support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee43f507-c0a2-45ed-818e-f24babf07d60@gmail.com>
Date: Sat, 15 Mar 2025 01:34:03 +0530
From: Sahil Siddiq <icegambit91@...il.com>
To: Stafford Horne <shorne@...il.com>
Cc: jonas@...thpole.se, stefan.kristiansson@...nalahti.fi,
 sahilcdq@...ton.me, linux-openrisc@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [RFC] openrisc: Add cacheinfo support

Hi,

On 3/14/25 2:54 AM, Stafford Horne wrote:
> On Tue, Mar 11, 2025 at 12:43:57AM +0530, Sahil Siddiq wrote:
>> Add cacheinfo support for OpenRISC.
>>
>> [...]
>> None of the functions in drivers/base/cacheinfo.c that are capable of
>> pulling these details (e.g.: cache_size) have been used. This is because
>> they pull these details by reading properties present in the device tree
>> file. In setup.c, for example, the value of "clock-frequency" is pulled
>> from the device tree file.
>>
>> Cache related properties are currently not present in OpenRISC's device
>> tree files.
> 
> If we want to add L2 caches and define them in the device tree would
> it "just work" or is more work needed?
> 

A little more work will have to be done. The implementation of "init_cache_level"
and "populate_cache_leaves" will have to be extended. To pull L2 cache attributes,
they'll need to make calls to the "of_get_property" family of functions similar to
what's being done for RISC-V and PowerPC.

Shall I resubmit this patch with those extensions? I think I'll be able to test
those changes with a modified device tree file that has an L2 cache component.

>> Regarding the "shared_cpu_map" cache attribute, I wasn't able to find
>> anything in the OpenRISC architecture manual to indicate that processors
>> in a multi-processor system may share the same cache component. MIPS uses
>> "globalnumber" to detect siblings. LoongArch uses a "CACHE_PRIVATE" flag
>> to detect siblings sharing the same cache.
> 
> In SMP environment the L1 caches are not shared they are specific to each CPU.
> 
> Also, we do not have hyperthreading in OpenRISC so shared_cpu_map should be a
> 1-to-1 mapping with the cpu.  Do you need to do extra work to setup that
> mapping?
> 

No extra work has to be done to set up the 1-to-1 mapping. This is already being
done in "ci_leaf_init()".

>> I am running with the assumption that every OpenRISC core has its own
>> icache and dcache. Given that OpenRISC does not support a multi-level
>> cache architecture and that icache and dcache are like L1 caches, I
>> think this assumption is reasonable. What are your thoughts on this?
> 
> Currently this is the case, but it could be possible to create an SoC with L2
> caches.  I could imagine these would be outside of the CPU and we could define
> them with the device tree.

In this case, some extra work will have to be done to set the "shared_cpu_map"
appropriately. But I think the modifications will be quite small. If the L2 cache
is external to all CPUs, then all online CPUs will have their corresponding bit
set in the "shared_cpu_map".

>> Another issue I noticed is that the unit used in ...cache/indexN/size
>> is KB. The actual value of the size is right-shifted by 10 before being
>> reported. When testing these changes using QEMU (and without making any
>> modifications to the values stored in DCCFGR and ICCFGR), the cache size
>> is far smaller than 1KB. Consequently, this is reported as 0K. For cache
>> sizes smaller than 1KB, should something be done to report it in another
>> unit? Reporting 0K seems a little misleading.
> 
> I think this is fine, as long as we pass in the correct size in bytes.

Understood.

>> [...]
>> +
>> +int init_cache_level(unsigned int cpu)
>> +{
>> +	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
>> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +	int leaves = 0, levels = 0;
>> +	unsigned long upr = mfspr(SPR_UPR);
>> +	unsigned long iccfgr, dccfgr;
>> +
>> +	if (!(upr & SPR_UPR_UP)) {
>> +		printk(KERN_INFO
>> +		       "-- no UPR register... unable to detect configuration\n");
>> +		return -ENOENT;
>> +	}
>> +
>> +	if (upr & SPR_UPR_DCP) {
>> +		dccfgr = mfspr(SPR_DCCFGR);
>> +		cpuinfo->dcache.ways = 1 << (dccfgr & SPR_DCCFGR_NCW);
>> +		cpuinfo->dcache.sets = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3);
>> +		cpuinfo->dcache.block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) >> 7);
>> +		cpuinfo->dcache.size =
>> +		    cpuinfo->dcache.sets * cpuinfo->dcache.ways * cpuinfo->dcache.block_size;
>> +		leaves += 1;
>> +		printk(KERN_INFO
>> +		       "-- dcache: %4d bytes total, %2d bytes/line, %d way(s)\n",
>> +		       cpuinfo->dcache.size, cpuinfo->dcache.block_size,
>> +		       cpuinfo->dcache.ways);
> 
> Can we print the number of sets here too?  Also is there a reason to pad these
> int's with 4 and 2 spaces? I am not sure the padding is needed.
> 
>> +	} else
>> +		printk(KERN_INFO "-- dcache disabled\n");
>> +
>> +	if (upr & SPR_UPR_ICP) {
>> +		iccfgr = mfspr(SPR_ICCFGR);
>> +		cpuinfo->icache.ways = 1 << (iccfgr & SPR_ICCFGR_NCW);
>> +		cpuinfo->icache.sets = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3);
>> +		cpuinfo->icache.block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) >> 7);
>> +		cpuinfo->icache.size =
>> +		    cpuinfo->icache.sets * cpuinfo->icache.ways * cpuinfo->icache.block_size;
>> +		leaves += 1;
>> +		printk(KERN_INFO
>> +		       "-- icache: %4d bytes total, %2d bytes/line, %d way(s)\n",
>> +		       cpuinfo->icache.size, cpuinfo->icache.block_size,
>> +		       cpuinfo->icache.ways);
> 
> Same here.


Sure, I'll print the number of sets as well.

I don't think there's any reason for the padding. It was part of the original
implementation in setup.c. There shouldn't be any issues in removing them.

>> [...]
>>   	seq_printf(m, "frequency\t\t: %ld\n", loops_per_jiffy * HZ);
>> -	seq_printf(m, "dcache size\t\t: %d bytes\n", cpuinfo->dcache_size);
>> -	seq_printf(m, "dcache block size\t: %d bytes\n",
>> -		   cpuinfo->dcache_block_size);
>> -	seq_printf(m, "dcache ways\t\t: %d\n", cpuinfo->dcache_ways);
>> -	seq_printf(m, "icache size\t\t: %d bytes\n", cpuinfo->icache_size);
>> -	seq_printf(m, "icache block size\t: %d bytes\n",
>> -		   cpuinfo->icache_block_size);
>> -	seq_printf(m, "icache ways\t\t: %d\n", cpuinfo->icache_ways);
>>   	seq_printf(m, "immu\t\t\t: %d entries, %lu ways\n",
>>   		   1 << ((mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTS) >> 2),
>>   		   1 + (mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTW));
>> -- 
>> 2.48.1
>>
> 
> This pretty much looks ok to me.
>

Thank you for the review.

Thanks,
Sahil