[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z9SX7thyConoDjLT@antec>
Date: Fri, 14 Mar 2025 20:56:14 +0000
From: Stafford Horne <shorne@...il.com>
To: Sahil Siddiq <icegambit91@...il.com>
Cc: jonas@...thpole.se, stefan.kristiansson@...nalahti.fi,
sahilcdq@...ton.me, linux-openrisc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC] openrisc: Add cacheinfo support
On Sat, Mar 15, 2025 at 01:34:03AM +0530, Sahil Siddiq wrote:
> Hi,
>
> On 3/14/25 2:54 AM, Stafford Horne wrote:
> > On Tue, Mar 11, 2025 at 12:43:57AM +0530, Sahil Siddiq wrote:
> > > Add cacheinfo support for OpenRISC.
> > >
> > > [...]
> > > None of the functions in drivers/base/cacheinfo.c that are capable of
> > > pulling these details (e.g.: cache_size) have been used. This is because
> > > they pull these details by reading properties present in the device tree
> > > file. In setup.c, for example, the value of "clock-frequency" is pulled
> > > from the device tree file.
> > >
> > > Cache related properties are currently not present in OpenRISC's device
> > > tree files.
> >
> > If we want to add L2 caches and define them in the device tree would
> > it "just work" or is more work needed?
> >
>
> A little more work will have to be done. The implementation of "init_cache_level"
> and "populate_cache_leaves" will have to be extended. To pull L2 cache attributes,
> they'll need to make calls to the "of_get_property" family of functions similar to
> what's being done for RISC-V and PowerPC.
>
> Shall I resubmit this patch with those extensions? I think I'll be able to test
> those changes with a modified device tree file that has an L2 cache component.
Since we don't have any such hardware now I don't think its needed.
> > > Regarding the "shared_cpu_map" cache attribute, I wasn't able to find
> > > anything in the OpenRISC architecture manual to indicate that processors
> > > in a multi-processor system may share the same cache component. MIPS uses
> > > "globalnumber" to detect siblings. LoongArch uses a "CACHE_PRIVATE" flag
> > > to detect siblings sharing the same cache.
> >
> > In SMP environment the L1 caches are not shared they are specific to each CPU.
> >
> > Also, we do not have hyperthreading in OpenRISC so shared_cpu_map should be a
> > 1-to-1 mapping with the cpu. Do you need to do extra work to setup that
> > mapping?
> >
>
> No extra work has to be done to set up the 1-to-1 mapping. This is already being
> done in "ci_leaf_init()".
OK.
> > > I am running with the assumption that every OpenRISC core has its own
> > > icache and dcache. Given that OpenRISC does not support a multi-level
> > > cache architecture and that icache and dcache are like L1 caches, I
> > > think this assumption is reasonable. What are your thoughts on this?
> >
> > Currently this is the case, but it could be possible to create an SoC with L2
> > caches. I could imagine these would be outside of the CPU and we could define
> > them with the device tree.
>
> In this case, some extra work will have to be done to set the "shared_cpu_map"
> appropriately. But I think the modifications will be quite small. If the L2 cache
> is external to all CPUs, then all online CPUs will have their corresponding bit
> set in the "shared_cpu_map".
Yes, it could be so. For now, let's not do this as no such hardware exists.
> > > Another issue I noticed is that the unit used in ...cache/indexN/size
> > > is KB. The actual value of the size is right-shifted by 10 before being
> > > reported. When testing these changes using QEMU (and without making any
> > > modifications to the values stored in DCCFGR and ICCFGR), the cache size
> > > is far smaller than 1KB. Consequently, this is reported as 0K. For cache
> > > sizes smaller than 1KB, should something be done to report it in another
> > > unit? Reporting 0K seems a little misleading.
> >
> > I think this is fine, as long as we pass in the correct size in bytes.
>
> Understood.
OK.
> > > [...]
> > > +
> > > +int init_cache_level(unsigned int cpu)
> > > +{
> > > + struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
> > > + struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> > > + int leaves = 0, levels = 0;
> > > + unsigned long upr = mfspr(SPR_UPR);
> > > + unsigned long iccfgr, dccfgr;
> > > +
> > > + if (!(upr & SPR_UPR_UP)) {
> > > + printk(KERN_INFO
> > > + "-- no UPR register... unable to detect configuration\n");
> > > + return -ENOENT;
> > > + }
> > > +
> > > + if (upr & SPR_UPR_DCP) {
> > > + dccfgr = mfspr(SPR_DCCFGR);
> > > + cpuinfo->dcache.ways = 1 << (dccfgr & SPR_DCCFGR_NCW);
> > > + cpuinfo->dcache.sets = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3);
> > > + cpuinfo->dcache.block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) >> 7);
> > > + cpuinfo->dcache.size =
> > > + cpuinfo->dcache.sets * cpuinfo->dcache.ways * cpuinfo->dcache.block_size;
> > > + leaves += 1;
> > > + printk(KERN_INFO
> > > + "-- dcache: %4d bytes total, %2d bytes/line, %d way(s)\n",
> > > + cpuinfo->dcache.size, cpuinfo->dcache.block_size,
> > > + cpuinfo->dcache.ways);
> >
> > Can we print the number of sets here too? Also is there a reason to pad these
> > int's with 4 and 2 spaces? I am not sure the padding is needed.
> >
> > > + } else
> > > + printk(KERN_INFO "-- dcache disabled\n");
> > > +
> > > + if (upr & SPR_UPR_ICP) {
> > > + iccfgr = mfspr(SPR_ICCFGR);
> > > + cpuinfo->icache.ways = 1 << (iccfgr & SPR_ICCFGR_NCW);
> > > + cpuinfo->icache.sets = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3);
> > > + cpuinfo->icache.block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) >> 7);
> > > + cpuinfo->icache.size =
> > > + cpuinfo->icache.sets * cpuinfo->icache.ways * cpuinfo->icache.block_size;
> > > + leaves += 1;
> > > + printk(KERN_INFO
> > > + "-- icache: %4d bytes total, %2d bytes/line, %d way(s)\n",
> > > + cpuinfo->icache.size, cpuinfo->icache.block_size,
> > > + cpuinfo->icache.ways);
> >
> > Same here.
>
>
> Sure, I'll print the number of sets as well.
>
> I don't think there's any reason for the padding. It was part of the original
> implementation in setup.c. There shouldn't be any issues in removing them.
Right, it would be good to fix.
> > > [...]
> > > seq_printf(m, "frequency\t\t: %ld\n", loops_per_jiffy * HZ);
> > > - seq_printf(m, "dcache size\t\t: %d bytes\n", cpuinfo->dcache_size);
> > > - seq_printf(m, "dcache block size\t: %d bytes\n",
> > > - cpuinfo->dcache_block_size);
> > > - seq_printf(m, "dcache ways\t\t: %d\n", cpuinfo->dcache_ways);
> > > - seq_printf(m, "icache size\t\t: %d bytes\n", cpuinfo->icache_size);
> > > - seq_printf(m, "icache block size\t: %d bytes\n",
> > > - cpuinfo->icache_block_size);
> > > - seq_printf(m, "icache ways\t\t: %d\n", cpuinfo->icache_ways);
> > > seq_printf(m, "immu\t\t\t: %d entries, %lu ways\n",
> > > 1 << ((mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTS) >> 2),
> > > 1 + (mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTW));
> > > --
> > > 2.48.1
> > >
> >
> > This pretty much looks ok to me.
> >
>
> Thank you for the review.
Thank you.
-Stafford
Powered by blists - more mailing lists