[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <OSBPR01MB203749DA00C7BEE5741AFEB980AA9@OSBPR01MB2037.jpnprd01.prod.outlook.com>
Date: Tue, 14 Jun 2022 11:55:39 +0000
From: "tarumizu.kohei@...itsu.com" <tarumizu.kohei@...itsu.com>
To: 'Greg KH' <gregkh@...uxfoundation.org>
CC: "catalin.marinas@....com" <catalin.marinas@....com>,
"will@...nel.org" <will@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
"rafael@...nel.org" <rafael@...nel.org>,
"lenb@...nel.org" <lenb@...nel.org>,
"mchehab+huawei@...nel.org" <mchehab+huawei@...nel.org>,
"eugenis@...gle.com" <eugenis@...gle.com>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"pcc@...gle.com" <pcc@...gle.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"marcos@...a.pet" <marcos@...a.pet>,
"marcan@...can.st" <marcan@...can.st>,
"linus.walleij@...aro.org" <linus.walleij@...aro.org>,
"nicolas.ferre@...rochip.com" <nicolas.ferre@...rochip.com>,
"conor.dooley@...rochip.com" <conor.dooley@...rochip.com>,
"arnd@...db.de" <arnd@...db.de>, "ast@...nel.org" <ast@...nel.org>,
"peter.chen@...nel.org" <peter.chen@...nel.org>,
"kuba@...nel.org" <kuba@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>
Subject: RE: [PATCH v5 0/6] Add hardware prefetch control driver for A64FX and
x86
Thanks for the comment.
> Why does userspace want to even do this?
This is because the optimal settings may differ from application to
application.
Examples of performance improvements for applications with simple
memory access characteristics are described in [merit] section.
However, some applications have complex characteristics, so it is
difficult to predict if an application will improve without actually
trying it out.
This is not necessary for all applications. However, I want to provide
as a minimal interface that can be used by those who want to improve
their application even a little.
> How will they do this?
I assume to be used to tune a specific core and execute an application
on that core. The execution example is as follows.
1) The user tunes the parameters of a specific core before executing
the program.
```
# echo 1024 > /sys/devices/system/cpu/cpu12/cache/index0/prefetch_control/stream_detect_prefetcher_dist
# echo 1024 > /sys/devices/system/cpu/cpu12/cache/index2/prefetch_control/stream_detect_prefetcher_dist
# echo 1024 > /sys/devices/system/cpu/cpu13/cache/index0/prefetch_control/stream_detect_prefetcher_dist
# echo 1024 > /sys/devices/system/cpu/cpu13/cache/index2/prefetch_control/stream_detect_prefetcher_dist
```
2) Execute the program bound to the target core.
```
# taskset -c 12-13 a.out
```
If the interface is exposed, the user can develop a library to execute
1) and 2) operation instead.
> What programs will do this?
It is assumed to be used by programs that execute many continuous
memory access. It may be useful for other applications, but I can't
explain them in detail right away.
> And why isn't just automatic and why does this hardware require manual
> intervention to work properly?
It is difficult for the hardware to determine the optimal parameters
in advance. Therefore, I think that the register is provided to change
the behavior of the hardware.
Powered by blists - more mailing lists