[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d93f4256-4554-e031-9730-4ca2a7de6aaf@linaro.org>
Date: Wed, 12 Apr 2023 01:36:52 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To: Rob Clark <robdclark@...il.com>
Cc: dri-devel@...ts.freedesktop.org,
Rob Clark <robdclark@...omium.org>,
Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
linux-arm-msm@...r.kernel.org,
Emil Velikov <emil.l.velikov@...il.com>,
Christopher Healy <healych@...zon.com>,
open list <linux-kernel@...r.kernel.org>,
Sean Paul <sean@...rly.run>,
Boris Brezillon <boris.brezillon@...labora.com>,
freedreno@...ts.freedesktop.org
Subject: Re: [Freedreno] [PATCH v2 0/2] drm: fdinfo memory stats
On 11/04/2023 21:28, Rob Clark wrote:
> On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> <dmitry.baryshkov@...aro.org> wrote:
>>
>> On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@...il.com> wrote:
>>>
>>> On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@...ll.ch> wrote:
>>>>
>>>> On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
>>>>> On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@...il.com> wrote:
>>>>>>
>>>>>> From: Rob Clark <robdclark@...omium.org>
>>>>>>
>>>>>> Similar motivation to other similar recent attempt[1]. But with an
>>>>>> attempt to have some shared code for this. As well as documentation.
>>>>>>
>>>>>> It is probably a bit UMA-centric, I guess devices with VRAM might want
>>>>>> some placement stats as well. But this seems like a reasonable start.
>>>>>>
>>>>>> Basic gputop support: https://patchwork.freedesktop.org/series/116236/
>>>>>> And already nvtop support: https://github.com/Syllo/nvtop/pull/204
>>>>>
>>>>> On a related topic, I'm wondering if it would make sense to report
>>>>> some more global things (temp, freq, etc) via fdinfo? Some of this,
>>>>> tools like nvtop could get by trawling sysfs or other driver specific
>>>>> ways. But maybe it makes sense to have these sort of things reported
>>>>> in a standardized way (even though they aren't really per-drm_file)
>>>>
>>>> I think that's a bit much layering violation, we'd essentially have to
>>>> reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
>>>> be in :-)
>>>
>>> I guess this is true for temp (where there are thermal zones with
>>> potentially multiple temp sensors.. but I'm still digging my way thru
>>> the thermal_cooling_device stuff)
>>
>> It is slightly ugly. All thermal zones and cooling devices are virtual
>> devices (so, even no connection to the particular tsens device). One
>> can either enumerate them by checking
>> /sys/class/thermal/thermal_zoneN/type or enumerate them through
>> /sys/class/hwmon. For cooling devices again the only enumeration is
>> through /sys/class/thermal/cooling_deviceN/type.
>>
>> Probably it should be possible to push cooling devices and thermal
>> zones under corresponding providers. However I do not know if there is
>> a good way to correlate cooling device (ideally a part of GPU) to the
>> thermal_zone (which in our case is provided by tsens / temp_alarm
>> rather than GPU itself).
>>
>>>
>>> But what about freq? I think, esp for cases where some "fw thing" is
>>> controlling the freq we end up needing to use gpu counters to measure
>>> the freq.
>>
>> For the freq it is slightly easier: /sys/class/devfreq/*, devices are
>> registered under proper parent (IOW, GPU). So one can read
>> /sys/class/devfreq/3d00000.gpu/cur_freq or
>> /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
>>
>> However because of the components usage, there is no link from
>> /sys/class/drm/card0
>> (/sys/devices/platform/soc@...e00000.display-subsystem/ae01000.display-controller/drm/card0)
>> to /sys/devices/platform/soc@...d00000.gpu, the GPU unit.
>>
>> Getting all these items together in a platform-independent way would
>> be definitely an important but complex topic.
>
> But I don't believe any of the pci gpu's use devfreq ;-)
>
> And also, you can't expect the CPU to actually know the freq when fw
> is the one controlling freq. We can, currently, have a reasonable
> approximation from devfreq but that stops if IFPC is implemented. And
> other GPUs have even less direct control. So freq is a thing that I
> don't think we should try to get from "common frameworks"
I think it might be useful to add another passive devfreq governor type
for external frequencies. This way we can use the same interface to
export non-CPU-controlled frequencies.
>
> BR,
> -R
>
>>>
>>>> What might be needed is better glue to go from the fd or fdinfo to the
>>>> right hw device and then crawl around the hwmon in sysfs automatically. I
>>>> would not be surprised at all if we really suck on this, probably more
>>>> likely on SoC than pci gpus where at least everything should be under the
>>>> main pci sysfs device.
>>>
>>> yeah, I *think* userspace would have to look at /proc/device-tree to
>>> find the cooling device(s) associated with the gpu.. at least I don't
>>> see a straightforward way to figure it out just for sysfs
>>>
>>> BR,
>>> -R
>>>
>>>> -Daniel
>>>>
>>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>
>>>>>> [1] https://patchwork.freedesktop.org/series/112397/
>>>>>>
>>>>>> Rob Clark (2):
>>>>>> drm: Add fdinfo memory stats
>>>>>> drm/msm: Add memory stats to fdinfo
>>>>>>
>>>>>> Documentation/gpu/drm-usage-stats.rst | 21 +++++++
>>>>>> drivers/gpu/drm/drm_file.c | 79 +++++++++++++++++++++++++++
>>>>>> drivers/gpu/drm/msm/msm_drv.c | 25 ++++++++-
>>>>>> drivers/gpu/drm/msm/msm_gpu.c | 2 -
>>>>>> include/drm/drm_file.h | 10 ++++
>>>>>> 5 files changed, 134 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.39.2
>>>>>>
>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> http://blog.ffwll.ch
>>
>>
>>
>> --
>> With best wishes
>> Dmitry
--
With best wishes
Dmitry
Powered by blists - more mailing lists