linux-kernel - Re: [PATCH v3 5/6] arm64: dts: qcom: sm6150: Add gpu and rgmu nodes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1afebfb7-00aa-4f19-b6c7-dd6fadb83664@oss.qualcomm.com>
Date: Mon, 22 Dec 2025 12:49:28 +0530
From: Akhil P Oommen <akhilpo@....qualcomm.com>
To: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
Cc: Konrad Dybcio <konrad.dybcio@....qualcomm.com>,
        Rob Clark <robin.clark@....qualcomm.com>, Sean Paul <sean@...rly.run>,
        Konrad Dybcio <konradybcio@...nel.org>,
        Dmitry Baryshkov <lumag@...nel.org>,
        Abhinav Kumar <abhinav.kumar@...ux.dev>,
        Marijn Suijten <marijn.suijten@...ainline.org>,
        David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Thomas Zimmermann <tzimmermann@...e.de>, Rob Herring <robh@...nel.org>,
        Krzysztof Kozlowski <krzk+dt@...nel.org>,
        Conor Dooley <conor+dt@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Jessica Zhang <jesszhan0024@...il.com>,
        Dan Carpenter <dan.carpenter@...aro.org>,
        linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        devicetree@...r.kernel.org, Jie Zhang <quic_jiezh@...cinc.com>
Subject: Re: [PATCH v3 5/6] arm64: dts: qcom: sm6150: Add gpu and rgmu nodes

On 12/13/2025 12:58 AM, Dmitry Baryshkov wrote:
> On Fri, Dec 12, 2025 at 01:01:44AM +0530, Akhil P Oommen wrote:
>> On 12/11/2025 6:56 PM, Dmitry Baryshkov wrote:
>>> On Thu, Dec 11, 2025 at 05:22:40PM +0530, Akhil P Oommen wrote:
>>>> On 12/11/2025 4:42 PM, Akhil P Oommen wrote:
>>>>> On 12/11/2025 6:06 AM, Dmitry Baryshkov wrote:
>>>>>> On Thu, Dec 11, 2025 at 02:40:52AM +0530, Akhil P Oommen wrote:
>>>>>>> On 12/6/2025 2:04 AM, Dmitry Baryshkov wrote:
>>>>>>>> On Fri, Dec 05, 2025 at 03:59:09PM +0530, Akhil P Oommen wrote:
>>>>>>>>> On 12/4/2025 7:49 PM, Dmitry Baryshkov wrote:
>>>>>>>>>> On Thu, Dec 04, 2025 at 03:43:33PM +0530, Akhil P Oommen wrote:
>>>>>>>>>>> On 11/26/2025 6:12 AM, Dmitry Baryshkov wrote:
>>>>>>>>>>>> On Sat, Nov 22, 2025 at 03:03:10PM +0100, Konrad Dybcio wrote:
>>>>>>>>>>>>> On 11/21/25 10:52 PM, Akhil P Oommen wrote:
>>>>>>>>>>>>>> From: Jie Zhang <quic_jiezh@...cinc.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Add gpu and rgmu nodes for qcs615 chipset.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Jie Zhang <quic_jiezh@...cinc.com>
>>>>>>>>>>>>>> Signed-off-by: Akhil P Oommen <akhilpo@....qualcomm.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>
>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>
>>>>>>>>>>>>>> +			gpu_opp_table: opp-table {
>>>>>>>>>>>>>> +				compatible = "operating-points-v2";
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +				opp-845000000 {
>>>>>>>>>>>>>> +					opp-hz = /bits/ 64 <845000000>;
>>>>>>>>>>>>>> +					required-opps = <&rpmhpd_opp_turbo>;
>>>>>>>>>>>>>> +					opp-peak-kBps = <7050000>;
>>>>>>>>>>>>>> +				};
>>>>>>>>>>>>>
>>>>>>>>>>>>> I see another speed of 895 @ turbo_l1, perhaps that's for speedbins
>>>>>>>>>>>>> or mobile parts specifically?
>>>>>>>>>>>>
>>>>>>>>>>>> msm-4.14 defines 7 speedbins for SM6150. Akhil, I don't see any of them
>>>>>>>>>>>> here.
>>>>>>>>>>>
>>>>>>>>>>> The IoT/Auto variants have a different frequency plan compared to the
>>>>>>>>>>> mobile variant. I reviewed the downstream code and this aligns with that
>>>>>>>>>>> except the 290Mhz corner. We can remove that one.
>>>>>>>>>>>
>>>>>>>>>>> Here we are describing the IoT variant of Talos. So we can ignore the
>>>>>>>>>>> speedbins from the mobile variant until that is supported.
>>>>>>>>>>
>>>>>>>>>> No, we are describing just Talos, which hopefully covers both mobile and
>>>>>>>>>> non-mobile platforms.
>>>>>>>>>
>>>>>>>>> We cannot assume that.
>>>>>>>>>
>>>>>>>>> Even if we assume that there is no variation in silicon, the firmware
>>>>>>>>> (AOP, TZ, HYP etc) is different between mobile and IoT version. So it is
>>>>>>>>> wise to use the configuration that is commercialized, especially when it
>>>>>>>>> is power related.
>>>>>>>>
>>>>>>>> How does it affect the speed bins? I'd really prefer if we:
>>>>>>>> - describe OPP tables and speed bins here
>>>>>>>> - remove speed bins cell for the Auto / IoT boards
>>>>>>>> - make sure that the driver uses the IoT bin if there is no speed bin
>>>>>>>>   declared in the GPU.
>>>>>>>>
>>>>>>>
>>>>>>> The frequency plan is different between mobile and IoT. Are you
>>>>>>> proposing to describe a union of OPP table from both mobile and IoT?
>>>>>>
>>>>>> Okay, this prompted me to check the sa6155p.dtsi from msm-4.14... And it
>>>>>> has speed bins. How comes we don't have bins for the IoT variant?
>>>>>>
>>>>>> Mobile bins: 0, 177, 187, 156, 136, 105, 73
>>>>>> Auto bins:   0, 177,      156, 136, 105, 73
>>>>>>
>>>>>> Both Mobile and Auto chips used the same NVMEM cell (0x6004, 8 bits
>>>>>> starting from bit 21).
>>>>>>
>>>>>> Mobile freqs:
>>>>>> 0:         845M, 745M, 700M,       550M,       435M,       290M
>>>>>> 177:       845M, 745M, 700M,       550M,       435M,       290M
>>>>>> 187: 895M, 845M, 745M, 700M,       550M,       435M,       290M
>>>>>> 156:             745M, 700M,       550M,       435M,       290M
>>>>>> 136:                         650M, 550M,       435M,       290M
>>>>>> 105:                                     500M, 435M,       290M
>>>>>> 73:                                                  350M, 290M
>>>>>>
>>>>>> Auto freqs:
>>>>>> 0:         845M, 745M, 650M, 500M, 435M
>>>>>> 177:       845M, 745M, 650M, 500M, 435M
>>>>>> 156:             745M, 650M, 500M, 435M
>>>>>> 136:                   650M, 500M, 435M
>>>>>> 105:                         500M, 435M
>>>>>> 73:                                      350M
>>>>>>
>>>>>> 290M was a part of the freq table, but later it was removed as "not
>>>>>> required", so probably it can be brought back, but I'm not sure how to
>>>>>> handle 650 MHz vs 700 MHz and 500 MHz vs 550 MHz differences.
>>>>>>
>>>>>> I'm a bit persistent here because I really want to avoid the situation
>>>>>> where we define a bin-less OPP table and later we face binned QCS615
>>>>>> chips (which is possible since both SM and SA were binned).
>>>>>
>>>>> Why is that a problem as long as KMD can handle it without breaking
>>>>> backward compatibility?
>>>>
>>>> I replied too soon. I see your point. Can't we keep separate OPP tables
>>>> when that happen? That is neat-er than complicating the driver, isn't it?
>>>
>>> I have different story in mind. We ship DTB for IQ-615 listing 845 MHz
>>> as a max freq without speed bins. Later some of the chips shipped in
>>> IQ-615 are characterized as not belonging to bin 0 / not supporting 845
>>> MHz. The users end up overclocking those chips, because the DTB doesn't
>>> make any difference.
>>
>> That is unlikely, because the characterization and other similiar
>> activities are completed and finalized before ramp up at foundries.
>> Nobody likes to RMA tons of chipsets.
>>
>> Anyway, this hypothetical scenarios is a problem even when we use the
>> hard fuse.
> 
> So, are you promising that while there were several characterization
> bins for SM6150 and SA6155P, there is only one bin for QCS615, going up
> to the max freq?

I have cross checked with our Product team. I can confirm that for both
internal and external SKUs of Talos IoT currently, there is only a
single bin for GPU with Fmax 845Mhz.

> 
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Also I don't see separate QFPROM memory map definitions for Mobile, IoT
>>>>>> and Auto SKUs. If you have access to the QCS615 hardware, what is the
>>>>>> value written in that fuse area?
>>>>>>
>>>>>>> Another wrinkle we need to address is that, so far, we have never had a
>>>>>>> dt binding where opp-supp-hw property exist without the speedbin cells.
>>>>>>> And that adds a bit of complexity on the driver side because, today, the
>>>>>>> KMD relies on the presence of speed bin cells to decide whether to
>>>>>>> select bin via opp_supp_hw API or not. Also, we may have to reserve this
>>>>>>> combination (opp bins without speedbin cells) to help KMD detect that it
>>>>>>> should use socinfo APIs instead of speedbin cells on certain chipsets.\
>>>>
>>>>> If it is a soft fuse, it could fall into an unused region in qfprom. On
>>>>> other IoT chipsets like Lemans, Product teams preferred a soft fuse
>>>>> instead of the hard fuse. The downside of the hard fuse that it should
>>>>> be blown from factory and not flexible to update from software later in
>>>>> the program.
>>>>
>>>> This response is for your comment above. Adding to that, the value for
>>>> the hard fuse is mostly likely to be '0' (unfused), but all internal
>>>> parts are always unfused. Maybe it is 'practically' harmless to use the
>>>> freq-limiter hard fuse for now, because 845Mhz is the Fmax for '0' on
>>>> mobile, Auto and IoT. I am not sure.
>>>>
>>>> I am trying to play safe here as this is dt. We don't want to configure
>>>> the wrong thing now and later struggle to correct it. It is safe to
>>>> defer things which we don't know.
>>>
>>> What is "soft fuse"? Why do we need an extra entity in addition to the
>>> one that was defined for auto / mobile units?
>>
>> The hard fuse (freq limiter one) has to be blown at a very early stage
>> in the chip manufacturing. Instead of that, a soft fuse region which is
>> updated by the firmware during the cold boot is used to provide a hint
>> to KMD about the supported GPU fmax. I was told that this provides
>> better operational flexibility to the Product team.
> 
> Do you have an upstream example by chance?

We use soft fuse for Lemans IoT.

-Akhil.

> 
>>
>> -Akhil
>>
>>>
>>>>
>>>> -Akhil.
>>>>
>>>>>
>>>>> -Akhil.
>>>>>
>>>>>>
>>>>>> We already have "machine" as another axis in the GPU catalog. I'd
>>>>>> suggest defining separate speed bins for mobile and auto/IoT in the DT
>>>>>> (0x1 - 0x20 for mobile, 0x100 - 0x1000 for auto) and then in the driver
>>>>>> mapping those by the machine compat.
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>