linux-kernel - Re: [PATCH] irqchip/gic-v4.1:Check whether indirect table is supported in allocate_vpe_l1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4a17190d-3d02-9c8c-699e-62e5aff63f08@huawei.com>
Date: Thu, 18 Sep 2025 22:19:56 +0800
From: Tangnianyao <tangnianyao@...wei.com>
To: Marc Zyngier <maz@...nel.org>
CC: <tglx@...utronix.de>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <guoyang2@...wei.com>,
	<wangwudi@...ilicon.com>, jiangkunkun <jiangkunkun@...wei.com>,
	<wangzhou1@...ilicon.com>, <guozicheng3@...ilicon.com>
Subject: Re: [PATCH] irqchip/gic-v4.1:Check whether indirect table is
 supported in allocate_vpe_l1_table



On 9/18/2025 17:50, Marc Zyngier wrote:
> On Thu, 18 Sep 2025 03:56:04 +0100,
> Tangnianyao <tangnianyao@...wei.com> wrote:
>>
>>
>> On 5/15/2024 17:34, Marc Zyngier wrote:
>>> On Wed, 15 May 2024 09:56:10 +0100,
>>> Tangnianyao <tangnianyao@...wei.com> wrote:
>>>>
>>>> On 1/22/2024 22:02, Marc Zyngier wrote:
>>>>> On Mon, 22 Jan 2024 13:13:09 +0000,
>>>>> Tangnianyao <tangnianyao@...wei.com> wrote:
>>>>>> On 1/22/2024 17:00, Marc Zyngier wrote:
>>>>>>> [Fixing the LKML address, which has bits of Stephan's address embedded
>>>>>>> in it...]
>>>>>>>
>>>>>>> On Mon, 22 Jan 2024 16:06:07 +0000,
>>>>>>> Nianyao Tang <tangnianyao@...wei.com> wrote:
>>>>>>>> In allocate_vpe_l1_table, when we fail to inherit VPE table from other
>>>>>>>> redistributors or ITSs, and we allocate a new vpe table for current common 
>>>>>>>> affinity field without checking whether indirect table is supported.
>>>>>>>> Let's fix it.
>>>>>>> Is there an actual implementation that doesn't support the indirect
>>>>>>> property for the VPE table? I know this is allowed for consistency
>>>>>>> with the original revision of the architecture, but I never expected
>>>>>>> an actual GICv4.1 implementation to be *that* bad.
>>>>>>>
>>>>>>> If that's the case, I'm a bit puzzled/worried.
>>>>>> I met this problem in a developing implementation and find it's allowed by GIC spec.
>>>>>> In such environment,  in a common affinity field with only redistributors and without
>>>>>> any ITS in it, forcing its_vpe_id_alloc to allocate a large vpeid(like 65000), and there
>>>>>> comes an error message "VPE IRQ allocation failure". It originally comes from
>>>>>> allocate_vpe_l2_table, reading GICR_VPROPBASER with GICR_VPROPBASER_4_1_SIZE=1
>>>>>> and GICR_VPROPBASER_4_1_INDIRECT=0.
>>>>> Really, you should get your HW engineers to fix their GIC
>>>>> implementation.  I'm OK with working around this issue for
>>>>> completeness, but shipping such an implementation would be a mistake.
>>>>>
>>>>> [...]
>>>>>
>>>>>> I have another question here. The max number of pages  for GITS_BASER
>>>>>> and GICR_VPROPBASER is different here, while GITS_BASER.Size is
>>>>>> bit[7:0] with max 256, and GICR_4_1_VPROPBASER.Size is bit[6:0] with max 128.
>>>>>> Kernel usually probe ITS basers first and then probe GICR_4_1_VPROPBASER in
>>>>>> a common affinity group. Maybe we need to check this in "inherit_vpe_l1_table_from_its" ?
>>>>> This is because GITS_BASER[] is generic (also works for devices and
>>>>> collections), while GICR_VPROPBASER is tailored to the VPE table which
>>>>> is usually smaller.
>>>>>
>>>>> I would expect that GICD_TYPER2.VID reports something that cannot
>>>>> result in something going wrong (in this case, the L1 allocation
>>>>> cannot be more than 128 pages).
>>>>>
>>>>> Overall, the kernel isn't a validation suite for the HW, and we expect
>>>>> it to have some level of sanity. So if none of this is in shipping HW
>>>>> but only in some model with crazy parameters, I don't think we should
>>>>> go out of our way to support it.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> 	M.
>>>>>
>>>> Hi Marc,
>>>> Friendly ping. Do we have plan to fix this problem on kernel, or any other plan ?
>>> Hi Nianyao,
>>>
>>> My earlier question still stand: is this something that affects a
>>> shipping implementation? If not, then I don't think we should support
>>> this upstream, as this doesn't seem like a realistic configuration.
>>>
>>> If your employer has actually built this (which I still consider as a
>>> mistake), then we can add the workaround I suggested.
>>>
>>> Thanks,
>>>
>>> 	M.
>>>
>> Hi Marc,
>>
>> For GIC4.1 of HIP09,HIP10,HIP10C, it only support one-level vcpu table and GITS_BASER<n>.Indirect
>> is RAZ/WI. It implements page size 16KB and entry size 32B,  each page support 512 vpe, and our
>> clients have requirement 1 or 2 pages at most, so it just supports flat page to simplify implementation.
>>
>> Our designer has confirmed with ARM architect that we can waive this rule:
>> Quote from GIC spec 12.19.1 GITS_BASER<n> description, if the maximum width of the scaling factor
>> that is identified by GITS_BASER<n>.Type and the smallest page size that is supported result in a single
>> level table that requires multiple pages, then implementing this bit
>> as RAZ/WI is DEPRECATED.
> I can read the spec just as well. Doesn't make it a good option.
It maybe a good option for clients that do not use many vpe?
So we try to convinced ARM to waive this rule and received a positive response.

>> We have actually built this in HIP09,HIP10,HIP10C and would like to
>> fix this in kernel.
> Isn't that the broken systems that can't even do vSGIs properly?
>
>> Can we merge the above bug-fix patch to kernel?  Or we need to fix with errata for HIP09,HIP10,HIP10C?
> Frankly, I've completely lost track of what this patch is doing. This
> has been going on for over 18 months, and you have been silent on the
> subject for over a year. What do you expect?
>
> If you want anything to be done on this front, please repost a patch,
> and we'll review it again.
>
> 	M.
>
We have tested the last patch in our internal build and now aim to fix the open-source
kernel to support our SoC. We hope to address this either through a community patch or an errata fix.

I will repost the patch shortly. Thanks for your review.

Nianyao Tang