lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e44b7a1-4a12-86bf-4651-aa6a03c4f832@huawei.com>
Date:   Mon, 29 Mar 2021 18:38:04 +0800
From:   Jingyi Wang <wangjingyi11@...wei.com>
To:     Marc Zyngier <maz@...nel.org>
CC:     <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>, <tglx@...utronix.de>,
        <wanghaibin.wang@...wei.com>, <yuzenghui@...wei.com>,
        <zhukeqian1@...wei.com>
Subject: Re: [RFC PATCH 1/3] irqchip/gic-v3: Make use of ICC_SGI1R IRM bit



On 3/29/2021 5:55 PM, Marc Zyngier wrote:
> On Mon, 29 Mar 2021 09:52:08 +0100,
> Jingyi Wang <wangjingyi11@...wei.com> wrote:
>>
>> IRM, bit[40] in ICC_SGI1R, determines how the generated SGIs
>> are distributed to PEs. If the bit is set, interrupts are routed
>> to all PEs in the system excluding "self". We use cpumask to
>> determine if this bit should be set and make use of that.
>>
>> This will reduce vm trap when broadcast IPIs are sent.
> 
> I remember writing similar code about 4 years ago, only to realise
> what:
> 
> - the cost of computing the resulting mask is pretty high for large
> machines
> - Linux almost never sends broadcast IPIs, so the complexity was all
> in vain
> 
> What changed? Please provide supporting data showing how many IPIs we
> actually save, and for which workload.
Maybe we can implement send_IPI_allbutself hooks as other some other 
archs instead of computing cpumask here?

>>
>> Signed-off-by: Jingyi Wang <wangjingyi11@...wei.com>
>> ---
>>   drivers/irqchip/irq-gic-v3.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index eb0ee356a629..8ecc1b274ea8 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -1127,6 +1127,7 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq)
>>   static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>>   {
>>   	int cpu;
>> +	cpumask_t tmp;
>>   
>>   	if (WARN_ON(d->hwirq >= 16))
>>   		return;
>> @@ -1137,6 +1138,17 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>>   	 */
>>   	wmb();
>>   
>> +	if (!cpumask_and(&tmp, mask, cpumask_of(smp_processor_id()))) {
> 
> Are you sure this does the right thing? This is checking that the
> current CPU is not part of the mask. But it not checking that the mask
> is actually "all but self".
> 
> This means you are potentially sending IPIs to CPUs that are not part
> of the mask, making performance potentially worse.
> 
> Thanks,
> 
> 	M.
> 
I will fix that,thanks.

>> +		/* Set Interrupt Routing Mode bit */
>> +		u64 val;
>> +		val = (d->hwirq) << ICC_SGI1R_SGI_ID_SHIFT;
>> +		val |= BIT_ULL(ICC_SGI1R_IRQ_ROUTING_MODE_BIT);
>> +		gic_write_sgi1r(val);
>> +
>> +		isb();
>> +		return;
>> +	}
>> +
>>   	for_each_cpu(cpu, mask) {
>>   		u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(cpu_logical_map(cpu));
>>   		u16 tlist;
>> -- 
>> 2.19.1
>>
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ