lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <343E0E168479F04FACCB176989D12DE7EE3206@dggemi522-mbs.china.huawei.com>
Date:   Wed, 16 Sep 2020 07:04:24 +0000
From:   lushenming <lushenming@...wei.com>
To:     Marc Zyngier <maz@...nel.org>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Jason Cooper <jason@...edaemon.net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Wanghaibin (D)" <wanghaibin.wang@...wei.com>,
        yuzenghui <yuzenghui@...wei.com>
Subject: RE: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll
 on the GICR_VPENDBASER.Dirty bit

Hi,

Our team just discussed this issue again and consulted our GIC hardware 
design team. They think the RD can afford busy waiting. So we still think 
maybe 0 is better, at least for our hardware.

In addition, if not 0, as I said before, in our measurement, it takes only 
hundreds of nanoseconds, or 1~2 microseconds, to finish parsing the VPT 
in most cases. So maybe 1 microseconds, or smaller, is more appropriate. 
Anyway, 10 microseconds is too much.

But it has to be said that it does depend on the hardware implementation.

Besides, I'm not sure where are the start and end point of the total scheduling 
latency of a vcpu you said, which includes many events. Is the parse time of 
the VPT not clear enough?

-----Original Message-----
From: Marc Zyngier [mailto:maz@...nel.org] 
Sent: 2020-09-15 22:48
To: lushenming <lushenming@...wei.com>
Cc: Thomas Gleixner <tglx@...utronix.de>; Jason Cooper <jason@...edaemon.net>; linux-kernel@...r.kernel.org; Wanghaibin (D) <wanghaibin.wang@...wei.com>; yuzenghui <yuzenghui@...wei.com>
Subject: Re: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll on the GICR_VPENDBASER.Dirty bit

On 2020-09-15 15:04, lushenming wrote:
> Thanks for your quick response.
> 
> Okay, I agree that busy-waiting may add more overhead at the RD level.
> But I think that the delay time can be adjusted. In our latest 
> hardware implementation, we optimize the search of the VPT, now even 
> the VPT full of interrupts (56k) can be parsed within 2 microseconds.

It's not so much when the VPT is full that it is bad. It is when the pending interrupts are not cached, and that you don't know *where* to look for them in the VPT.

> It is true that the parse speeds of various hardware are different, 
> but does directly waiting for 10 microseconds make the optimization of 
> those fast hardware be completely masked? Maybe we can set the delay 
> time smaller, like 1 microseconds?

That certainly would be more acceptable. But I still question the impact of such a change compared to the cost of a vcpu entry. I suggest you come up with measurements that actually show that polling this register more often significantly reduces the entry latency. Only then can we make an educated decision.

Thanks,

         M.
--
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ