[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <527F52AB-0070-43EA-BE82-945280CA2AEE@gmail.com>
Date: Wed, 21 Feb 2024 10:57:38 -0600
From: Andrew Geissler <geissonator@...il.com>
To: Andrew Jeffery <andrew@...econstruct.com.au>
Cc: minyard@....org,
Paul Menzel <pmenzel@...gen.mpg.de>,
Joel Stanley <joel@....id.au>,
openipmi-developer@...ts.sourceforge.net,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
linux-aspeed <linux-aspeed@...ts.ozlabs.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
openbmc@...ts.ozlabs.org
Subject: Re: [PATCH] ipmi: kcs: Update OBF poll timeout to reduce latency
> On Feb 20, 2024, at 4:36 PM, Andrew Jeffery <andrew@...econstruct.com.au> wrote:
>
> On Tue, 2024-02-20 at 13:33 -0600, Corey Minyard wrote:
>> On Tue, Feb 20, 2024 at 04:51:21PM +0100, Paul Menzel wrote:
>>> Dear Andrew,
>>
>> It's because increasing that number causes it to poll longer for the
>> event, the host takes longer than 100us to generate the event, and if
>> the event is missed the time when it is checked again is very long.
>>
>> Polling for 100us is already pretty extreme. 200us is really too long.
>>
>> The real problem is that there is no interrupt for this. I'd also guess
>> there is no interrupt on the host side, because that would solve this
>> problem, too, as it would certainly get around to handling the interupt
>> in 100us. I'm assuming the host driver is not the Linux driver, as it
>> should also handle this in a timely manner, even when polling.
>
> I expect the issues Andrew G is observing are with the Power10 boot
> firmware. The boot firmware only polls. The runtime firmware enables
> interrupts.
Yep, this is with the low level host boot firmware.
Also, further testing over night showed that 200us wasn’t enough for
our larger Everest P10 machines, I needed to go to 300us. As we
were struggling to allow 200us, I assume 300us is going to be a no-go.
>>
>
>>
>> The right way to fix this is probably to do the same thing the host side
>> Linux driver does. It has a kernel thread that is kicked off to do
>> this. Unfortunately, that's more complicated to implement, but it
>> avoids polling in this location (which causes latency issues on the BMC
>> side) and lets you poll longer without causing issues.
>
> In Andrew G's case he's talking MCTP over KCS using a vendor-defined
> transport binding (that also leverages LPC FWH cycles for bulk data
> transfers)[1]. I think it could have taken more inspiration from the
> IPMI KCS protocol: It might be worth an experiment to write the dummy
> command value to IDR from the host side after each ODR read to signal
> the host's clearing of OBF (no interrupt for the BMC) with an IBF
> (which does interrupt the BMC). And doing the obverse for the BMC. Some
> brief thought suggests that if the dummy value is read there's no need
> to send a dummy value in reply (as it's an indicator to read the status
> register). With that the need for the spin here (or on the host side)
> is reduced at the cost of some constant protocol overhead.
>
Thanks for the quick reviews and ideas.
I’ll see if I can find someone on the team to help out with Andrew J’s
thoughts and if that doesn’t work, look into the kernel thread idea.
>
>
> Andrew J
Powered by blists - more mailing lists