lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6bb55aa-5c47-ba7a-2f74-56da4aef4a42@kontron.de>
Date:   Wed, 27 May 2020 12:50:01 +0000
From:   Schrempf Frieder <frieder.schrempf@...tron.de>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
CC:     Shawn Guo <shawnguo@...nel.org>,
        Sascha Hauer <s.hauer@...gutronix.de>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        Enrico Weigelt <info@...ux.net>,
        Thomas Gleixner <tglx@...utronix.de>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>
Subject: Re: High interrupt latency with low power idle mode on i.MX6

On 27.05.20 13:53, Russell King - ARM Linux admin wrote:
> On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote:
>> Hi,
>>
>> on our i.MX6UL/ULL boards running mainline kernels, we see an issue with
>> RS485 collisions on the bus. These are caused by the resetting of the
>> RTS signal being delayed after each transmission. The TXDC interrupt
>> takes several milliseconds to trigger and the slave on the bus already
>> starts to send a reply in the meantime.
>>
>> We found out that these delays only happen when the CPU is in "low power
>> idle" mode (ARM power off). When we disable cpuidle state 2 or put some
>> background load on the CPU everything works fine and the delays are gone.
>>
>> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable
>>
>> It seems like also other interfaces (I2C, etc.) might be affected by
>> these increased latencies, we haven't investigated this more closely,
>> though.
>>
>> We currently apply a patch to our kernel, that disables low power idle
>> mode by default, but I'm wondering if there's a way to fix this
>> properly? Any ideas?
> 
> Let's examine a basic fact about power management:
> 
> The deeper PM modes that the system enters, the higher the latency to
> resume operation.
> 
> So, I'm not surprised that you have higher latency when you allow the
> system to enter lower power modes.  Does that mean that the kernel
> should not permit entering lower power modes - no, it's policy and
> application dependent.
> 
> If the hardware is designed to use software to manage the RTS signal
> to control the RS485 receiver, then I'm afraid that your report really
> does not surprise me - throwing that at software to manage is a really
> stupid idea, but it seems lots of people do this.  I've held this view
> since I worked on a safety critical system that used RS485 back in the
> 1990s (London Underground Jubilee Line Extension public address system.)
> 
> So, what we have here is several things that come together to create a
> problem:
> 
> 1) higher power savings produce higher latency to resume from
> 2) lack of hardware support for RS485 half duplex communication needing
>     software support
> 3) an application that makes use of RS485 half duplex communication
>     without disabling the higher latency power saving modes
> 
> The question is, who should disable those higher latency power saving
> modes - the kernel, or userspace?
> 
> The kernel knows whether it needs to provide software control of the
> RTS signal or not, but the kernel does not know the maximum permissible
> latency (which is application specific.)  So, the kernel doesn't have
> all the information it needs.  However, there is a QoS subsystem which
> may help you.
> 
> There's also tweaks available via
> /sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us
> 
> which can be poked to configure the latency that is required, and will
> prevent the deeper PM states being entered.

Thanks for the detailed explanation. This all makes perfect sense to me.
I will keep in mind that we need to consider this aspect of power saving 
vs. latency when designing systems and also that we need to provide the 
information for the kernel to decide which of the two is more important.

Also thanks for pointing out the QoS subsystem. I'm not quite sure if it 
would work for us to use pm_qos_resume_latency_us in our specific case. 
The actual latency we observe is something like 2 to 3 milliseconds 
longer with low power idle than without, but the exit_latency for low 
power idle specified in the cpuidle driver is only 300 us.

So as far as I can see with this difference even if we would set 
pm_qos_resume_latency_us to 1000 us (which should be fast enough for the 
RS485 to work properly), the low power idle wouldn't be disabled.

It's rather this discrepancy between the latency set in the driver and 
what we see in reality which makes me wonder if there's something I'm 
missing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ