lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 9 Jun 2010 22:22:29 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Suresh Rajashekara <suresh.raj+linuxomap@...il.com>
cc:	linux-omap@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	linux-pm@...ts.linux-foundation.org,
	John Stultz <johnstul@...ibm.com>
Subject: Re: Timekeeping issue on aggressive suspend/resume

On Wed, 9 Jun 2010, Suresh Rajashekara wrote:

> I have an application (running on 2.6.29-omap1) which puts an OMAP1
> system to suspend aggressively. The system wakes up every 4 seconds
> and stays awake for about 35 milliseconds and sleeps again for another
> 4 seconds. This design is to save power on a battery operated device.
> 
> This aggressive suspend resume action seems like creating an issue to
> other applications in the system waiting for some timeout to happen
> (especially an application which is waiting using the mq_timedreceive
> and is supposed to timeout every 30 seconds. It seems to wake up every
> 90 seconds). Seems like the timekeeping is not happening properly in
> side the kernel.
> 
> If the suspend duration is changed from 4 second to 1 second, then
> things work somewhat better. On reducing it to 0.5 second (which was
> our earlier design on 2.6.16-rc3), the problem seems to disappear.
> 
> Is this expected?

Yes, that's caused by the fact that suspend (via sys/power/state )
freezes the kernel internal timers and the user space visible timers
which are based on CLOCK_MONOTONIC or jiffies (like mq_timedreceive on
your .29 kernel). Only CLOCK_REALTIME based timers are kept correct as
we have to align to the wall clock time.

The reason for this is, that otherwise almost all timers are expired
when we resume and we get a thundering herd of apps and kernel
facilities due to firing timeouts.

Another problem is that jiffies can wrap around on 32 bit systems
during a long suspend though I don't think that's a real world problem
as it takes between 49 to 497 days of suspend depending on the HZ
setting. SO for your usecase it would not matter.

I'm more concerned about code getting surprised by firing timers as
the kernel has this behaviour for a long time now.

Though we could change that conditionally - the default would still be
the freeze of jiffies and CLOCK_MONOTONIC for historical compability.

There will be probably some accounting issues. uptime, cpu time of the
suspend task and some others, but that needs to be found out.

Thanks,

	tglx



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ