lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200811201211.19552.david-b@pacbell.net>
Date:	Thu, 20 Nov 2008 12:11:19 -0800
From:	David Brownell <david-b@...bell.net>
To:	yi.y.yang@...el.com
Cc:	linux-kernel@...r.kernel.org, gregkh@...e.de,
	akpm@...ux-foundation.org, Alan Stern <stern@...land.harvard.edu>
Subject: Re: [PATCH] USB: improve ehci_watchdog's side effect in CPU power management

On Thursday 25 September 2008, Yi Yang wrote:
> ehci_watchdog will wake up CPU very frequently so that CPU
> stays at C3 very short, average residence time is about 50
> ms on Aspire One, but we expect it should be about 1 second
> or more, so this kind of periodic timer is very bad for power
> saving.

How did you finger this timer?  I think you don't understand
what it's really doing.

Near as I can tell, f0d781d59cb621e1795d510039df973d0f8b23fc
should just be reverted.

Alan Stern had some good comments.  Although GCC will usually
end up optimizing most of this code away, this function may now
be too large now for inlining.

 
> We can't remove this timer because of some bad USB controller
> chipset, but at least we should reduce its side effect to as
> possible as low.

That's actually a different timer ... the IAA watchdog timer
is coping with a quirk that's been observed on most VIA chips.
On sane hardware, it should never fire (since the IAA interrupt
actually happens).


> This patch can make CPU stay at C3 longer, average residence time
> is about twice as long as original. 

Did you actually measure this patch?  It looks very wrong to me.

Recall that the primary purpose of *this* timer is to reduce
the DMA load ... first by shrinking the async ring by taking
out unused QH entries, and then by disabling the async ring
entirely.

So leaving needless DMA loads active for longer, you prevent
entry to C3... contrary to what you're attempting.

To improve C3 times, you'd want to stop DMA *sooner* not later...
there's a tradeoff of course, since stopping DMA too soon will
shrink performance (and causes various hardware races to be
more common).


> Please consider to apply it, thanks
> 
> Signed-off-by: Yi Yang <yi.y.yang@...el.com>
> ---
>  ehci.h |   12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
> index 5799298..9d530d9 100644
> --- a/drivers/usb/host/ehci.h
> +++ b/drivers/usb/host/ehci.h
> @@ -181,14 +181,16 @@ timer_action (struct ehci_hcd *ehci, enum ehci_timer_action action)
>  	 * the async ring; just the I/O watchdog.  Note that if a
>  	 * SHRINK were pending, OFF would never be requested.
>  	 */
> -	if (timer_pending(&ehci->watchdog)
> -			&& ((BIT(TIMER_ASYNC_SHRINK) | BIT(TIMER_ASYNC_OFF))
> -				& ehci->actions))
> -		return;
> +	enum ehci_timer_action oldactions = ehci->actions;

Moving that test invalidates the comment desribing what it was
doing.  This change *also* looks fishy... either it's not needed,
or it was thet only change you needed (but, with comment fix).


>  
>  	if (!test_and_set_bit (action, &ehci->actions)) {
>  		unsigned long t;
>  
> +		if (timer_pending(&ehci->watchdog)
> +			&& ((BIT(TIMER_ASYNC_SHRINK) | BIT(TIMER_ASYNC_OFF))
> +				& oldactions))
> +			return;
> +
>  		switch (action) {
>  		case TIMER_IO_WATCHDOG:
>  			t = EHCI_IO_JIFFIES;
> @@ -204,7 +206,7 @@ timer_action (struct ehci_hcd *ehci, enum ehci_timer_action action)
>  			t = DIV_ROUND_UP(EHCI_SHRINK_FRAMES * HZ, 1000) + 1;
>  			break;
>  		}
> -		mod_timer(&ehci->watchdog, t + jiffies);
> +		mod_timer(&ehci->watchdog, round_jiffies(t + jiffies));

So basically, instead of having the DMA load shrink down to zero
and finally to off, in on the order of 1/20 of a second ... you
instead leave DMA active for much longer periods.

An async ring with three empty entries will be doing DMA for one,
two, three seconds ... preventing C3 all the while ... before
turning off.

Vs previous behavior where will shrink to empty and then stop
doing DMA, in under 1/10 of a second.

This round_jiffies() call is the very least that needs reverting.



>  	}
>  }
>  
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ