[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <476ABDDF.8080607@intel.com>
Date: Thu, 20 Dec 2007 11:09:19 -0800
From: "Kok, Auke" <auke-jan.h.kok@...el.com>
To: Stephen Hemminger <shemminger@...ux-foundation.org>
CC: parag.warudkar@...il.com, netdev@...r.kernel.org,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
Arjan van de Ven <arjan@...ux.intel.com>
Subject: Re: [PATCH] sky2: Use deferrable timer for watchdog
Stephen Hemminger wrote:
> On Thu, 20 Dec 2007 17:29:23 +0000
>
>> -----Original Message-----
>> From: Stephen Hemminger <shemminger@...ux-foundation.org>
>>
>> Date: Thu, 20 Dec 2007 09:16:03
>> To:parag.warudkar@...il.com
>> Cc:netdev@...r.kernel.org, akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
>> Subject: Re: [PATCH] sky2: Use deferrable timer for watchdog
>>
>>
>> On Tue, 18 Dec 2007 20:13:28 -0500 (EST)
>> Parag Warudkar <parag.warudkar@...il.com> wrote:
>>
>>> sky2 can use deferrable timer for watchdog - reduces wakeups from idle per
>>> second.
>>>
>>> Signed-off-by: Parag Warudkar <parag.warudkar@...il.com>
>>>
>>> --- linux-2.6/drivers/net/sky2.c 2007-12-07 10:04:39.000000000 -0500
>>> +++ linux-2.6-work/drivers/net/sky2.c 2007-12-18 20:07:58.000000000 -0500
>>> @@ -4230,7 +4230,10 @@
>>> sky2_show_addr(dev1);
>>> }
>>>
>>> - setup_timer(&hw->watchdog_timer, sky2_watchdog, (unsigned long) hw);
>>> + hw->watchdog_timer.function = sky2_watchdog;
>>> + hw->watchdog_timer.data = (unsigned long) hw;
>>> + init_timer_deferrable(&hw->watchdog_timer);
>>> +
>>> INIT_WORK(&hw->restart_work, sky2_restart);
>>>
>>> pci_set_drvdata(pdev, hw);
>> Does it really reduce the wakeup's or only change who gets charged by powertop?
>> The system is going to wakeup once a second anyway. Looks to me that if the
>> timer is using round_jiffies(), that setting deferrable just changes the accounting.
>>
>> My interpretation of the api is:
>> * round_jiffies() - timer wants to wakeup but isn't precise about when so schedule
>> on next second when system will wake up anyway;
>> e.g why meetings are usually scheduled on the hour
>>
>> * deferrable - timer doesn't have to really wakeup but wants to happen near
>> a particular time. e.g. "I'll meet you at the pub around 8pm"
>>
>> Therefore doing deferrable is unnecessary for timers using round_jiffies unless system
>> is so good at doing timers that it is going to skip doing timer once per second.
>>
>
> parag.warudkar@...il.com wrote:
>
>> NO_HZ kernels don't do timers every second - if you do round_jiffies() the kernel will wakeup and run the timer at that time no matter what.
>>
>> The reason deferrable was introduced is to avoid waking up the kernel just for this one timer that can be called when the CPU is not idle for some reason other than this timer.
>>
>> In other words let's say there were two timers - one non-deferrable expiring in 3 seconds and other deferrable, expiring in 1.5 seconds. The kernel will not wake up twice - once for 1.5 second and other for 3 second - it will wake up once at expiry of 3 second timer and execute both the 1.5 second and 3 second timers.
>>
>> And this is not just powertop accounting thing - like I said the total num of wakeups per second go down with this patch.
>>
>> Parag
>>
>> Sent via BlackBerry from T-Mobile
>
>
> Quit top-posting!
>
> If this is the case then the whole usage of round_jiffies() is bogus. All users of round_jiffies()
> should just be converted to deferrable?? I am a bit concerned that if deferrable gets used everywhere
> then a strange situation would occur where all timers were waiting for some other timer to finally
> happen, kind of a wierd timelock situation. Like the old chip/dale cartoon:
> "you first, no you first, after you mister chip, no after you mister dale,..."
that's a dangerous situation indeed and I'd really like to know what the limits
are for deferring deferrable timers.... Arjan, do you know? Anyone?
I don't see a danger just yet on normal systems - I get something like 10 wakeups
per second from just the kernel (acpi, ahci, usb) on most my systems which
guarantees that the watchdog runs often enough, but for embedded systems and
critical timers in other drivers this may be an issue quickly
Auke
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists