Before this change we had Normal lapic_clockevent rating 100 hpet MSI (percpu) clockevent rating 110 Always Running lapic_clockevent rating 150 As a result, on systems that support HPET MSI, percpu clockevents got priority over LAPIC timer. That was ok when systems supported deep C-state. But, that was sub-optimal on systems that did not support deep C-states as HPETs are slower than LAPIC. There was also a functional issue with usage of HPET MSI on some platforms, which do not support deep C-state as reported here. http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html After the change, hpet MSI (percpu) clockevent rating 95 Normal lapic_clockevent rating 100 Always Running lapic_clockevent rating 150 And we reduce the rating of non-Always_Running LAPIC timer (to 90), when we see that deep C-states are supported and switch to hpet MSI. This change makes the timer usage optimal in terms of performance and also eliminates the functionality issue mentioned above. Signed-off-by: Venkatesh Pallipadi --- arch/x86/kernel/hpet.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index dd9370b..ccb3752 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -555,7 +555,11 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu) hpet_setup_irq(hdev); evt->irq = hdev->irq; - evt->rating = 110; + /* + * Rating should be within 10 less than lapic timer for + * timer switch to happen when deep C-states are supported. + */ + evt->rating = 95; evt->features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_NO_BROADCAST; if (hdev->flags & HPET_DEV_PERI_CAP) evt->features |= CLOCK_EVT_FEAT_PERIODIC; -- 1.6.0.6 -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/