linux-kernel - timer-sp804: sched_clock: issue after first suspend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKUFe6YJKEU8WiHWqAKOf20_c8XizVMFXr=12xO63kW9yrByuw@mail.gmail.com>
Date:   Mon, 27 Nov 2017 17:02:16 +0530
From:   Abhishek Shah <abhishek.shah@...adcom.com>
To:     daniel.lezcano@...aro.org, tglx@...utronix.de
Cc:     open list <linux-kernel@...r.kernel.org>, mingo@...hat.com,
        peterz@...radead.org, deepa.kernel@...il.com,
        john.stultz@...aro.org, fweisbec@...il.com,
        viro@...iv.linux.org.uk,
        BCM Kernel Feedback <bcm-kernel-feedback-list@...adcom.com>,
        Vikram Prakash <vikram.prakash@...adcom.com>
Subject: timer-sp804: sched_clock: issue after first suspend

Hi Daniel/Thomas/All,

One of our arm64 SoCs have 4 instances of sp804 timers apart form the
architecture timer.

The arch timer (implemented in "drivers/clocksource/arm_arch_timer.c")
is being used as clocksource (-done with clocksource_register_hz
callback)
as well as sched_clock (arch_timer_read_counter - done with
sched_clock_register).

When sp804 driver gets probed, as its frequency (133 MHz) is higher
than that of the arch timer (25 MHz);
it overrides current sched_clock -> arch_timer_read_counter with sp804_read.
Here, the sp804_read is based on timer2( base + 0x20) of sp804. I see
timer1 is being used as clock_event_device.

*On such system, the current clocksource is "arch_sys_counter"  as
seen through sysfs, But the sched_clock is based on sp804_read.*

The problem arises when we suspend the system. After suspend, the
printk timestamp freezes, i.e., continues to show the same value as
printk appears to get its timestamp from sched_clock and timer2 gets
"reset" on suspend.

If LD = timerX_base + 0x0; VAL = timerX_base + 0x4; CTRL = timerX_base
+ 0x8 (X = 1 for timer1 and X=2 for timer2); then
Initially, following are values of timer2 registers: LD: 0xFFFFFFFF
VAL:0x35C108B2 CTRL:0x000000C2
After first suspend, following are values of timer2 registers: LD:
0x00000000 VAL:0xFFFFFFFF CTRL:0x00000020
So, the timer2 got reset..
[ Just for reference:
Initially, following are values of timer1 registers: LD: 0x00000000
VAL:0xFFFFFFFF CTRL:0x00000000
After first suspend, following are values of timer1 registers: LD:
0x00000000 VAL:0xFFFFFFFF CTRL:0x00000000
]

In order to re-init timer2, I have added resume callback in mmio
clocksource driver.
Following patch re-inits the timer being used for sched_clock.
*************************************************************************************************************************************
diff --git a/drivers/clocksource/mmio.c b/drivers/clocksource/mmio.c
index 4c4df98..55734b2 100644
--- a/drivers/clocksource/mmio.c
+++ b/drivers/clocksource/mmio.c
@@ -51,7 +51,7 @@ u64 clocksource_mmio_readw_down(struct clocksource *c)
  */
 int __init clocksource_mmio_init(void __iomem *base, const char *name,
        unsigned long hz, int rating, unsigned bits,
-       u64 (*read)(struct clocksource *))
+       u64 (*read)(struct clocksource *), void (*resume)(struct clocksource *))
 {
        struct clocksource_mmio *cs;

@@ -68,6 +68,7 @@ int __init clocksource_mmio_init(void __iomem *base,
const char *name,
        cs->clksrc.read = read;
        cs->clksrc.mask = CLOCKSOURCE_MASK(bits);
        cs->clksrc.flags = CLOCK_SOURCE_IS_CONTINUOUS;
+       cs->clksrc.resume = resume;

        return clocksource_register_hz(&cs->clksrc, hz);
 }
diff --git a/drivers/clocksource/timer-sp804.c
b/drivers/clocksource/timer-sp804.c
index 3ac9dec..7fc0a0b 100755
--- a/drivers/clocksource/timer-sp804.c
+++ b/drivers/clocksource/timer-sp804.c
@@ -72,6 +72,16 @@ static u64 notrace sp804_read(void)
        return ~readl_relaxed(sched_clock_base + TIMER_VALUE);
 }

+static void sp804_timer_resume(struct clocksource *clksrc)
+{
+       writel(0, sched_clock_base + TIMER_CTRL);
+       writel(0xffffffff, sched_clock_base + TIMER_LOAD);
+       writel(0xffffffff, sched_clock_base + TIMER_VALUE);
+       writel(TIMER_CTRL_32BIT | TIMER_CTRL_ENABLE | TIMER_CTRL_PERIODIC,
+               sched_clock_base + TIMER_CTRL);
+}
+
 void __init sp804_timer_disable(void __iomem *base)
 {
        writel(0, base + TIMER_CTRL);
@@ -105,7 +115,7 @@ int  __init
__sp804_clocksource_and_sched_clock_init(void __iomem *base,
                base + TIMER_CTRL);

        clocksource_mmio_init(base + TIMER_VALUE, name,
-               rate, 200, 32, clocksource_mmio_readl_down);
+               rate, 200, 32, clocksource_mmio_readl_down, sp804_timer_resume);

        if (use_sched_clock) {
                sched_clock_base = base;
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7dff196..2b23e0b 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -247,7 +247,8 @@ extern u64 clocksource_mmio_readw_up(struct clocksource *);
 extern u64 clocksource_mmio_readw_down(struct clocksource *);

 extern int clocksource_mmio_init(void __iomem *, const char *,
-       unsigned long, int, unsigned, u64 (*)(struct clocksource *));
+       unsigned long, int, unsigned, u64 (*)(struct clocksource *),
+       void (*)(struct clocksource *));

 extern int clocksource_i8253_init(void);
*************************************************************************************************************************************

This solves the printk timestamp freeze issue. Does anybody have any
suggestion to implement this differently
as I think this has to general issue with platforms having sp804 timers?
One thing that I can think of is, we can re-init the required timers
in ATF during resume sequence or is there any better way?

Apart from this, I see following prints at times, can the above
discussion be related to this behavior?
[  320.448348] INFO: rcu_sched detected stalls on CPUs/tasks:
[  320.454015]  3-...: (90 GPs behind) idle=162/140000000000001/0
softirq=1220/1220 fqs=2462
[  320.462540]  (detected by 1, t=5255 jiffies, g=518, c=517, q=37)
[  320.468738] Task dump for CPU 3:
[  320.472067] swapper/3       R  running task        0     0      1 0x00000022
[  320.479341] Call trace:
[  320.481869] [<ffff000008085828>] __switch_to+0x98/0xb0


Regards,
Abhishek