linux-kernel - RE:(2) [Issue] timer callback registered with mod

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <1062227969.4242453.1632492283698@mail-kr5-0>
Date:   Fri, 24 Sep 2021 19:34:43 +0530
From:   Maninder Singh <maninder1.s@...sung.com>
To:     Frederic Weisbecker <frederic@...nel.org>
CC:     "fweisbec@...il.com" <fweisbec@...il.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Vaneet Narang <v.narang@...sung.com>,
        AMIT SAHRAWAT <a.sahrawat@...sung.com>,
        Chung-Ki Woo <chungki0201.woo@...sung.com>,
        Vishal Goel <vishal.goel@...sung.com>
Subject: RE:(2) [Issue] timer callback registered with mod_timer is getting
 called beforetime

Hi Frederic,

> > Is it known behaviour for timers?
> > because only 1 CPU is assigned to update jiffies work to call do_timer utill unless it goes to idle state and pass ownership to other CPU.
> > 
> > we tried by making all CPU to handle code for jiffies updation (it will add performance hit)
> > but then no issue of abrupt jiffies change occured on system.
>  
> First of all, are you meeting this issue specifically on NOHZ_FULL? Because
> there is a pending fix for a related matter there:

No, this is not our case.

>  
>       https://lore.kernel.org/lkml/20210915142303.24297-1-frederic@kernel.org/
>  
> As for what you're reporting here, I think the core problem is the fact that the
> timekeeper (jiffies updater) is stuck with IRQs disabled for way too long. Even
> one millisecond is too much to tolerate. Do you have any idea about the source of
> that situation?
>  

Yes, definately interrupts should not be disabled for so long,
but sometimes there are 3rd party drivers/vendors module code can cause issue,
and it can be the same case and we are trying to reproduce issue again and check code path.

So we had 2 doubts:
(1) In this explained case timer callback will be called early right? 
(2) What if jiffies updation can be done by any of the CPU rather that making one
CPU owner? can it cause any side effectes? one we know is performance, there will be redundant calls
from other CPUs.

        /* Check, if the jiffies need an update */
        if (tick_do_timer_cpu == cpu)
                tick_do_update_jiffies64(now);

On our target, there is a race condition when irq_disable code path scheduled on same CPU
which is responsible for jiffies updation and in parallel CPU1 registers evet callback for 20/30 ms.
and due to abrupt jiffies change callback triggered within 1 ms of actual time, which cause actual
issue.

Thanks
Maninder Singh

Download attachment "rcptInfo.txt" of type "application/octet-stream" (1577 bytes)