linux-kernel - Re: [Question] Reading /proc/stat has a time backward issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 8 Aug 2022 20:23:46 +0800
From:   "Lihua (lihua, ran)" <hucool.lihua@...wei.com>
To:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        "open list:SCHEDULER" <linux-kernel@...r.kernel.org>,
        <frederic@...nel.org>
Subject: Re: [Question] Reading /proc/stat has a time backward issue

ping...

Your suggestions are valuable, I don't have a good way to fix this.

thanks all.

在 2022/8/4 15:44, Lihua (lihua, ran) 写道:
> ping...
> 
> Any good suggestions?
> 
> thanks all.
> 
> 在 2022/7/27 12:02, Lihua (lihua, ran) 写道:
>> Hi all,
>>
>> I found a problem that the statistical time goes backward, the value read first is 319, and the value read again is 318. As follows：
>> first：
>> cat /proc/stat |  grep cpu1
>> cpu1    319    0    496    41665    0    0    0    0    0    0
>> then：
>> cat /proc/stat |  grep cpu1
>> cpu1    318    0    497    41674    0    0    0    0    0    0
>>
>> Time goes back, which is counterintuitive.
>>
>> After debug this, I found that the problem is caused by the implementation of kcpustat_cpu_fetch_vtime. As follows：
>>
>>                                CPU0                                                                          CPU1
>> First:
>> show_stat():
>>      ->kcpustat_cpu_fetch()
>>          ->kcpustat_cpu_fetch_vtime()
>>              ->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu) + vtime->utime + delta;              rq->curr is in user mod
>>               ---> When CPU1 rq->curr running on userspace, need add utime and delta
>>                                                                                               --->  rq->curr->vtime->utime is less than 1 tick
>> Then:
>> show_stat():
>>      ->kcpustat_cpu_fetch()
>>          ->kcpustat_cpu_fetch_vtime()
>>              ->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu);                                     rq->curr is in kernel mod
>>              ---> When CPU1 rq->curr running on kernel space, just got kcpustat
>>
>> Because the values of utime、 stime and delta are temporarily written to cpustat. Therefore, there are two problems  read from /proc/stat:
>> 1. There may be a regression phenomenon;
>> 2. When there are many tasks, the statistics are not accurate enough when utime and stime do not exceed one TICK.
>> The time goes back is counterintuitive, and I want to discuss whether there is a good solution without compromising performance.
>>
>> Thanks a lot.