linux-kernel - Re: [PATCH -next] sched/cputime: Fix the bug of reading time backward from /proc/stat

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <0b2f6919-7ed4-b30d-e92b-355c09bbfd25@huawei.com>
Date:   Mon, 5 Sep 2022 11:47:41 +0800
From:   zhengzucheng <zhengzucheng@...wei.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Li Hua <hucool.lihua@...wei.com>
CC:     <mingo@...hat.com>, <juri.lelli@...hat.com>,
        <vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
        <rostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
        <bristot@...hat.com>, <vschneid@...hat.com>,
        <linux-kernel@...r.kernel.org>, <stable@...r.kernel.org>
Subject: Re: [PATCH -next] sched/cputime: Fix the bug of reading time backward
 from /proc/stat


Assume that a CPU time“ A” is read from /proc/stat， and after a while,  
a CPU time “B” is read. If T = B – A < 0, T is identified as a large 
number as an unsigned integer. As a result, the CPU usage calculated by 
this way will be abnormally high. It seems to be a problem to be fixed.

original link:
https://lore.kernel.org/lkml/20220813000102.42051-1-hucool.lihua@huawei.com/

在 2022/8/15 16:15, Peter Zijlstra 写道:
> On Sat, Aug 13, 2022 at 08:01:02AM +0800, Li Hua wrote:
>> The problem that the statistical time goes backward, the value read first is 319, and the value read again is 318. As follows：
>> first：
>> cat /proc/stat |  grep cpu1
>> cpu1    319    0    496    41665    0    0    0    0    0    0
>> then：
>> cat /proc/stat |  grep cpu1
>> cpu1    318    0    497    41674    0    0    0    0    0    0
>>
>> Time goes back, which is counterintuitive.
>>
>> After debug this, The problem is caused by the implementation of kcpustat_cpu_fetch_vtime. As follows：
>>
>>                                CPU0                                                                          CPU1
>> First:
>> show_stat():
>>      ->kcpustat_cpu_fetch()
>>          ->kcpustat_cpu_fetch_vtime()
>>              ->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu) + vtime->utime + delta;              rq->curr is in user mod
>>               ---> When CPU1 rq->curr running on userspace, need add utime and delta
>>                                                                                               --->  rq->curr->vtime->utime is less than 1 tick
>> Then:
>> show_stat():
>>      ->kcpustat_cpu_fetch()
>>          ->kcpustat_cpu_fetch_vtime()
>>              ->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu);                                     rq->curr is in kernel mod
>>              ---> When CPU1 rq->curr running on kernel space, just got kcpustat
> This is unreadable, what?!?
> .