linux-kernel - Re: [BUG nohz]: wrong user and system time accounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170403152315.GA4221@lerouge>
Date:   Mon, 3 Apr 2017 17:23:17 +0200
From:   Frederic Weisbecker <fweisbec@...il.com>
To:     Luiz Capitulino <lcapitulino@...hat.com>
Cc:     Wanpeng Li <kernellwp@...il.com>, Mike Galbraith <efault@....de>,
        Rik van Riel <riel@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [BUG nohz]: wrong user and system time accounting

On Fri, Mar 31, 2017 at 11:11:19PM -0400, Luiz Capitulino wrote:
> On Sat, 1 Apr 2017 01:24:54 +0200
> Frederic Weisbecker <fweisbec@...il.com> wrote:
> 
> > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote:
> > > On Thu, 30 Mar 2017 17:25:46 -0400
> > > Luiz Capitulino <lcapitulino@...hat.com> wrote:
> > >   
> > > > On Thu, 30 Mar 2017 16:18:17 +0200
> > > > Frederic Weisbecker <fweisbec@...il.com> wrote:
> > > >   
> > > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:    
> > > > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@...il.com>:      
> > > > > > > If it works, we may want to take that solution, likely less performance sensitive
> > > > > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > > > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.      
> > > > > > 
> > > > > > So both Rik and you agree with the skew tick solution, I will try it
> > > > > > tomorrow. Btw, if we should just add random offset to the cpu in the
> > > > > > nohz_full mode or add random offset to all cpus like the codes above?      
> > > > > 
> > > > > Lets just keep it to all CPUs for simplicty.
> > > > > Also please add a comment that explains why we need that skew_tick on nohz_full.    
> > > > 
> > > > I've tried all the test-cases we discussed in this thread with skew_tick=1
> > > > and it worked as expected in bare-metal and KVM guests.
> > > > 
> > > > However, I found a test-case that works in bare-metal but show problems
> > > > in KVM guests. It could something that's KVM specific, or it could be
> > > > something that's harder to reproduce in bare-metal.  
> > > 
> > > After discussing some findings on this issue with Rik, I realized that
> > > we don't add the skew when restarting the tick in tick_nohz_restart().
> > > Adding the offset there seems to solve this problem.  
> > 
> > Are you sure? tick_nohz_restart() doesn't seem to override the initial skew. It
> > always forwards the expiration time on top of the last tick.
> 
> OK, I'll double check. Without my change the bug triggers almost
> instantly with the described reproducer. With my change it didn't
> trig for several minutes (but it does look wrong looking at it now).

Do you observe aligned ticks with trace events (hrtimer_expire_entry)?

You might want to enforce the global clock to trace that:

    echo "global" > /sys/kernel/debug/tracing/trace_clock