netdev - Re: [RFC PATCH V2 11/11] x86: tsc: avoid system instability in hibernation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJZ5v0jkaw1jJVahWbvcqcYhcwWLqajm7gchn4L4WOngHJcbUA@mail.gmail.com>
Date:   Mon, 13 Jan 2020 12:48:08 +0100
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     "Singh, Balbir" <sblbir@...zon.com>
Cc:     "peterz@...radead.org" <peterz@...radead.org>,
        "Valentin, Eduardo" <eduval@...zon.com>,
        "boris.ostrovsky@...cle.com" <boris.ostrovsky@...cle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Agarwal, Anchal" <anchalag@...zon.com>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "vkuznets@...hat.com" <vkuznets@...hat.com>,
        "sstabellini@...nel.org" <sstabellini@...nel.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "Woodhouse@...-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com" 
        <Woodhouse@...-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "jgross@...e.com" <jgross@...e.com>, "pavel@....cz" <pavel@....cz>,
        "axboe@...nel.dk" <axboe@...nel.dk>,
        "x86@...nel.org" <x86@...nel.org>,
        "roger.pau@...rix.com" <roger.pau@...rix.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "rjw@...ysocki.net" <rjw@...ysocki.net>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "Kamata, Munehisa" <kamatam@...zon.com>,
        "bp@...en8.de" <bp@...en8.de>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "konrad.wilk@...cle.co" <konrad.wilk@...cle.co>,
        "len.brown@...el.com" <len.brown@...el.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "fllinden@...ozn.com" <fllinden@...ozn.com>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>
Subject: Re: [RFC PATCH V2 11/11] x86: tsc: avoid system instability in hibernation

On Mon, Jan 13, 2020 at 12:43 PM Singh, Balbir <sblbir@...zon.com> wrote:
>
> On Mon, 2020-01-13 at 11:16 +0100, Peter Zijlstra wrote:
> > On Fri, Jan 10, 2020 at 07:35:20AM -0800, Eduardo Valentin wrote:
> > > Hey Peter,
> > >
> > > On Wed, Jan 08, 2020 at 11:50:11AM +0100, Peter Zijlstra wrote:
> > > > On Tue, Jan 07, 2020 at 11:45:26PM +0000, Anchal Agarwal wrote:
> > > > > From: Eduardo Valentin <eduval@...zon.com>
> > > > >
> > > > > System instability are seen during resume from hibernation when system
> > > > > is under heavy CPU load. This is due to the lack of update of sched
> > > > > clock data, and the scheduler would then think that heavy CPU hog
> > > > > tasks need more time in CPU, causing the system to freeze
> > > > > during the unfreezing of tasks. For example, threaded irqs,
> > > > > and kernel processes servicing network interface may be delayed
> > > > > for several tens of seconds, causing the system to be unreachable.
> > > > > The fix for this situation is to mark the sched clock as unstable
> > > > > as early as possible in the resume path, leaving it unstable
> > > > > for the duration of the resume process. This will force the
> > > > > scheduler to attempt to align the sched clock across CPUs using
> > > > > the delta with time of day, updating sched clock data. In a post
> > > > > hibernation event, we can then mark the sched clock as stable
> > > > > again, avoiding unnecessary syncs with time of day on systems
> > > > > in which TSC is reliable.
> > > >
> > > > This makes no frigging sense what so bloody ever. If the clock is
> > > > stable, we don't care about sched_clock_data. When it is stable you get
> > > > a linear function of the TSC without complicated bits on.
> > > >
> > > > When it is unstable, only then do we care about the sched_clock_data.
> > > >
> > >
> > > Yeah, maybe what is not clear here is that we covering for situation
> > > where clock stability changes over time, e.g. at regular boot clock is
> > > stable, hibernation happens, then restore happens in a non-stable clock.
> >
> > Still confused, who marks the thing unstable? The patch seems to suggest
> > you do yourself, but it is not at all clear why.
> >
> > If TSC really is unstable, then it needs to remain unstable. If the TSC
> > really is stable then there is no point in marking is unstable.
> >
> > Either way something is off, and you're not telling me what.
> >
>
> Hi, Peter
>
> For your original comment, just wanted to clarify the following:
>
> 1. After hibernation, the machine can be resumed on a different but compatible
> host (these are VM images hibernated)
> 2. This means the clock between host1 and host2 can/will be different

So the problem is specific to this particular use case.

I'm not sure why to impose this hack on hibernation in all cases.