linux-kernel - Re: [BUG] perf: perf sched warning possibly due to clock granularity on AMD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120207083253.GC12821@elte.hu>
Date:	Tue, 7 Feb 2012 09:32:53 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Venki Pallipadi <venki@...gle.com>
Cc:	Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Stephane Eranian <eranian@...gle.com>,
	linux-kernel@...r.kernel.org, acme@...hat.com,
	robert.richter@....com, eric.dumazet@...il.com,
	Andreas Herrmann <andreas.herrmann3@....com>
Subject: Re: [BUG] perf: perf sched warning possibly due to clock granularity
 on AMD


* Venki Pallipadi <venki@...gle.com> wrote:

> On Mon, Feb 6, 2012 at 12:37 PM, Borislav Petkov <bp@...64.org> wrote:
> > On Mon, Feb 06, 2012 at 09:31:33PM +0100, Peter Zijlstra wrote:
> >> On Mon, 2012-02-06 at 21:27 +0100, Borislav Petkov wrote:
> >> > On Mon, Feb 06, 2012 at 05:54:19PM +0100, Peter Zijlstra wrote:
> >> > > On Mon, 2012-02-06 at 17:46 +0100, Borislav Petkov wrote:
> >> > > > > across all CPUs in the entire system.
> >> > > >
> >> > > > Right, by the "entire system" you mean consistent across cores and
> >> > > > sockets but not necessarily across cabinets, as in the comment above,
> >> > > > correct?
> >> > > >
> >> > > > If so, let me ask around if this holds true too.
> >> > >
> >> > > Every CPU available to the kernel. So if you run a single system image
> >> > > across your cabinets, then yes those too.
> >> >
> >> > Ok, but what about that sentence "(but not across cabinets - we turn
> >> > it off in that case explicitly.)" - I don't see any place where it is
> >> > turned off explicitly... Maybe a stale comment?
> >>
> >> I suspect it might be the sched_clock_stable = 0 in mark_tsc_unstable(),
> >> but lets ask Venki, IIRC he wrote all that.
> >
> > Yeah, I was looking at the code further and on Intel it does:
> >
> >        if (c->x86_power & (1 << 8)) {
> >                set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
> >                set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
> >                if (!check_tsc_unstable())
> >                        sched_clock_stable = 1;
> >        }
> >
> > while on AMD, in early_init_amd() we do:
> >
> >        if (c->x86_power & (1 << 8)) {
> >                set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
> >                set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
> >        }
> >
> > and having in mind that tsc_unstable is set on generic x86 paths,
> > nothing stops us to do the same on AMD too, and as a result, set
> > sched_clock_stable too.
> >
> > But yeah, let's see what Venki has to say first.
> >
> 
> Looks like cabinet comment came from Ingo (commit 83ce4009) in 
> reference to
>     (We will turn this off in DMI quirks for multi-chassis 
>     systems)
> 
> Yes. If these two flags are set, TSC should be consistent and 
> sched_clock_stable could be set and it will be reset if there 
> is a call to mark_tsc_unstable().

Most of the details swapped out from my brain meanwhile, but I 
have some vague memories of a DMI quirk for some high-end system 
that just did a sched_clock_stable = 0 or such.

So if the common case is that the TSC is entirely synchronized 
across CPUs, then we can default to that and rely on platform 
initialization code or DMI quirks setting the few large-NUMA 
systems to an unstable TSC.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/