lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Oct 2022 15:57:12 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        Zhang Rui <rui.zhang@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        "Peter Zijlstra" <peterz@...radead.org>, <x86@...nel.org>
CC:     <linux-kernel@...r.kernel.org>, <tim.c.chen@...el.com>,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        <liaoyu15@...wei.com>, <oliver.sang@...el.com>,
        <ying.huang@...el.com>
Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better
 estimation of socket numbers

On Mon, Oct 24, 2022 at 08:43:33AM -0700, Dave Hansen wrote:
> On 10/24/22 00:37, Feng Tang wrote:
> >> For instance, I can live with the implementation being a bit goofy when
> >> kernel commandlines are in play.  We can pr_info() about those cases.
> > Something like adding
> > 
> > pr_info("Watchdog for TSC is disabled for this platform while estimating
> > 	the socket number is %d, if the real socket number is bigger than
> > 	4 (may due to some tricks like 'maxcpus=' cmdline parameter, please
> > 	add 'tsc=watchdog' to cmdline as well\n", logical_packages);
> 
> That's too wishy-washy.  Also, I *KNOW* Intel has built systems with
> wonky, opaque numbers of "sockets".  Cascade Lake was a single physical
> "socket", but in all other respects (including enumeration to software)
> it acted like two logical sockets.
> 
> So, what was the "real" socket number for Cascade Lake?  If you looked
> in a chassis, you'd see one socket.  But, there were two dies in that
> socket talking to each other over UPI, so it had a system topology which
> was indistinguishable from a 2-socket system.
 
Good to know and thanks for the info.

I have to admit I haven't checked a server's internals before, and
thanks to Oliver for helping checking some Cascade Lake boxes of 0Day.

In one box where 'lscpu' shows 4 sockets (96 cores in total), it does
only have 2 physical processors in the chassis, just like you
mentioned, it has 2 dies for each processor. And in another box,
'lscpu' shows 2 sockets (44 cores in total), it also has 2 physical
processors but with much smaller size.

And fortunately the 'logical_packages' for these 2 boxes are both
the correct value: 2.

> Let's just state the facts:
> 
> 	pr_info("Disabling TSC watchdog on %d-package system.", ...)
> 
> Then, we can have a flag elsewhere to say how reliable that number is.
> A taint flag or CPU bug is probably going to far, but something like this:
> 
> bool logical_package_count_unreliable = false;
> 
> void mark_bad_package_count(char *reason)
> {
> 	if (logical_package_count_unreliable)
> 		return true;
> 
> 	pr_warn("processor package count is unreliable");
> }
> 
> Might be OK.  Then you can call mark_bad_package_count() from multiple
> sites, like the maxcpus= code.

This should work! we can just add one more check:

	boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
	!logical_package_count_unreliable &&
	logical_packages <= 2 

And when some new case leading to a imprecise 'logical_packages' is
found in future, we could just apply to this to it.

> But, like I said in the other thread, let's make sure we're agreed on
> the precise problem that we're solving before we go down this road.

Sure.

Thanks,
Feng


> > and adding a new 'tsc=watchdog' option to force watchdog on (might be
> > over-complexed?)
> 
> Agreed, I don't think that's quite warranted yet.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ