lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <397f513f-9273-76d1-a0ba-9d1d403020c5@intel.com>
Date:   Mon, 24 Oct 2022 08:43:33 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Feng Tang <feng.tang@...el.com>, Zhang Rui <rui.zhang@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org
Cc:     linux-kernel@...r.kernel.org, tim.c.chen@...el.com,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>, liaoyu15@...wei.com
Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better
 estimation of socket numbers

On 10/24/22 00:37, Feng Tang wrote:
>> For instance, I can live with the implementation being a bit goofy when
>> kernel commandlines are in play.  We can pr_info() about those cases.
> Something like adding
> 
> pr_info("Watchdog for TSC is disabled for this platform while estimating
> 	the socket number is %d, if the real socket number is bigger than
> 	4 (may due to some tricks like 'maxcpus=' cmdline parameter, please
> 	add 'tsc=watchdog' to cmdline as well\n", logical_packages);

That's too wishy-washy.  Also, I *KNOW* Intel has built systems with
wonky, opaque numbers of "sockets".  Cascade Lake was a single physical
"socket", but in all other respects (including enumeration to software)
it acted like two logical sockets.

So, what was the "real" socket number for Cascade Lake?  If you looked
in a chassis, you'd see one socket.  But, there were two dies in that
socket talking to each other over UPI, so it had a system topology which
was indistinguishable from a 2-socket system.

Let's just state the facts:

	pr_info("Disabling TSC watchdog on %d-package system.", ...)

Then, we can have a flag elsewhere to say how reliable that number is.
A taint flag or CPU bug is probably going to far, but something like this:

bool logical_package_count_unreliable = false;

void mark_bad_package_count(char *reason)
{
	if (logical_package_count_unreliable)
		return true;

	pr_warn("processor package count is unreliable");
}

Might be OK.  Then you can call mark_bad_package_count() from multiple
sites, like the maxcpus= code.

But, like I said in the other thread, let's make sure we're agreed on
the precise problem that we're solving before we go down this road.

> and adding a new 'tsc=watchdog' option to force watchdog on (might be
> over-complexed?)

Agreed, I don't think that's quite warranted yet.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ