lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZIwMstkB7CG3pDYu@feng-clx>
Date:   Fri, 16 Jun 2023 15:18:10 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        "Paul E . McKenney" <paulmck@...nel.org>, <x86@...nel.org>,
        <linux-kernel@...r.kernel.org>, <rui.zhang@...el.com>,
        <tim.c.chen@...el.com>
Subject: Re: [Patch v2 2/2] x86/tsc: use logical_packages as a better
 estimation of socket numbers

On Thu, Jun 15, 2023 at 11:20:21AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 13, 2023 at 01:25:23PM +0800, Feng Tang wrote:
> > Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC
> > on qualified platorms") was introduced to solve problem that
> > sometimes TSC clocksource is wrongly judged as unstable by watchdog
> > like 'jiffies', HPET, etc.
> > 
> > In it, the hardware socket number is a key factor for judging whether
> > to disable the watchdog for TSC, and 'nr_online_nodes' was chosen as
> > an estimation due to it is needed in early boot phase before
> > registering 'tsc-early' clocksource, where all none-boot CPUs are not
> > brought up yet.
> > 
> > In recent patch review, Dave and Rui pointed out there are many cases
> > in which 'nr_online_nodes' is not accurate, like:
> > 
> > * numa emulation (numa=fake=4 etc.)
> > * numa=off
> > * platforms with CPU-less HBM nodes, CPU-less Optane memory nodes.
> > * SNC (sub-numa cluster) mode enabled
> > * 'nr_cpus=' and 'maxcpus=' kernel cmdline parameter setup
> > 
> > Peter Zijlstra suggested 'logical_packages', and it seems to be the
> > best option we have as it covers all the cases above except the
> > last one. But it is only usable after smp_init() and all CPUs are
> > brought up, while 'tsc-early' is initialized before that.
> > 
> > One solution is to skip the watchdog for 'tsc-early' clocksource,
> > and move the check after smp_init(), while before 'tsc' clocksource
> > is registered, where 'logical_packages' could be used.
> > 
> > For 'nr_cpus' and 'maxcpus' cmdline case, the global flag
> > 'package_count_unreliable' will be set to true and the watchdog
> > will be kept as is.
> 
> So I have at least two machines where I boot with 'possible_cpus=#'
> because the BIOS MADT is reporting a stupid number of CPUs that aren't
> actually there.
> 
> So I think I'm lucky and side-stepped this nonsense, but if someone were
> to use nr_cpus= for this same purpose, they get screwed over and get the
> watchdog. Sad day for them I suppose.

Thanks for sharing the info! Now I know one reason why those cmdline
parameters were created.

> I realize there might not be a perfect solution,

Yes. Rui is working on a MADT based parsing which may take a while
before being stable, given all kinds of fancy firmware out there.

> but I'm also struggling
> to see the value of the whole package_count_unreliable thing.

'possible_cpus' and 'nr_cpus' parameters are a little different from
"maxcpus' that they limit the possible cpus during boot, and after
boot user has no ways to online other CPUs beyond that limit.

But for 'maxcpus' case, user can online a small number of CPU during
boot and onlined all CPUs later on, which has a possible under
estimation issue for package number, while the above 2 don't have. 

So how about only keeping the 'package_count_unreliable' check for
'maxcpus' case? 

Thanks,
Feng




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ