lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2687702.9iZYToFQE1@tjmaciei-mobl5>
Date:   Sat, 12 Nov 2022 18:35:23 -0800
From:   Thiago Macieira <thiago.macieira@...el.com>
To:     Borislav Petkov <bp@...en8.de>, "Luck, Tony" <tony.luck@...el.com>
Cc:     "Joseph, Jithu" <jithu.joseph@...el.com>,
        "hdegoede@...hat.com" <hdegoede@...hat.com>,
        "markgross@...nel.org" <markgross@...nel.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "platform-driver-x86@...r.kernel.org" 
        <platform-driver-x86@...r.kernel.org>,
        "patches@...ts.linux.dev" <patches@...ts.linux.dev>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "Jimenez Gonzalez, Athenas" <athenas.jimenez.gonzalez@...el.com>,
        "Mehta, Sohil" <sohil.mehta@...el.com>
Subject: Re: [PATCH v2 12/14] platform/x86/intel/ifs: Add current_batch sysfs entry

On Saturday, 12 November 2022 15:32:47 PST Luck, Tony wrote:
> > Because if this is going to be run during downtime, as Thiago says, then
> > you can just as well use debugfs for this. And then there's no need to
> > cast any API in stone and so on.
> 
> Did Thiago say “during downtime”? I think
> he talked about some users opportunistic
> use of scan tests. But that’s far from only
> during downtime. We fully expect CSPs to
> run these scans periodically on production
> machines.

Let me clarify. I did not mean full system downtime for maintenance, but I did 
mean that there's a gap in consumer workload, for both threads of one or more 
cores. As Tony said, it should have little observable effect on any other core, 
meaning an IFS run can be scheduled *as* any other workload (albeit a 
privileged one) for a subset of the machine, while the rest of the system 
remains in production. This allows them a lot of flexibility and is the reason 
I am talking about containers, with the implied constraint that the 
container's view of the filesystem is narrower than the kernel's.

There'll be some coordination required to get all cores to have run all tests, 
but it should be doable over a period of time, and I'm thinking days, not 
years. This should still be short enough to reveal if the system can detect a 
defect or wear-out before any real workload is impacted by it.

If an issue is detected, the admin can decide whether to offline the core(s) 
reporting problems but keep the rest serving workloads and generating revenue, 
or offline the entire machine for full maintenance and to run more invasive and 
time-consuming tests.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Cloud Software Architect - Intel DCAI Cloud Engineering



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ