lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20191019081001.GA5571@zn.tnic>
Date:   Sat, 19 Oct 2019 10:10:53 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "bberg@...hat.com" <bberg@...hat.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "hdegoede@...hat.com" <hdegoede@...hat.com>,
        "ckellner@...hat.com" <ckellner@...hat.com>
Subject: Re: [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal
 throttle messages

On Fri, Oct 18, 2019 at 01:38:32PM -0700, Luck, Tony wrote:
> Sorry to have caused confusion.

Ditto. But us causing confusion is fine - this way we can talk about
what we really wanna do!

:-)))

> The thoughts behind that statement are that we currently have an issue
> with too many noisy high severity messages. The interim solution we
> are going with is to downgrade the severity. But if we apply a time
> based filter to remove most of the noise by not printing at all, maybe
> what we have left is a very small number of high severity messages.
>
> But that's completely up for debate.

Well, I think those messages being pr_warn are fine if one wants to
inspect dmesg for signs of overheating and the platform is hitting some
thermal limits.

And if the time-based filter is not too accurate, that's fine too, I
guess, as long as we don't flood dmesg.

What I don't like is the command line parameter and us putting the onus
on the user to decide although we have all that info in the kernel
already and we can do that decision automatically.

> I agree it is a good thing to look at. I'm not so sure we will find
> a good enough method that works all the way from tablet to server,
> so we might end up with "#define MAX_THERM_TIME 8000" ... but some
> study of options would either turn up a good heuristic, or provide
> evidence for why that is either hard, or no better than a constant.

Yeah, I still think a simple avg filter which starts from a sufficiently
high value and improves it over time, should be good enough.

Hell, even the trivial formula we use in the CMCI interrupt for polling,
might work, where we either double the interval or halve it, depending
on recent history.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ