lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea5621698508a800cea59b5533f8845b9f0befc6.camel@intel.com>
Date:   Mon, 2 Aug 2021 11:37:30 +0000
From:   "Winiarska, Iwona" <iwona.winiarska@...el.com>
To:     "zweiss@...inix.com" <zweiss@...inix.com>
CC:     "corbet@....net" <corbet@....net>,
        "jae.hyun.yoo@...ux.intel.com" <jae.hyun.yoo@...ux.intel.com>,
        "Lutomirski, Andy" <luto@...nel.org>,
        "linux-hwmon@...r.kernel.org" <linux-hwmon@...r.kernel.org>,
        "Luck, Tony" <tony.luck@...el.com>,
        "andrew@...id.au" <andrew@...id.au>,
        "mchehab@...nel.org" <mchehab@...nel.org>,
        "jdelvare@...e.com" <jdelvare@...e.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "linux@...ck-us.net" <linux@...ck-us.net>,
        "linux-aspeed@...ts.ozlabs.org" <linux-aspeed@...ts.ozlabs.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "yazen.ghannam@....com" <yazen.ghannam@....com>,
        "robh+dt@...nel.org" <robh+dt@...nel.org>,
        "openbmc@...ts.ozlabs.org" <openbmc@...ts.ozlabs.org>,
        "bp@...en8.de" <bp@...en8.de>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "pierre-louis.bossart@...ux.intel.com" 
        <pierre-louis.bossart@...ux.intel.com>,
        "andriy.shevchenko@...ux.intel.com" 
        <andriy.shevchenko@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
Subject: Re: [PATCH 13/14] docs: hwmon: Document PECI drivers

On Tue, 2021-07-27 at 22:58 +0000, Zev Weiss wrote:
> On Mon, Jul 12, 2021 at 05:04:46PM CDT, Iwona Winiarska wrote:
> > From: Jae Hyun Yoo <jae.hyun.yoo@...ux.intel.com>
> > 
> > Add documentation for peci-cputemp driver that provides DTS thermal
> > readings for CPU packages and CPU cores and peci-dimmtemp driver that
> > provides DTS thermal readings for DIMMs.
> > 
> > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@...ux.intel.com>
> > Co-developed-by: Iwona Winiarska <iwona.winiarska@...el.com>
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@...el.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
> > ---
> > Documentation/hwmon/index.rst         |  2 +
> > Documentation/hwmon/peci-cputemp.rst  | 93 +++++++++++++++++++++++++++
> > Documentation/hwmon/peci-dimmtemp.rst | 58 +++++++++++++++++
> > MAINTAINERS                           |  2 +
> > 4 files changed, 155 insertions(+)
> > create mode 100644 Documentation/hwmon/peci-cputemp.rst
> > create mode 100644 Documentation/hwmon/peci-dimmtemp.rst
> > 
> > diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
> > index bc01601ea81a..cc76b5b3f791 100644
> > --- a/Documentation/hwmon/index.rst
> > +++ b/Documentation/hwmon/index.rst
> > @@ -154,6 +154,8 @@ Hardware Monitoring Kernel Drivers
> >    pcf8591
> >    pim4328
> >    pm6764tr
> > +   peci-cputemp
> > +   peci-dimmtemp
> >    pmbus
> >    powr1220
> >    pxe1610
> > diff --git a/Documentation/hwmon/peci-cputemp.rst
> > b/Documentation/hwmon/peci-cputemp.rst
> > new file mode 100644
> > index 000000000000..d3a218ba810a
> > --- /dev/null
> > +++ b/Documentation/hwmon/peci-cputemp.rst
> > @@ -0,0 +1,93 @@
> > +.. SPDX-License-Identifier: GPL-2.0-only
> > +
> > +Kernel driver peci-cputemp
> > +==========================
> > +
> > +Supported chips:
> > +       One of Intel server CPUs listed below which is connected to a PECI
> > bus.
> > +               * Intel Xeon E5/E7 v3 server processors
> > +                       Intel Xeon E5-14xx v3 family
> > +                       Intel Xeon E5-24xx v3 family
> > +                       Intel Xeon E5-16xx v3 family
> > +                       Intel Xeon E5-26xx v3 family
> > +                       Intel Xeon E5-46xx v3 family
> > +                       Intel Xeon E7-48xx v3 family
> > +                       Intel Xeon E7-88xx v3 family
> > +               * Intel Xeon E5/E7 v4 server processors
> > +                       Intel Xeon E5-16xx v4 family
> > +                       Intel Xeon E5-26xx v4 family
> > +                       Intel Xeon E5-46xx v4 family
> > +                       Intel Xeon E7-48xx v4 family
> > +                       Intel Xeon E7-88xx v4 family
> > +               * Intel Xeon Scalable server processors
> > +                       Intel Xeon D family
> > +                       Intel Xeon Bronze family
> > +                       Intel Xeon Silver family
> > +                       Intel Xeon Gold family
> > +                       Intel Xeon Platinum family
> > +
> > +       Datasheet: Available from http://www.intel.com/design/literature.htm
> > +
> > +Author: Jae Hyun Yoo <jae.hyun.yoo@...ux.intel.com>
> > +
> > +Description
> > +-----------
> > +
> > +This driver implements a generic PECI hwmon feature which provides Digital
> > +Thermal Sensor (DTS) thermal readings of the CPU package and CPU cores that
> > are
> > +accessible via the processor PECI interface.
> > +
> > +All temperature values are given in millidegree Celsius and will be
> > measurable
> > +only when the target CPU is powered on.
> > +
> > +Sysfs interface
> > +-------------------
> > +
> > +=======================
> > =======================================================
> > +temp1_label            "Die"
> > +temp1_input            Provides current die temperature of the CPU package.
> > +temp1_max              Provides thermal control temperature of the CPU
> > package
> > +                       which is also known as Tcontrol.
> > +temp1_crit             Provides shutdown temperature of the CPU package
> > which
> > +                       is also known as the maximum processor junction
> > +                       temperature, Tjmax or Tprochot.
> > +temp1_crit_hyst                Provides the hysteresis value from Tcontrol
> > to Tjmax of
> > +                       the CPU package.
> > +
> > +temp2_label            "DTS"
> > +temp2_input            Provides current DTS temperature of the CPU package.
> 
> Would this be a good place to note the slightly counter-intuitive nature
> of DTS readings?  i.e. add something along the lines of "The DTS sensor
> produces a delta relative to Tjmax, so negative values are normal and
> values approaching zero are hot."  (In my experience people who aren't
> already familiar with it tend to think something's wrong when a CPU
> temperature reading shows -50C.)

I believe that what you're referring to is a result of "GetTemp", and we're
using it to calculate "Die" sensor values (temp1).
The sensor value is absolute - we don't expose "raw" thermal sensor value
(delta) anywhere.

DTS sensor is exposing temperature value scaled to fit DTS 2.0 thermal profile: 
https://www.intel.com/content/www/us/en/processors/xeon/scalable/xeon-scalable-thermal-guide.html
(section 5.2.3.2)

Similar to "Die" sensor - it's also exposed in absolute form.

I'll try to change description to avoid confusion.

> 
> > +temp2_max              Provides thermal control temperature of the CPU
> > package
> > +                       which is also known as Tcontrol.
> > +temp2_crit             Provides shutdown temperature of the CPU package which
> > +                       is also known as the maximum processor junction
> > +                       temperature, Tjmax or Tprochot.
> > +temp2_crit_hyst                Provides the hysteresis value from Tcontrol to
> > Tjmax of
> > +                       the CPU package.
> > +
> > +temp3_label            "Tcontrol"
> > +temp3_input            Provides current Tcontrol temperature of the CPU
> > +                       package which is also known as Fan Temperature target.
> > +                       Indicates the relative value from thermal monitor trip
> > +                       temperature at which fans should be engaged.
> > +temp3_crit             Provides Tcontrol critical value of the CPU package
> > +                       which is same to Tjmax.
> > +
> > +temp4_label            "Tthrottle"
> > +temp4_input            Provides current Tthrottle temperature of the CPU
> > +                       package. Used for throttling temperature. If this
> > value
> > +                       is allowed and lower than Tjmax - the throttle will
> > +                       occur and reported at lower than Tjmax.
> > +
> > +temp5_label            "Tjmax"
> > +temp5_input            Provides the maximum junction temperature, Tjmax of
> > the
> > +                       CPU package.
> > +
> > +temp[6-N]_label                Provides string "Core X", where X is resolved
> > core
> > +                       number.
> > +temp[6-N]_input                Provides current temperature of each core.
> > +temp[6-N]_max          Provides thermal control temperature of the core.
> > +temp[6-N]_crit         Provides shutdown temperature of the core.
> > +temp[6-N]_crit_hyst    Provides the hysteresis value from Tcontrol to Tjmax
> > of
> > +                       the core.
> 
> I only see *_label and *_input for the per-core temperature sensors, no
> *_max, *_crit, or *_crit_hyst.

You're right - this should be removed from documentation.

> 
> > +
> > +=======================
> > =======================================================
> > diff --git a/Documentation/hwmon/peci-dimmtemp.rst b/Documentation/hwmon/peci-
> > dimmtemp.rst
> > new file mode 100644
> > index 000000000000..1778d9317e43
> > --- /dev/null
> > +++ b/Documentation/hwmon/peci-dimmtemp.rst
> > @@ -0,0 +1,58 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +Kernel driver peci-dimmtemp
> > +===========================
> > +
> > +Supported chips:
> > +       One of Intel server CPUs listed below which is connected to a PECI
> > bus.
> > +               * Intel Xeon E5/E7 v3 server processors
> > +                       Intel Xeon E5-14xx v3 family
> > +                       Intel Xeon E5-24xx v3 family
> > +                       Intel Xeon E5-16xx v3 family
> > +                       Intel Xeon E5-26xx v3 family
> > +                       Intel Xeon E5-46xx v3 family
> > +                       Intel Xeon E7-48xx v3 family
> > +                       Intel Xeon E7-88xx v3 family
> > +               * Intel Xeon E5/E7 v4 server processors
> > +                       Intel Xeon E5-16xx v4 family
> > +                       Intel Xeon E5-26xx v4 family
> > +                       Intel Xeon E5-46xx v4 family
> > +                       Intel Xeon E7-48xx v4 family
> > +                       Intel Xeon E7-88xx v4 family
> > +               * Intel Xeon Scalable server processors
> > +                       Intel Xeon D family
> > +                       Intel Xeon Bronze family
> > +                       Intel Xeon Silver family
> > +                       Intel Xeon Gold family
> > +                       Intel Xeon Platinum family
> > +
> > +       Datasheet: Available from http://www.intel.com/design/literature.htm
> > +
> > +Author: Jae Hyun Yoo <jae.hyun.yoo@...ux.intel.com>
> > +
> > +Description
> > +-----------
> > +
> > +This driver implements a generic PECI hwmon feature which provides Digital
> > +Thermal Sensor (DTS) thermal readings of DIMM components that are accessible
> > +via the processor PECI interface.
> 
> I had thought "DTS" referred to a fairly specific sensor in the CPU; is
> the same term also used for DIMM temp sensors or is the mention of it
> here a copy/paste error?

Yeah - it should be "Temperature Sensor on DIMM".

Thanks
-Iwona

> 
> > +
> > +All temperature values are given in millidegree Celsius and will be
> > measurable
> > +only when the target CPU is powered on.
> > +
> > +Sysfs interface
> > +-------------------
> > +
> > +=======================
> > =======================================================
> > +
> > +temp[N]_label          Provides string "DIMM CI", where C is DIMM channel and
> > +                       I is DIMM index of the populated DIMM.
> > +temp[N]_input          Provides current temperature of the populated DIMM.
> > +temp[N]_max            Provides thermal control temperature of the DIMM.
> > +temp[N]_crit           Provides shutdown temperature of the DIMM.
> > +
> > +=======================
> > =======================================================
> > +
> > +Note:
> > +       DIMM temperature attributes will appear when the client CPU's BIOS
> > +       completes memory training and testing.
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 35ba9e3646bd..d16da127bbdc 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -14509,6 +14509,8 @@ M:      Iwona Winiarska <iwona.winiarska@...el.com>
> > R:      Jae Hyun Yoo <jae.hyun.yoo@...ux.intel.com>
> > L:      linux-hwmon@...r.kernel.org
> > S:      Supported
> > +F:     Documentation/hwmon/peci-cputemp.rst
> > +F:     Documentation/hwmon/peci-dimmtemp.rst
> > F:      drivers/hwmon/peci/
> > 
> > PECI SUBSYSTEM
> > -- 
> > 2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ