lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <127a93b0-647f-bb0c-2bf4-649fc4d1f25e@linux.intel.com>
Date: Wed, 19 Mar 2025 16:01:29 +0200 (EET)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Mario Limonciello <superm1@...nel.org>
cc: Hans de Goede <hdegoede@...hat.com>, 
    Mario Limonciello <mario.limonciello@....com>, 
    Perry Yuan <perry.yuan@....com>, Thomas Gleixner <tglx@...utronix.de>, 
    Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
    Dave Hansen <dave.hansen@...ux.intel.com>, 
    "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>, 
    "H . Peter Anvin" <hpa@...or.com>, Jonathan Corbet <corbet@....net>, 
    Huang Rui <ray.huang@....com>, 
    "Gautham R . Shenoy" <gautham.shenoy@....com>, 
    "Rafael J . Wysocki" <rafael@...nel.org>, 
    Viresh Kumar <viresh.kumar@...aro.org>, 
    "open list:AMD HETERO CORE HARDWARE FEEDBACK DRIVER" <platform-driver-x86@...r.kernel.org>, 
    "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <linux-kernel@...r.kernel.org>, 
    "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>, 
    "open list:AMD PSTATE DRIVER" <linux-pm@...r.kernel.org>, 
    Perry Yuan <Perry.Yuan@....com>, Bagas Sanjaya <bagasdotme@...il.com>
Subject: Re: [PATCH v8 01/13] Documentation: x86: Add AMD Hardware Feedback
 Interface documentation

On Tue, 18 Feb 2025, Mario Limonciello wrote:

> From: Perry Yuan <Perry.Yuan@....com>
> 
> Introduce a new documentation file, `amd_hfi.rst`, which delves into the
> implementation details of the AMD Hardware Feedback Interface and its
> associated driver, `amd_hfi`. This documentation describes how the
> driver provides hint to the OS scheduling which depends on the capability
> of core performance and efficiency ranking data.
> 
> This documentation describes
> * The design of the driver
> * How the driver provides hints to the OS scheduling
> * How the driver interfaces with the kernel for efficiency ranking data.
> 
> Reviewed-by: Bagas Sanjaya <bagasdotme@...il.com>
> Signed-off-by: Perry Yuan <Perry.Yuan@....com>
> Reviewed-by: Mario Limonciello <mario.limonciello@....com>
> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
> ---
>  Documentation/arch/x86/amd-hfi.rst | 127 +++++++++++++++++++++++++++++
>  Documentation/arch/x86/index.rst   |   1 +
>  2 files changed, 128 insertions(+)
>  create mode 100644 Documentation/arch/x86/amd-hfi.rst
> 
> diff --git a/Documentation/arch/x86/amd-hfi.rst b/Documentation/arch/x86/amd-hfi.rst
> new file mode 100644
> index 0000000000000..5d204688470e3
> --- /dev/null
> +++ b/Documentation/arch/x86/amd-hfi.rst
> @@ -0,0 +1,127 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======================================================================
> +Hardware Feedback Interface For Hetero Core Scheduling On AMD Platform
> +======================================================================
> +
> +:Copyright: 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> +
> +:Author: Perry Yuan <perry.yuan@....com>
> +:Author: Mario Limonciello <mario.limonciello@....com>
> +
> +Overview
> +--------
> +
> +AMD Heterogeneous Core implementations are comprised of more than one
> +architectural class and CPUs are comprised of cores of various efficiency and
> +power capabilities: performance-oriented *classic cores* and power-efficient
> +*dense cores*. As such, power management strategies must be designed to
> +accommodate the complexities introduced by incorporating different core types.
> +Heterogeneous systems can also extend to more than two architectural classes as
> +well. The purpose of the scheduling feedback mechanism is to provide
> +information to the operating system scheduler in real time such that the
> +scheduler can direct threads to the optimal core.
> +
> +The goal of AMD's heterogeneous architecture is to attain power benefit by sending
> +background thread to the dense cores while sending high priority threads to the classic
> +cores. From a performance perspective, sending background threads to dense cores can free
> +up power headroom and allow the classic cores to optimally service demanding threads.
> +Furthermore, the area optimized nature of the dense cores allows for an increasing
> +number of physical cores. This improved core density will have positive multithreaded
> +performance impact.

Hi Mario,

Please fold these paragraphs to 80 characters so that they're easier to 
read as textfiles (the table can obviously exceed that but there should be 
no reason for the text paragraphs to have excessively long lines).

My apologies for taking so long to get to review this series. Most of my 
comments are quite minor but there's also 1-2 things that seem more 
important. It seemed to me that there is some disconnetion between the 
promises made in the Kconfig description and what is provided by the patch 
series.

--
 i.

> +
> +AMD Heterogeneous Core Driver
> +-----------------------------
> +
> +The ``amd_hfi`` driver delivers the operating system a performance and energy efficiency
> +capability data for each CPU in the system. The scheduler can use the ranking data
> +from the HFI driver to make task placement decisions.
> +
> +Thread Classification and Ranking Table Interaction
> +----------------------------------------------------
> +
> +The thread classification is used to select into a ranking table that describes
> +an efficiency and performance ranking for each classification.
> +
> +Threads are classified during runtime into enumerated classes. The classes represent
> +thread performance/power characteristics that may benefit from special scheduling behaviors.
> +The below table depicts an example of thread classification and a preference where a given thread
> +should be scheduled based on its thread class. The real time thread classification is consumed
> +by the operating system and is used to inform the scheduler of where the thread should be placed.
> +
> +Thread Classification Example Table
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ++----------+----------------+-------------------------------+---------------------+---------+
> +| class ID | Classification | Preferred scheduling behavior | Preemption priority | Counter |
> ++----------+----------------+-------------------------------+---------------------+---------+
> +| 0        | Default        | Performant                    | Highest             |         |
> ++----------+----------------+-------------------------------+---------------------+---------+
> +| 1        | Non-scalable   | Efficient                     | Lowest              | PMCx1A1 |
> ++----------+----------------+-------------------------------+---------------------+---------+
> +| 2        | I/O bound      | Efficient                     | Lowest              | PMCx044 |
> ++----------+----------------+-------------------------------+---------------------+---------+
> +
> +Thread classification is performed by the hardware each time that the thread is switched out.
> +Threads that don't meet any hardware specified criteria will be classified as "default".
> +
> +AMD Hardware Feedback Interface
> +--------------------------------
> +
> +The Hardware Feedback Interface provides to the operating system information
> +about the performance and energy efficiency of each CPU in the system. Each
> +capability is given as a unit-less quantity in the range [0-255]. A higher
> +performance value indicates higher performance capability, and a higher
> +efficiency value indicates more efficiency. Energy efficiency and performance
> +are reported in separate capabilities in the shared memory based ranking table.
> +
> +These capabilities may change at runtime as a result of changes in the
> +operating conditions of the system or the action of external factors.
> +Power Management FW is responsible for detecting events that would require
> +a reordering of the performance and efficiency ranking. Table updates would
> +happen relatively infrequently and occur on the time scale of seconds or more.
> +
> +The following events trigger a table update:
> +    * Thermal Stress Events
> +    * Silent Compute
> +    * Extreme Low Battery Scenarios
> +
> +The kernel or a userspace policy daemon can use these capabilities to modify
> +task placement decisions. For instance, if either the performance or energy
> +capabilities of a given logical processor becomes zero, it is an indication that
> +the hardware recommends to the operating system to not schedule any tasks on
> +that processor for performance or energy efficiency reasons, respectively.
> +
> +Implementation details for Linux
> +--------------------------------
> +
> +The implementation of threads scheduling consists of the following steps:
> +
> +1. A thread is spawned and scheduled to the ideal core using the default
> +   heterogeneous scheduling policy.
> +2. The processor profiles thread execution and assigns an enumerated classification ID.
> +   This classification is communicated to the OS via logical processor scope MSR.
> +3. During the thread context switch out the operating system consumes the workload(WL)
> +   classification which resides in a logical processor scope MSR.
> +4. The OS triggers the hardware to clear its history by writing to an MSR,
> +   after consuming the WL classification and before switching in the new thread.
> +5. If due to the classification, ranking table, and processor availability,
> +   the thread is not on its ideal processor, the OS will then consider scheduling
> +   the thread on its ideal processor (if available).
> +
> +Ranking Table
> +-------------
> +The ranking table is a shared memory region that is used to communicate the
> +performance and energy efficiency capabilities of each CPU in the system.
> +
> +The ranking table design includes rankings for each APIC ID in the system and
> +rankings both for performance and efficiency for each workload classification.
> +
> +.. kernel-doc:: drivers/platform/x86/amd/hfi/hfi.c
> +   :doc: amd_shmem_info
> +
> +Ranking Table update
> +---------------------------
> +The power management firmware issues an platform interrupt after updating the ranking
> +table and is ready for the operating system to consume it. CPUs receive such interrupt
> +and read new ranking table from shared memory which PCCT table has provided, then
> +``amd_hfi`` driver parse the new table to provide new consume data for scheduling decisions.
> diff --git a/Documentation/arch/x86/index.rst b/Documentation/arch/x86/index.rst
> index 8ac64d7de4dc9..56f2923f52597 100644
> --- a/Documentation/arch/x86/index.rst
> +++ b/Documentation/arch/x86/index.rst
> @@ -43,3 +43,4 @@ x86-specific Documentation
>     features
>     elf_auxvec
>     xstate
> +   amd-hfi
> 

-- 
 i.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ