[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAA5xa-kH0eyP16DocXyy2=VdVQCBa6d+PRJbcHANHdpk3QnFJg@mail.gmail.com>
Date: Mon, 14 Mar 2022 08:24:02 +0100
From: Henry Tseng <henrybear327@...il.com>
To: Jui-Tse Huang <juitse.huang@...il.com>
Cc: Jonathan Corbet <corbet@....net>,
Peter Zijlstra <peterz@...radead.org>,
Valentin Schneider <valentin.schneider@....com>,
Mauro Carvalho Chehab <mchehab+huawei@...nel.org>,
Huaixin Chang <changhuaixin@...ux.alibaba.com>,
Beata Michalska <beata.michalska@....com>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
Ching-Chun Huang <jserv@...s.ncku.edu.tw>,
Yiwei Lin <s921975628@...il.com>
Subject: Re: [PATCH] docs/scheduler: Introduce the doc of load average
I believe there is a typo in the document, as mentioned inline down below.
On Mon, Mar 14, 2022 at 7:03 AM Jui-Tse Huang <juitse.huang@...il.com> wrote:
>
> Load average is one of the common as well as easily observed statistics provided
> by Linux, but still not well documented, which makes the numbers that users
> observe from the output of top, htop or other system monitoring applications are
> only numbers. This patch discusses how Linux calculates the load average as well
> as what is concerned while calculating the load average.
>
> The discussion flow is divided into several parts:
> 1. The expression used to get the load average.
> 2. Why does Linux choose such an average method from the other.
> 2. The meaning of each term in the expression.
> 3. The type of tasks that will be covered in the calculation.
> 4. A brief explanation of the fixed-point number since the weights defined in
> the Linux kernel are based on it.
>
> Signed-off-by: Jui-Tse Huang <juitse.huang@...il.com>
> Signed-off-by: Yiwei Lin <s921975628@...il.com>
> Co-Developed-by: Yiwei Lin <s921975628@...il.com>
>
> ---
> Documentation/scheduler/index.rst | 1 +
> Documentation/scheduler/load-average.rst | 77 ++++++++++++++++++++++++
> 2 files changed, 78 insertions(+)
> create mode 100644 Documentation/scheduler/load-average.rst
>
> diff --git a/Documentation/scheduler/index.rst b/Documentation/scheduler/index.rst
> index 88900aabdbf7..bdc779b4190f 100644
> --- a/Documentation/scheduler/index.rst
> +++ b/Documentation/scheduler/index.rst
> @@ -17,6 +17,7 @@ Linux Scheduler
> sched-nice-design
> sched-rt-group
> sched-stats
> + load-average
>
> text_files
>
> diff --git a/Documentation/scheduler/load-average.rst b/Documentation/scheduler/load-average.rst
> new file mode 100644
> index 000000000000..1b55f8da4e16
> --- /dev/null
> +++ b/Documentation/scheduler/load-average.rst
> @@ -0,0 +1,77 @@
> +============
> +Load Average
> +============
> +
> +Load average is a basic statistic provided by almost all operating systems that
> +aims to report the usage of system hardware resources. In Linux kernel, the
> +load average is calculated via the following expression::
> +
> + / 0 , if t = 0
> + load_{t} = |
> + \ laod_{t - 1} * exp + active * (1 - exp), otherwise
> +
> +The expression represents the exponential moving average of the historical
> +loading of the system. There are several reasons that Linux kernel chooses
> +exponential moving average from other similar average equations such as simple
> +moving average or cumulative moving average:
> +
> +#. The exponential moving average consumes fixed memory space, while the simple
> + moving average has O(n) space complexity where n is the number of timeslice
> + within a given interval.
> +#. The exponential moving average not only applies a higher weight to the most
> + recent record but also declines the weight exponentially, which makes the
> + resulting load average reflect the situation of the current system. Neither
> + the simple moving average nor cumulative moving average has this feature.
> +
> +In the expression, the load_{t} in the expression indicates the calculated load
> +average at the given time t.
> +The active is the most recent recorded system load. In Linux, the system load
> +means the number of tasks in the state of TASK_RUNNING or TASK_UNINTERRUPTIBLE
> +of the entire system. Tasks with TASK_UNINTERRUPTIBLE state are usually waiting
> +for disk I/O or holding an uninterruptible lock, which is considered as a part
> +of system resource, thus, Linux kernel covers them while calculating the load
> +average.
> +The exp means the weight applied to the previous report of load average, while
> +(1 - exp) is the weight applied to the most recently recorded system load.
> +There are three different weights defined in the Linux kernel, in
> +include/linux/sched/loadavg.h, to perform statistics in various timescales::
> +
> + // include/linux/sched/loadavg.h
> + ...
> + #define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */
> + #define EXP_5 2014 /* 1/exp(5sec/5min) */
> + #define EXP_15 2037 /* 1/exp(5sec/15min) */
> + ...
> +
> +According to the expression shown on the top of this page, the weight (exp)
> +controls how much of the last load load_{t - 1} will take place in the
> +calculation of current load, while (1 - exp) is the weight applied to the most
> +recent record of system load active.
> +
> +Due to the security issue, the weights are defined as fixed-point numbers based
> +on the unsigned integer rather than floating-pointing numbers. The introduction
> +of the fixed-point number keeps the FPU away from the calculation process. Since
> +the precession of the fixed-point used in the Linux kernel is 11 bits, a
> +fixed-point can be converted to a floating-point by dividing it by 2048, as the
> +expression shown below::
> +
> + EXP_1 = 1884 / 2048 = 0.919922
> + EXP_5 = 2014 / 2048 = 0.983398
> + EXP_15 = 2037 / 2048 = 0.994629
> +
> +Which indicates the weights applied to active are::
> +
> + (1 - EXP_1) = (1 - 0.919922) = 0.080078
> + (1 - EXP_5) = (1 - 0.983398) = 0.016602
> + (1 - EXP_15) = (1 - 0.994629) = 0.005371
> +
> +The load average will be updated every 5 seconds. Each time the scheduler_tick()
> +be called, the function calc_global_load_tick() will also be invoked, which
> +makes the active of each CPU core be calculated and be merged globally, finally,
> +the load average will be updated with that global active.
> +
> +As a user, the load average can be observed via top, htop, or other system
> +monitor application, or more directly, by the following command::
> +
> + $ cat /proc/laodavg
Should be $ cat /proc/loadavg
> +
> --
> 2.25.1
>
--
Best wishes,
Henry
Powered by blists - more mailing lists