[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d38b5679-1f07-7b94-51ca-cb9b60db8b6f@linux.intel.com>
Date: Sun, 24 Feb 2019 09:41:40 +0800
From: "Li, Aubrey" <aubrey.li@...ux.intel.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: mingo@...hat.com, peterz@...radead.org, hpa@...or.com,
ak@...ux.intel.com, tim.c.chen@...ux.intel.com,
dave.hansen@...el.com, arjan@...ux.intel.com, aubrey.li@...el.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v12 3/3] Documentation/filesystems/proc.txt: add
AVX512_elapsed_ms
On 2019/2/24 2:16, Thomas Gleixner wrote:
> On Thu, 21 Feb 2019, Aubrey Li wrote:
>> @@ -45,6 +45,7 @@ Table of Contents
>> 3.9 /proc/<pid>/map_files - Information about memory mapped files
>> 3.10 /proc/<pid>/timerslack_ns - Task timerslack value
>> 3.11 /proc/<pid>/patch_state - Livepatch patch operation state
>> + 3.12 /proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use
>
> So is this a separate file now?
>
>> +3.12 /proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use
>> +--------------------------------------------------------------------------
>> +If AVX512 is supported on the machine, this file displays time elapsed since
>
> This is not a file and this documentation wants to be where the status file
> is described.
>
>> +last AVX512 usage of the task in millisecond.
>
> Since last usage is misleading. What you want to say is:
>
> The entry shows the milliseconds elapsed since the last time AVX512 usage
> was recorded.
>
>> +The per-task AVX512 usage tracking mechanism is added during context switch.
>> +When the task is scheduled out, the AVX512 timestamp of the task is tagged
>> +by jiffies if AVX512 usage is detected.
>> +
>> +When this interface is queried, AVX512_elapsed_ms is calculated as follows:
>> +
>> + delta = (long)(jiffies_now - AVX512_timestamp);
>> + AVX512_elpased_ms = jiffies_to_msecs(delta);
>
> This information is not really helpful for someone who wants to use that
> field.
>
>> +
>> +Because this tracking mechanism depends on context switch, the number of
>> +AVX512_elapsed_ms could be inaccurate if the AVX512 using task runs alone on
>> +a CPU and not scheduled out for a long time. An extreme experiment shows a
>> +task is spinning on the AVX512 ops on an isolated CPU, but the longest elapsed
>> +time is close to 4 seconds(HZ = 250).
>> +
>> +So 5s or even longer is an appropriate threshold for the job scheduler to poll
>> +and decide if the task should be classifed as an AVX512 task and migrated
>> +away from the core on which a Non-AVX512 task is running.
>
> 5 seconds or long is appropriate? No. It really depends on the workload and
> the scheduling scenarios. What the documentation has to provide is the
> information that this value is a crystal ball estimate and what the reasons
> are why its inaccurate.
>
> Something like this instead of this conglomorate of useful, irrelevant and
> misleading information:
>
> The AVX512_elapsed_ms entry shows the milliseconds elapsed since the last
> time AVX512 usage was recorded. The recording happens on a best effort
> basis when a task is scheduled out. This means that the value depends on
> two factors:
>
> 1) The time which the task spent on the CPU without being scheduled
> out. With CPU isolation and a single runnable task this can take
> several seconds.
>
> 2) The time since the task was scheduled out last. Depending on the
> reason for being scheduled out (time slice exhausted, syscall ...)
> this can be arbitrary long time.
>
> As a consequence the value cannot be considered precise and authoritive
> information. The application which uses this information has to be aware
> of the overall scenario on the system in order to determine whether a
> task is a real AVX512 user or not.
>
> See? No jiffies, no code snippets, no absolute numbers and no magic
> recommendation which might be correct for your test scenario, but
> completely bogus for some other scenario.
>
> Instead it contains the things which a application programmer who wants to
> use that value needs to know. He then has to map it to his scenario and
> build the crystal ball logic which makes it perhaps useful.
Thanks a lot, I'll try to refine it again.
Regards,
-Aubrey
Powered by blists - more mailing lists