lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1902231849400.1666@nanos.tec.linutronix.de>
Date:   Sat, 23 Feb 2019 19:16:17 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Aubrey Li <aubrey.li@...ux.intel.com>
cc:     mingo@...hat.com, peterz@...radead.org, hpa@...or.com,
        ak@...ux.intel.com, tim.c.chen@...ux.intel.com,
        dave.hansen@...el.com, arjan@...ux.intel.com, aubrey.li@...el.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v12 3/3] Documentation/filesystems/proc.txt: add
 AVX512_elapsed_ms

On Thu, 21 Feb 2019, Aubrey Li wrote:
> @@ -45,6 +45,7 @@ Table of Contents
>    3.9   /proc/<pid>/map_files - Information about memory mapped files
>    3.10  /proc/<pid>/timerslack_ns - Task timerslack value
>    3.11	/proc/<pid>/patch_state - Livepatch patch operation state
> +  3.12	/proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use

So is this a separate file now?
  
> +3.12	/proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use
> +--------------------------------------------------------------------------
> +If AVX512 is supported on the machine, this file displays time elapsed since

This is not a file and this documentation wants to be where the status file
is described.

> +last AVX512 usage of the task in millisecond.

Since last usage is misleading. What you want to say is:

  The entry shows the milliseconds elapsed since the last time AVX512 usage
  was recorded.

> +The per-task AVX512 usage tracking mechanism is added during context switch.
> +When the task is scheduled out, the AVX512 timestamp of the task is tagged
> +by jiffies if AVX512 usage is detected.
> +
> +When this interface is queried, AVX512_elapsed_ms is calculated as follows:
> +
> +	delta = (long)(jiffies_now - AVX512_timestamp);
> +	AVX512_elpased_ms = jiffies_to_msecs(delta);

This information is not really helpful for someone who wants to use that
field.

> +
> +Because this tracking mechanism depends on context switch, the number of
> +AVX512_elapsed_ms could be inaccurate if the AVX512 using task runs alone on
> +a CPU and not scheduled out for a long time. An extreme experiment shows a
> +task is spinning on the AVX512 ops on an isolated CPU, but the longest elapsed
> +time is close to 4 seconds(HZ = 250).
> +
> +So 5s or even longer is an appropriate threshold for the job scheduler to poll
> +and decide if the task should be classifed as an AVX512 task and migrated
> +away from the core on which a Non-AVX512 task is running.

5 seconds or long is appropriate? No. It really depends on the workload and
the scheduling scenarios. What the documentation has to provide is the
information that this value is a crystal ball estimate and what the reasons
are why its inaccurate.

Something like this instead of this conglomorate of useful, irrelevant and
misleading information:

  The AVX512_elapsed_ms entry shows the milliseconds elapsed since the last
  time AVX512 usage was recorded. The recording happens on a best effort
  basis when a task is scheduled out. This means that the value depends on
  two factors:

    1) The time which the task spent on the CPU without being scheduled
       out. With CPU isolation and a single runnable task this can take
       several seconds.

    2) The time since the task was scheduled out last. Depending on the
       reason for being scheduled out (time slice exhausted, syscall ...)
       this can be arbitrary long time.

  As a consequence the value cannot be considered precise and authoritive
  information. The application which uses this information has to be aware
  of the overall scenario on the system in order to determine whether a
  task is a real AVX512 user or not.

See? No jiffies, no code snippets, no absolute numbers and no magic
recommendation which might be correct for your test scenario, but
completely bogus for some other scenario.

Instead it contains the things which a application programmer who wants to
use that value needs to know. He then has to map it to his scenario and
build the crystal ball logic which makes it perhaps useful.

Thanks,

	tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ