linux-kernel - Re: [PATCH v2 3/3] mm/vmstat: do not refresh stats for nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZH4CnJlpBMxEEwPW@tpad>
Date:   Mon, 5 Jun 2023 12:43:24 -0300
From:   Marcelo Tosatti <mtosatti@...hat.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Christoph Lameter <cl@...ux.com>,
        Aaron Tomlin <atomlin@...mlin.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v2 3/3] mm/vmstat: do not refresh stats for nohz_full CPUs

On Mon, Jun 05, 2023 at 09:59:57AM +0200, Michal Hocko wrote:
> On Fri 02-06-23 15:58:00, Marcelo Tosatti wrote:
> > The interruption caused by queueing work on nohz_full CPUs 
> > is undesirable for certain aplications.
> 
> This is not a proper changelog. I am not going to write a changelog for
> you this time. Please explain why this is really needed and why this
> approach is desired. 
> E.g. why don't you prevent userspace from
> refreshing stats if interference is not desirable.

Michal,

Can you please check if the following looks better, as
a changelog? thanks

---

schedule_work_on API uses the workqueue mechanism to
queue a work item on a queue. A kernel thread, which
runs on the target CPU, executes those work items.

Therefore, when using the schedule_work_on API,
it is necessary for the kworker kernel thread to
be scheduled in, for the work function to be executed.

Time sensitive applications such as SoftPLCs
(https://tum-esi.github.io/publications-list/PDF/2022-ETFA-How_Real_Time_Are_Virtual_PLCs.pdf),
have their response times affected by such interruptions.

The /proc/sys/vm/stat_refresh file was originally introduced by

commit 52b6f46bc163eef17ecba4cd552beeafe2b24453
Author: Hugh Dickins <hughd@...gle.com>
Date:   Thu May 19 17:12:50 2016 -0700

    mm: /proc/sys/vm/stat_refresh to force vmstat update

    Provide /proc/sys/vm/stat_refresh to force an immediate update of
    per-cpu into global vmstats: useful to avoid a sleep(2) or whatever
    before checking counts when testing.  Originally added to work around a
    bug which left counts stranded indefinitely on a cpu going idle (an
    inaccuracy magnified when small below-batch numbers represent "huge"
    amounts of memory), but I believe that bug is now fixed: nonetheless,
    this is still a useful knob.

Other than the potential interruption to a time sensitive application,
if using SCHED_FIFO or SCHED_RR priority on the isolated CPU, then
system hangs can occur:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=978688

To avoid the problems above, do not schedule the work to synchronize
per-CPU mm counters on isolated CPUs. Given the possibility for
breaking existing userspace applications, avoid changing
behaviour of access to /proc/sys/vm/stat_refresh, such as
returning errors to userspace.

---

> Also would it make some sense to reduce flushing to cpumask 
> of the calling process? (certainly a daring thought but have
> you even considered it?)

Fail to see the point here ?