linux-kernel - Re: [PATCH v3 3/3] mm/vmstat: do not refresh stats for isolated CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZH49Tx4kbOG1zESS@tpad>
Date:   Mon, 5 Jun 2023 16:53:51 -0300
From:   Marcelo Tosatti <mtosatti@...hat.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Christoph Lameter <cl@...ux.com>,
        Aaron Tomlin <atomlin@...mlin.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v3 3/3] mm/vmstat: do not refresh stats for isolated CPUs

On Mon, Jun 05, 2023 at 09:20:12PM +0200, Michal Hocko wrote:
> On Mon 05-06-23 15:56:30, Marcelo Tosatti wrote:
> > schedule_work_on API uses the workqueue mechanism to
> > queue a work item on a queue. A kernel thread, which
> > runs on the target CPU, executes those work items.
> > 
> > Therefore, when using the schedule_work_on API,
> > it is necessary for the kworker kernel thread to
> > be scheduled in, for the work function to be executed.
> > 
> > Time sensitive applications such as SoftPLCs
> > (https://tum-esi.github.io/publications-list/PDF/2022-ETFA-How_Real_Time_Are_Virtual_PLCs.pdf),
> > have their response times affected by such interruptions.
> > 
> > The /proc/sys/vm/stat_refresh file was originally introduced
> > with the goal to:
> > 
> > "Provide /proc/sys/vm/stat_refresh to force an immediate update of
> >  per-cpu into global vmstats: useful to avoid a sleep(2) or whatever
> >  before checking counts when testing.  Originally added to work around a
> >  bug which left counts stranded indefinitely on a cpu going idle (an
> >  inaccuracy magnified when small below-batch numbers represent "huge"
> >  amounts of memory), but I believe that bug is now fixed: nonetheless,
> >  this is still a useful knob."
> > 
> > Other than the potential interruption to a time sensitive application,
> > if using SCHED_FIFO or SCHED_RR priority on the isolated CPU, then
> > system hangs can occur:
> 
> The same thing can happen without isolated CPUs and this patch doesn't
> help at all.
> 
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=978688
> 
> And this is an example of that...
> 
> > To avoid the problems above, do not schedule the work to synchronize
> > per-CPU mm counters on isolated CPUs. Given the possibility for
> > breaking existing userspace applications, avoid returning
> > errors from access to /proc/sys/vm/stat_refresh.
> > 
> > Signed-off-by: Marcelo Tosatti <mtosatti@...hat.com>
> 
> It would be really helpful to not post new versions while discussion of
> the previous one is still not done.
> 
> Anyway
> Nacked-by: Michal Hocko <mhocko@...e.com>
> 
> This is silently changing semantic and I do not think you have actually
> shown this is a real life problem. 

https://bugzilla.redhat.com/show_bug.cgi?id=1921601

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=978688

> To me it sounds like a theoretical
> issue 

Its not (see data above).

> at most and it can be worked around by disalowing to use this
> interface from userspace. stat_refresh is mostly for debugging purposes
> and I strongly doubt it is ever used in environments you refer to in
> this series.

Based on experience, I strongly believe people will run latency
sensitive apps and end up reading/writing this file.