lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <74e4afd4-5695-90fb-e66e-25d2bc2e2f53@gentwo.de>
Date:   Tue, 17 Jan 2023 13:52:18 +0100 (CET)
From:   Christoph Lameter <cl@...two.de>
To:     Marcelo Tosatti <mtosatti@...hat.com>
cc:     Frederic Weisbecker <frederic@...nel.org>, atomlin@...mlin.com,
        tglx@...utronix.de, mingo@...nel.org, peterz@...radead.org,
        pauld@...hat.com, neelx@...hat.com, oleksandr@...alenko.name,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v13 2/6] mm/vmstat: Use vmstat_dirty to track CPU-specific
 vmstat discrepancies

On Mon, 16 Jan 2023, Marcelo Tosatti wrote:

> Honestly, to me, there is no dilemma:
>
> * There is a requirement from applications to be uninterrupted
> by operating system activities. Examples include radio access
> network software, software defined PLCs for industrial automation (1).
>
> * There exists vm-statistics counters (which count
> the number of pages on different states, for example, number of
> free pages, locked pages, pages under writeback, pagetable pages,
> file pages, etc).
> To reduce number of accesses to the global counters, each CPU maintains
> its own delta relative to the global VM counters
> (which can be cached in the local processor cache, therefore fast).

The counters only count accurately as a global sum. A counter may be
specific to a zone and at which time it counts uses of that zone of from
all processors.

> Now you are objecting to this patchset because:
>
> It increases the number of cycles to execute the function to modify
> the counters by 6. Can you mention any benchmark where this
> increase is significant?

I am objecting because of a fundamental misunderstanding of how these
counters work and because the patchset is incorrect in the way it handles
these counters. Come up with a correct approach and then we can consider
regressions and/or improvements in performance.

> Alternatives:
> 	1) Disable periodic synchronization for nohz_full CPUs.
> 	2) Processor instructions which can modify more than
> 	   one address in memory.
> 	3) Synchronize the per-CPU stats remotely (which
> 	   increases per-CPU and per-node accesses).

Or remove the assumptions that may exist in current code that a delta on a
specific cpu counter means that something occurred on that cpu?

If there is a delta then that *does not* mean that there is something to
do on that processor. The delta could be folded by another processor into
the global counter if that processor is idle or not entering the Kernel
and stays that way throughout the operation.

So I guess that would be #3. The function cpu_vm_stats_fold() already does
this for offline cpus. Can something similar be made to work for idle cpus
or those continually running in user space?


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ