lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 27 Jan 2022 14:23:29 -0300 From: Marcelo Tosatti <mtosatti@...hat.com> To: linux-kernel@...r.kernel.org Cc: Nitesh Lal <nilal@...hat.com>, Nicolas Saenz Julienne <nsaenzju@...hat.com>, Frederic Weisbecker <frederic@...nel.org>, Christoph Lameter <cl@...ux.com>, Juri Lelli <juri.lelli@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Alex Belits <abelits@...its.com>, Peter Xu <peterx@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, Daniel Bristot de Oliveira <bristot@...hat.com>, Marcelo Tosatti <mtosatti@...hat.com> Subject: [patch v9 10/10] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean The logic to disable vmstat worker thread, when entering nohz full, does not cover all scenarios. For example, it is possible for the following to happen: References: <20220127172319.428529308@...ler.cnet> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 It is not necessary to queue work item to run refresh_vm_stats on a remote CPU if that CPU has no dirty stats and no per-CPU allocations for remote nodes. This fixes sosreport hang (which uses vmstat_refresh) with spinning SCHED_FIFO process. Signed-off-by: Marcelo Tosatti <mtosatti@...hat.com> --- mm/vmstat.c | 49 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 44 insertions(+), 5 deletions(-) Index: linux-2.6/mm/vmstat.c =================================================================== --- linux-2.6.orig/mm/vmstat.c +++ linux-2.6/mm/vmstat.c @@ -1910,6 +1910,31 @@ static bool need_update(int cpu) } #ifdef CONFIG_PROC_FS +static bool need_drain_remote_zones(int cpu) +{ +#ifdef CONFIG_NUMA + struct zone *zone; + + for_each_populated_zone(zone) { + struct per_cpu_pages *pcp; + + pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); + if (!pcp->count) + continue; + + if (!pcp->expire) + continue; + + if (zone_to_nid(zone) == cpu_to_node(cpu)) + continue; + + return true; + } +#endif + + return false; +} + static void refresh_vm_stats(struct work_struct *work) { refresh_cpu_vm_stats(true); @@ -1919,8 +1944,12 @@ int vmstat_refresh(struct ctl_table *tab void *buffer, size_t *lenp, loff_t *ppos) { long val; - int err; - int i; + int i, cpu; + struct work_struct __percpu *works; + + works = alloc_percpu(struct work_struct); + if (!works) + return -ENOMEM; /* * The regular update, every sysctl_stat_interval, may come later @@ -1934,9 +1963,19 @@ int vmstat_refresh(struct ctl_table *tab * transiently negative values, report an error here if any of * the stats is negative, so we know to go looking for imbalance. */ - err = schedule_on_each_cpu(refresh_vm_stats); - if (err) - return err; + cpus_read_lock(); + for_each_online_cpu(cpu) { + struct work_struct *work = per_cpu_ptr(works, cpu); + + INIT_WORK(work, refresh_vm_stats); + if (need_update(cpu) || need_drain_remote_zones(cpu)) + schedule_work_on(cpu, work); + } + for_each_online_cpu(cpu) + flush_work(per_cpu_ptr(works, cpu)); + cpus_read_unlock(); + free_percpu(works); + for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) { /* * Skip checking stats known to go negative occasionally.
Powered by blists - more mailing lists