lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170525150353.GE23493@htj.duckdns.org>
Date:   Thu, 25 May 2017 11:03:53 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Michael Bringmann <mwb@...ux.vnet.ibm.com>
Cc:     Lai Jiangshan <jiangshanlai@...il.com>,
        linux-kernel@...r.kernel.org,
        Nathan Fontenot <nfont@...ux.vnet.ibm.com>
Subject: Re: [PATCH] workqueue: Ensure that cpumask set for pools created
 after boot

Hello, Michael.

On Wed, May 24, 2017 at 06:39:49PM -0500, Michael Bringmann wrote:
> [  321.310961] ------------[ cut here ]------------
> [  321.310997] WARNING: CPU: 184 PID: 13201 at kernel/workqueue.c:3375 alloc_unbound_pwq+0x5c0/0x5e0
> [  321.311005] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag sg pseries_rng ghash_generic gf128mul xts vmx_crypto binfmt_misc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
> [  321.311097] CPU: 184 PID: 13201 Comm: cpuhp/184 Not tainted 4.12.0-rc1.wi91275_debug_03.ppc64le+ #8
> [  321.311106] task: c000000408961080 task.stack: c000000406394000
> [  321.311113] NIP: c000000000116c80 LR: c000000000116c7c CTR: 0000000000000000
> [  321.311121] REGS: c0000004063977b0 TRAP: 0700   Not tainted  (4.12.0-rc1.wi91275_debug_03.ppc64le+)
> [  321.311128] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> [  321.311150]   CR: 28000082  XER: 00000000
> [  321.311159] CFAR: c000000000a2dc80 SOFTE: 1 
> [  321.311159] GPR00: c000000000116c7c c000000406397a30 c0000000013ae900 000000000000003b 
> [  321.311159] GPR04: c000000408961a38 0000000000000006 00000000a49e41e5 ffffffffa4a5a483 
> [  321.311159] GPR08: 00000000000062cc 0000000000000000 0000000000000000 c000000408961a38 
> [  321.311159] GPR12: 0000000000000000 c00000000fb38c00 c00000000011e858 c00000040a902ac0 
> [  321.311159] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [  321.311159] GPR20: c000000406394000 0000000000000002 c000000406394000 0000000000000000 
> [  321.311159] GPR24: c000000405075400 c000000404fc0000 0000000000000110 c0000000015a4c88 
> [  321.311159] GPR28: 0000000000000000 c0000004fe256000 c0000004fe256008 c0000004fe052800 
> [  321.311290] NIP [c000000000116c80] alloc_unbound_pwq+0x5c0/0x5e0
> [  321.311298] LR [c000000000116c7c] alloc_unbound_pwq+0x5bc/0x5e0
> [  321.311305] Call Trace:
> [  321.311310] [c000000406397a30] [c000000000116c7c] alloc_unbound_pwq+0x5bc/0x5e0 (unreliable)
> [  321.311323] [c000000406397ad0] [c000000000116e30] wq_update_unbound_numa+0x190/0x270
> [  321.311334] [c000000406397b60] [c000000000118eb0] workqueue_offline_cpu+0xe0/0x130
> [  321.311345] [c000000406397bf0] [c0000000000e9f20] cpuhp_invoke_callback+0x240/0xcd0
> [  321.311355] [c000000406397cb0] [c0000000000eab28] cpuhp_down_callbacks+0x78/0xf0
> [  321.311365] [c000000406397d00] [c0000000000eae6c] cpuhp_thread_fun+0x18c/0x1a0
> [  321.311376] [c000000406397d30] [c0000000001251cc] smpboot_thread_fn+0x2fc/0x3b0
> [  321.311386] [c000000406397dc0] [c00000000011e9c0] kthread+0x170/0x1b0
> [  321.311397] [c000000406397e30] [c00000000000b4f4] ret_from_kernel_thread+0x5c/0x68

wq_update_unbound_numa() should have never called into
alloc_unbound_pwq() w/ empty node cpu mask.  It should have fallen
back to the dfl_pwq.  It looks like I just messed up the logic there
from the initial commit of the feature.  Can you please see whether
the following fixes the problem?

Thanks.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c74bf39ef764..e2c248d5223a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3559,29 +3559,21 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
  * stable.
  *
  * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
- * %false if equal.
+ * %false if equal.  On %false return, the content of @cpumask is undefined.
  */
 static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
 				 int cpu_going_down, cpumask_t *cpumask)
 {
 	if (!wq_numa_enabled || attrs->no_numa)
-		goto use_dfl;
+		return false;
 
 	/* does @node have any online CPUs @attrs wants? */
 	cpumask_and(cpumask, cpumask_of_node(node), attrs->cpumask);
 	if (cpu_going_down >= 0)
 		cpumask_clear_cpu(cpu_going_down, cpumask);
 
-	if (cpumask_empty(cpumask))
-		goto use_dfl;
-
-	/* yeap, return possible CPUs in @node that @attrs wants */
-	cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
-	return !cpumask_equal(cpumask, attrs->cpumask);
-
-use_dfl:
-	cpumask_copy(cpumask, attrs->cpumask);
-	return false;
+	return !cpumask_empty(cpumask) &&
+		!cpumask_equal(cpumask, attrs->cpumask);
 }
 
 /* install @pwq into @wq's numa_pwq_tbl[] for @node and return the old pwq */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ