linux-kernel - [BUG -next] "random: make /dev/urandom scalable for silly userspace programs" causes crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20160727071400.GA3912@osiris>
Date:	Wed, 27 Jul 2016 09:14:00 +0200
From:	Heiko Carstens <heiko.carstens@...ibm.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	linux-next@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: [BUG -next] "random: make /dev/urandom scalable for silly userspace
 programs" causes crash

Hi Ted,

it looks like your patch "random: make /dev/urandom scalable for silly
userspace programs" within linux-next seems to be a bit broken:

It causes this allocation failure and subsequent crash on s390 with fake
NUMA enabled:

[    0.533195] SLUB: Unable to allocate memory on node 1, gfp=0x24008c0(GFP_KERNEL|__GFP_NOFAIL)
[    0.533198]   cache: kmalloc-192, object size: 192, buffer size: 528, defaul order: 3, min order: 0
[    0.533202]   node 0: slabs: 2, objs: 124, free: 17
[    0.533208] Unable to handle kernel pointer dereference in virtual kernel address space
[    0.533211] Failing address: 0000000000000000 TEID: 0000000000000483
...
[    0.533276] Krnl PSW : 0704e00180000000 00000000001a853e (lockdep_init_map+0x1e/0x220)
[    0.533281]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
               Krnl GPRS: 0000000000a23400 00000000370c8008 0000000000000060 0000000000bedc90
[    0.533285]            0000000002070800 0000000000000000 0000000000000001 0000000000000000
[    0.533287]            000000003743d3f8 000000003743d408 0000000002070800 0000000000bedc90
[    0.533289]            0000000000000048 00000000009c2030 00000000370cfd00 00000000370cfcc0
[    0.533295] Krnl Code: 00000000001a852e: a7840001            brc     8,1a8530
           00000000001a8532: e3f0ffc0ff71       lay     %r15,-64(%r15)
          #00000000001a8538: e3e0f0980024       stg     %r14,152(%r15)
          >00000000001a853e: e54820080000       mvghi   8(%r2),0
           00000000001a8544: e54820100000       mvghi   16(%r2),0
           00000000001a854a: 58100370           l       %r1,880
           00000000001a854e: 50102020           st      %r1,32(%r2)
           00000000001a8552: b90400c2           lgr     %r12,%r2
[    0.533313] Call Trace:
[    0.533315] ([<0000000000000001>] 0x1)
[    0.533318] ([<00000000001b4220>] __raw_spin_lock_init+0x50/0x80)
[    0.533320] ([<0000000000759e7a>] rand_initialize+0xc2/0xf0)
[    0.533322] ([<00000000001002cc>] do_one_initcall+0xb4/0x140)
[    0.533325] ([<0000000000ef2cc0>] kernel_init_freeable+0x140/0x2d8)
[    0.533328] ([<00000000009b07ea>] kernel_init+0x2a/0x150)
[    0.533330] ([<00000000009bd782>] kernel_thread_starter+0x6/0xc)
[    0.533332] ([<00000000009bd77c>] kernel_thread_starter+0x0/0xc)

To me it looks rand_initialize is broken with CONFIG_NUMA:

static int rand_initialize(void)
{
#ifdef CONFIG_NUMA
	int i;
	int num_nodes = num_possible_nodes();
	struct crng_state *crng;
	struct crng_state **pool;
#endif

	init_std_data(&input_pool);
	init_std_data(&blocking_pool);
	crng_initialize(&primary_crng);

#ifdef CONFIG_NUMA
	pool = kmalloc(num_nodes * sizeof(void *),
		       GFP_KERNEL|__GFP_NOFAIL|__GFP_ZERO);
	for (i=0; i < num_nodes; i++) {
		crng = kmalloc_node(sizeof(struct crng_state),
				    GFP_KERNEL | __GFP_NOFAIL, i);
		spin_lock_init(&crng->lock);
		crng_initialize(crng);
		pool[i] = crng;

	}
	mb();
	crng_node_pool = pool;
#endif
	return 0;
}
early_initcall(rand_initialize);

First the for loop should use for_each_node() to skip not possible nodes,
no?

However that wouldn't be enough, since in this case it crashed because node
1 is in the possible map, but it isn't online and doesn't have any memory,
which explains why the allocation fails and the subsequent crash when
calling spin_lock_init().

I think the proper fix would be to simply use for_each_online_node(); at
least that fixes the crash on s390.