lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <201109091623.29000.cratiu@ixiacom.com>
Date:	Fri, 9 Sep 2011 16:23:28 +0300
From:	Cosmin Ratiu <cratiu@...acom.com>
To:	<linux-mips@...ux-mips.org>
CC:	<netdev@...r.kernel.org>
Subject: Octeon crash in virt_to_page(&core0_stack_variable)

Hello,

I've been investigating a strange crash and I wanted to ask for your help.
The crash happens when virt_to_page is called with an address from the softirq 
stack of core 0 on Cavium Octeon. It may happen on other MIPS processors as 
well, but I'm not sure.

I've attached a simple kernel module to demonstrate the problem and the output 
of dmesg + the crash. Two seconds after inserting the module, the kernel 
should crash.

From what I've dug up in the kernel sources, it seems the stack for the first 
idle task resides in the data segment (mapped in kseg2) while the rest are 
allocated with kmalloc in __cpu_up() and reside in a different area (CAC_BASE 
upwards).
It seems virt_to_phys produces bogus results for kseg2 and after that, 
virt_to_page crashes trying to access invalid memory.

This problem was discovered when doing BGP traffic with the TCP MD5 option 
activated, where the following call chain caused a crash:

 * tcp_v4_rcv
 *  tcp_v4_timewait_ack
 *   tcp_v4_send_ack -> follow stack variable rep.th
 *    tcp_v4_md5_hash_hdr
 *     tcp_md5_hash_header
 *      sg_init_one
 *       sg_set_buf
 *        virt_to_page

I noticed that tcp_v4_send_reset uses a similar stack variable and also calls 
tcp_v4_md5_hash_hdr, so it has the same problem.

I don't fully understand octeon mm details, so I wanted to bring up this issue 
in order to find a proper fix.
To avoid the problem, I've implemented a quick hack to declare those variables 
percpu instead of on the stack, so they would also reside in CAC_BASE upwards. 
I've attached a patch against 2.6.32 for reference.

Cosmin.

View attachment "dmesg.log" of type "text/x-log" (20013 bytes)

View attachment "vcrash.c" of type "text/x-csrc" (1001 bytes)

View attachment "tcp-md5-crash.diff" of type "text/x-patch" (4763 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ