lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19f34abd0807231545u5bc8b55fm768527a02268f111@mail.gmail.com>
Date:	Thu, 24 Jul 2008 00:45:36 +0200
From:	"Vegard Nossum" <vegard.nossum@...il.com>
To:	"Dmitry Adamushko" <dmitry.adamushko@...il.com>
Cc:	"Suresh Siddha" <suresh.b.siddha@...el.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"the arch/x86 maintainers" <x86@...nel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	"Ingo Molnar" <mingo@...e.hu>,
	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>, netdev@...r.kernel.org
Subject: Re: recent -git: BUG in free_thread_xstate

On Thu, Jul 24, 2008 at 12:05 AM, Vegard Nossum <vegard.nossum@...il.com> wrote:
> On Thu, Jul 24, 2008 at 12:01 AM, Dmitry Adamushko
> <dmitry.adamushko@...il.com> wrote:
>> So I guess, 'cpu' value is slightly, well, out of reality. Check the
>> address of "runqueues" in your kernel image...
>> I guess, it should be quite close to the "fault" address... then we
>> can even calculate 'cpu' :-)
>
> Yup, that's right.
>
> $ nm vmlinux | grep runqueues
> c0803f00 d per_cpu__runqueues

Hey, with this patch applied:

diff --git a/include/asm-x86/string_32.h b/include/asm-x86/string_32.h
index b49369a..7bef7ea 100644
--- a/include/asm-x86/string_32.h
+++ b/include/asm-x86/string_32.h
@@ -29,9 +29,14 @@ extern char *strchr(const char *s, int c);
 #define __HAVE_ARCH_STRLEN
 extern size_t strlen(const char *s);

+extern void warn_on_slowpath(const char *file, int line);
+
 static __always_inline void * __memcpy(void * to, const void * from, size_t n)
 {
 int d0, d1, d2;
+       if (n == 0x6b)
+               warn_on_slowpath(__FILE__, __LINE__);
+
 __asm__ __volatile__(
        "rep ; movsl\n\t"
        "movl %4,%%ecx\n\t"

I have found an important clue; it seems to be my network driver's fault:

------------[ cut here ]------------
WARNING: at include2/asm/string_32.h:38 skb_copy_and_csum_dev+0xee/0x100()
Pid: 3989, comm: bash Tainted: G        W 2.6.26-dirty #3
 [<c013496f>] warn_on_slowpath+0x4f/0x70
 [<c0198041>] ? check_bytes_and_report+0x21/0xc0
 [<c04a8544>] ? __kfree_skb+0x34/0x80
 [<c0198041>] ? check_bytes_and_report+0x21/0xc0
 [<c01983ef>] ? check_object+0xdf/0x1f0
 [<c0198041>] ? check_bytes_and_report+0x21/0xc0
 [<c04a8544>] ? __kfree_skb+0x34/0x80
 [<c01983ef>] ? check_object+0xdf/0x1f0
 [<c04bbafc>] ? find_skb+0x3c/0x80
 [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
 [<c03539d7>] rtl8139_start_xmit+0x57/0x130
 [<c019a84b>] ? __kmalloc_track_caller+0x8b/0x120
 [<c04bba6e>] netpoll_send_skb+0x14e/0x1a0
 [<c04bbf54>] netpoll_send_udp+0x1e4/0x210
 [<c0374b0c>] write_msg+0x8c/0xc0
 [<c0135053>] __call_console_drivers+0x53/0x60
 [<c01350ab>] _call_console_drivers+0x4b/0x90
 [<c01351f5>] release_console_sem+0xc5/0x1f0
 [<c01357fe>] vprintk+0x2ce/0x420
 [<c0107e7d>] ? do_IRQ+0x4d/0xa0
 [<c0104de5>] ? restore_nocheck+0x12/0x15
 [<c0286ae1>] ? delay_tsc+0x61/0xb8
 [<c0286b06>] ? delay_tsc+0x86/0xb8
 [<c013596b>] printk+0x1b/0x20
 [<c0580d5d>] native_cpu_up+0x7cd/0x880
 [<c01df741>] ? internal_create_group+0xd1/0x180
 [<c0580470>] ? do_fork_idle+0x0/0x20
 [<c014d7c9>] ? __raw_notifier_call_chain+0x19/0x20
 [<c05826f3>] _cpu_up+0x83/0x100
 [<c05827b9>] cpu_up+0x49/0x70
 [<c05635d8>] store_online+0x58/0x80
 [<c0563580>] ? store_online+0x0/0x80
 [<c02fda2b>] sysdev_store+0x2b/0x40
 [<c01dd7b2>] sysfs_write_file+0xa2/0x100
 [<c019f156>] vfs_write+0x96/0x130
 [<c01dd710>] ? sysfs_write_file+0x0/0x100
 [<c019f81d>] sys_write+0x3d/0x70
 [<c0104cdb>] sysenter_past_esp+0x78/0xd1
 =======================
---[ end trace a7919e7f17c0a725 ]---

In particular, these are interesting:

 [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100

This is net/core/skbuff.c:1731:
        skb_copy_from_linear_data(skb, to, csstart);

 [<c03539d7>] rtl8139_start_xmit+0x57/0x130

This is drivers/net/8139too.c:1711:
                dev_kfree_skb(skb);

(The line numbers are still from v2.6.26, but this reproduces on
current -git as well.)

Is this enough information to fix it? :-)


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ