lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1311946421.2843.16.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Date:	Fri, 29 Jul 2011 15:33:41 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	synapse <synapse@...py.csoma.elte.hu>
Cc:	netdev@...r.kernel.org
Subject: Re: PROBLEM: BUG (NULL ptr dereference in ipv4_dst_check)

Le vendredi 29 juillet 2011 à 15:18 +0200, synapse a écrit :
> Hello guys,
> 
> I have a problem that I hope you can help me resolv. This is my first 
> real bug report, so please be
> patient :)
> 
> ### Description:
> 3.0.0-rc4 routinely locks up with BUG: unable to handle kernel NULL 
> pointer dereference at 000000000000002c
> I have an intel sr2600 machine with a 10Gbit interface, it periodically 
> locks up after a few days.
> It serves a lot of traffic. The trace is at the end of the mail.
> ###
> 
> ### My efforts:
> I've traced the error back from atomic_dec_and_test() to:
> 
> ipv4_dst_check()
> check_peer_redir()
> neigh_release()
> atomic_dec_and_test()
> 
> The parameter to atomic_dec_and_test() is NULL (&neigh->refcnt in 
> neigh_release), so atomic_dec_and_test()
> at /arch/x86/include/asm/atomic.h dies at offset 0xffffffff8140f56f.
> 
> ffffffff8140f560:       48 8b 15 19 47 2f 00    mov    
> 0x2f4719(%rip),%rdx        # 0xffffffff81703c80
> ffffffff8140f567:       48 89 50 18             mov    %rdx,0x18(%rax)
> ffffffff8140f56b:       48 8b 7b 40             mov    0x40(%rbx),%rdi
> ffffffff8140f56f:       f0 ff 4f 2c             lock decl 0x2c(%rdi)
> ffffffff8140f573:       0f 94 c0                sete   %al
> ffffffff8140f576:       84 c0                   test   %al,%al
> ffffffff8140f578:       0f 85 ab 00 00 00       jne    0xffffffff8140f629
> 
>  From what I've seen is that this code is responsible for pmtu related 
> things. The refcount member of struct neighbour
> is NULL and the neigh pointer (struct neighbour *) in neigh_release() is 
> not. I have no clue how this might happen,
> though I suspect somebody releases the data structure somehow. Note that 
> this code is invoked when redirect_learned.a4
> is set and is different from rt_gateway in ipv4_dst_check().
> 
> Is it possible that two packets go to two different cores for processing 
> and one core invalidates the rt entry
> the other is currently working on (meaning the second will try to 
> dereference a NULL ptr)?
> ###
> 
> 
> This is just my clumsy attempt at tracking this down, I'm not a kernel 
> expert unfortunately. I'm happy to provide
> further info on the matter. If I'm completely on the wrong track please 
> let me know.
> 
> Thank you for any help,
> Gergely Kalman
> 

This bug was probably already fixed.

Please try current linux tree



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ