netdev - RE: IPV6 ndisc:: Bad NIC causing IPV6 NDP to stop working

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <D8C50530D6022F40A817A35C40CC06A70B34DBF05D@DUBX7MCDUB01.EMEA.DELL.COM>
Date:	Thu, 21 Jun 2012 09:43:33 +0100
From:	<Menny_Hamburger@...l.com>
To:	<eric.dumazet@...il.com>
CC:	<netdev@...r.kernel.org>
Subject: RE: IPV6 ndisc::  Bad NIC causing  IPV6 NDP to stop working

For high availability reasons, the machines discussed run with a number of NICs per subnet, where our own proprietary service fixes up routing when a NIC goes wild.
We schedule a fix in the field but our goal is to eliminate as many single points of failure as we can, so that our systems will still run properly when something goes wrong.
We encountered this issue on some proprietary NICs but also with bnx2, where we get "chip not in correct endian mode" errors (This is another problem that may require a separate discussion).

-----Original Message-----
From: Eric Dumazet [mailto:eric.dumazet@...il.com] 
Sent: 21 June, 2012 11:23
To: Hamburger, Menny
Cc: netdev@...r.kernel.org
Subject: Re: IPV6 ndisc:: Bad NIC causing IPV6 NDP to stop working

On Thu, 2012-06-21 at 08:59 +0100, Menny_Hamburger@...l.com wrote:
> Hi,
> 
> Our machines runs EL5.8 x86_64.
> We have witnessed several cases where we suspect that a bad NIC on the machine caused IPV6 neighbour discovery to stop working on all the other NICs - when this happens ping6 fails on every NIC we try it.
> From looking into the code I see that there is only a single socket assigned for NDP; Does it sound logical to allocate a socket per interface instead of a single global socket.
> I have found the following thread in LKML: https://lkml.org/lkml/2006/11/29/335, and it seems that this allocation issue still exists in EL5 based kernels - could this cause the above problem?
> 

What is a bad NIC, and why not fixing it ?