linux-kernel - Re: Linux-2.6.21 hangs during post boot initialization phase

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4632088C.8000509@bigpond.net.au>
Date:	Sat, 28 Apr 2007 00:28:28 +1000
From:	Peter Williams <pwil3058@...pond.net.au>
To:	Neil Horman <nhorman@...driver.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	jgarzik@...ox.com
CC:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux-2.6.21 hangs during post boot initialization phase

Neil Horman wrote:
> On Fri, Apr 27, 2007 at 04:05:11PM +1000, Peter Williams wrote:
>> Linus Torvalds wrote:
>>> On Fri, 27 Apr 2007, Peter Williams wrote:
>>>> The 2.6.21 kernel is hanging during the post boot phase where various 
>>>> daemons
>>>> are being started (not always the same daemon unfortunately).
>>>>
>>>> This problem was not present in 2.6.21-rc7 and there is no oops or other
>>>> unusual output in the system log at the time the hang occurs.
>>> Can you use "git bisect" to narrow it down a bit more? It's only 125 
>>> commits, so bisecting even just three or four kernels will narrow it down 
>>> to a handful.
>> As the changes became, smaller the builds became quicker :-) and after 7 
>> iterations we have:
>>
>>
>> author	Neil Horman <nhorman@...driver.com>
>> 	Fri, 20 Apr 2007 13:54:58 +0000 (09:54 -0400)
>> committer	Jeff Garzik <jeff@...zik.org>
>> 	Tue, 24 Apr 2007 16:43:07 +0000 (12:43 -0400)
>> commit	b748d9e3b80dc7e6ce6bf7399f57964b99a4104c
>> tree	887909e1f735bb444ef0e3e370f34401fa6eee02	tree | snapshot
>> parent	d91c088b39e3c66d309938de858775bb90fd1ead	commit | diff
>> sis900: Allocate rx replacement buffer before rx operation
>>
>> The sis900 driver appears to have a bug in which the receive routine
>> passes the skbuff holding the received frame to the network stack before
>> refilling the buffer in the rx ring.  If a new skbuff cannot be 
>> allocated, the
>> driver simply leaves a hole in the rx ring, which causes the driver to stop
>> receiving frames and become non-recoverable without an rmmod/insmod 
>> according to
>> reporters.  This patch reverses that order, attempting to allocate a 
>> replacement
>> buffer first, and receiving the new frame only if one can be allocated. 
>>  If no
>> skbuff can be allocated, the current skbuf in the rx ring is recycled, 
>> dropping
>> the current frame, but keeping the NIC operational.
>>
>> Signed-off-by: Neil Horman <nhorman@...driver.com>
>> Signed-off-by: Jeff Garzik <jeff@...zik.org>
>>
>> Peter
>> -- 
>> Peter Williams                                   pwil3058@...pond.net.au
>>
>> "Learning, n. The kind of ignorance distinguishing the studious."
>>  -- Ambrose Bierce
> 
> This was reported to me last night, and I've posted a patch to fix it, its
> available here:
> http://marc.info/?l=linux-netdev&m=117761259222165&w=2
> 
> It applies on top of the previous patch, and should fix your problem.
> 
> Here's a copy of the patch
> 
> Thanks & Regards
> Neil
> 
> 
> diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c
> index a6a0f09..7e44939 100644
> --- a/drivers/net/sis900.c
> +++ b/drivers/net/sis900.c
> @@ -1754,6 +1754,7 @@ static int sis900_rx(struct net_device *net_dev)
>  			sis_priv->rx_ring[entry].cmdsts = RX_BUF_SIZE;
>  		} else {
>  			struct sk_buff * skb;
> +			struct sk_buff * rx_skb;
>  
>  			pci_unmap_single(sis_priv->pci_dev,
>  				sis_priv->rx_ring[entry].bufptr, RX_BUF_SIZE,
> @@ -1787,10 +1788,10 @@ static int sis900_rx(struct net_device *net_dev)
>  			}
>  
>  			/* give the socket buffer to upper layers */
> -			skb = sis_priv->rx_skbuff[entry];
> -			skb_put(skb, rx_size);
> -			skb->protocol = eth_type_trans(skb, net_dev);
> -			netif_rx(skb);
> +			rx_skb = sis_priv->rx_skbuff[entry];
> +			skb_put(rx_skb, rx_size);
> +			skb->protocol = eth_type_trans(rx_skb, net_dev);
> +			netif_rx(rx_skb);
>  
>  			/* some network statistics */
>  			if ((rx_status & BCAST) == MCAST)

This patch fixes the problem for me.

Peter
-- 
Peter Williams                                   pwil3058@...pond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/