lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 22 Nov 2010 11:17:55 -0500
From:	Bernie Innocenti <bernie@...ewiz.org>
To:	Krzysztof Halasa <khc@...waw.pl>
Cc:	Ward Vandewege <ward@....org>, lkml <linux-kernel@...r.kernel.org>,
	Jan Seiffert <kaffeemonster@...glemail.com>
Subject: Re: pc300too on a modern kernel?

On Fri, 2010-11-19 at 22:56 +0100, Krzysztof Halasa wrote:

> It seems it happens this way:
> - sca_xmit() fills the whole ring (leaving one descriptor empty as
>   designed - for EDA to work)
> - the chip transmits something and signals IRQ->sca_tx_done()
> - sca_tx_done can't see any descriptor processed and only wakes the
>   queue. Perhaps we should only wake the queue if at least one
>   descriptor has been processed - though sca_tx_done() should never be
>   called otherwise.
> - sca_xmit is called again with full ring, thus BUG().
> 
> I wonder if the following helps (untested):
> 
> --- a/drivers/net/wan/hd64572.c
> +++ b/drivers/net/wan/hd64572.c
> @@ -293,6 +293,7 @@ static inline void sca_tx_done(port_t *port)
>  	struct net_device *dev = port->netdev;
>  	card_t* card = port->card;
>  	u8 stat;
> +	int wake = 0;
>  
>  	spin_lock(&port->lock);
>  
> @@ -316,10 +317,12 @@ static inline void sca_tx_done(port_t *port)
>  			dev->stats.tx_bytes += readw(&desc->len);
>  		}
>  		writeb(0, &desc->stat);	/* Free descriptor */
> +		wake = 1;
>  		port->txlast = (port->txlast + 1) % card->tx_ring_buffers;
>  	}
>  
> -	netif_wake_queue(dev);
> +	if (wake)
> +		netif_wake_queue(dev);
>  	spin_unlock(&port->lock);
>  }

Last Friday I applied a patch very similar to this one, with a printk on
the no-wake case.

As you predicted, this made the BUG_ON() disappear. My printk fired
approximately at same frequency of the debug statements I had in
sca_xmit(), thus confirming your hypothesis.

Now the question is: why do we get so many spurious interrupts?

With this workaround applied, we're st seeing occasional clusters of
packet loss. We're working to graph the ping loss alongside traffic to
see if there's any correlation.

-- 
   // Bernie Innocenti - http://codewiz.org/
 \X/  Sugar Labs       - http://sugarlabs.org/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ