netdev - Re: [Bugme-new] [Bug 9386] New: sis190 network driver crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20071115.150041.156510275.davem@davemloft.net>
Date:	Thu, 15 Nov 2007 15:00:41 -0800 (PST)
From:	David Miller <davem@...emloft.net>
To:	akpm@...ux-foundation.org
Cc:	netdev@...r.kernel.org, bugme-daemon@...zilla.kernel.org,
	chris@...uxepos.com, romieu@...zoreil.com
Subject: Re: [Bugme-new] [Bug 9386] New: sis190 network driver crash

From: Andrew Morton <akpm@...ux-foundation.org>
Date: Thu, 15 Nov 2007 11:58:41 -0800

> On Thu, 15 Nov 2007 07:30:53 -0800 (PST) bugme-daemon@...zilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=9386
 ...
> > I have a problem where I can lock up a number of machines by 
> > changing the link state on a sis190 Ethernet port. For example, during 
> > a data transfer such as FTP if I unplug the Ethernet cable and plug it 
> > back in, the Ethernet interface will stop responding and the machine 
> > will lock up after a minute or so. This behaviour is repeatable. I have the
> > sis190 driver loaded as a module.
> > 
> > I haven't found a kernel version where this doesn't happen. It happens with
> > kernel 2.6.20.15, for example.

I wonder if somehow sis190_phy_task() is creating some kind
of deadlock when handling the link down and up events.

It takes the RTNL semaphore in sis190_phy_task() but it doesn't
call anything which can see deadlocking on that.

It does invoke the link-watch layer, indirectly via the
various netif_carrier_{on,off}() calls it makes but those
should be OK since they just schedule workqueue things.

Perhaps what is contributing to the problem is that
sis190_interrupt() still processes the RX and TX queues
even when a link change event is signalled.  Perhaps
the chip doesn't like that.

Francois, I noticed two issues while reviewing the driver for
this bug:

1) The interrupt handler does no SMP locking, the chip might
   not be happy with one thread (in phy_task) programming
   the MDIO whilst another thread does RX/TX ring processing,
   for example.

2) The timeout limit check in __mdio_cmd() is buggy, it should
   be 99 instead of 999.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html