lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 21 Mar 2010 22:54:50 +0100
From:	Stefani Seibold <stefani@...bold.net>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel <linux-kernel@...r.kernel.org>,
	netdev@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	David Miller <davem@...emloft.net>
Subject: Re: [PATCH] fix PHY polling system blocking

I had now analyzed the PHY handling in most of the network drivers. Most
of the PHY communication will be handled in a polling/blocking way,
write  a command word and then wait for the results. Due the nature of
the PHY attachment, this will take some time.

Some of the network drivers do this polling/blocking also in atomic code
paths, like interrupts or timer. So activities on the PHY can cause huge
latency jitters.

On the other side, most of the network driver handle the PHY without
using or only partially using the phylib.

The phylib has also a drawback, because it polls the PHY despite if it
has interrupt support for it or not. I can't see a reason for this
behavior.

So the problem of huge latencies by polling the PHY occurs in most of
the network drivers. For example have a look at the e100 network driver
in the file drivers/net/e100.c, function mdio_ctrl_hw(): This function
will poll for max. of 4000 us or 4 ms.

To fix this latency jitter problem with the PHY polling there are the
following steps to do:

- disable polling in driver/net/phy.c if an interrupt for the PHY is
available
- create an own single or per cpu workqueue for the phylib, so that the
PHY specific code can temporary schedule or block
- prevent all current user of the phylib to access the PHY in a atomic
code path
- modify all current users of the phylib from using cpu_relax() to
cond_resched() and replace the counters against inquiring a timeout 
- modify all other network drivers to use the phylib

What do you think?

Stefani


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ