lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 22 Jun 2014 06:38:59 +0000 From: "fugang.duan@...escale.com" <fugang.duan@...escale.com> To: Russell King <rmk+kernel@....linux.org.uk>, "linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org> CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org> Subject: RE: [PATCH RFC 04/30] net: fec: fix interrupt handling races From: Russell King <rmk@....linux.org.uk> Data: Friday, June 20, 2014 8:12 PM >To: linux-arm-kernel@...ts.infradead.org >Cc: Duan Fugang-B38611; netdev@...r.kernel.org >Subject: [PATCH RFC 04/30] net: fec: fix interrupt handling races > >While running: while :; do iperf -c <HOST> -P 4; done, transmit timeouts >are regularly reported. With the tx ring dumping in place, we can see >that all entries are in use, and the hardware has finished transmitting >these packets. However, the driver has not reclaimed these ring entries. > >This can occur if the interrupt handler is invoked at the wrong moment - >eg: > > CPU0 CPU1 > fec_enet_tx() > interrupt, IEVENT = FEC_ENET_TXF > FEC_ENET_TXF cleared > napi_schedule_prep() > napi_complete() > >The result is that we clear the transmit interrupt, but we don't trigger >any cleaning of the transmit ring. Instead, use a different strategy: > >- When receiving a transmit or receive interrupt, disable both tx and rx > interrupts, but do not acknowledge them. Schedule a napi poll. Don't > loop. > >- When we are polled, read IEVENT, acknowledging the pending transmit > and receive interrupts, before then going on to process the > appropriate rings. > >This allows us to avoid the race, and has a number of other advantages: >- we cut down on the number of transmit interrupts we have to process. >- we only look at the rings which have pending events. >- we gain additional throughput: the iperf total bandwidth increases > from about 180Mbps to 240Mbps: > >[ 3] 0.0-10.0 sec 68.1 MBytes 57.0 Mbits/sec [ 5] 0.0-10.0 sec 72.4 >MBytes 60.5 Mbits/sec [ 4] 0.0-10.1 sec 76.1 MBytes 63.5 Mbits/sec >[ 6] 0.0-10.1 sec 71.9 MBytes 59.9 Mbits/sec >[SUM] 0.0-10.1 sec 288 MBytes 241 Mbits/sec > >Signed-off-by: Russell King <rmk+kernel@....linux.org.uk> >--- > drivers/net/ethernet/freescale/fec_main.c | 40 +++++++++++++++++--------- >----- > 1 file changed, 22 insertions(+), 18 deletions(-) > >diff --git a/drivers/net/ethernet/freescale/fec_main.c >b/drivers/net/ethernet/freescale/fec_main.c >index 045ea71f2b59..4e695b742030 100644 >--- a/drivers/net/ethernet/freescale/fec_main.c >+++ b/drivers/net/ethernet/freescale/fec_main.c >@@ -1369,29 +1369,25 @@ fec_enet_interrupt(int irq, void *dev_id) { > struct net_device *ndev = dev_id; > struct fec_enet_private *fep = netdev_priv(ndev); >+ const unsigned napi_mask = FEC_ENET_RXF | FEC_ENET_TXF; > uint int_events; > irqreturn_t ret = IRQ_NONE; > >- do { >- int_events = readl(fep->hwp + FEC_IEVENT); >- writel(int_events, fep->hwp + FEC_IEVENT); >+ int_events = readl(fep->hwp + FEC_IEVENT); >+ writel(int_events & ~napi_mask, fep->hwp + FEC_IEVENT); > >- if (int_events & (FEC_ENET_RXF | FEC_ENET_TXF)) { >- ret = IRQ_HANDLED; >+ if (int_events & napi_mask) { >+ ret = IRQ_HANDLED; > >- /* Disable the RX interrupt */ >- if (napi_schedule_prep(&fep->napi)) { >- writel(FEC_RX_DISABLED_IMASK, >- fep->hwp + FEC_IMASK); >- __napi_schedule(&fep->napi); >- } >- } >+ /* Disable the NAPI interrupts */ >+ writel(FEC_ENET_MII, fep->hwp + FEC_IMASK); >+ napi_schedule(&fep->napi); >+ } > >- if (int_events & FEC_ENET_MII) { >- ret = IRQ_HANDLED; >- complete(&fep->mdio_done); >- } >- } while (int_events); >+ if (int_events & FEC_ENET_MII) { >+ ret = IRQ_HANDLED; >+ complete(&fep->mdio_done); >+ } > > return ret; > } >@@ -1399,8 +1395,16 @@ fec_enet_interrupt(int irq, void *dev_id) static >int fec_enet_rx_napi(struct napi_struct *napi, int budget) { > struct net_device *ndev = napi->dev; >- int pkts = fec_enet_rx(ndev, budget); > struct fec_enet_private *fep = netdev_priv(ndev); >+ int pkts; >+ >+ /* >+ * Clear any pending transmit or receive interrupts before >+ * processing the rings to avoid racing with the hardware. >+ */ >+ writel(FEC_ENET_RXF | FEC_ENET_TXF, fep->hwp + FEC_IEVENT); >+ >+ pkts = fec_enet_rx(ndev, budget); > > fec_enet_tx(ndev); > Can you add the interrupt check like below. If there have no tx interrupt, it is most likely no tx rings reclaim. Events = readl(fep->hwp + FEC_IEVENT); If (events & FEC_ENET_RXF) pkts = fec_enet_rx(ndev, budget); if (events & FEC_ENET_TXF) fec_enet_tx(ndev); Thanks, Andy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists