netdev - Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130520213728.GA28252@electric-eye.fr.zoreil.com>
Date:	Mon, 20 May 2013 23:37:28 +0200
From:	Francois Romieu <romieu@...zoreil.com>
To:	Stephen Hemminger <stephen@...workplumber.org>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH net 2/2]  8139cp: reset BQL when ring tx ring cleared

Stephen Hemminger <stephen@...workplumber.org> :
> This patch cures transmit timeout's with DHCP observed
> while running under KVM. When the transmit ring is cleaned out,
> the Byte Queue Limit values need to be reset.
> 
> Signed-off-by: Stephen Hemminger <stephen@...workplumber.org>
> 
> --- a/drivers/net/ethernet/realtek/8139cp.c	2013-05-20 09:29:49.689900862 -0700
> +++ b/drivers/net/ethernet/realtek/8139cp.c	2013-05-20 09:29:50.857880563 -0700
> @@ -1141,6 +1141,7 @@ static void cp_clean_rings (struct cp_pr
>  			cp->dev->stats.tx_dropped++;
>  		}
>  	}
> +	netdev_reset_queue(cp->dev);
>  
>  	memset(cp->rx_ring, 0, sizeof(struct cp_desc) * CP_RX_RING_SIZE);
>  	memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);

It's a good catch but I believe that you are not telling us the whole
story. :o)

Where does cp_clean_rings get called ?

- error path of cp_refill_rx
  - cp_init_rings
    - cp_alloc_rings
      - cp_open
        - in cp_change_mtu, after cp_close, whence cp_stop_hw
    - cp_tx_timeout
      -> after cp_stop_hw
- cp_free_rings
  - error path of cp_open
    -> after cp_stop_hw
  - cp_close
    -> after cp_stop_hw
- cp_tx_timeout
  -> after cp_stop_hw

cp_stop_hw includes netdev_reset_queue.

You have imho exhibited a start_xmit after cp_stop_hw race - not sure if
it happens in cp_tx_timeout or cp_change_mtu. Reverting the analysis above,
I have not found a place where cp_stop_hw could be called without being
followed by a cp_clean_rings. The netdev_reset_queue in cp_stop_hw, now
useless, should thus be removed.

Does it make sense ?

-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html