netdev - Re: [PATCH] net: nvidia: forcedeth: Fix two possible concurrency use-after-free bugs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a861dacb-c992-13d7-5458-09445183655b@oracle.com>
Date:   Wed, 9 Jan 2019 10:35:48 +0800
From:   Yanjun Zhu <yanjun.zhu@...cle.com>
To:     Jia-Ju Bai <baijiaju1990@...il.com>, davem@...emloft.net,
        keescook@...omium.org
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: nvidia: forcedeth: Fix two possible concurrency
 use-after-free bugs


On 2019/1/9 10:03, Jia-Ju Bai wrote:
>
>
> On 2019/1/9 9:24, Yanjun Zhu wrote:
>>
>> On 2019/1/8 20:57, Jia-Ju Bai wrote:
>>>
>>>
>>> On 2019/1/8 20:54, Zhu Yanjun wrote:
>>>>
>>>> 在 2019/1/8 20:45, Jia-Ju Bai 写道:
>>>>> In drivers/net/ethernet/nvidia/forcedeth.c, the functions
>>>>> nv_start_xmit() and nv_start_xmit_optimized() can be concurrently
>>>>> executed with nv_poll_controller().
>>>>>
>>>>> nv_start_xmit
>>>>>    line 2321: prev_tx_ctx->skb = skb;
>>>>>
>>>>> nv_start_xmit_optimized
>>>>>    line 2479: prev_tx_ctx->skb = skb;
>>>>>
>>>>> nv_poll_controller
>>>>>    nv_do_nic_poll
>>>>>      line 4134: spin_lock(&np->lock);
>>>>>      nv_drain_rxtx
>>>>>        nv_drain_tx
>>>>>          nv_release_txskb
>>>>>            line 2004: dev_kfree_skb_any(tx_skb->skb);
>>>>>
>>>>> Thus, two possible concurrency use-after-free bugs may occur.
>>>>>
>>>>> To fix these possible bugs,
>>>>
>>>>
>>>> Does this really occur? Can you reproduce this ?
>>>
>>> This bug is not found by the real execution.
>>> It is found by a static tool written by myself, and then I check it 
>>> by manual code review.
>>
>> Before "line 2004: dev_kfree_skb_any(tx_skb->skb); ",
>>
>> "
>>
>>                 nv_disable_irq(dev);
>>                 nv_napi_disable(dev);
>>                 netif_tx_lock_bh(dev);
>>                 netif_addr_lock(dev);
>>                 spin_lock(&np->lock);
>>                 /* stop engines */
>>                 nv_stop_rxtx(dev);   <---this stop rxtx
>>                 nv_txrx_reset(dev);
>> "
>>
>> In this case, does nv_start_xmit or nv_start_xmit_optimized still 
>> work well?
>>
>
> nv_stop_rxtx() calls nv_stop_tx(dev).
>
> static void nv_stop_tx(struct net_device *dev)
> {
>     struct fe_priv *np = netdev_priv(dev);
>     u8 __iomem *base = get_hwbase(dev);
>     u32 tx_ctrl = readl(base + NvRegTransmitterControl);
>
>     if (!np->mac_in_use)
>         tx_ctrl &= ~NVREG_XMITCTL_START;
>     else
>         tx_ctrl |= NVREG_XMITCTL_TX_PATH_EN;
>     writel(tx_ctrl, base + NvRegTransmitterControl);
>     if (reg_delay(dev, NvRegTransmitterStatus, NVREG_XMITSTAT_BUSY, 0,
>               NV_TXSTOP_DELAY1, NV_TXSTOP_DELAY1MAX))
>         netdev_info(dev, "%s: TransmitterStatus remained busy\n",
>                 __func__);
>
>     udelay(NV_TXSTOP_DELAY2);
>     if (!np->mac_in_use)
>         writel(readl(base + NvRegTransmitPoll) & 
> NVREG_TRANSMITPOLL_MAC_ADDR_REV,
>                base + NvRegTransmitPoll);
> }
>
> nv_stop_tx() seems to only write registers to stop transmitting for 
> hardware.
> But it does not wait until nv_start_xmit() and 
> nv_start_xmit_optimized() finish execution.
There are 3 modes in forcedeth NIC.
In throughput mode (0), every tx & rx packet will generate an interrupt.
In CPU mode (1), interrupts are controlled by a timer.
In dynamic mode (2), the mode toggles between throughput and CPU mode 
based on network load.

 From the source code,

"np->recover_error = 1;" is related with CPU mode.

nv_start_xmit or nv_start_xmit_optimized seems related with ghroughput mode.

In static void nv_do_nic_poll(struct timer_list *t),
when  if (np->recover_error), line 2004: dev_kfree_skb_any(tx_skb->skb); 
will run.

When "np->recover_error=1", do you think nv_start_xmit or 
nv_start_xmit_optimized will be called?


> Maybe netif_stop_queue() should be used here to stop transmitting for 
> network layer, but this function does not seem to wait, either.
> Do you know any function that can wait until ".ndo_start_xmit" finish 
> execution?
>
>
> Best wishes,
> Jia-Ju Bai
>
>