[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cdd68de3-7803-2921-45ee-6769b4a86265@gmail.com>
Date: Tue, 5 Apr 2022 12:16:01 +0900
From: Taehee Yoo <ap420073@...il.com>
To: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
netdev@...r.kernel.org, bpf@...r.kernel.org,
ecree.xilinx@...il.com, ast@...nel.org, daniel@...earbox.net,
hawk@...nel.org, john.fastabend@...il.com
Subject: Re: [PATCH net v2] net: sfc: add missing xdp queue reinitialization
On 4/4/22 19:36, Martin Habets wrote:
Hi Martin,
> Hi Taehee,
>
> On Sat, Apr 02, 2022 at 03:48:14AM +0900, Taehee Yoo wrote:
>> On 4/1/22 20:06, Martin Habets wrote:
>>
>> Hi Martin,
>> Thank you so much for your review!
>>
>>> Hi Taehee,
>>>
>>> Thanks for looking into this. Unfortunately efx_realloc_channels()
>>> has turned out to be quite fragile over the years, so I'm
>>> keen to remove it in stead of patching it up all the time.
>>
>> I agree with you.
>> efx_realloc_channels() is too complex.
>>
>>>
>>> Could you try the patch below please?
>>> If it works ok for you as well we'll be able to remove
>>> efx_realloc_channels(). The added advantage of this approach
>>> is that the netdev notifiers get informed of the change.
>>
>> I tested your patch and I found a page reference count problem.
>> How to test:
>> 1. set up XDP_TX
>> 2. traffic on
>> 3. traffic off
>> 4. ring buffer size change
>> 5. loop from 2 to 4.
>>
>> [ 87.836195][ T72] BUG: Bad page state in process kworker/u16:1
>> pfn:125445
>> [ 87.843356][ T72] page:000000003725f642 refcount:-2 mapcount:0
>> mapping:0000000000000000 index:0x0 pfn:0x125445
>> [ 87.853783][ T72] flags: 0x200000000000000(node=0|zone=2)
>>
>> [ 87.859391][ T72] raw: 0200000000000000 dead000000000100
>> dead000000000122 0000000000000000
>> [ 87.867928][ T72] raw: 0000000000000000 0000000000000000
>> fffffffeffffffff 0000000000000000
>> [ 87.876569][ T72] page dumped because: nonzero _refcount
>>
>> [ 87.882125][ T72] Modules linked in: af_packet sfc ixgbe mtd
atlantic
>> coretemp mdio hwmon sch_fq_codel msr bpf_prelx
>> [ 87.895331][ T72] CPU: 0 PID: 72 Comm: kworker/u16:1 Not tainted
>> 5.17.0+ #62 dbf33652f22e5147659e7e2472bb962779c4833
>> [ 87.906350][ T72] Hardware name: ASUS System Product Name/PRIME
Z690-P
>> D4, BIOS 0603 11/01/2021
>> [ 87.915360][ T72] Workqueue: netns cleanup_net
>>
>> [ 87.920087][ T72] Call Trace:
>>
>> [ 87.923311][ T72] <TASK>
>>
>> [ 87.926188][ T72] dump_stack_lvl+0x56/0x7b
>>
>> [ 87.930597][ T72] bad_page.cold.125+0x63/0x93
>>
>> [ 87.935288][ T72] free_pcppages_bulk+0x63c/0x6f0
>>
>> [ 87.940232][ T72] free_unref_page+0x8b/0xf0
>>
>> [ 87.944749][ T72] efx_fini_rx_queue+0x15f/0x210 [sfc
>> 49c5d4f562a40c6a7ed913c25f5bd4e126bcfa4e]
>
> Looks to me like this is in efx_fini_rx_recycle_ring().
> It could be a side effect of the memory leak you report below.
> If this is in efx_fini_rx_recycle_ring() I'll post a patch for
> that soon on a separate thread.
Thanks for that!
>
>> [ 87.953756][ T72] efx_stop_channels+0xef/0x1b0 [sfc
>> 49c5d4f562a40c6a7ed913c25f5bd4e126bcfa4e]
>> [ 87.962699][ T72] efx_net_stop+0x4d/0x60 [sfc
>> 49c5d4f562a40c6a7ed913c25f5bd4e126bcfa4e]
>> [ 87.971029][ T72] __dev_close_many+0x8b/0xf0
>>
>> [ 87.975618][ T72] dev_close_many+0x7d/0x120
>>
>> [ ... ]
>>
>>
>> In addition, I would like to share issues that I'm currently looking
into:
>> 1. TX DMA error
>> when interface down/up or ring buffer size changes, TX DMA error
would occur
>> because tx_queue can be used before initialization.
>> But It will be fixed by the below patch.
>>
>> static void efx_ethtool_get_wol(struct net_device *net_dev,
>> diff --git a/drivers/net/ethernet/sfc/tx.c
b/drivers/net/ethernet/sfc/tx.c
>> index d16e031e95f4..6983799e1c05 100644
>> --- a/drivers/net/ethernet/sfc/tx.c
>> +++ b/drivers/net/ethernet/sfc/tx.c
>> @@ -443,6 +443,9 @@ int efx_xdp_tx_buffers(struct efx_nic *efx, int n,
>> struct xdp_frame **xdpfs,
>> if (unlikely(!tx_queue))
>> return -EINVAL;
>>
>> + if (!tx_queue->initialised)
>> + return -EINVAL;
>> +
>> if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED)
>> HARD_TX_LOCK(efx->net_dev, tx_queue->core_txq, cpu);
>>
>> diff --git a/drivers/net/ethernet/sfc/tx_common.c
>> b/drivers/net/ethernet/sfc/tx_common.c
>> index d530cde2b864..9bc8281b7f5b 100644
>> --- a/drivers/net/ethernet/sfc/tx_common.c
>> +++ b/drivers/net/ethernet/sfc/tx_common.c
>> @@ -101,6 +101,8 @@ void efx_fini_tx_queue(struct efx_tx_queue
*tx_queue)
>> netif_dbg(tx_queue->efx, drv, tx_queue->efx->net_dev,
>> "shutting down TX queue %d\n", tx_queue->queue);
>>
>> + tx_queue->initialised = false;
>> +
>> if (!tx_queue->buffer)
>> return;
>
> Looks ok, but xmit_hard should never be called on an interface that
> is down. Makes me wonder if we have a seqence issue in our ndo_stop API.
>
I think it's an XDP tx_queue specific issue.
Sorry that I didn't provide that information.
When a packet is received and it acts XDP_TX, it calls
efx_xdp_tx_buffers() and it uses xdp tx_queue and sends it to hardware
directly.
So, ->ndo_start_xmit is not called.
And that bug occurs when the first interface up.
->ndo_stop has never called, So I think this is not an issue of ->ndo_stop.
>>
>> After your patch, unfortunately, it can't fix ring buffer size
change case.
>> It can fix only interface down/up case.
>> I will look into this more.
>>
>> 2. Memory leak
>> There is a memory leak in ring buffer size change logic.
>> reproducer:
>> while :
>> do
>> ethtool -G <interface name> rx 2048 tx 2048
>> ethtool -G <interface name> rx 1024 tx 1024
>> done
>
> Is this with my patch or only with yours?
> Thanks a lot for testing this.
>
Memory leak still occurs with your patch(do not use efx_realloc_channels())
Thanks a lot,
Taehee Yoo
> Martin
>
>> Thanks a lot,
>> Taehee Yoo
>>
>>>
>>> Regards,
>>> Martin Habets <habetsm.xilinx@...il.com>
>>>
>>> ---
>>> drivers/net/ethernet/sfc/ethtool.c | 13 ++++++++++++-
>>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/sfc/ethtool.c
>> b/drivers/net/ethernet/sfc/ethtool.c
>>> index 48506373721a..8cfbe61737bb 100644
>>> --- a/drivers/net/ethernet/sfc/ethtool.c
>>> +++ b/drivers/net/ethernet/sfc/ethtool.c
>>> @@ -179,6 +179,7 @@ efx_ethtool_set_ringparam(struct net_device
*net_dev,
>>> {
>>> struct efx_nic *efx = netdev_priv(net_dev);
>>> u32 txq_entries;
>>> + int rc = 0;
>>>
>>> if (ring->rx_mini_pending || ring->rx_jumbo_pending ||
>>> ring->rx_pending > EFX_MAX_DMAQ_SIZE ||
>>> @@ -198,7 +199,17 @@ efx_ethtool_set_ringparam(struct net_device
*net_dev,
>>> "increasing TX queue size to minimum of %u\n",
>>> txq_entries);
>>>
>>> - return efx_realloc_channels(efx, ring->rx_pending, txq_entries);
>>> + /* Apply the new settings */
>>> + efx->rxq_entries = ring->rx_pending;
>>> + efx->txq_entries = ring->tx_pending;
>>> +
>>> + /* Update the datapath with the new settings if the interface is
up */
>>> + if (!efx_check_disabled(efx) && netif_running(efx->net_dev)) {
>>> + dev_close(net_dev);
>>> + rc = dev_open(net_dev, NULL);
>>> + }
>>> +
>>> + return rc;
>>> }
>>>
>>> static void efx_ethtool_get_wol(struct net_device *net_dev,
>
Powered by blists - more mailing lists