linux-kernel - Re: [PATCH v3 net-next 1/2] net: ethernet: slicoss: add slicoss gigabit ethernet driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <02068116-067a-3df9-b0cd-3ad60bcad9ff@gmx.de>
Date:   Wed, 30 Nov 2016 20:48:32 +0100
From:   Lino Sanfilippo <LinoSanfilippo@....de>
To:     Florian Fainelli <f.fainelli@...il.com>, davem@...emloft.net,
        charrer@...critech.com, liodot@...il.com,
        gregkh@...uxfoundation.org, andrew@...n.ch
Cc:     devel@...verdev.osuosl.org, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH v3 net-next 1/2] net: ethernet: slicoss: add slicoss
 gigabit ethernet driver

On 29.11.2016 18:14, Florian Fainelli wrote:
> On 11/28/2016 01:41 PM, Lino Sanfilippo wrote:
>> The problem is that the HW does not provide a tx completion index. Instead we have to 
>> iterate the status descriptors until we get an invalid idx which indicates that there 
>> are no further tx descriptors done for now. I am afraid that if we do not limit the 
>> number of descriptors processed in the tx completion handler, a continuous transmission 
>> of frames could keep the loop in xmit_complete() run endlessly. I dont know if this 
>> can actually happen but I wanted to make sure that this is avoided.
> 
> OK, it might be a good idea to put that comment somewhere around the tx
> completion handler to understand why it is bounded with a specific value.
> 

Agreed, I will add such a comment.

>> 
>>> [snip]
>>>
>>>> +	while (slic_get_free_rx_descs(rxq) > SLIC_MAX_REQ_RX_DESCS) {
>>>> +		skb = alloc_skb(maplen + ALIGN_MASK, gfp);
>>>> +		if (!skb)
>>>> +			break;
>>>> +
>>>> +		paddr = dma_map_single(&sdev->pdev->dev, skb->data, maplen,
>>>> +				       DMA_FROM_DEVICE);
>>>> +		if (dma_mapping_error(&sdev->pdev->dev, paddr)) {
>>>> +			netdev_err(dev, "mapping rx packet failed\n");
>>>> +			/* drop skb */
>>>> +			dev_kfree_skb_any(skb);
>>>> +			break;
>>>> +		}
>>>> +		/* ensure head buffer descriptors are 256 byte aligned */
>>>> +		offset = 0;
>>>> +		misalign = paddr & ALIGN_MASK;
>>>> +		if (misalign) {
>>>> +			offset = SLIC_RX_BUFF_ALIGN - misalign;
>>>> +			skb_reserve(skb, offset);
>>>> +		}
>>>> +		/* the HW expects dma chunks for descriptor + frame data */
>>>> +		desc = (struct slic_rx_desc *)skb->data;
>>>> +		memset(desc, 0, sizeof(*desc));
>>>
>>> Do you really need to zero-out the prepending RX descriptor? Are not you
>>> missing a write barrier here?
>> 
>> Indeed, it should be sufficient to make sure that the bit SLIC_IRHDDR_SVALID is not set.
>> I will adjust it. 
>> Concerning the write barrier: You mean a wmb() before slic_write() to ensure that the zeroing
>>  of the status desc is done before the descriptor is passed to the HW, right?
> 
> Correct, that's what I meant here.
> 

Ok, will fix this. Good catch BTW!

>> 
>>> [snip]
>>>
>>>> +
>>>> +		dma_sync_single_for_cpu(&sdev->pdev->dev,
>>>> +					dma_unmap_addr(buff, map_addr),
>>>> +					buff->addr_offset + sizeof(*desc),
>>>> +					DMA_FROM_DEVICE);
>>>> +
>>>> +		status = le32_to_cpu(desc->status);
>>>> +		if (!(status & SLIC_IRHDDR_SVALID))
>>>> +			break;
>>>> +
>>>> +		buff->skb = NULL;
>>>> +
>>>> +		dma_unmap_single(&sdev->pdev->dev,
>>>> +				 dma_unmap_addr(buff, map_addr),
>>>> +				 dma_unmap_len(buff, map_len),
>>>> +				 DMA_FROM_DEVICE);
>>>
>>> This is potentially inefficient, you already did a cache invalidation
>>> for the RX descriptor here, you could be more efficient with just
>>> invalidating the packet length, minus the descriptor length.
>>>
>> 
>> I am not sure I understand: We have to unmap the complete dma area, no matter if we synced
>> part of it before, dont we? AFAIK a dma sync is different from unmapping dma, or do I miss
>> something?
> 
> Sorry, I was not very clear, what I meant is that you can allocate and
> do the initial dma_map_single() of your RX skbs during ndo_open(), and
> then, in your RX path, you can only do dma_sync_single_for_cpu() twice
> (once for the RX descriptor status, second time for the actual packet
> contents), and when you return the SKB to the HW, do a
> dma_sync_single_for_device(). The advantage of doing that, is that if
> your cache operations are slow, you only do them on exactly packet
> length, and not the actual RX buffer size (e.g: 2KB).

Um. In the rx path the SKB will be consumed (by napi_gro_receive()).
AFAIK we _have_ to unmap it before this call. Doing only a dma_sync_single_for_cpu()
for the packet contents does IMHO only make sense if the corresponding SKB is
reused somehow. But this is not the case. The rx buffers are refilled with newly
allocated SKBs each time, and thus we need to create a new dma mapping for each of them.

Or do I still misunderstand when to call the dma sync functions?


BTW: I just realized that if the descriptor has not been used by the HW yet, see:

+		dma_sync_single_for_cpu(&sdev->pdev->dev,
+					dma_unmap_addr(buff, map_addr),
+					buff->addr_offset + sizeof(*desc),
+					DMA_FROM_DEVICE);
+
+		status = le32_to_cpu(desc->status);
+		if (!(status & SLIC_IRHDDR_SVALID))
+			break;
+		  
		^^^^^^^^^^^^^^^^^^^^^^^^^^^  dma_sync_single_for_device missing


there has to be a dma_sync_single_for_device to undo the sync for cpu (since the
HW will write to this descr when the next rx packet arrives), right? But this is racy:
What if the HW writes to that descr after we synced it for cpu but before we synced
it for the HW again? Any ideas?

Regards,
Lino