lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea9f2737-6639-b9ce-9472-bb3c04581734@huawei.com>
Date:   Thu, 24 Jun 2021 22:15:50 +0800
From:   "huangguangbin (A)" <huangguangbin2@...wei.com>
To:     Will Deacon <will@...nel.org>
CC:     <davem@...emloft.net>, <kuba@...nel.org>,
        <catalin.marinas@....com>, <maz@...nel.org>,
        <mark.rutland@....com>, <dbrazdil@...gle.com>,
        <qperret@...gle.com>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>, <lipeng321@...wei.com>,
        <peterz@...radead.org>
Subject: Re: [PATCH net-next 2/3] net: hns3: add support for TX push mode



On 2021/6/22 20:16, Will Deacon wrote:
> On Tue, Jun 22, 2021 at 07:11:10PM +0800, Guangbin Huang wrote:
>> From: Huazhong Tan <tanhuazhong@...wei.com>
>>
>> For the device that supports the TX push capability, the BD can
>> be directly copied to the device memory. However, due to hardware
>> restrictions, the push mode can be used only when there are no
>> more than two BDs, otherwise, the doorbell mode based on device
>> memory is used.
>>
>> Signed-off-by: Huazhong Tan <tanhuazhong@...wei.com>
>> Signed-off-by: Yufeng Mo <moyufeng@...wei.com>
>> ---
>>   drivers/net/ethernet/hisilicon/hns3/hnae3.h        |  1 +
>>   drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    | 83 ++++++++++++++++++++--
>>   drivers/net/ethernet/hisilicon/hns3/hns3_enet.h    |  6 ++
>>   drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  2 +
>>   .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c |  2 +
>>   .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 11 ++-
>>   .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  8 +++
>>   .../ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c   |  2 +
>>   .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  | 11 ++-
>>   .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h  |  8 +++
>>   10 files changed, 126 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> index 0b202f4def83..3979d5d2e842 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> @@ -163,6 +163,7 @@ struct hnae3_handle;
>>   
>>   struct hnae3_queue {
>>   	void __iomem *io_base;
>> +	void __iomem *mem_base;
>>   	struct hnae3_ae_algo *ae_algo;
>>   	struct hnae3_handle *handle;
>>   	int tqp_index;		/* index in a handle */
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> index cdb5f14fb6bc..8649bd8e1b57 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> @@ -2002,9 +2002,77 @@ static int hns3_fill_skb_to_desc(struct hns3_enet_ring *ring,
>>   	return bd_num;
>>   }
>>   
>> +static void hns3_tx_push_bd(struct hns3_enet_ring *ring, int num)
>> +{
>> +#define HNS3_BYTES_PER_64BIT		8
>> +
>> +	struct hns3_desc desc[HNS3_MAX_PUSH_BD_NUM] = {};
>> +	int offset = 0;
>> +
>> +	/* make sure everything is visible to device before
>> +	 * excuting tx push or updating doorbell
>> +	 */
>> +	dma_wmb();
>> +
>> +	do {
>> +		int idx = (ring->next_to_use - num + ring->desc_num) %
>> +			  ring->desc_num;
>> +
>> +		u64_stats_update_begin(&ring->syncp);
>> +		ring->stats.tx_push++;
>> +		u64_stats_update_end(&ring->syncp);
>> +		memcpy(&desc[offset], &ring->desc[idx],
>> +		       sizeof(struct hns3_desc));
>> +		offset++;
>> +	} while (--num);
>> +
>> +	__iowrite64_copy(ring->tqp->mem_base, desc,
>> +			 (sizeof(struct hns3_desc) * HNS3_MAX_PUSH_BD_NUM) /
>> +			 HNS3_BYTES_PER_64BIT);
>> +
>> +#if defined(CONFIG_ARM64)
>> +	dgh();
>> +#endif
> 
> It looks a bit weird putting this at the end of the function, given that
> it's supposed to do something to a pair of accesses. Please can you explain
> what it's doing, and also provide some numbers to show that it's worthwhile
> (given that it's a performance hint not a correctness thing afaict).
> 
When the driver writes the device space mapped to the WriteCombine,
CPU combines into the cacheline unit by using the merge window mechanism
and delivers the cacheline to the device. However, even if the cacheline
is full, the device space is delivered only after the merge window
ends. (There is about 10ns delay at 3G frequency). To reduce the delay,
the WriteCombine needs to be flushed explicitly. This is why the DGH
needs to be invoked here.

>> +}
>> +
>> +static void hns3_tx_mem_doorbell(struct hns3_enet_ring *ring)
>> +{
>> +#define HNS3_MEM_DOORBELL_OFFSET	64
>> +
>> +	__le64 bd_num = cpu_to_le64((u64)ring->pending_buf);
>> +
>> +	/* make sure everything is visible to device before
>> +	 * excuting tx push or updating doorbell
>> +	 */
>> +	dma_wmb();
>> +
>> +	__iowrite64_copy(ring->tqp->mem_base + HNS3_MEM_DOORBELL_OFFSET,
>> +			 &bd_num, 1);
>> +	u64_stats_update_begin(&ring->syncp);
>> +	ring->stats.tx_mem_doorbell += ring->pending_buf;
>> +	u64_stats_update_end(&ring->syncp);
>> +
>> +#if defined(CONFIG_ARM64)
>> +	dgh();
>> +#endif
> 
> Same here.
> 
> Thanks,
> 
> Will
> .
> 
Thanks,

Guangbin
.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ