lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 23 May 2018 06:52:33 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Jiri Pirko <jiri@...nulli.us>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Saeed Mahameed <saeedm@...lanox.com>,
        "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        Huy Nguyen <huyn@...lanox.com>,
        Or Gerlitz <gerlitz.or@...il.com>
Subject: Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute

On 05/23/2018 02:43 AM, Jiri Pirko wrote:
> Tue, May 22, 2018 at 07:20:26AM CEST, jakub.kicinski@...ronome.com wrote:
>> On Mon, 21 May 2018 14:04:57 -0700, Saeed Mahameed wrote:
>>> From: Huy Nguyen <huyn@...lanox.com>
>>>
>>> In this patch, we add dcbnl buffer attribute to allow user
>>> change the NIC's buffer configuration such as priority
>>> to buffer mapping and buffer size of individual buffer.
>>>
>>> This attribute combined with pfc attribute allows advance user to
>>> fine tune the qos setting for specific priority queue. For example,
>>> user can give dedicated buffer for one or more prirorities or user
>>> can give large buffer to certain priorities.
>>>
>>> We present an use case scenario where dcbnl buffer attribute configured
>>> by advance user helps reduce the latency of messages of different sizes.
>>>
>>> Scenarios description:
>>> On ConnectX-5, we run latency sensitive traffic with
>>> small/medium message sizes ranging from 64B to 256KB and bandwidth sensitive
>>> traffic with large messages sizes 512KB and 1MB. We group small, medium,
>>> and large message sizes to their own pfc enables priorities as follow.
>>>   Priorities 1 & 2 (64B, 256B and 1KB)
>>>   Priorities 3 & 4 (4KB, 8KB, 16KB, 64KB, 128KB and 256KB)
>>>   Priorities 5 & 6 (512KB and 1MB)
>>>
>>> By default, ConnectX-5 maps all pfc enabled priorities to a single
>>> lossless fixed buffer size of 50% of total available buffer space. The
>>> other 50% is assigned to lossy buffer. Using dcbnl buffer attribute,
>>> we create three equal size lossless buffers. Each buffer has 25% of total
>>> available buffer space. Thus, the lossy buffer size reduces to 25%. Priority
>>> to lossless  buffer mappings are set as follow.
>>>   Priorities 1 & 2 on lossless buffer #1
>>>   Priorities 3 & 4 on lossless buffer #2
>>>   Priorities 5 & 6 on lossless buffer #3
>>>
>>> We observe improvements in latency for small and medium message sizes
>>> as follows. Please note that the large message sizes bandwidth performance is
>>> reduced but the total bandwidth remains the same.
>>>   256B message size (42 % latency reduction)
>>>   4K message size (21% latency reduction)
>>>   64K message size (16% latency reduction)
>>>
>>> Signed-off-by: Huy Nguyen <huyn@...lanox.com>
>>> Signed-off-by: Saeed Mahameed <saeedm@...lanox.com>
>>
>> On a cursory look this bares a lot of resemblance to devlink shared
>> buffer configuration ABI.  Did you look into using that?  
>>
>> Just to be clear devlink shared buffer ABIs don't require representors
>> and "switchdev mode".
> 
> If the CX5 buffer they are trying to utilize here is per port and not a
> shared one, it would seem ok for me to not have it in "devlink sb".
> 

+1 I think its probably reasonable to let devlink manage the global
(device layer) buffers and then have dcbnl partition the buffer up
further per netdev. Notice there is already a partitioning of the
buffers happening when DCB is enabled and/or parameters are changed.
So giving explicit control over this seems OK to me.

It would be nice though if the API gave us some hint on max/min/stride
of allowed values. Could the get API return these along with current
value? Presumably the allowed max size could change with devlink buffer
changes in how the global buffer is divided up as well.

The argument against allowing this API is it doesn't have anything to
do with the 802.1Q standard, but that is fine IMO.

.John


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ