lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <42eca8e9-6e45-4728-bc1f-18330b7814c1@nalramli.com>
Date: Thu, 24 Aug 2023 20:26:21 -0400
From: "Nabil S. Alramli" <dev@...ramli.com>
To: Saeed Mahameed <saeed@...nel.org>
Cc: netdev@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
 saeedm@...dia.com, tariqt@...dia.com, linux-kernel@...r.kernel.org,
 leon@...nel.org, jdamato@...tly.com, nalramli@...tly.com
Subject: Re: [net-next RFC 0/1] mlx5: support per queue coalesce settings

Hi Saeed,

Thank you for your review and feedback. Please see below for a couple of 
additional questions.

On 8/24/23 14:51, Saeed Mahameed wrote:
> On 23 Aug 18:31, Nabil S. Alramli wrote:
>> Hello,
>>
>> I am Submitting this as an RFC to get feedback and to find out if the 
>> folks
>> at Mellanox would accept this change.
>>
>> Currently, only gobal coalescing configuration queries or changes are
>> supported in the `mlx5` driver. However, per-queue operations are not, 
>> and
>> result in `EOPNOTSUPP` errors when attempted with `ethtool`. This patch
>> adds support for per-queue coalesce operations, with a caveat described
>> below.
>>
>> Here's an example use case:
>>
>> - An mlx5 NIC is configured with 8 queues, each queue has its IRQ pinned
>>  to a unique CPU.
>> - Two custom RSS contexts are created: context 1 and context 2. Each
>>  context has a different set of queues where flows are distributed. For
>>  example, context 1 may distribute flows to queues 0-3, and context 2 may
>>  distribute flows to queues 4-7.
>> - A series of ntuple filters are installed which direct matching flows to
>>  RSS contexts. For example, perhaps port 80 is directed to context 1 and
>>  port 443 to context 2.
>> - Applications which receive network data associated with either context
>>  are pinned to the CPUs where the queues in the matching context have
>>  their IRQs pinned to maximize locality.
>>
>> The apps themselves, however, may have different requirements on 
>> latency vs
>> CPU usage and so setting the per queue IRQ coalesce values would be very
>> helpful.
>>
>> This patch would support this, with the caveat that DIM mode changes per
>> queue are not supported. DIM mode can only be changed NIC-wide. This is
>> because in the mlx5 driver, global operations that change the state of
>> adaptive-ex or adaptive-tx require a reset. So in the case of 
>> per-queue, we
>> reject such requests. This was done in the interest of simplicity for 
>> this
>> RFC as setting the DIM mode per queue appears to require significant
>> changes to mlx5 to be able to preserve the state of the indvidual 
>> channels
>> through a reset.
>>
>> IMO, if a user is going to set per-queue coalesce settings it might be
>> reasonable to assume that they will disable adaptive rx/tx NIC wide first
>> and then go through and apply their desired per-queue settings.
>>
> 
> DIM is already per channel, so I don't think it's that difficult to have
> mix support of DIM and static config per channel. Yes the code will need
> some refactoring, but with a quick glance at the code provided here, such
> refactoring is already required IMO.
> 
That sounds good. I'll go ahead and add support for DIM per-channel 
coalescing updates. I'll send another RFC when that is ready for initial 
feedback.

Can you please provide some clarification as to what you mean that some 
refactoring is needed? Are you referring to the existing code, or is 
there something specifically in this patch?

For example, I think we'll need to make some additional changes to 
mlx5e_safe_switch_params(), mlx5e_open_channels() and probably more. Is 
this what you have in mind?

>> Here's an example:
>>
>> $ ethtool --per-queue eth0 queue_mask 0x4 --show-coalesce
>> Queue: 2
>> Adaptive RX: on  TX: on
>> stats-block-usecs: 0
>> sample-interval: 0
>> pkt-rate-low: 0
>> pkt-rate-high: 0
>>
>> rx-usecs: 8
>> rx-frames: 128
>> ...
>>
>> Now, let's try to set adaptive-rx off rx-usecs 16 for queue 2:
>>
>> $ sudo ethtool --per-queue eth0 queue_mask 0x4 --coalesce adaptive-rx 
>> off \
>>  rx-usecs 16
>> Cannot set device per queue parameters: Operation not supported
>>
>> This is not supported; adaptive-rx must be disabled NIC wide first:
>>
>> $ sudo ethtool -C eth0 adaptive-rx off
>>
>> And now, queue_mask 0x4 can be applied to set rx-usecs:
>>
>> $ sudo ethtool --per-queue eth0 queue_mask 0x4 --coalesce rx-usecs 16
>> $ ethtool --per-queue eth0 queue_mask 0x4 --show-coalesce
>> Queue: 2
>> Adaptive RX: off  TX: on
>> stats-block-usecs: 0
>> sample-interval: 0
>> pkt-rate-low: 0
>> pkt-rate-high: 0
>>
>> rx-usecs: 16
>> rx-frames: 32
>> ...
>>
>> Previously a global `struct mlx5e_params` stored the options in
>> `struct mlx5e_priv.channels.params`. That was preserved, but a channel-
>> specific instance was added as well, in `struct mlx5e_channel.params`.
>>
>> Note that setting global coalescing options will set the individual
>> channel settings to the same values as well.
>>
>> Is Mellanox open to this change? What would be needed to get something 
>> like
>> this accepted?
>>
> 
> Sure, we just need to pass review and few testing cycles.
> 
> Thanks,
> Saeed.
> 

Best Regards,

-- Nabil S. Alramli

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ