lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05ae066b-873d-159b-4ac2-ab39120c949b@mellanox.com>
Date:   Wed, 27 Jun 2018 15:11:00 +0300
From:   Tariq Toukan <tariqt@...lanox.com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        David Miller <davem@...emloft.net>
Cc:     netdev <netdev@...r.kernel.org>,
        Tariq Toukan <tariqt@...lanox.com>,
        Shawn Bohrer <sbohrer@...advisors.com>,
        Shay Agroskin <shayag@...lanox.com>,
        Eran Ben Elisha <eranbe@...lanox.com>
Subject: Re: [PATCH net-next] mlx4: do not use rwlock in fast path



On 09/02/2017 7:10 PM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@...gle.com>
> 
> Using a reader-writer lock in fast path is silly, when we can
> instead use RCU or a seqlock.
> 
> For mlx4 hwstamp clock, a seqlock is the way to go, removing
> two atomic operations and false sharing.
> 
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Tariq Toukan <tariqt@...lanox.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_clock.c |   35 ++++++++--------
>   drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |    2
>   2 files changed, 19 insertions(+), 18 deletions(-)
> 

Hi Eric,

When my peer, Shay, modified mlx5 to adopt this same locking 
scheme/type, he noticed a degradation in packet rate.
He got back to testing mlx4 and also noticed a degradation introduced by 
this patch.

Perf numbers (single ring):

mlx4:
with rw-lock: ~8.54M pps
with seq-lock: ~8.51M pps

mlx5:
With rw-lock: ~14.94M pps
With seq-lock: ~14.48M pps

Actually, this can be explained by the analysis below.
In short, number of readers is significantly larger than of writers. 
Hence optimizing the readers flow would give better numbers. The issue 
is, the read/write lock might cause writers starvation. Maybe RCU fits 
best here?

Degradation analysis:
The patch changes the lock type which protects reads and updates of a 
variable ( (struct mlx4_en_dev).clock variable)
This variable is used to convert the hw timestamp into skb->hwtstamps.
This variable is read for each transmitted/received packet and updated 
only via ptp module and some overflow periodic work we have (maximum of 
10 times per second)
Meaning that there are much more readers than writers, and it’s best to 
optimize the readers flow.

Best,
Tariq

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ