lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 28 Oct 2022 16:48:53 -0700
From:   Rohit Nair <rohit.sajan.kumar@...cle.com>
To:     Leon Romanovsky <leon@...nel.org>
Cc:     jgg@...pe.ca, saeedm@...dia.com, davem@...emloft.net,
        edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
        linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org, manjunath.b.patil@...cle.com,
        rama.nichanamatlu@...cle.com,
        Michael Guralnik <michaelgur@...dia.com>
Subject: Re: [External] : Re: [PATCH 1/1] IB/mlx5: Add a signature check to
 received EQEs and CQEs

On 10/27/22 5:23 AM, Leon Romanovsky wrote:
> On Tue, Oct 25, 2022 at 10:44:12AM -0700, Rohit Nair wrote:
>> Hey Leon,
>>
>> Please find my replies to your comments here below:
> 
> <...>
> 
>>>
>>>> This patch does not introduce any significant performance degradations
>>>> and has been tested using qperf.
>>> What does it mean? You made changes in kernel verbs flow, they are not
>>> executed through qperf.
>> We also conducted several extensive performance tests using our test-suite
>> which utilizes rds-stress and also saw no significant performance
>> degrdations in those results.
> 
> What does it mean "also"? Your change is applicable ONLY for kernel path.
> 
> Anyway, I'm not keen adding rare debug code to performance critical path.
> 
> Thanks

rds-stress exercises the codepath we are modifying here. rds-stress 
didn't show much of performance degrade when we ran internally. We also 
requested our DB team for performance regression testing and this change 
passed their test suite. This motivated us to submit this to upstream.

If there is any other test that is better suited for this change, I am 
willing to test it. Please let me know if you have something in mind. We 
can revisit this patch after such a test may be.

I agree that, this was a rare debug scenario, but it took lot more than 
needed to narrow down[engaged vendor on live sessions]. We are adding 
this in the hope to finding the cause at the earliest or at least point 
us which direction to look at. We also requested the vendor[mlx] to 
include some diagnostics[HW counter], which can help us narrow it faster 
next time. This is our attempt to add kernel side of diagnostics.

Feel free to share your suggestions

Thanks

Powered by blists - more mailing lists