lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 27 Feb 2024 17:39:14 +0000
From: Cosmin Ratiu <cratiu@...dia.com>
To: "taoliu828@....com" <taoliu828@....com>
CC: Roi Dayan <roid@...dia.com>, Paul Blakey <paulb@...dia.com>, Saeed
 Mahameed <saeedm@...dia.com>, Vlad Buslov <vladbu@...dia.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, Dima Chumak
	<dchumak@...dia.com>
Subject: Re: Report mlx5_core crash

On Thu, 2024-02-08 at 11:12 +0800, Tao Liu wrote:
> Hi Cosmin,
> 
> Thanks for your reply.
> 
> It's hard to reproduce the crash directly.  In our case the rule forwards ip
> broadcast traffic to 5 vxlan remotes. And driver creates 6 mlx5_flow_rule
> which include 5 mlx5_pkt_reformat and 1 counter.
> It triggers only when two *dr_action in struct mlx5_pkt_reformat have same
> lower 32 bits, which determined by memory allocation.
> 
> Is it possible that we do some fault injection in unit test to reproduce?

In the end, no complicated fault injection was needed. I just had to
pay proper attention to your awesome initial analysis and I've managed
to understand the problems.

I've also prepared fixes for both of them, the patches are under review
in our internal tree and should hopefully soon be on their way
upstream.

But from the stack traces you reported, I noticed you are running with
OFED. I will talk to my colleagues and let you know as soon as a new
build with the fixes included can be used to test.

Cosmin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ