lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZcRGl758ek_at4Ha@liutao02-mac.local>
Date: Thu, 8 Feb 2024 11:12:23 +0800
From: Tao Liu <taoliu828@....com>
To: Cosmin Ratiu <cratiu@...dia.com>
Cc: roid@...dia.com, paulb@...dia.com, vladbu@...dia.com,
	dchumak@...dia.com, saeedm@...dia.com, taoliu828@....com,
	netdev@...r.kernel.org
Subject: Re: Report mlx5_core crash

On 02/07  , Cosmin Ratiu wrote:
> On Tue, 2024-02-06 at 15:01 +0800, Tao Liu wrote:
> > On 01/31  , Tao Liu wrote:
> > > Hi Mellanox team,
> > > 
> > >    We hit a crash in mlx5_core which is similar with commit
> > >    de31854ece17 ("net/mlx5e: Fix nullptr on deleting mirroring rule").
> > >    But they are different cases, our case is:
> > >    in_port(...),eth(...) \
> > > actions:set(tunnel(...)),vxlan_sys_4789,set(tunnel(...)),vxlan_sys_4789,...
> > > 
> > >      BUG: kernel NULL pointer dereference, address: 0000000000000270
> > >      RIP: 0010:del_sw_hw_rule+0x29/0x190 [mlx5_core]
> 
> Hello,
> 
> I'll help you find and fix the problem.
> Your core dump analysis was very useful, but not sufficient to find the
> cause of the crash. Would you mind sharing a set of reproduction steps
> so we can debug this further?
> 
> Thank you,
> Cosmin.

Hi Cosmin,

Thanks for your reply.

It's hard to reproduce the crash directly.  In our case the rule forwards ip
broadcast traffic to 5 vxlan remotes. And driver creates 6 mlx5_flow_rule
which include 5 mlx5_pkt_reformat and 1 counter.
It triggers only when two *dr_action in struct mlx5_pkt_reformat have same
lower 32 bits, which determined by memory allocation.

Is it possible that we do some fault injection in unit test to reproduce?

Best regards,
Tao


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ