lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMnM-s4M4HFpK1MVr+ey6PkU=uzwYsUipc1zBA5RPhzt-A@mail.gmail.com>
Date:   Mon, 24 Apr 2023 13:59:15 -0400
From:   Jamal Hadi Salim <jhs@...atatu.com>
To:     Stephen Hemminger <stephen@...workplumber.org>
Cc:     Leon Romanovsky <leon@...nel.org>,
        Victor Nogueira <victor@...atatu.com>, davem@...emloft.net,
        edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
        netdev@...r.kernel.org, xiyou.wangcong@...il.com, jiri@...nulli.us,
        kernel@...atatu.com
Subject: Re: [PATCH net v2] net/sched: act_mirred: Add carrier check

On Mon, Apr 24, 2023 at 1:44 PM Stephen Hemminger
<stephen@...workplumber.org> wrote:
>
> On Mon, 24 Apr 2023 20:36:02 +0300
> Leon Romanovsky <leon@...nel.org> wrote:
>
> > > There are cases where the device is adminstratively UP, but operationally
> > > down. For example, we have a physical device (Nvidia ConnectX-6 Dx, 25Gbps)
> > > who's cable was pulled out, here is its ip link output:
> > >
> > > 5: ens2f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
> > >     link/ether b8:ce:f6:4b:68:35 brd ff:ff:ff:ff:ff:ff
> > >     altname enp179s0f1np1
> > >
> > > As you can see, it's administratively UP but operationally down.
> > > In this case, sending a packet to this port caused a nasty kernel hang (so
> > > nasty that we were unable to capture it). Aborting a transmit based on
> > > operational status (in addition to administrative status) fixes the issue.
> > >
>
> Then fix the driver. It shouldn't hang.
> Other drivers just drop packets if link is down.


We didnt do extensive testing of drivers but consider this a safeguard
against buggy driver (its a huge process upgrading drivers in some
environments). It may even make sense to move this to dev_queue_xmit()
i.e the arguement is: why is the core sending a packet to hardware
that has link down to begin with? BTW, I believe the bridge behaves
this way ...

cheers,
jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ