lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d8e5e57392fec5aff2b65beceda161ea@walle.cc>
Date:   Sat, 27 Feb 2021 14:19:37 +0100
From:   Michael Walle <michael@...le.cc>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        Claudiu Manoil <claudiu.manoil@....com>,
        Alexandru Marginean <alexandru.marginean@....com>,
        Vladimir Oltean <vladimir.oltean@....com>,
        Jesse Brandeburg <jesse.brandeburg@...el.com>
Subject: Re: [PATCH v2 net 2/6] net: enetc: initialize RFS/RSS memories for
 unused ports too

Am 2021-02-25 13:18, schrieb Vladimir Oltean:
> From: Vladimir Oltean <vladimir.oltean@....com>
> 
> Michael reports that since linux-next-20210211, the AER messages for 
> ECC
> errors have started reappearing, and this time they can be reliably
> reproduced with the first ping on one of his LS1028A boards.
> 
> $ ping 1[   33.258069] pcieport 0000:00:1f.0: AER: Multiple Corrected
> error received: 0000:00:00.0
> 72.16.0.1
> PING [   33.267050] pcieport 0000:00:1f.0: AER: can't find device of 
> ID0000
> 172.16.0.1 (172.16.0.1): 56 data bytes
> 64 bytes from 172.16.0.1: seq=0 ttl=64 time=17.124 ms
> 64 bytes from 172.16.0.1: seq=1 ttl=64 time=0.273 ms
> 
> $ devmem 0x1f8010e10 32
> 0xC0000006
> 
> It isn't clear why this is necessary, but it seems that for the errors
> to go away, we must clear the entire RFS and RSS memory, not just for
> the ports in use.
> 
> Sadly the code is structured in such a way that we can't have unified
> logic for the used and unused ports. For the minimal initialization of
> an unused port, we need just to enable and ioremap the PF memory space,
> and a control buffer descriptor ring. Unused ports must then free the
> CBDR because the driver will exit, but used ports can not pick up from
> where that code path left, since the CBDR API does not reinitialize a
> ring when setting it up, so its producer and consumer indices are out 
> of
> sync between the software and hardware state. So a separate
> enetc_init_unused_port function was created, and it gets called right
> after the PF memory space is enabled.
> 
> Note that we need access from enetc_pf.c to the CBDR creation and
> deletion methods, which were for some reason put in enetc.c. While
> changing their definitions to be non-static, also move them to
> enetc_cbdr.c which seems like a better place to hold these.
> 
> Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories")
> Reported-by: Michael Walle <michael@...le.cc>
> Cc: Jesse Brandeburg <jesse.brandeburg@...el.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>

I had this patch in my tree for a while now. As we've learned, it
really depends on a particular power-up state for the error to happen.
So take this with a grain of salt: I haven't seen the error anymore,
albeit multiple power-cycles. Thus:

Tested-by: Michael Walle <michael@...le.cc>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ