lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sun, 18 Nov 2018 23:27:27 +0100
From:   Heiko Stübner <heiko@...ech.de>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Otavio Salvador <otavio.salvador@...ystems.com.br>,
        netdev@...r.kernel.org, david.choi@...rel.com,
        Andy Yan <andy.yan@...k-chips.com>
Subject: Re: Linux kernel hangs if using RV1108 with MSZ8863 switch with two ports connected

Hi all,

Am Sonntag, 18. November 2018, 19:12:30 CET schrieb Andrew Lunn:
> > The kernel starts booting normally and then hangs like this when two
> > Ethernet cables are connected to the KSZ8863 switch:
> > http://dark-code.bulix.org/3xexu5-507563
>
> > This has the lock detection, inside kernel hacking, enabled.
> 
> Maybe turn on all the hung-task debug and magic key support.  With
> magic key, you might be able to get a backtrace of where it is
> spinning.
> 
> Maybe also add #define DEBUG at the top of drivers/net/phy/phy.c.
> Does it hang during a PHY state transition?
> 
> Maybe both PHYs are interrupting at the same time, and the interrupt
> code is broken?
> 
> Maybe look at the switch driver and see if there is any code which is
> executed on link up. Put some printk() in there.
> 
> If you PHYs are using interrupt mode, maybe disable that and use
> polling.
> 
> Do you know if this ever worked properly before? If you know when it
> did work, you can do a git bisect to narrow it down to the one patch
> which broke it..
> 
> Basically, at the moment, you just need to try lots of things, to
> narrow it down.

Your hang also seems to happen around the time the kernel disables
unused clocks and regulators.

So you might also try with these functions disabled, as it may be caused
by some clock or regulator handled wrongly there (I think it's called
clk_ignore_unused as kernel commandline but please double-check
and you'll need to check for a regulator equivalent yourself).

And as I think it might be some sort of driver-related issue, you could
also enable debugging in the driver-core [drivers/base/dd.c]
by either #define DEBUG or just redefining dev_dbg temporarily ;-)
	#define dev_dbg dev_warn
or so.

That may help finding the culprit of your hang.


Heiko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ