lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Sep 2021 11:34:43 +0000
From:   "Vaittinen, Matti" <Matti.Vaittinen@...rohmeurope.com>
To:     Grygorii Strashko <grygorii.strashko@...com>,
        Tony Lindgren <tony@...mide.com>
CC:     "linux-omap@...r.kernel.org" <linux-omap@...r.kernel.org>,
        Suman Anna <s-anna@...com>,
        Paul Barker <paul.barker@...cloud.com>,
        Peter Ujfalusi <peter.ujfalusi@...il.com>,
        Benoît Cousson <bcousson@...libre.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: beaglebone black boot failure Linux v5.15.rc1

Thanks a lot guys!

On 9/17/21 14:01, Grygorii Strashko wrote:
> 
> 
> On 17/09/2021 13:57, Grygorii Strashko wrote:
>>
>>
>> On 17/09/2021 13:28, Vaittinen, Matti wrote:
>>> Hi deeee Ho Tony & All,
>>>
>>> On 9/17/21 09:14, Tony Lindgren wrote:
>>>> Hi,
>>>>
>>>> * Vaittinen, Matti <Matti.Vaittinen@...rohmeurope.com> [210916 09:15]:
>>>
>>>>> My beaglebone black (rev c) based test environment fails to boot with
>>>>> v5.15-rc1. Boot succeeds with the v5.14.
>>>>>
>>>>> Bisecting the Linus' tree pointed out the commit:
>>>>> [1c7ba565e70365763ea780666a3eee679344b962] ARM: dts: am335x-baltos:
>>>>> switch to new cpsw switch drv
>>>>>
>>>>> I don't see this exact commit touching the BBB device-tree. In Linus'
>>>>> tree it is a part of a merge commit. Reverting the whole merge on 
>>>>> top of
>>>>> the v5.15-rc1
>>>>>
>>>>> This reverts commit 81b6a285737700c2e04ef0893617b80481b6b4b7, 
>>>>> reversing
>>>>> changes made to f73979109bc11a0ed26b6deeb403fb5d05676ffc.
>>>>>
>>>>> makes my beaglebone black to boot again.
>>>>>
>>>>> Yesterday I tried adding this patch:
>>>>> https://lore.kernel.org/linux-omap/20210915065032.45013-1-tony@atomide.com/T/#u 
>>>>>
>>>>> pointed by Tom on top of the v5.15-rc1 - no avail. I also did #define
>>>>> DEBUG at ti-sys.c as was suggested by Tom - but I don't see any 
>>>>> more output.
>>>>
>>>> Correction, that was me, not Tom :)
>>>
>>> Oh.. Sorry! I don't know where I picked Tom from... My bad!
>>>
>>>> For me, adding any kind of delay fixed the issue. Also adding some 
>>>> printk
>>>> statements fixed it for me.
>>>>
>>>>> Any suggestions what to check next?
>>>>
>>>> Maybe try the attached patch? If it helps, just try with the with the
>>>> ti,sysc-delay-us = <2> added as few modules need that after enable.
>>>>
>>>> It's also possible there is an issue with some other device that is now
>>>> getting enabled other than pruss. The last XXX printk output should 
>>>> show
>>>> the last device being probed.
>>>>
>>>> Looks like you need to also enable CONFIG_SERIAL_EARLYCON=y, and pass
>>>> console=ttyS0,115200 debug earlycon in the kernel command line.
>>>
>>> Ah. Thanks again. I indeed lacked the "debug earlycon" parameters. Now
>>> we're more likely to see what went wrong :) I pasted the serial log form
>>> failing boot with v5.15-rc1 which has both the patch you linked me above
>>> and the patch you suggested me to test in previous email.
>>>
>>

This really feels like an timing/synchronization issue. Adding various 
prints to

I tried adding prints to omap_reset_deassert() made the Ooops to go 
away. I suspect the prints did change timing just the needed bit. Later 
the boot hanged to NFS mount failing though - but that may also be 
problem on the NFS server side. (I jave a new laptop and I am still 
trying to set-up my development environment there.)


>> [...]
>>
>>> [    2.786181] ti-sysc 48311fe0.target-module: XXX sysc_probe
>>> [    2.791994] ti-sysc 48311fe0.target-module:
>>> 48310000:2000:1fe0:1fe4:NA:00000020:rng
>>> [    2.800820] omap_rng 48310000.rng: Random Number Generator ver. 20
>>> [    2.807315] random: crng init done
>>> [    2.814207] ti-sysc 4a101200.target-module: XXX sysc_probe
>>> [    2.820080] ti-sysc 4a101200.target-module:
>>> 4a100000:8000:1200:1208:1204:4edb0100:cpgmac
>>
>> This one cpsw
>>
>>> [    2.830347] ti-sysc 4a326000.target-module: XXX sysc_probe
>>
>> This one pruss and it still shows sysc_probe
>>
>> Not sure what are the dependency here :( if any.
>>
>> Additional option to try - cmdline param "initcall_debug" and maybe 
>> increase print level in really_probe_debug()
>>
> 
> Just to be clear - idea is to see *all* probes - not only sysc.
> 
> [...]
> 

I added initcall_debug && changed the pr_debug() to pr_err() in 
really_probe_debug(). Log from that run is attached. The 
omap_reset_deassert() was not instrumented to print/delay for this run.

Best Regards
	Matti Vaittinen

Download attachment "bbb_boot_pruss_minicom_2.cap" of type "application/vnd.tcpdump.pcap" (152009 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ