lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121014202431.GL21164@n2100.arm.linux.org.uk>
Date:	Sun, 14 Oct 2012 21:24:31 +0100
From:	Russell King - ARM Linux <linux@....linux.org.uk>
To:	Daniel Mack <zonque@...il.com>
Cc:	Al Viro <viro@...IV.linux.org.uk>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [git pull] signals pile 3

On Sun, Oct 14, 2012 at 05:35:23PM +0200, Daniel Mack wrote:
> I rebased my ARM development branch and figured that your patch 9fff2fa
> ("arm: switch to saner kernel_execve() semantics") breaks the boot on my
> board right after init is invoked via NFS:

Ok, I'm not going to assign blame to Al's commits (I never reviewed his
stuff before they hit mainline - patches never posted to the ARM mailing
list, and the development actually happened within the merge window,
all things we tell people not to do...)  I _still_ haven't reviewed that
stuff yet.

But... nevertheless...

> [    4.682072] VFS: Mounted root (nfs filesystem) on device 0:12.
> [    4.690744] devtmpfs: mounted
> [    4.694395] Freeing init memory: 172K
> [    5.291417] Internal error: Oops - undefined instruction: 0 [#1] SMP
> THUMB2

Ok, so this tells us the kernel was built using Thumb2 ISA.

> [    5.298734] Modules linked in:
> [    5.301952] CPU: 0    Not tainted  (3.6.0-11053-g56c8535 #128)
> [    5.308071] PC is at cpsw_probe+0x422/0x9ac

PC is not word aligned, so it can't be running in the ARM ISA.

> [    5.312459] LR is at trace_hardirqs_on_caller+0x8f/0xfc
> [    5.317934] pc : [<c03493de>]    lr : [<c005e81f>]    psr: 60000113

Note that this reconfirms the above (well, it should do, it's the same
value.)

> [    5.317934] sp : cf055fb0  ip : 00000000  fp : 00000000
> [    5.329944] r10: 00000000  r9 : 00000000  r8 : 00000000
> [    5.335413] r7 : 00000000  r6 : 00000000  r5 : c034458d  r4 : 00000000
> [    5.342244] r3 : cf057a40  r2 : 00000000  r1 : 00000001  r0 : 00000000
> [    5.349078] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
> Segment user

And this tells us that we're running in ARM mode, not Thumb mode.

> [    5.356546] Control: 50c5387d  Table: 8f434019  DAC: 00000015
> [    5.362562] Process init (pid: 1, stack limit = 0xcf054240)
> [    5.368395] Stack: (0xcf055fb0 to 0xcf056000)
> [    5.372961] 5fa0:                                     00000001
> 00000000 00000000 00000000
> [    5.381525] 5fc0: cf055fb0 c000d1a8 00000000 00000000 00000000
> 00000000 00000000 00000000
> [    5.390091] 5fe0: 00000000 bee83f10 00000000 b6fdedd0 00000010
> 00000000 aaaabfaf a8babbaa

No stack backtrace (and it's silent about why that is).

The other strange thing here is that the stack dump above is showing that
the stack is completely empty - which shouldn't be the case if we're in a
driver probe function - driver probe functions are called via the driver
model layers...

> [    5.398664] Code: 2206a010 718ef508 0184f8da f8b1f65d (3070f8d8)

And now we come to the Code: line, which makes no sense as an ARM ISA:

   0:   2206a010        andcs   sl, r6, #16
   4:   718ef508        orrvc   pc, lr, r8, lsl #10
   8:   0184f8da        ldrdeq  pc, [r4, sl]
   c:   f8b1f65d                        ; <UNDEFINED> instruction: 0xf8b1f65d
  10:   3070f8d8        ldrsbtcc        pc, [r0], #-136 ; 0xffffff78    ; <UNPREDICTABLE>

But as Thumb, it looks more reasonable:

   0:   a010            add     r0, pc, #64     ; (adr r0, 44 <foo+0x44>)
   2:   2206            movs    r2, #6
   4:   f508 718e       add.w   r1, r8, #284    ; 0x11c
   8:   f8da 0184       ldr.w   r0, [sl, #388]  ; 0x184
   c:   f65d f8b1       bl      ffe5d172 <foo+0xffe5d172>
  10:   f8d8 3070       ldr.w   r3, [r8, #112]  ; 0x70

I don't have any further comments to make on this yet, as I've no idea
what state stuff is in, but the above oops dump to me suggests that
we've randomly jumped into some part of the kernel which just happens
to be cpsw_probe().

Please send me (in private mail) your vmlinux file and a corresponding
oops dump from that same kernel, and I'll dig and try and work out
what's going on...

This kind of investigation reminds me of those I did back in the 1990s
when stuff was rather unstable and ARM was a young architecture.  Now
all we need is for an ARM platform to dump its entire memory out the
ethernet port, bringing an university department network to a halt (I
did that once - back in the 1990s - sorry Tim!)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ