lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200712202047.56745.m.kozlowski@tuxland.pl>
Date:	Thu, 20 Dec 2007 20:47:55 +0100
From:	Mariusz Kozlowski <m.kozlowski@...land.pl>
To:	Matt Mackall <mpm@...enic.com>
Cc:	David Miller <davem@...emloft.net>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, sparclinux@...r.kernel.org
Subject: Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

Hello, 

> > > Actually, you may only need these two:
> > > 
> > > > maps4-add-proc-kpagecount-interface.patch
> > > > maps4-add-proc-kpageflags-interface.patch
> > 
> > Yes these two were enough, and exporting fs/proc/base.c's
> > mem_lseek().
> > 
> > As hard as I try, I can't reproduce this at all.  I tried
> > both on my workstation and my niagara boxes.
> 
> That's good to know, I was having a very hard time imagining how the
> kpagecount code could be going south.
>  
> > It must be other needle in the 30MB+ -mm haystack. :-(

I'm afraid you are wrong. Eariler kernel are affected as well. At reading your mail I was
thinking of applying those two patches to 2.6.24-rc5 and do bisection on the rest of -mm series.
Unfortunately clean 2.6.24-rc5 with these two patches is affected as well (new processes
stuck in D state etc). So I tried vanilla 2.6.23 patched by these two patches (and
mem_lseek export from fs/proc/base.c). Now at least I got a trace produced by 'cat /proc/kpagecount'
which you can find below. Also, in spite of the oops, the box doesn't get locked (as with -mm)
and is still usable.

[  126.060976] TSTATE: 0000009980009603 TPC: 0000000000428a84 TNPC: 0000000000428a88 Y: 00000000    Not tainted
[  126.063486] TPC: <cpu_idle+0x2c/0xe0>
[  126.065986] g0: 0000000000000009 g1: 0000048000004000 g2: 000000000000000f g3: 00000000007204c0
[  126.068636] g4: 00000000007244c0 g5: fffff8007f878000 g6: 00000000007204c0 g7: 0000000000724958
[  126.071232] o0: 0000000000000001 o1: 00000000007204c8 o2: 0000000000000001 o3: 0000000000000000
[  126.073924] o4: 6000000000000000 o5: 000000000078f140 sp: 00000000007239b1 ret_pc: 0000000000428a78
[  126.076569] RPC: <cpu_idle+0x20/0xe0>
[  126.079185] l0: 0000000000720000 l1: 0000000000000002 l2: 0000000000000001 l3: 000000000075d400
[  126.081934] l4: 000000000075d400 l5: fffff80080015b10 l6: fffff80080005b08 l7: 0000000000000001
[  126.084637] i0: 0000000000000001 i1: 0000000000720094 i2: 0000000000000000 i3: 0000000000000000
[  126.087375] i4: 00000000007204c0 i5: 0000000000000002 i6: 0000000000723a71 i7: 0000000000665a24
[  126.090135] I7: <rest_init+0x6c/0x80>
[  145.121228] Unable to handle kernel NULL pointer dereference
[  145.124515] tsk->{mm,active_mm}->context = 0000000000000d41
[  145.127778] tsk->{mm,active_mm}->pgd = fffff800bd8d2000
[  145.127801]               \|/ ____ \|/
[  145.127808]               "@'/ .. \`@"
[  145.127815]               /_| \__/ |_\
[  145.127821]                  \__U_/
[  145.127831] cat(3111): Oops [#1]
[  145.127849] 
[  145.127853] =================================
[  145.127861] [ INFO: inconsistent lock state ]
[  145.127873] 2.6.23 #1
[  145.127880] ---------------------------------
[  145.127891] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[  145.127906] cat/3111 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  145.127918]  (regdump_lock){+...}, at: [<00000000004281d0>] __show_regs+0x18/0x320
[  145.127951] {in-hardirq-W} state was registered at:
[  145.127960]   [<0000000000669780>] _spin_lock+0x28/0x40
[  145.127983]   [<00000000004281d0>] __show_regs+0x18/0x320
[  145.128000]   [<00000000004284e4>] show_regs+0xc/0x20
[  145.128016]   [<00000000005ac9d8>] sysrq_handle_showregs+0x20/0x40
[  145.128041]   [<00000000005ac7fc>] __handle_sysrq+0x84/0x160
[  145.128060]   [<00000000005ac8f8>] handle_sysrq+0x20/0x40
[  145.128078]   [<00000000005a4f08>] kbd_event+0x670/0xb60
[  145.128110]   [<00000000005ea0c0>] input_event+0x1e8/0x560
[  145.128140]   [<00000000005efa2c>] sunkbd_interrupt+0x114/0x140
[  145.128167]   [<00000000005e6270>] serio_interrupt+0x38/0xa0
[  145.128186]   [<00000000005b2e58>] sunsu_kbd_ms_interrupt+0xa0/0x140
[  145.128212]   [<000000000049f6f8>] handle_IRQ_event+0x20/0x80
[  145.128251]   [<000000000049f808>] __do_IRQ+0xb0/0x140
[  145.128268]   [<000000000042f48c>] handler_irq+0x94/0xc0
[  145.128306]   [<0000000000426f30>] sunos_sys_table+0x560/0x728
[  145.128324]   [<0000000000428a78>] cpu_idle+0x20/0xe0
[  145.128341]   [<0000000000665a24>] rest_init+0x6c/0x80
[  145.128375]   [<000000000076ec24>] start_kernel+0x2ec/0x340
[  145.128405]   [<000000000066599c>] tlb_fixup_done+0xa0/0xbc
[  145.128425]   [<0000000000000000>] 0x8
[  145.128443] irq event stamp: 1209
[  145.128451] hardirqs last  enabled at (1209): [<0000000000404b74>] __handle_softirq_continue+0x20/0x24
[  145.128480] hardirqs last disabled at (1207): [<0000000000474494>] __do_softirq+0xbc/0x140
[  145.128506] softirqs last  enabled at (1208): [<00000000004744dc>] __do_softirq+0x104/0x140
[  145.128526] softirqs last disabled at (1203): [<00000000004745a0>] do_softirq+0x88/0xa0
[  145.128546] 
[  145.128551] other info that might help us debug this:
[  145.128562] no locks held by cat/3111.
[  145.128570] 
[  145.128574] stack backtrace:
[  145.128582] Call Trace:
[  145.128590]  [00000000004907a0] print_usage_bug+0x148/0x160
[  145.128624]  [00000000004917f4] mark_lock+0x6dc/0x780
[  145.128641]  [000000000049286c] __lock_acquire+0x734/0x12a0
[  145.128659]  [0000000000493430] lock_acquire+0x58/0x80
[  145.128676]  [0000000000669780] _spin_lock+0x28/0x40
[  145.128691]  [00000000004281d0] __show_regs+0x18/0x320
[  145.128706]  [0000000000429ba0] die_if_kernel+0x68/0x2c0
[  145.128722]  [0000000000452ab0] unhandled_fault+0x78/0xe0
[  145.128749]  [0000000000452d14] do_sparc64_fault+0x17c/0x620
[  145.128765]  [000000000040798c] sparc64_realfault_common+0x18/0x20
[  145.128787]  [fffff800bdca3e80] 0xfffff800bdca3e88
[  145.128799]  [000000000050affc] proc_reg_read+0x64/0xa0
[  145.128828]  [00000000004ccb4c] vfs_read+0x74/0x120
[  145.128856]  [00000000004ccf4c] sys_read+0x34/0x60
[  145.128872]  [0000000000406314] linux_sparc_syscall+0x3c/0x44
[  145.128898]  [0000000000012ff4] 0x12ffc
[  145.128915] TSTATE: 0000004411009603 TPC: 00000000005119ac TNPC: 00000000005119b0 Y: 00000000    Not tainted
[  145.128940] TPC: <kpagecount_read+0x94/0xe0>
[  145.128951] g0: 0000000000000000 g1: 0000000000000058 g2: 0000000000000000 g3: 0000000000028008
[  145.128966] g4: fffff800bfc3a460 g5: fffff8007f878000 g6: fffff800bdca0000 g7: 0000000000000000
[  145.128982] o0: 0000000000000001 o1: 0000000000000001 o2: 000000000050afe4 o3: 0000000000000000
[  145.128997] o4: 0000000000000002 o5: 0000000000b80320 sp: fffff800bdca3391 ret_pc: fffff800bdca3e80
[  145.129013] RPC: <0xfffff800bdca3e88>
[  145.129023] l0: fffff800bfc3a460 l1: 0000000000669d3c l2: 0000000000000001 l3: 000000000075d400
[  145.129039] l4: 000000000075d400 l5: fffff80080015b10 l6: fffff80080005b08 l7: 0000000000000001
[  145.129054] i0: 0000000000028010 i1: 0000000000028000 i2: 0000000000001ff8 i3: 0000000000000002
[  145.129070] i4: 0000000000000058 i5: 0000000000000000 i6: fffff800bdca3451 i7: 000000000050affc
[  145.129088] I7: <proc_reg_read+0x64/0xa0>
[  145.129119] Caller[000000000050affc]: proc_reg_read+0x64/0xa0
[  145.129139] Caller[00000000004ccb4c]: vfs_read+0x74/0x120
[  145.129156] Caller[00000000004ccf4c]: sys_read+0x34/0x60
[  145.129173] Caller[0000000000406314]: linux_sparc_syscall+0x3c/0x44
[  145.129193] Caller[0000000000012ff4]: 0x12ffc
[  145.129205] Instruction DUMP: 82070002  02c04003  86063ff8 <ce406008> cef0e000  82100000  8610001b  b406bff8  80a06000 

> Have we seen a config for the broken machine? Perhaps that'll help us
> make a guess..

Please find it attached (version 2.6.23).

The box is sun ultra 60 with 2 cpus.

# lspci 
0000:00:00.0 Host bridge: Sun Microsystems Computer Corp. Psycho PCI Bus Module
0000:00:01.0 Bridge: Sun Microsystems Computer Corp. EBUS (rev 01)
0000:00:01.1 Ethernet controller: Sun Microsystems Computer Corp. Happy Meal (rev 01)
0000:00:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14)
0000:00:03.1 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14)
0001:00:00.0 Host bridge: Sun Microsystems Computer Corp. Psycho PCI Bus Module

# cat /proc/cpuinfo 
cpu             : TI UltraSparc II  (BlackBird)
fpu             : UltraSparc II integrated FPU
prom            : OBP 3.17.0 1998/10/23 11:26
type            : sun4u
ncpus probed    : 2
ncpus active    : 2
D$ parity tl1   : 0
I$ parity tl1   : 0
Cpu0ClkTck      : 000000001ad1c43b
Cpu2ClkTck      : 000000001ad1c43b
MMU Type        : Spitfire
State:
CPU0:           online
CPU2:           online

# cat /proc/meminfo 
MemTotal:      1015648 kB
MemFree:        961840 kB
Buffers:          5680 kB
Cached:          18096 kB
SwapCached:          0 kB
Active:          22440 kB
Inactive:        10288 kB
SwapTotal:      497992 kB
SwapFree:       497992 kB
Dirty:              32 kB
Writeback:           0 kB
AnonPages:        9168 kB
Mapped:           4288 kB
Slab:            10368 kB
SReclaimable:     4008 kB
SUnreclaim:       6360 kB
PageTables:        424 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   1005816 kB
Committed_AS:    27272 kB
VmallocTotal:  4194304 kB
VmallocUsed:       136 kB
VmallocChunk:  4194168 kB

# cat /proc/interrupts 
           CPU0       CPU2       
  0:      24567      16248     <NULL>  timer
  1:          0          0      sun4u  PSYCHO_PCIERR
  2:          0          0      sun4u  PSYCHO_UE
  3:          0          0      sun4u  PSYCHO_CE
  8:        291          0      sun4u  su(kbd)
  9:          0          0      sun4u  su(mouse)
 14:       1061          0      sun4u  eth0
 15:       2034          0      sun4u  sym53c8xx
 16:          0         30      sun4u  sym53c8xx
 17:          0          0      sun4u  PSYCHO_PCIERR

I'll try earilier kernels and see what happens.

Regards,

	Mariusz

View attachment "config-sparc64-2.6.23" of type "text/plain" (19369 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ