linux-kernel - Re: [PATCH] SYSVIPC - Fix the ipc structures initialization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <491BC4B8.1050406@colorfullife.com>
Date:	Thu, 13 Nov 2008 07:10:00 +0100
From:	Manfred Spraul <manfred@...orfullife.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	cboulte@...il.com, Nadia.Derbey@...l.net,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>
Subject: Re: [PATCH] SYSVIPC - Fix the ipc structures initialization

Andrew Morton wrote:
> Time is starting to press on this one.  Is there something which we can
> revert which would fix this bug?
>   
My previous analysis was bogus, let's start from scratch:

1) the initial oops report:
http://bugzilla.kernel.org/show_bug.cgi?id=11796#c0

- lockdep is enabled, the oops is somewhere in __lock_acquire
- the instruction that oopses is

 >>>  lock incl 0x138(%r12)
R12 is 0x0038004000000000

That could be an debug_atomic_inc() in __lock_acquire. The class pointer 
in the spinlock_t is not initialized, thus it crashes.
Ingo - is that possible?

2) the latest oops was actually a soft lockup:

It starts with:
> [  400.393024] INFO: trying to register non-static key.
> [  400.397005] the code is fine but needs lockdep annotation.
> [  400.397005] turning off the locking correctness validator.
> [  400.397005] Pid: 4207, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #1
> [  400.397005] Call Trace:
> [  400.397005]  [<ffffffff80257055>] static_obj+0x60/0x77
> [  400.397005]  [<ffffffff8025af59>] __lock_acquire+0x1c8/0x779
> [  400.397005]  [<ffffffff8025b59f>] lock_acquire+0x95/0xc2
> [  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  400.397005]  [<ffffffff8045117d>] _spin_lock+0x2d/0x5a
> [  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  400.397005]  [<ffffffff802feaa5>] ipc_lock+0x0/0x99
> [  400.397005]  [<ffffffff802feb46>] ipc_lock_check+0x8/0x53
> [  400.397005]  [<ffffffff803002c3>] sys_msgctl+0x188/0x461
> [  400.397005]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
> [  400.397005]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  400.397005]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
> [  400.397005]  [<ffffffff80212e09>] sched_clock+0x5/0x7
> [  400.397005]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  400.397005]  [<ffffffff80213021>] native_sched_clock+0x8c/0xa5
> [  400.397005]  [<ffffffff80212e09>] sched_clock+0x5/0x7
> [  400.397005]  [<ffffffff8020bf7a>] system_call_fastpath+0x16/0x1b
> [  400.397005]
> [  464.933003] BUG: soft lockup - CPU#2 stuck for 61s! [sysv_test2:4207]
> [  464.933006] Call Trace:
> [  464.933006]  [<ffffffff8033dc6b>] _raw_spin_lock+0x98/0x100
> [  464.933006]  [<ffffffff8045119e>] _spin_lock+0x4e/0x5a
> [  464.933006]  [<ffffffff802feb07>] ipc_lock+0x62/0x99

For me, it reads like an uninitialized spinlock_t:
The static_obj test in kernel/lockdep.c notices that something is wrong and disables itself.
But then _raw_spin_lock() tries to acquire the uninitialized spinlock and loops forever, because noone does spin_unlock().
after 60 seconds, the soft lockup detection notices the problem and oopses.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/