[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+icZUVuJWpLquLk4kHJzmMKtPw24oxud=u3Na0U0HhSYqwV1w@mail.gmail.com>
Date: Sat, 3 Sep 2011 07:54:47 +0200
From: Sedat Dilek <sedat.dilek@...glemail.com>
To: Valdis.Kletnieks@...edu
Cc: Tim Chen <tim.c.chen@...ux.intel.com>,
Jiri Slaby <jirislaby@...il.com>,
"David S. Miller" <davem@...emloft.net>,
ML netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>
Subject: Re: [next] unix stream crashes
On Sat, Sep 3, 2011 at 7:35 AM, <Valdis.Kletnieks@...edu> wrote:
> On Fri, 02 Sep 2011 16:55:03 PDT, Tim Chen said:
>
>> I'll like to isolate the problem to either the send path or receive
>> path. My suspicion is the error handling portion of the send path is not
>> quite right but I haven't yet found any issues after reviewing the
>> patch.
>
> Took a while, because it took a few tries to get netconsole working,
> and then I was seeing odd results, but here we go:
>
> next-20110831 - crashes 100% consistent.
> next-20110831 + revert 0856a30409 - OK.
> revert + scm_recv.patch - OK.
> revert + scm_send.patch - crashes 100% consistent.
>
YES, I can confirm this with next-20110826.
> Now the odd part - although I was seeing crashes 100% of the time, I saw a
> number of different tracebacks (but I never actually saw the same traceback
> that Jiri had). Also, the system died at different points - most of the time it
> would live long enough for GDM to prompt for a userid/password and then die,
> but sometimes it didn't get as far as the GDM screen. Hopefully the variety of
> crashes will tell you something useful.
>
> I'll be able to test patches for go/nogo over the weekend, but probably won't
> have a second machine to catch netconsole until I'm back in the office Monday.
>
> Example 1:
>
> [ 142.316258] Kernel panic - not syncing: CRED: put_cred_rcu() sees ffff88010d1ff300 with usage -41
> [ 142.316260]
> [ 142.316275] Pid: 2264, comm: gdm-simple-slav Tainted: G W 3.1.0-rc4-next-20110831-dirty #17
> [ 142.316279] Call Trace:
> [ 142.316283] <IRQ> [<ffffffff81577a6c>] panic+0x96/0x1a2
> [ 142.316300] [<ffffffff8105cb54>] put_cred_rcu+0x32/0x91
> [ 142.316306] [<ffffffff8157a44f>] rcu_do_batch+0xcb/0x1e4
> [ 142.316313] [<ffffffff81092967>] invoke_rcu_callbacks+0x6c/0xc7
> [ 142.316319] [<ffffffff810932f8>] __rcu_process_callbacks+0x118/0x124
> [ 142.316325] [<ffffffff810934f0>] rcu_process_callbacks+0x64/0x72
> [ 142.316331] [<ffffffff8103f8c4>] __do_softirq+0x110/0x278
> [ 142.316338] [<ffffffff815a23ac>] call_softirq+0x1c/0x30
> [ 142.316342] <EOI> [<ffffffff81003647>] do_softirq+0x44/0xf1
> [ 142.316352] [<ffffffff8103f485>] _local_bh_enable_ip+0x12a/0x178
> [ 142.316358] [<ffffffff8103f4dc>] local_bh_enable_ip+0x9/0xb
> [ 142.316364] [<ffffffff8159a2f3>] _raw_write_unlock_bh+0x36/0x3a
> [ 142.316372] [<ffffffff814c1ac3>] unix_release_sock+0x86/0x1ff
> [ 142.316378] [<ffffffff8105b548>] ? up_read+0x1b/0x32
> [ 142.316383] [<ffffffff814c1c5d>] unix_release+0x21/0x23
> [ 142.316390] [<ffffffff81423d02>] sock_release+0x1a/0x6f
> [ 142.316395] [<ffffffff81424a30>] sock_close+0x22/0x26
> [ 142.316401] [<ffffffff810fcacb>] __fput+0x140/0x1fe
> [ 142.316407] [<ffffffff810f97cb>] ? sys_close+0xe6/0x158
> [ 142.316412] [<ffffffff810fcb9e>] fput+0x15/0x17
> [ 142.316417] [<ffffffff810f8ef2>] filp_close+0x87/0x93
> [ 142.316422] [<ffffffff810f97d6>] sys_close+0xf1/0x158
> [ 142.316429] [<ffffffff815a0ffb>] system_call_fastpath+0x16/0x1b
>
I saw similiar call-traces with put_cred_rcu() - besides with
kmem_cache_alloc_trace().
My post-it says:
Kernel panic - not syncing: CRED: put_cred_rcu sees f67ac0c0 with usage -43
BTW, systemd (uses dbus/sockets) is more sensitive than Debian's
standard sysvinit.
- Sedat -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists