[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXu5jLabRWRdknf_CSBJ316vXYrEUCPTUCsm50WEwdpVNfE3g@mail.gmail.com>
Date: Tue, 16 Apr 2019 22:38:13 -0500
From: Kees Cook <keescook@...omium.org>
To: Jitendra Sharma <shajit@...eaurora.org>
Cc: "Luis R. Rodriguez" <mcgrof@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
linux-arm-msm@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>
Subject: Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
On Mon, Apr 15, 2019 at 7:58 AM Jitendra Sharma <shajit@...eaurora.org> wrote:
>
> Hi Kees Cook/Luis,
>
> We are observing one kernel crash in next_tgid function through
> getdents64 path. Call stack is as shown below:
>
> -000|has_group_leader_pid(inline)
> -000|next_tgid(
> | [X20] ns = 0xFFFFFF87CABB1AC0,
> | [locdesc] iter = (
> | [locdesc] tgid = 424,
> | [locdesc] task = ?))
> | [X21] p = 0xFFFFFFD0FFFFF948
> | [X21] task = 0xFFFFFFD0FFFFF948
> -001|proc_pid_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> | [X21] ns = 0xFFFFFF87CABB1AC0
> -002|proc_root_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> -003|iterate_dir(
> | [X19] file = 0xFFFFFFD1AC60FC40,
> | [X22] ctx = 0xFFFFFF8027363E40)
> | [X23] inode = 0xFFFFFFD1F20246D0
> -004|SYSC_getdents64(inline)
> -004|sys_getdents64(
> | ?,
> | ?,
> | [X19] count = 4200)
> | [X19] count = 4200
> | [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
> | [X0] error = -1720
> -005|el0_svc_naked(asm)
> -->|exception
> -006|NUX:0x78C5AD7D38(asm)
> ---|end of frame
>
>
> From this call stack,task: 0xFFFFFFD0FFFFF948, seems to be invalid.
> As(from ramdumps) it doesn't have any valid fields. And while trying to
> access the fields of this task struct in has_group_leader_pid, abort is
> happening.
>
> From the dumps, its not clear why the task struct is coming to be some
> invalid (Possibly task has already exited). This issue is observed
> during normal monkey testing for long hours.
>
> Could you please provide some pointers which could help in debugging
> this issue further.
Do you have any hints on how to reproduce this? I assume something is
missing proper locking or RCU handling, but I don't see anything
obvious in the surrounding code yet...
--
Kees Cook
Powered by blists - more mailing lists