linux-kernel - RE: crash in filesytem during reboot . (and proposed patch)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c0c9b98df2e80bfb63be37ee9f1be7b4@mail.gmail.com>
Date:	Fri, 22 Jun 2012 17:53:14 -0700
From:	Sadasivan Shaiju <sshaiju@...sta.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org
Subject: RE: crash in filesytem during reboot . (and proposed patch)

Hi Andrew,

Please see inline .
-----Original Message-----
From: Andrew Morton [mailto:akpm@...ux-foundation.org]
Sent: Friday, June 22, 2012 2:30 PM
To: Sadasivan Shaiju
Cc: linux-kernel@...r.kernel.org
Subject: Re: crash in filesytem during reboot . (and proposed patch)

On Fri, 15 Jun 2012 11:12:09 -0700
Sadasivan Shaiju <sshaiju@...sta.com> wrote:

> Hi
>
>
>

Your email is quadruple-spaced.  Please, fix that.

Sure I will fix this .

> I  am  getting  the  following  crashes during  a  reboot of   the
system
> .  It  looks  like  a  race  condition   during  unmount .
>
> <4>Call Trace:
> <4>[] clear_inode+0x28/0xe8
> <4>[] generic_drop_inode+0x3c/0xa8
> <4>[] d_kill+0x4c/0x78
> <4>[] __shrink_dcache_sb+0x258/0x360
> <4>[] shrink_dcache_parent+0x140/0x190 <4>[]
> proc_flush_task+0xac/0x2e8 <4>[] release_task+0x80/0x4c0 <4>[]
> wait_consider_task+0x608/0xa80 <4>[] do_wait+0x10c/0x2b8 <4>[]
> SyS_wait4+0x88/0x120 <4>[] compat_sys_wait4+0xc8/0xd0 <4>[]
> handle_sysn32+0x44/0x84
>
> Call Trace:
> [] file_ra_state_init+0x0/0x20
> [] __dentry_open+0x26c/0x3d0
> [] do_filp_open+0x70c/0xbc8
> [] do_sys_open+0x78/0x1e0
> [] handle_sysn32+0x44/0x84
>
> Call Trace:
> [<ffffffff812ae3e4>] iput+0x3c/0x88
> [<ffffffff812aaa84>] d_kill+0x4c/0x78
> [<ffffffff812aad08>] __shrink_dcache_sb+0x258/0x360
> [<ffffffff812ab300>] shrink_dcache_parent+0x140/0x190
> [<ffffffff812eea14>] proc_flush_task+0xac/0x2e8 [<ffffffff811e6538>]
> release_task+0x80/0x4c0 [<ffffffff811e80c8>] do_exit+0x6f8/0x908
> [<ffffffff8121dee8>] unregister_module_notifier+0x0/0x10
>
> Call Trace:
> [<ffffffff812ae3e4>] iput+0x3c/0x88
> [<ffffffff812aaa84>] d_kill+0x4c/0x78
> [<ffffffff812ab6b8>] dput+0x120/0x220
> [<ffffffff812a0f1c>] do_lookup+0xdc/0x210 [<ffffffff812a33e8>]
> __link_path_walk+0x910/0x1408 [<ffffffff812a4194>]
> path_walk+0x64/0x108 [<ffffffff812a4350>] do_path_lookup+0x60/0x68
> [<ffffffff812a519c>]
> do_filp_open+0xdc/0xbc8 [<ffffffff81293768>] do_sys_open+0x78/0x1e0
> [<ffffffff81103844>] handle_sysn32+0x44/0x84
>
> ...
>
> I  am  thinking  of  putting  the  following  fix  in
>  shrink_dcache_parent() .   Please  let  me  know  is  there  any
problem
> with  this  fix .
>
> ...
>
> --- linux-2.6.32.orig/fs/dcache.c       2012-05-30 15:59:18.000000000
-0700
> +++ linux-2.6.32/fs/dcache.c    2012-06-11 17:10:33.000000000 -0700
> @@ -881,8 +881,14 @@
>         struct super_block *sb = parent->d_sb;
>         int found;
>
> -       while ((found = select_parent(parent)) != 0)
> -               __shrink_dcache_sb(sb, &found, 0);
> +       while ((found = select_parent(parent)) != 0) {
> +               if (down_read_trylock(&sb->s_umount)) {
> +                        if ((sb->s_root != NULL)) {
> +                               __shrink_dcache_sb(sb, &found, 0);
> +                }
> +                  up_read(&sb->s_umount);
> +             }
> +        }
>  }

Please fully describe the race which you believe you have found.  What
races against what?

The race is between generic_shutdown_super()  and  __shrink_dcache_sb  ()
.  Under high memory pressure one
Of our user process crashed  and the  parent was trying to do a clean up
with the following  stack flow

<4>[] clear_inode+0x28/0xe8

<4>[] generic_drop_inode+0x3c/0xa8

<4>[] d_kill+0x4c/0x78

<4>[] __shrink_dcache_sb+0x258/0x360

<4>[] shrink_dcache_parent+0x140/0x190

<4>[] proc_flush_task+0xac/0x2e8

<4>[] release_task+0x80/0x4c0

<4>[] wait_consider_task+0x608/0xa80

<4>[] do_wait+0x10c/0x2b8

<4>[] SyS_wait4+0x88/0x120

<4>[] compat_sys_wait4+0xc8/0xd0

<4>[] handle_sysn32+0x44/0x84

During that time the system get rebooted and unmounting starts .
Meanwhile the parent process is trying to clean up
The child' dentry's  and clear_inode  will reference to a stale inode and
it will crash .  So I try to grab the s_umount lock
So that __shrink_dcache_sb() won't be called during unmounts .  This
prevents accessing the stale inode in clear_inode .

A similar race condition  is  already  prevented in prune_dcache()
(between generic_shutdown_super ()and  __shrink_dcache_sb () ) .

Please also confirm that the bug is still present in current kernels -
2.6.32 is rather old.

I am not sure whether the bug is still present in current kernels.
But I do see some rcu locks in this  area in the current kernel .

We are moving to 3.4  kernel . But the current product is still  based on
2.6.32 .
So we need to fix this issue in 2.6.32 .

Regards,
Shaiju.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/