lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGdZYLjD=g0Ye=L-iVNQ7y5+2_8uw1SLy2et2z70+wBQz4_2w@mail.gmail.com>
Date:   Fri, 2 Jun 2017 22:22:39 -0700
From:   Khazhismel Kumykov <khazhy@...gle.com>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        Miklos Szeredi <mszeredi@...hat.com>
Subject: Re: Hang/soft lockup in d_invalidate with simultaneous calls

On Fri, Jun 2, 2017 at 6:12 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
> Part of that could be relieved if we turned check_and_drop() into
> static void check_and_drop(void *_data)
> {
>         struct detach_data *data = _data;
>
>         if (!data->mountpoint && list_empty(&data->select.dispose))
>                 __d_drop(data->select.start);
> }

So with this change, d_invalidate will drop the starting dentry before
all it's children are dropped? Would it make sense to just drop it
right off the bat, and let one task handle shrinking all it's
children?
>
> It doesn't solve the entire problem, though - we still could get too
> many threads into that thing before any of them gets to that __d_drop().

Yes, the change isn't sufficient for my repro, as many threads get to
the loop before the drop, although new tasks don't get stuck in the
same loop after the dentry is dropped.

> I wonder if we should try and collect more victims; that could lead
> to contention of its own, but...

>From my understanding, the contention is the worst when one task is
shrinking everything, and several other tasks are busily looping
walking the dentry until everything is done. In this case, the tasks
busily looping d_walk hold the d_lock for a dentry while walking over
all it's children, then soon after it finishes the d_walk, it queues
again to walk again, while shrink_dentry_list releases and re-grabs
for each entry.

If we limit the number of items we pass to shrink_dentry_list at one
time things actually look quite a bit better. e.g., in testing
arbitrarily limiting select.dispose to 1000 elements does fix my
repro.

e.g.
diff --git a/fs/dcache.c b/fs/dcache.c
index 22af360ceca3..3892e0eb7ec2 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1367,6 +1367,7 @@ struct select_data {
        struct dentry *start;
        struct list_head dispose;
        int found;
+       int actually_found;
 };

 static enum d_walk_ret select_collect(void *_data, struct dentry *dentry)
@@ -1388,6 +1389,7 @@ static enum d_walk_ret select_collect(void
*_data, struct dentry *dentry)
                if (!dentry->d_lockref.count) {
                        d_shrink_add(dentry, &data->dispose);
                        data->found++;
+                       data->actually_found++;
                }
        }
        /*
@@ -1397,6 +1399,9 @@ static enum d_walk_ret select_collect(void
*_data, struct dentry *dentry)
         */
        if (!list_empty(&data->dispose))
                ret = need_resched() ? D_WALK_QUIT : D_WALK_NORETRY;
+
+       if (data->actually_found > 1000)
+               ret = D_WALK_QUIT;
 out:
        return ret;
 }
@@ -1415,6 +1420,7 @@ void shrink_dcache_parent(struct dentry *parent)
                INIT_LIST_HEAD(&data.dispose);
                data.start = parent;
                data.found = 0;
+               data.actually_found = 0;

                d_walk(parent, &data, select_collect, NULL);
                if (!data.found)
@@ -1536,6 +1542,7 @@ void d_invalidate(struct dentry *dentry)
                INIT_LIST_HEAD(&data.select.dispose);
                data.select.start = dentry;
                data.select.found = 0;
+               data.select.actually_found = 0;

                d_walk(dentry, &data, detach_and_collect, check_and_drop);

Download attachment "smime.p7s" of type "application/pkcs7-signature" (4843 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ