lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 25 Nov 2021 12:07:31 +0800
From:   Muchun Song <songmuchun@...edance.com>
To:     Gang Li <ligang.bdlg@...edance.com>
Cc:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        linux- stable <stable@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode

On Thu, Nov 25, 2021 at 11:12 AM Gang Li <ligang.bdlg@...edance.com> wrote:
>
> This patch fixes a data race in commit 779750d20b93 ("shmem: split huge pages
> beyond i_size under memory pressure").
>
> Here are call traces causing race:
>
>    Call Trace 1:
>      shmem_unused_huge_shrink+0x3ae/0x410
>      ? __list_lru_walk_one.isra.5+0x33/0x160
>      super_cache_scan+0x17c/0x190
>      shrink_slab.part.55+0x1ef/0x3f0
>      shrink_node+0x10e/0x330
>      kswapd+0x380/0x740
>      kthread+0xfc/0x130
>      ? mem_cgroup_shrink_node+0x170/0x170
>      ? kthread_create_on_node+0x70/0x70
>      ret_from_fork+0x1f/0x30
>
>    Call Trace 2:
>      shmem_evict_inode+0xd8/0x190
>      evict+0xbe/0x1c0
>      do_unlinkat+0x137/0x330
>      do_syscall_64+0x76/0x120
>      entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>
> A simple explanation:
>
> Image there are 3 items in the local list (@list).
> In the first traversal, A is not deleted from @list.
>
>   1)    A->B->C
>         ^
>         |
>         pos (leave)
>
> In the second traversal, B is deleted from @list. Concurrently, A is
> deleted from @list through shmem_evict_inode() since last reference counter of
> inode is dropped by other thread. Then the @list is corrupted.
>
>   2)    A->B->C
>         ^  ^
>         |  |
>      evict pos (drop)
>
> We should make sure the inode is either on the global list or deleted from
> any local list before iput().
>
> Fixed by moving inodes back to global list before we put them.
>
> Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
> Signed-off-by: Gang Li <ligang.bdlg@...edance.com>

You have forgotten my Reviewed-by and  Kirill A. Shutemov's Acked-by
as well as Cc: stable@...r.kernel.org.

> ---
>  mm/shmem.c | 34 +++++++++++++++++++---------------
>  1 file changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 9023103ee7d8..e6ccb2a076ff 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -569,7 +569,6 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
>                 /* inode is about to be evicted */
>                 if (!inode) {
>                         list_del_init(&info->shrinklist);
> -                       removed++;

I believe there is a warning about @removed since it's unused.

>                         goto next;
>                 }
>
> @@ -577,12 +576,12 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
>                 if (round_up(inode->i_size, PAGE_SIZE) ==
>                                 round_up(inode->i_size, HPAGE_PMD_SIZE)) {
>                         list_move(&info->shrinklist, &to_remove);
> -                       removed++;
>                         goto next;
>                 }
>
>                 list_move(&info->shrinklist, &list);
>  next:
> +               sbinfo->shrinklist_len--;
>                 if (!--batch)
>                         break;
>         }
> @@ -602,7 +601,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
>                 inode = &info->vfs_inode;
>
>                 if (nr_to_split && split >= nr_to_split)
> -                       goto leave;
> +                       goto move_back;
>
>                 page = find_get_page(inode->i_mapping,
>                                 (inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT);
> @@ -616,38 +615,43 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
>                 }
>
>                 /*
> -                * Leave the inode on the list if we failed to lock
> -                * the page at this time.
> +                * Move the inode on the list back to shrinklist if we failed
> +                * to lock the page at this time.
>                  *
>                  * Waiting for the lock may lead to deadlock in the
>                  * reclaim path.
>                  */
>                 if (!trylock_page(page)) {
>                         put_page(page);
> -                       goto leave;
> +                       goto move_back;
>                 }
>
>                 ret = split_huge_page(page);
>                 unlock_page(page);
>                 put_page(page);
>
> -               /* If split failed leave the inode on the list */
> +               /* If split failed move the inode on the list back to shrinklist */
>                 if (ret)
> -                       goto leave;
> +                       goto move_back;
>
>                 split++;
>  drop:
>                 list_del_init(&info->shrinklist);
> -               removed++;
> -leave:
> +               goto put;
> +move_back:
> +               /*
> +               * Make sure the inode is either on the global list or deleted from
> +               * any local list before iput() since it could be deleted in another
> +               * thread once we put the inode (then the local list is corrupted).
> +               */
> +               spin_lock(&sbinfo->shrinklist_lock);
> +               list_move(&info->shrinklist, &sbinfo->shrinklist);
> +               sbinfo->shrinklist_len++;
> +               spin_unlock(&sbinfo->shrinklist_lock);
> +put:
>                 iput(inode);
>         }
>
> -       spin_lock(&sbinfo->shrinklist_lock);
> -       list_splice_tail(&list, &sbinfo->shrinklist);
> -       sbinfo->shrinklist_len -= removed;
> -       spin_unlock(&sbinfo->shrinklist_lock);
> -
>         return split;
>  }
>
> --
> 2.20.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ