[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALYGNiMUQ2=OjF4G3jQYZMXja5mGinXB8M1YRCe8kctCzQoWHw@mail.gmail.com>
Date: Mon, 6 Jan 2020 23:35:50 +0300
From: Konstantin Khlebnikov <koct9i@...il.com>
To: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Wei Yang <richardw.yang@...ux.intel.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Li Xinhai <lixinhai.lxh@...il.com>
Subject: Re: [PATCH] mm/rmap: fix reusing mergeable anon_vma as parent when fork
On Mon, Jan 6, 2020 at 1:42 PM Konstantin Khlebnikov
<khlebnikov@...dex-team.ru> wrote:
>
> This fixes couple misconceptions in commit 4e4a9eb92133 ("mm/rmap.c: reuse
> mergeable anon_vma as parent when fork").
>
> First problem caused by initialization order in dup_mmap(): vma->vm_prev
> is set after calling anon_vma_fork(). Thus in anon_vma_fork() it points to
> previous VMA in parent mm. This is fixed by rearrangement in dup_mmap().
>
> If in parent VMAs: SRC1 SRC2 .. SRCn share anon-vma ANON0, then after fork
> before all patches in child process related VMAs: DST1 DST2 .. DSTn will
> use different anon-vmas: ANON1 ANON2 .. ANONn. Before this patch only DST1
> will fork new ANON1 and following DST2 .. DSTn will share parent's ANON0.
> With this patch DST1 will create new ANON1 and DST2 .. DSTn will share it.
>
> Also this patch moves sharing logic out of anon_vma_clone() into more
> specific anon_vma_fork() because this supposed to work only at fork().
> Function anon_vma_clone() is more generic is also used at splitting VMAs.
>
> Second problem is hidden behind first one: assumption "Parent has vm_prev,
> which implies we have vm_prev" is wrong if first VMA in parent mm has set
> flag VM_DONTCOPY. Luckily prev->anon_vma doesn't dereference NULL pointer
> because in current code 'prev' actually is same as 'pprev'. To avoid that
> this patch just checks pointer and compares vm_start to verify relation
> between previous VMAs in parent and child.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
> Fixes: 4e4a9eb92133 ("mm/rmap.c: reuse mergeable anon_vma as parent when fork")
Oops, I've forgot to mention that Li Xinhai <lixinhai.lxh@...il.com>
found and reported this suspicious code. Sorry.
Reported-by: Li Xinhai <lixinhai.lxh@...il.com>
Link: https://lore.kernel.org/linux-mm/CALYGNiNzz+dxHX0g5-gNypUQc3B=8_Scp53-NTOh=zWsdUuHAw@mail.gmail.com/T/#t
> ---
> kernel/fork.c | 4 ++--
> mm/rmap.c | 25 ++++++++++++-------------
> 2 files changed, 14 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 2508a4f238a3..04ee5e243f65 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -548,6 +548,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> if (retval)
> goto fail_nomem_policy;
> tmp->vm_mm = mm;
> + tmp->vm_prev = prev; /* anon_vma_fork use this */
> + tmp->vm_next = NULL;
> retval = dup_userfaultfd(tmp, &uf);
> if (retval)
> goto fail_nomem_anon_vma_fork;
> @@ -559,7 +561,6 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> } else if (anon_vma_fork(tmp, mpnt))
> goto fail_nomem_anon_vma_fork;
> tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT);
> - tmp->vm_next = tmp->vm_prev = NULL;
> file = tmp->vm_file;
> if (file) {
> struct inode *inode = file_inode(file);
> @@ -592,7 +593,6 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> */
> *pprev = tmp;
> pprev = &tmp->vm_next;
> - tmp->vm_prev = prev;
> prev = tmp;
>
> __vma_link_rb(mm, tmp, rb_link, rb_parent);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b3e381919835..77b3aa38d5c2 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -269,19 +269,6 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> {
> struct anon_vma_chain *avc, *pavc;
> struct anon_vma *root = NULL;
> - struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
> -
> - /*
> - * If parent share anon_vma with its vm_prev, keep this sharing in in
> - * child.
> - *
> - * 1. Parent has vm_prev, which implies we have vm_prev.
> - * 2. Parent and its vm_prev have the same anon_vma.
> - */
> - if (!dst->anon_vma && src->anon_vma &&
> - pprev && pprev->anon_vma == src->anon_vma)
> - dst->anon_vma = prev->anon_vma;
> -
>
> list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> struct anon_vma *anon_vma;
> @@ -334,6 +321,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> */
> int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> {
> + struct vm_area_struct *prev = vma->vm_prev, *pprev = pvma->vm_prev;
> struct anon_vma_chain *avc;
> struct anon_vma *anon_vma;
> int error;
> @@ -345,6 +333,17 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> /* Drop inherited anon_vma, we'll reuse existing or allocate new. */
> vma->anon_vma = NULL;
>
> + /*
> + * If parent shares anon_vma with its vm_prev, keep this sharing.
> + *
> + * Previous VMA could be missing or not match previuos in parent
> + * if VM_DONTCOPY is set: compare vm_start to avoid this case.
> + */
> + if (pvma->anon_vma && pprev && prev &&
> + pprev->anon_vma == pvma->anon_vma &&
> + pprev->vm_start == prev->vm_start)
> + vma->anon_vma = prev->anon_vma;
> +
> /*
> * First, attach the new VMA to the parent VMA's anon_vmas,
> * so rmap can find non-COWed pages in child processes.
>
>
Powered by blists - more mailing lists