[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZteMZEsMCMig8zOQ@google.com>
Date: Tue, 3 Sep 2024 22:23:32 +0000
From: Carlos Llamas <cmllamas@...gle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Hillf Danton <hdanton@...a.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Barry Song <v-songbaohua@...o.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Tangquan Zheng <zhengtangquan@...o.com>
Subject: Re: [PATCH] binder_alloc: Move alloc_page() out of mmap_rwsem to
reduce the lock duration
On Wed, Sep 04, 2024 at 09:10:20AM +1200, Barry Song wrote:
> However, I'm not entirely convinced that this is a problem :-) Concurrent
> allocations like this can occur in many places, especially in PFs. Reclamation
> is not useless because it helps free up memory for others; it's not
> without value.
Yeah, for binder though there is a high chance of multiple callers
trying to allocate the _same_ page concurrently. What I observed is that
pages being reclaimed "in vain" are often in the same vma and this means
subsequent callers will need to allocate these pages.
But even if this wasn't an issue, the correct solution would still be to
support concurrent faults. So, in reality it doesn't matter and we still
need to refactor the call to be non-exclusive.
> I also don't believe binder is one of the largest users executing concurrent
> allocations.
Oh, I have no statistics on this.
> Awesome! I’m eager to see your patch, and we’re ready to help with testing.
> I strongly recommend dropping the write lock entirely. Using
> `mmap_write_lock()` isn’t just a binder-specific concern; it has the
> potential to affect the entire Android system.
>
> In patch 3, I experimented with using `per_vma_lock` as well. I’m _not_
> proposing it for merging since you’re already working on it, but I wanted
> to share the idea. (just like patch2, it has only passed build-test)
Yeap, I'm using the per-vma-lock too per Suren's suggestion.
>
> [PATCH] binder_alloc: Further move to per_vma_lock from mmap_read_lock
>
> To further reduce the read lock duration, let's try using per_vma_lock
> first. If that fails, we can take the read lock, similar to how page
> fault handlers operate.
>
> Signed-off-by: Barry Song <v-songbaohua@...o.com>
> ---
> drivers/android/binder_alloc.c | 18 ++++++++++++++----
> 1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index a2281dfacbbc..b40a5dd650c8 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -221,6 +221,8 @@ static int binder_install_single_page(struct
> binder_alloc *alloc,
> struct binder_lru_page *lru_page,
> unsigned long addr)
> {
> + struct vm_area_struct *vma;
> + bool per_vma_lock = true;
> struct page *page;
> int ret = 0;
>
> @@ -235,10 +237,15 @@ static int binder_install_single_page(struct
> binder_alloc *alloc,
> */
> page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
>
> - mmap_read_lock(alloc->mm);
> + vma = lock_vma_under_rcu(alloc->mm, addr);
> + if (!vma) {
> + per_vma_lock = false;
> + mmap_read_lock(alloc->mm);
> + vma = find_vma(alloc->mm, addr);
> + }
>
> - /* vma might have been dropped or deattached */
> - if (!alloc->vma || !find_vma(alloc->mm, addr)) {
> + /* vma might have been dropped, deattached or changed to new one */
> + if (!alloc->vma || !vma || vma != alloc->vma) {
> pr_err("%d: %s failed, no vma\n", alloc->pid, __func__);
> ret = -ESRCH;
> goto out;
> @@ -270,7 +277,10 @@ static int binder_install_single_page(struct
> binder_alloc *alloc,
> binder_set_installed_page(lru_page, page);
> spin_unlock(&alloc->lock);
> out:
> - mmap_read_unlock(alloc->mm);
> + if (per_vma_lock)
> + vma_end_read(vma);
> + else
> + mmap_read_unlock(alloc->mm);
> mmput_async(alloc->mm);
> if (ret && page)
> __free_page(page);
> --
> 2.39.3 (Apple Git-146)
This looks fairly similar to my patch. However, you would need to handle
the case were vm_insert_page() returns -EBUSY (success that raced) and
also sync with the shrinker callbacks in binder_alloc_free_page() which
is the biggest concern.
Let's not duplicate efforts though. Can we please wait for my patch?
I'll add you as Co-Developed-by, since you've posted this already?
Regards,
Carlos Llamas
Powered by blists - more mailing lists