lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231107090843.261410-1-aliceryhl@google.com>
Date:   Tue,  7 Nov 2023 09:08:43 +0000
From:   Alice Ryhl <aliceryhl@...gle.com>
To:     Carlos Llamas <cmllamas@...gle.com>
Cc:     "Arve Hjønnevåg" <arve@...roid.com>,
        Christian Brauner <brauner@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        kernel-team@...roid.com, linux-kernel@...r.kernel.org,
        Martijn Coenen <maco@...roid.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Todd Kjos <tkjos@...roid.com>
Subject: Re: [PATCH 19/21] binder: perform page allocation outside of locks

Carlos Llamas <cmllamas@...gle.com> writes:
> Split out the insertion of pages to be done outside of the alloc->mutex
> in a separate binder_get_pages_range() routine. Since this is no longer
> serialized with other requests we need to make sure we look at the full
> range of pages for this buffer, including those shared with neighboring
> buffers. The insertion of pages into the vma is still serialized with
> the mmap write lock.
> 
> Besides avoiding unnecessary nested locking this helps in preparation of
> switching the alloc->mutex into a spinlock_t in subsequent patches.
> 
> Signed-off-by: Carlos Llamas <cmllamas@...gle.com>

This is rather complex, so it could use more comments. However, as far
as I can tell, the change is correct.

> +/* The range of pages should include those shared with other buffers */
> +static int binder_get_page_range(struct binder_alloc *alloc,
> +				 unsigned long start, unsigned long end)
> [snip]
> +/* The range of pages should exclude those shared with other buffers */
> +static void binder_allocate_page_range(struct binder_alloc *alloc,
> +				       unsigned long start, unsigned long end)

I would really like a comment on each function explaining that:

 * The binder_allocate_page_range function ensures that existing pages
   will not be reclaimed by the shrinker.
 * The binder_get_page_range function ensures that missing pages are
   allocated and inserted.

I also don't think the names are great. The function with "allocate" in
its name is the one that doesn't perform any allocations.

>  	mmap_write_lock(alloc->mm);
> +	if (lru_page->page_ptr)
> +		goto out;

Another comment that I'd like to see somewhere is one that says
something along these lines:

    Multiple processes may call `binder_get_user_page_remote` on the
    same page in parallel. When this happens, one of them will allocate
    the page and insert it, and the other process will use the mmap
    write lock to wait for the insertion to complete. This means that we
    can't use a mmap read lock here.

> +	/* mark page insertion complete and safe to acquire */
> +	smp_store_release(&lru_page->page_ptr, page);
> [snip]
> +		/* check if page insertion is marked complete by release */
> +		if (smp_load_acquire(&page->page_ptr))
> +			continue;

We already discussed this when I asked you to make this an acquire /
release operation so that it isn't racy, but it could use a comment
explaining its purpose.

>  	mmap_write_lock(alloc->mm);
> +	if (lru_page->page_ptr)
> +		goto out;
> +
>  	if (!alloc->vma) {
>  		pr_err("%d: %s failed, no vma\n", alloc->pid, __func__);
>  		ret = -ESRCH;
>  		goto out;
>  	}
>  
>  	page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
>  	if (!page) {
>  		pr_err("%d: failed to allocate page\n", alloc->pid);
>  		ret = -ENOMEM;
>  		goto out;
>  	}

Maybe it would be worth to allocate the page before taking the mmap
write lock? It has the disadvantage that you may have to immediately
deallocate it if we trigger the `if (lru_page->page_ptr) goto out`
branch, but that shouldn't happen that often, and it would reduce the
amount of time we spend holding the mmap write lock.

Alice

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ