linux-kernel - Re: [PATCH v8 2/5] drm: ttm_pool: Rework ttm_pool to use drm_page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a6a962e-dda4-cfbd-4daf-52fad2b902f4@amd.com>
Date:   Fri, 5 Mar 2021 11:43:44 +0100
From:   Christian König <christian.koenig@....com>
To:     John Stultz <john.stultz@...aro.org>,
        lkml <linux-kernel@...r.kernel.org>
Cc:     Daniel Vetter <daniel@...ll.ch>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Liam Mark <lmark@...eaurora.org>,
        Chris Goldsworthy <cgoldswo@...eaurora.org>,
        Laura Abbott <labbott@...nel.org>,
        Brian Starkey <Brian.Starkey@....com>,
        Hridya Valsaraju <hridya@...gle.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Sandeep Patil <sspatil@...gle.com>,
        Daniel Mentz <danielmentz@...gle.com>,
        Ørjan Eide <orjan.eide@....com>,
        Robin Murphy <robin.murphy@....com>,
        Ezequiel Garcia <ezequiel@...labora.com>,
        Simon Ser <contact@...rsion.fr>,
        James Jones <jajones@...dia.com>, linux-media@...r.kernel.org,
        dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v8 2/5] drm: ttm_pool: Rework ttm_pool to use
 drm_page_pool

Am 05.03.21 um 00:20 schrieb John Stultz:
> This patch reworks the ttm_pool logic to utilize the recently
> added drm_page_pool code.
>
> This adds drm_page_pool structures to the ttm_pool_type
> structures, and then removes all the ttm_pool_type shrinker
> logic (as its handled in the drm_page_pool shrinker).
>
> NOTE: There is one mismatch in the interfaces I'm not totally
> happy with. The ttm_pool tracks all of its pooled pages across
> a number of different pools, and tries to keep this size under
> the specified page_pool_size value. With the drm_page_pool,
> there may other users, however there is still one global
> shrinker list of pools. So we can't easily reduce the ttm
> pool under the ttm specified size without potentially doing
> a lot of shrinking to other non-ttm pools. So either we can:
>    1) Try to split it so each user of drm_page_pools manages its
>       own pool shrinking.
>    2) Push the max value into the drm_page_pool, and have it
>       manage shrinking to fit under that global max. Then share
>       those size/max values out so the ttm_pool debug output
>       can have more context.
>
> I've taken the second path in this patch set, but wanted to call
> it out so folks could look closely.

That's perfectly fine with me. A global approach for the different page 
pool types is desired anyway as far as I can see.

>
> Thoughts would be greatly appreciated here!
>
> Cc: Daniel Vetter <daniel@...ll.ch>
> Cc: Christian Koenig <christian.koenig@....com>
> Cc: Sumit Semwal <sumit.semwal@...aro.org>
> Cc: Liam Mark <lmark@...eaurora.org>
> Cc: Chris Goldsworthy <cgoldswo@...eaurora.org>
> Cc: Laura Abbott <labbott@...nel.org>
> Cc: Brian Starkey <Brian.Starkey@....com>
> Cc: Hridya Valsaraju <hridya@...gle.com>
> Cc: Suren Baghdasaryan <surenb@...gle.com>
> Cc: Sandeep Patil <sspatil@...gle.com>
> Cc: Daniel Mentz <danielmentz@...gle.com>
> Cc: Ørjan Eide <orjan.eide@....com>
> Cc: Robin Murphy <robin.murphy@....com>
> Cc: Ezequiel Garcia <ezequiel@...labora.com>
> Cc: Simon Ser <contact@...rsion.fr>
> Cc: James Jones <jajones@...dia.com>
> Cc: linux-media@...r.kernel.org
> Cc: dri-devel@...ts.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@...aro.org>
> ---
> v7:
> * Major refactoring to use drm_page_pools inside the
>    ttm_pool_type structure. This allows us to use container_of to
>    get the needed context to free a page. This also means less
>    code is changed overall.
> v8:
> * Reworked to use the new cleanly rewritten drm_page_pool logic
> ---
>   drivers/gpu/drm/Kconfig        |   1 +
>   drivers/gpu/drm/ttm/ttm_pool.c | 156 ++++++---------------------------
>   include/drm/ttm/ttm_pool.h     |   6 +-
>   3 files changed, 31 insertions(+), 132 deletions(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 7cbcecb8f7df..a6cbdb63f6c7 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -184,6 +184,7 @@ config DRM_PAGE_POOL
>   config DRM_TTM
>   	tristate
>   	depends on DRM && MMU
> +	select DRM_PAGE_POOL
>   	help
>   	  GPU memory management subsystem for devices with multiple
>   	  GPU memory types. Will be enabled automatically if a device driver
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index 6e27cb1bf48b..f74ea801d7ab 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -39,6 +39,7 @@
>   #include <asm/set_memory.h>
>   #endif
>   
> +#include <drm/page_pool.h>
>   #include <drm/ttm/ttm_pool.h>
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_tt.h>
> @@ -68,8 +69,6 @@ static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER];
>   static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
>   
>   static struct mutex shrinker_lock;
> -static struct list_head shrinker_list;
> -static struct shrinker mm_shrinker;
>   
>   /* Allocate pages of size 1 << order with the given gfp_flags */
>   static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
> @@ -125,8 +124,9 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
>   }
>   
>   /* Reset the caching and pages of size 1 << order */
> -static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
> -			       unsigned int order, struct page *p)
> +static unsigned long ttm_pool_free_page(struct ttm_pool *pool,
> +					enum ttm_caching caching,
> +					unsigned int order, struct page *p)
>   {
>   	unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
>   	struct ttm_pool_dma *dma;
> @@ -142,7 +142,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
>   
>   	if (!pool || !pool->use_dma_alloc) {
>   		__free_pages(p, order);
> -		return;
> +		return 1UL << order;
>   	}
>   
>   	if (order)
> @@ -153,6 +153,16 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
>   	dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dma->addr,
>   		       attr);
>   	kfree(dma);
> +	return 1UL << order;

The returned value is always the same. So you wrapper can do this and we 
don't really need to change the function here.

> +}
> +
> +static unsigned long ttm_subpool_free_page(struct drm_page_pool *subpool,
> +					   struct page *p)

Better call this ttm_pool_free_callback.

> +{
> +	struct ttm_pool_type *pt;
> +
> +	pt = container_of(subpool, struct ttm_pool_type, subpool);
> +	return ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
>   }
>   
>   /* Apply a new caching to an array of pages */
> @@ -216,40 +226,6 @@ static void ttm_pool_unmap(struct ttm_pool *pool, dma_addr_t dma_addr,
>   		       DMA_BIDIRECTIONAL);
>   }
>   
> -/* Give pages into a specific pool_type */
> -static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p)
> -{
> -	unsigned int i, num_pages = 1 << pt->order;
> -
> -	for (i = 0; i < num_pages; ++i) {
> -		if (PageHighMem(p))
> -			clear_highpage(p + i);
> -		else
> -			clear_page(page_address(p + i));
> -	}
> -
> -	spin_lock(&pt->lock);
> -	list_add(&p->lru, &pt->pages);
> -	spin_unlock(&pt->lock);
> -	atomic_long_add(1 << pt->order, &allocated_pages);
> -}
> -
> -/* Take pages from a specific pool_type, return NULL when nothing available */
> -static struct page *ttm_pool_type_take(struct ttm_pool_type *pt)
> -{
> -	struct page *p;
> -
> -	spin_lock(&pt->lock);
> -	p = list_first_entry_or_null(&pt->pages, typeof(*p), lru);
> -	if (p) {
> -		atomic_long_sub(1 << pt->order, &allocated_pages);
> -		list_del(&p->lru);
> -	}
> -	spin_unlock(&pt->lock);
> -
> -	return p;
> -}
> -
>   /* Initialize and add a pool type to the global shrinker list */
>   static void ttm_pool_type_init(struct ttm_pool_type *pt, struct ttm_pool *pool,
>   			       enum ttm_caching caching, unsigned int order)
> @@ -257,25 +233,14 @@ static void ttm_pool_type_init(struct ttm_pool_type *pt, struct ttm_pool *pool,
>   	pt->pool = pool;
>   	pt->caching = caching;
>   	pt->order = order;

The order is now duplicated and can probably be dropped from the TTM pool.

> -	spin_lock_init(&pt->lock);
> -	INIT_LIST_HEAD(&pt->pages);
>   
> -	mutex_lock(&shrinker_lock);
> -	list_add_tail(&pt->shrinker_list, &shrinker_list);
> -	mutex_unlock(&shrinker_lock);
> +	drm_page_pool_init(&pt->subpool, order, ttm_subpool_free_page);
>   }
>   
>   /* Remove a pool_type from the global shrinker list and free all pages */
>   static void ttm_pool_type_fini(struct ttm_pool_type *pt)
>   {
> -	struct page *p, *tmp;
> -
> -	mutex_lock(&shrinker_lock);
> -	list_del(&pt->shrinker_list);
> -	mutex_unlock(&shrinker_lock);
> -
> -	list_for_each_entry_safe(p, tmp, &pt->pages, lru)
> -		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> +	drm_page_pool_fini(&pt->subpool);
>   }
>   
>   /* Return the pool_type to use for the given caching and order */
> @@ -306,30 +271,6 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
>   	return NULL;
>   }
>   
> -/* Free pages using the global shrinker list */
> -static unsigned int ttm_pool_shrink(void)
> -{
> -	struct ttm_pool_type *pt;
> -	unsigned int num_freed;
> -	struct page *p;
> -
> -	mutex_lock(&shrinker_lock);
> -	pt = list_first_entry(&shrinker_list, typeof(*pt), shrinker_list);
> -
> -	p = ttm_pool_type_take(pt);
> -	if (p) {
> -		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> -		num_freed = 1 << pt->order;
> -	} else {
> -		num_freed = 0;
> -	}
> -
> -	list_move_tail(&pt->shrinker_list, &shrinker_list);
> -	mutex_unlock(&shrinker_lock);
> -
> -	return num_freed;
> -}
> -
>   /* Return the allocation order based for a page */
>   static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p)
>   {
> @@ -386,7 +327,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
>   		struct ttm_pool_type *pt;
>   
>   		pt = ttm_pool_select_type(pool, tt->caching, order);
> -		p = pt ? ttm_pool_type_take(pt) : NULL;
> +		p = pt ? drm_page_pool_remove(&pt->subpool) : NULL;
>   		if (p) {
>   			apply_caching = true;
>   		} else {
> @@ -479,16 +420,13 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
>   
>   		pt = ttm_pool_select_type(pool, tt->caching, order);
>   		if (pt)
> -			ttm_pool_type_give(pt, tt->pages[i]);
> +			drm_page_pool_add(&pt->subpool, tt->pages[i]);
>   		else
>   			ttm_pool_free_page(pool, tt->caching, order,
>   					   tt->pages[i]);
>   
>   		i += num_pages;
>   	}
> -
> -	while (atomic_long_read(&allocated_pages) > page_pool_size)
> -		ttm_pool_shrink();

That won't work. You still need to make sure that we shrink the pool to 
be under the maximum.

>   }
>   EXPORT_SYMBOL(ttm_pool_free);
>   
> @@ -537,21 +475,6 @@ void ttm_pool_fini(struct ttm_pool *pool)
>   }
>   
>   #ifdef CONFIG_DEBUG_FS
> -/* Count the number of pages available in a pool_type */
> -static unsigned int ttm_pool_type_count(struct ttm_pool_type *pt)
> -{
> -	unsigned int count = 0;
> -	struct page *p;
> -
> -	spin_lock(&pt->lock);
> -	/* Only used for debugfs, the overhead doesn't matter */
> -	list_for_each_entry(p, &pt->pages, lru)
> -		++count;
> -	spin_unlock(&pt->lock);
> -
> -	return count;
> -}
> -
>   /* Dump information about the different pool types */
>   static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt,
>   				    struct seq_file *m)
> @@ -559,7 +482,8 @@ static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt,
>   	unsigned int i;
>   
>   	for (i = 0; i < MAX_ORDER; ++i)
> -		seq_printf(m, " %8u", ttm_pool_type_count(&pt[i]));
> +		seq_printf(m, " %8lu",
> +			   drm_page_pool_get_size(&pt[i].subpool));
>   	seq_puts(m, "\n");
>   }
>   
> @@ -609,7 +533,10 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)
>   	}
>   
>   	seq_printf(m, "\ntotal\t: %8lu of %8lu\n",
> -		   atomic_long_read(&allocated_pages), page_pool_size);
> +		   atomic_long_read(&allocated_pages),
> +		   drm_page_pool_get_max());
> +	seq_printf(m, "(%8lu in non-ttm pools)\n", drm_page_pool_get_total() -
> +					atomic_long_read(&allocated_pages));
>   
>   	mutex_unlock(&shrinker_lock);

That won't work. You need to move the debugfs functions into the DRM 
pool as well or otherwise you have two separate shrinker_lock instances 
and the lock protection is not correct any more.

Regards,
Christian.

>   
> @@ -619,28 +546,6 @@ EXPORT_SYMBOL(ttm_pool_debugfs);
>   
>   #endif
>   
> -/* As long as pages are available make sure to release at least one */
> -static unsigned long ttm_pool_shrinker_scan(struct shrinker *shrink,
> -					    struct shrink_control *sc)
> -{
> -	unsigned long num_freed = 0;
> -
> -	do
> -		num_freed += ttm_pool_shrink();
> -	while (!num_freed && atomic_long_read(&allocated_pages));
> -
> -	return num_freed;
> -}
> -
> -/* Return the number of pages available or SHRINK_EMPTY if we have none */
> -static unsigned long ttm_pool_shrinker_count(struct shrinker *shrink,
> -					     struct shrink_control *sc)
> -{
> -	unsigned long num_pages = atomic_long_read(&allocated_pages);
> -
> -	return num_pages ? num_pages : SHRINK_EMPTY;
> -}
> -
>   /**
>    * ttm_pool_mgr_init - Initialize globals
>    *
> @@ -655,8 +560,9 @@ int ttm_pool_mgr_init(unsigned long num_pages)
>   	if (!page_pool_size)
>   		page_pool_size = num_pages;
>   
> +	drm_page_pool_set_max(page_pool_size);
> +
>   	mutex_init(&shrinker_lock);
> -	INIT_LIST_HEAD(&shrinker_list);
>   
>   	for (i = 0; i < MAX_ORDER; ++i) {
>   		ttm_pool_type_init(&global_write_combined[i], NULL,
> @@ -669,10 +575,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
>   				   ttm_uncached, i);
>   	}
>   
> -	mm_shrinker.count_objects = ttm_pool_shrinker_count;
> -	mm_shrinker.scan_objects = ttm_pool_shrinker_scan;
> -	mm_shrinker.seeks = 1;
> -	return register_shrinker(&mm_shrinker);
> +	return 0;
>   }
>   
>   /**
> @@ -691,7 +594,4 @@ void ttm_pool_mgr_fini(void)
>   		ttm_pool_type_fini(&global_dma32_write_combined[i]);
>   		ttm_pool_type_fini(&global_dma32_uncached[i]);
>   	}
> -
> -	unregister_shrinker(&mm_shrinker);
> -	WARN_ON(!list_empty(&shrinker_list));
>   }
> diff --git a/include/drm/ttm/ttm_pool.h b/include/drm/ttm/ttm_pool.h
> index 4321728bdd11..3d975888ce47 100644
> --- a/include/drm/ttm/ttm_pool.h
> +++ b/include/drm/ttm/ttm_pool.h
> @@ -30,6 +30,7 @@
>   #include <linux/llist.h>
>   #include <linux/spinlock.h>
>   #include <drm/ttm/ttm_caching.h>
> +#include <drm/page_pool.h>
>   
>   struct device;
>   struct ttm_tt;
> @@ -51,10 +52,7 @@ struct ttm_pool_type {
>   	unsigned int order;
>   	enum ttm_caching caching;
>   
> -	struct list_head shrinker_list;
> -
> -	spinlock_t lock;
> -	struct list_head pages;
> +	struct drm_page_pool subpool;
>   };
>   
>   /**