linux-kernel - Re: [PATCH v3 1/5] mm: kmem: optimize get_obj_cgroup_from

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0588a0ed-f6b5-6365-8721-edb69b935a6c@suse.cz>
Date:   Tue, 17 Oct 2023 11:57:52 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Roman Gushchin <roman.gushchin@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>, shakeelb@...gle.com,
        Muchun Song <muchun.song@...ux.dev>,
        Dennis Zhou <dennis@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Naresh Kamboju <naresh.kamboju@...aro.org>
Subject: Re: [PATCH v3 1/5] mm: kmem: optimize get_obj_cgroup_from_current()

On 10/17/23 00:18, Roman Gushchin wrote:
> Manually inline memcg_kmem_bypass() and active_memcg() to speed up
> get_obj_cgroup_from_current() by avoiding duplicate in_task() checks
> and active_memcg() readings.

I guess the compiler could figure out most of it, except maybe the
active_memcg() after rcu_read_lock(), as I note below.

But anyway it's a single caller of memcg_kmem_bypass() so this makes sense.

> Also add a likely() macro to __get_obj_cgroup_from_memcg():
> obj_cgroup_tryget() should succeed at almost all times except a very
> unlikely race with the memcg deletion path.
> 
> Signed-off-by: Roman Gushchin (Cruise) <roman.gushchin@...ux.dev>
> Tested-by: Naresh Kamboju <naresh.kamboju@...aro.org>
> Acked-by: Shakeel Butt <shakeelb@...gle.com>
> Acked-by: Johannes Weiner <hannes@...xchg.org>

Reviewed-by: Vlastimil Babka <vbabka@...e.cz>

> ---
>  mm/memcontrol.c | 34 ++++++++++++++--------------------
>  1 file changed, 14 insertions(+), 20 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 9741d62d0424..16ac2a5838fb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1068,19 +1068,6 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
>  }
>  EXPORT_SYMBOL(get_mem_cgroup_from_mm);
>  
> -static __always_inline bool memcg_kmem_bypass(void)
> -{
> -	/* Allow remote memcg charging from any context. */
> -	if (unlikely(active_memcg()))
> -		return false;
> -
> -	/* Memcg to charge can't be determined. */
> -	if (!in_task() || !current->mm || (current->flags & PF_KTHREAD))
> -		return true;
> -
> -	return false;
> -}
> -
>  /**
>   * mem_cgroup_iter - iterate over memory cgroup hierarchy
>   * @root: hierarchy root
> @@ -3007,7 +2994,7 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg)
>  
>  	for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) {
>  		objcg = rcu_dereference(memcg->objcg);
> -		if (objcg && obj_cgroup_tryget(objcg))
> +		if (likely(objcg && obj_cgroup_tryget(objcg)))
>  			break;
>  		objcg = NULL;
>  	}
> @@ -3016,16 +3003,23 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg)
>  
>  __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void)
>  {
> -	struct obj_cgroup *objcg = NULL;
>  	struct mem_cgroup *memcg;
> +	struct obj_cgroup *objcg;
>  
> -	if (memcg_kmem_bypass())
> -		return NULL;
> +	if (in_task()) {
> +		memcg = current->active_memcg;
> +
> +		/* Memcg to charge can't be determined. */
> +		if (likely(!memcg) && (!current->mm || (current->flags & PF_KTHREAD)))
> +			return NULL;
> +	} else {
> +		memcg = this_cpu_read(int_active_memcg);
> +		if (likely(!memcg))
> +			return NULL;
> +	}
>  
>  	rcu_read_lock();
> -	if (unlikely(active_memcg()))
> -		memcg = active_memcg();

I guess the rcu_read_lock() has in theory prevented us from being migrated
to a different cpu after doing the !in_task()
this_cpu_read(int_active_memcg); and now we trust the reading outside of the
rcu lock, but it's ok because a) nothing prevented us from migrating cpu
after getting to the objcg anyway so we could end up with a wrong one, and
b) if we're not in_task() and thus read int_active_memcg, it means our
context is some interrupt handler that can't be migrated to another cpu
anyway, right?

> -	else
> +	if (!memcg)
>  		memcg = mem_cgroup_from_task(current);
>  	objcg = __get_obj_cgroup_from_memcg(memcg);
>  	rcu_read_unlock();