lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190108145942.GZ31793@dhcp22.suse.cz>
Date:   Tue, 8 Jan 2019 15:59:42 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] memcg: schedule high reclaim for remote memcgs on
 high_work

On Wed 02-01-19 17:56:38, Shakeel Butt wrote:
> If a memcg is over high limit, memory reclaim is scheduled to run on
> return-to-userland. However it is assumed that the memcg is the current
> process's memcg. With remote memcg charging for kmem or swapping in a
> page charged to remote memcg, current process can trigger reclaim on
> remote memcg. So, schduling reclaim on return-to-userland for remote
> memcgs will ignore the high reclaim altogether. So, punt the high
> reclaim of remote memcgs to high_work.

Have you seen this happening in real life workloads? And is this
offloading what we really want to do? I mean it is clearly the current
task that has triggered the remote charge so why should we offload that
work to a system? Is there any reason we cannot reclaim on the remote
memcg from the return-to-userland path?

> Signed-off-by: Shakeel Butt <shakeelb@...gle.com>
> ---
>  mm/memcontrol.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e9db1160ccbc..47439c84667a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2302,19 +2302,23 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	 * reclaim on returning to userland.  We can perform reclaim here
>  	 * if __GFP_RECLAIM but let's always punt for simplicity and so that
>  	 * GFP_KERNEL can consistently be used during reclaim.  @memcg is
> -	 * not recorded as it most likely matches current's and won't
> -	 * change in the meantime.  As high limit is checked again before
> -	 * reclaim, the cost of mismatch is negligible.
> +	 * not recorded as the return-to-userland high reclaim will only reclaim
> +	 * from current's memcg (or its ancestor). For other memcgs we punt them
> +	 * to work queue.
>  	 */
>  	do {
>  		if (page_counter_read(&memcg->memory) > memcg->high) {
> -			/* Don't bother a random interrupted task */
> -			if (in_interrupt()) {
> +			/*
> +			 * Don't bother a random interrupted task or if the
> +			 * memcg is not current's memcg's ancestor.
> +			 */
> +			if (in_interrupt() ||
> +			    !mm_match_cgroup(current->mm, memcg)) {
>  				schedule_work(&memcg->high_work);
> -				break;
> +			} else {
> +				current->memcg_nr_pages_over_high += batch;
> +				set_notify_resume(current);
>  			}
> -			current->memcg_nr_pages_over_high += batch;
> -			set_notify_resume(current);
>  			break;
>  		}
>  	} while ((memcg = parent_mem_cgroup(memcg)));
> -- 
> 2.20.1.415.g653613c723-goog
> 

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ