lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b668145-a81d-6f46-0569-b0adb76788d8@alibaba-inc.com>
Date:   Thu, 05 Oct 2017 02:08:48 +0800
From:   "Yang Shi" <yang.s@...baba-inc.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     cl@...ux.com, penberg@...nel.org, rientjes@...gle.com,
        iamjoonsoo.kim@....com, akpm@...ux-foundation.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when
 unreclaimable slabs > user memory



On 10/4/17 7:27 AM, Michal Hocko wrote:
> On Wed 04-10-17 02:06:17, Yang Shi wrote:
>> +static bool is_dump_unreclaim_slabs(void)
>> +{
>> +	unsigned long nr_lru;
>> +
>> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
>> +		 global_node_page_state(NR_INACTIVE_ANON) +
>> +		 global_node_page_state(NR_ACTIVE_FILE) +
>> +		 global_node_page_state(NR_INACTIVE_FILE) +
>> +		 global_node_page_state(NR_ISOLATED_ANON) +
>> +		 global_node_page_state(NR_ISOLATED_FILE) +
>> +		 global_node_page_state(NR_UNEVICTABLE);
>> +
>> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
>> +}
> 
> I am sorry I haven't pointed this earlier (I was following only half
> way) but this should really be memcg aware. You are checking only global
> counters. I do not think it is an absolute must to provide per-memcg
> data but you should at least check !is_memcg_oom(oc).

BTW, I saw there is already such check in dump_header that looks like 
the below code:

         if (oc->memcg)
                 mem_cgroup_print_oom_info(oc->memcg, p);
         else
                 show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);

I'm supposed it'd better to replace "oc->memcg" to "is_memcg_oom(oc)" 
since they do the same check and "is_memcg_oom" interface sounds preferable.

Then I'm going to move unreclaimable slabs dump to the "else" block.

Yang

> 
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex))
>> +		return;
> 
> I would move the trylock up so that we do not get empty and confusing
> Unreclaimable slab info: and add a note that we are not dumping anything
> due to lock contention
> 	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");
> 
> Other than that this looks sensible to me.
> 
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ