[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <11acb4bd-a1bf-cc32-a124-d98bf746f201@alibaba-inc.com>
Date: Wed, 20 Sep 2017 07:03:14 +0800
From: "Yang Shi" <yang.s@...baba-inc.com>
To: David Rientjes <rientjes@...gle.com>
Cc: cl@...ux.com, penberg@...nel.org, iamjoonsoo.kim@....com,
akpm@...ux-foundation.org, mhocko@...nel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: oom: show unreclaimable slab info when kernel
panic
On 9/19/17 3:41 PM, David Rientjes wrote:
> On Wed, 20 Sep 2017, Yang Shi wrote:
>
>>>> --- a/mm/slab_common.c
>>>> +++ b/mm/slab_common.c
>>>> @@ -35,6 +35,8 @@
>>>> static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
>>>> slab_caches_to_rcu_destroy_workfn);
>>>> +#define K(x) ((x)/1024)
>>>> +
>>>> /*
>>>> * Set of flags that will prevent slab merging
>>>> */
>>>> @@ -1272,6 +1274,34 @@ static int slab_show(struct seq_file *m, void *p)
>>>> return 0;
>>>> }
>>>> +void show_unreclaimable_slab()
>>>> +{
>>>> + struct kmem_cache *s = NULL;
>>>> + struct slabinfo sinfo;
>>>> +
>>>> + memset(&sinfo, 0, sizeof(sinfo));
>>>> +
>>>> + printk("Unreclaimable slabs:\n");
>>>> +
>>>> + /*
>>>> + * Here acquiring slab_mutex is unnecessary since we don't prefer to
>>>> + * get sleep in oom path right before kernel panic, and avoid race
>>>> condition.
>>>> + * Since it is already oom, so there should be not any big allocation
>>>> + * which could change the statistics significantly.
>>>> + */
>>>> + list_for_each_entry(s, &slab_caches, list) {
>>>> + if (!is_root_cache(s))
>>>> + continue;
>>>> +
>>>> + get_slabinfo(s, &sinfo);
>>>> +
>>>> + if (!is_reclaimable(s) && sinfo.num_objs > 0)
>>>> + printk("%-17s %luKB\n", cache_name(s),
>>>> K(sinfo.num_objs * s->size));
>>>> + }
>>>
>>> I like this, but could we be even more helpful by giving the user more
>>> information from sinfo beyond just the total size of objects allocated?
>>
>> Sure, we definitely can. But, the question is what info is helpful to users to
>> diagnose oom other than the size.
>>
>> I think of the below:
>> - the number of active objs, the number of total objs, the percentage
>> of active objs per cache
>> - the number of active slabs, the number of total slabs, the
>> percentage of active slabs per cache
>>
>> Anything else?
>>
>
> Right now it's a useful tool to find out what unreclaimable slab is
> sitting around that is causing the system to run out of memory. If we
> knew how much of this slab is actually in use vs free, it can determine if
> its stranding or if there's a bug in the slab allocator itself.
I see. You prefer to have a report which looks like:
Cache Used size Free size
mm_struct 100K 50K
Or show the total size (used + free) instead of free size. And, may plus
the number of objs and the number of total objs.
Thanks,
Yang
>
> We wouldn't need percentages, we can calculate that directly from the
> data if necessary.
>
Powered by blists - more mailing lists