lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1710161709460.140151@chino.kir.corp.google.com>
Date:   Mon, 16 Oct 2017 17:15:31 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Yang Shi <yang.s@...baba-inc.com>
cc:     cl@...ux.com, penberg@...nel.org, iamjoonsoo.kim@....com,
        akpm@...ux-foundation.org, mhocko@...nel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable
 slabs > user memory

On Wed, 11 Oct 2017, Yang Shi wrote:

> @@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
>  	return false;
>  }
>  
> +/*
> + * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
> + * than all user memory (LRU pages)
> + */
> +static bool is_dump_unreclaim_slabs(void)
> +{
> +	unsigned long nr_lru;
> +
> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> +		 global_node_page_state(NR_INACTIVE_ANON) +
> +		 global_node_page_state(NR_ACTIVE_FILE) +
> +		 global_node_page_state(NR_INACTIVE_FILE) +
> +		 global_node_page_state(NR_ISOLATED_ANON) +
> +		 global_node_page_state(NR_ISOLATED_FILE) +
> +		 global_node_page_state(NR_UNEVICTABLE);
> +
> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> +}

I think this is an excessive requirement to meet to dump potentially very 
helpful information to the kernel log.  On my 256GB system, this would 
probably require >128GB of unreclaimable slab to trigger.  If a single 
slab cache leaker were to blame for this excessive usage, it would suffice 
to only print a single line showing the slab cache with the greatest 
memory footprint.

It also prevents us from diagnosing issues where reclaimable slab isn't 
actually reclaimed as expected, so the scope is too narrow.

Previous iterations of this patchset were actually better because it 
presented useful data that wasn't restricted to excessive requirements for 
a very narrow scope.

Please simply dump statistics for all slab caches where the memory 
footprint is greater than 5% of system memory.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ