[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54062F32.5070504@sr71.net>
Date: Tue, 02 Sep 2014 13:57:22 -0700
From: Dave Hansen <dave@...1.net>
To: Dave Hansen <dave.hansen@...el.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Hugh Dickins <hughd@...gle.com>, Tejun Heo <tj@...nel.org>,
Vladimir Davydov <vdavydov@...allels.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: regression caused by cgroups optimization in 3.17-rc2
I, of course, forgot to include the most important detail. This appears
to be pretty run-of-the-mill spinlock contention in the resource counter
code. Nearly 80% of the CPU is spent spinning in the charge or uncharge
paths in the kernel. It is apparently spinning on res_counter->lock in
both the charge and uncharge paths.
It already does _some_ batching here on the free side, but that
apparently breaks down after ~40 threads.
It's a no-brainer since the patch in question removed an optimization
skipping the charging, and now we're seeing overhead from the charging.
Here's the first entry from perf top:
80.18% 80.18% [kernel] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--66.59%-- res_counter_uncharge_until
| res_counter_uncharge
| uncharge_batch
| uncharge_list
| mem_cgroup_uncharge_list
| release_pages
| free_pages_and_swap_cache
| tlb_flush_mmu_free
| |
| |--90.12%-- unmap_single_vma
| | unmap_vmas
| | unmap_region
| | do_munmap
| | vm_munmap
| | sys_munmap
| | system_call_fastpath
| | __GI___munmap
| |
| --9.88%-- tlb_flush_mmu
| tlb_finish_mmu
| unmap_region
| do_munmap
| vm_munmap
| sys_munmap
| system_call_fastpath
| __GI___munmap
|
|--46.13%-- __res_counter_charge
| res_counter_charge
| try_charge
| mem_cgroup_try_charge
| |
| |--99.89%-- do_cow_fault
| | handle_mm_fault
| | __do_page_fault
| | do_page_fault
| | page_fault
| | testcase
| --0.11%-- [...]
|
|--1.14%-- do_cow_fault
| handle_mm_fault
| __do_page_fault
| do_page_fault
| page_fault
| testcase
--8217937613.29%-- [...]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists