[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33710E6CAA200E4583255F4FB666C4E20AC96A42@G01JPEXMBYT03>
Date: Mon, 18 Feb 2013 00:16:15 +0000
From: "Hatayama, Daisuke" <d.hatayama@...fujitsu.com>
To: Atsushi Kumagai <kumagai-atsushi@....nes.nec.co.jp>
CC: "kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"lisa.mitchell@...com" <lisa.mitchell@...com>,
"ebiederm@...ssion.com" <ebiederm@...ssion.com>,
"cpw@....com" <cpw@....com>,
"vgoyal@...hat.com" <vgoyal@...hat.com>
Subject: RE: [PATCH 00/13] kdump, vmcore: support mmap() on /proc/vmcore
From: Atsushi Kumagai <kumagai-atsushi@....nes.nec.co.jp>
Subject: Re: [PATCH 00/13] kdump, vmcore: support mmap() on /proc/vmcore
Date: Fri, 15 Feb 2013 12:57:01 +0900
> On Thu, 14 Feb 2013 19:11:43 +0900
> HATAYAMA Daisuke <d.hatayama@...fujitsu.com> wrote:
<cut>
>> TODO
>> ====
>>
>> - fix makedumpfile to use mmap() on /proc/vmcore and benchmark it to
>> confirm whether we can see enough performance improvement.
>
> As a first step, I'll make a prototype patch for benchmarking unless you
> have already done it.
>
I have an idea, but I've not started developing it yet.
I think threre are the two points we should optimize. One is
write_kdump_pages() that reads target page frames, compresses them if
necessary, and writes each page frame data in order, and the other is
__exclude_unnecessary_pages() that reads mem_map array into page_cache
and processes it for filtering.
Optimising the former seems trivial by mmap(), but we have to consider
more for the latter case since it is virtually contiguous but not
guranteed to be physically contiguous; mem_map is mapped in the
virtual memory map region. Hence, the current implementation reads
mem_map array one by one in 4KB page with virtual-to-physical
translation. This is critical in performance and not sutable for
optimization by mmap(). We should fix this anyway.
My idea here is to focus on the fact that virtual memory map region is
actually mapped using PMD level page entry, i.e. 4MB page, if
currently used processor supports large pages. By this, the page
entries gained by each page translation is guranteed to be physically
contiguous in at least 4MB length. Looking at the benchmark, the
performance improvement is already saturated in 4MB case. So I guess
we can see enough performance improvement by mmap()ing mem_map array
in this 4MB page units.
Thanks.
HATAYAMA, Daisuke
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists