[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D03E346D-8DDF-4134-84C9-07AB66493A58@thehive.com>
Date: Tue, 9 Jun 2009 10:16:30 -0400
From: Matthew Von Maszewski <matthew@...hive.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: linux-kernel@...r.kernel.org,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: huge mem mmap eats all CPU when multiple processes
My apologies for lack of clarity in the original email. I am working
on a test program to send out later today. Here are my responses to
the questions asked:
On Jun 8, 2009, at 8:41 PM, KAMEZAWA Hiroyuki wrote:
> On Mon, 8 Jun 2009 10:27:49 -0400
> Matthew Von Maszewski <matthew@...hive.com> wrote:
>
>> [note: not on kernel mailing list, please cc author]
>>
>> Symptom: 9 processes mmap same 2 Gig memory section for a shared C
>> heap (lots of random access). All process begin extreme CPU load in
>> top.
>>
>> - Same code works well when only single process access huge mem.
> Does this "huge mem" means HugeTLB(2M/4Mbytes) pages ?
Yes. My debian x86_64 kernel build uses 2m pages. Test by one
process is really fast. Test by multiple process against same mmap()
file are really slow
>
>
>> - Code works well with standard vm based mmap file and 9 processes.
>>
>
> What is sys/user ratio in top ? Almost all cpus are used by "sys" ?
Tasks: 94 total, 3 running, 91 sleeping, 0 stopped, 0 zombie
Cpu0 : 5.6%us, 86.4%sy, 0.0%ni, 1.3%id, 5.3%wa, 0.0%hi,
1.3%si, 0.0%st
Cpu1 : 1.0%us, 92.4%sy, 0.0%ni, 0.0%id, 5.6%wa, 0.0%hi,
1.0%si, 0.0%st
Cpu2 : 1.7%us, 90.4%sy, 0.0%ni, 0.0%id, 7.3%wa, 0.0%hi,
0.7%si, 0.0%st
Cpu3 : 0.0%us, 70.4%sy, 0.0%ni, 25.1%id, 4.0%wa, 0.0%hi,
0.5%si, 0.0%st
Mem: 6103960k total, 2650044k used, 3453916k free, 6068k buffers
Swap: 5871716k total, 0k used, 5871716k free, 84504k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3681 proxy 20 0 2638m 1596 1312 S 43 0.0 0:07.87
tentacle.e.prof
3687 proxy 20 0 2656m 1592 1312 S 43 0.0 0:07.69
tentacle.e.prof
3689 proxy 20 0 2662m 1600 1312 S 42 0.0 0:07.82
tentacle.e.prof
3683 proxy 20 0 2652m 1596 1312 S 41 0.0 0:07.75
tentacle.e.prof
3684 proxy 20 0 2650m 1596 1312 S 41 0.0 0:07.89
tentacle.e.prof
3686 proxy 20 0 2644m 1596 1312 S 40 0.0 0:07.80
tentacle.e.prof
3685 proxy 20 0 2664m 1592 1312 S 40 0.0 0:07.82
tentacle.e.prof
3682 proxy 20 0 2646m 1616 1328 S 38 0.0 0:07.73
tentacle.e.prof
3664 proxy 20 0 2620m 1320 988 R 36 0.0 0:01.08 tentacle.e
3678 proxy 20 0 72352 35m 1684 R 11 0.6 0:01.79 squid
tentacle.e and tentacle.e.prof are copies of the same executable file,
started with different command line options. tentacle.e is started by
an init.d script. tentacle.e.prof processes are started by squid.
I am creating a simplified program to duplicate the scenario. Will
send it along later today.
>
>
>> Environment:
>>
>> - Intel x86_64: Dual core Xeon with hyperthreading (4 logical
>> processors)
>> - 6 Gig ram, 2.5G allocated to huge mem
> by boot option ?
huge mem initialization
1. sysctl.conf allocates the desired number of 2M pages:
system:/mnt$ tail -n 3 /etc/sysctl.conf
#huge
vm.nr_hugepages=1200
2. init.d script for tentacle.e mounts the file system and
preallocates space
(from init.d file starting tentacle.e)
umount /mnt/hugefs
mount -t hugetlbfs -o uid=proxy,size=2300M none /mnt/hugefs
system:/mnt df -kP
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sda1 135601864 32634960 96078636 26% /
tmpfs 3051980 0 3051980 0% /lib/init/rw
udev 10240 68 10172 1% /dev
tmpfs 3051980 0 3051980 0% /dev/shm
none 2355200 2117632 237568 90% /mnt/hugefs
>
>
>> - tried with kernels 2.6.29.4 and 2.6.30-rc8
>> - following mmap() call has base address as NULL on first process,
>> then returned address passed to subsequent processes (not threads,
>> processes)
>>
>> m_MemSize=((m_MemSize/(2048*1024))+1)*2048*1024;
>> m_BaseAddr=mmap(m_File->GetFixedBase(), m_MemSize,
>> (PROT_READ | PROT_WRITE),
>> MAP_SHARED, m_File->GetFileId(),
>> m_Offset);
>>
>>
>> I am not a kernel hacker so I have not attempted to debug. Will be
>> able to spend time on a sample program for sharing later today or
>> tomorrow. Sending this note now in case this is already known.
>>
>
> IIUC, all page faults to hugetlb are serialized by system's mutex.
> Then, touching
> in parallel doesn't do fast job..
> Then, I wonder touching all necessary maps by one thread is good, in
> general.
>
>
>
>> Don't suppose this is as simple as a Copy-On-Write flag being set
>> wrong?
>>
> I don't think, so.
>
>> Please send notes as to things I need to capture to better describe
>> this bug. Happy to do the work.
>>
> Add cc to linux-mm.
>
> Thanks,
> -Kame
>
>
>> Thanks,
>> Matthew
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-
>> kernel" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists