[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100812134338.14856uchty1ra39c@guarana.org>
Date: Thu, 12 Aug 2010 13:43:38 +1000
From: Kevin Easton <kevin@...rana.org>
To: Micha Nelissen <micha@...i.hopto.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Why is get_user_pages so slow?
Quoting Micha Nelissen <micha@...i.hopto.org>:
> Hi all,
>
> Why is get_user_pages much slower than taking the faults? (I would
> expect it to be faster).
>
> Attached example program first mallocs a piece of memory (64MB in
> this case) then reads it "to take the faults". Afterwards, it uses
> mmap with MAP_POPULATE to "speed up" and not to have to take the
> faults, but have everything mapped in one go. I think mmap is using
> get_user_pages in this case.
>
> $ ./memspeed
> malloc took 0 msecs
> read took 14 msecs
> write took 0 msecs
> free took 1 msecs
> mmap took 45 msecs
> munmap took 5 msecs
>
> Using MAP_POPULATE is 3 times as slow as the 'stupid'
> implementation! I'm running a Core 2 duo e6300 system with linux
> 2.6.28.4.
>
> Am I doing something wrong? MAP_POPULATE seems a bit of a joke to me.
Hi Micha,
Yep, you are. Because your pointer 'p' is a pointer to int, when you
increment it by 0x1000 in your loops you are actually incrementing it
by 0x1000 * sizeof(int) - so you're only actually touching one page in
four.
If you change the types of 'buf', 'p' and 'e' to 'char *' then it
touches every page - and (and least on my test box) the MAP_POPULATE
case pulls ahead.
- Kevin
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists