[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.GSO.4.64.0703121247210.7679@cpu102.cs.uwaterloo.ca>
Date: Mon, 12 Mar 2007 13:05:17 -0400 (EDT)
From: Ashif Harji <asharji@...uwaterloo.ca>
To: Jan Kara <jack@...e.cz>
cc: Nick Piggin <npiggin@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: do_generic_mapping_read performance issue
On Mon, 12 Mar 2007, Jan Kara wrote:
> On Mon 12-03-07 15:39:00, Nick Piggin wrote:
>> On Mon, Mar 12, 2007 at 03:20:12PM +0100, Jan Kara wrote:
>>> Hi,
>>>
>>>> Hi, I am encountering a performance problem, which I have tracked into the
>>>> Linux kernel. The problem occurs with my experimental web server that uses
>>>> sendfile to repeatedly transmit files. The files are based on the static
>>>> portion of the SPECweb99 fileset and range in size to model a reasonable
>>>> workload. With this workload, a significant number of the requests are
>>>> for files of size 4 KB or less.
>>>>
>>>> I have determined that the performance problems occurs in the function
>>>> do_generic_mapping_read in file mm/filemap.c for kernel version 2.6.20.1.
>>>> Here is the specific code fragment:
>>>>
>>>> /*
>>>> * When (part of) the same page is read multiple times
>>>> * in succession, only mark it as accessed the first time.
>>>> */
>>>> if (prev_index != index)
>>>> mark_page_accessed(page);
>>> Actually, the code is like that certainly for two years :).
>>
>> Did it always use ra->prev_page? ISTR it using pos%PAGE_SIZE == 0 at some
>> stage (ie. read from the start of a page -- obviously that also has holes).
> Yes, at least in 2.6.12-rc5 which is the first one in git :).
>
>>>> I was wondering if anyone could explain why the call to mark_page_accessed
>>>> is conditional? That is, what problem it is trying to solve. It would seem
>>>> that in many scenarios, if the same page is accessed repeatedly, then it
>>>> would be appropriate to keep that page cached.
>>> I also don't know why the condition is there but it's there at least
>>> for two years so I'm not sure anybody remembers ;). Nick, do you have
>>> an idea?
>>
>> Yeah it is there because that is basically how our "use once" detection
>> handles the case where an app does not read in chunks that are PAGE_SIZE
>> multiples and PAGE_SIZE aligned.
> OK, I see. Then I'm not sure the check does more good than bad. Because
> if we happen to reread the same chunk several times, then the check does a
> wrong thing...
Thanks for providing me with additional information.
I would like to submit a patch to fix the performance problem. The
simplest solution is to remove the check. Even in the situation where an
application does not read in PAGE_SIZE multiples as described above, if
the page is accessed frequently it should remain in the cache. However, I
am open to suggestions for a more sophisticated scheme.
ashif.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists