lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070312143900.GB6016@wotan.suse.de>
Date:	Mon, 12 Mar 2007 15:39:00 +0100
From:	Nick Piggin <npiggin@...e.de>
To:	Jan Kara <jack@...e.cz>
Cc:	Ashif Harji <asharji@...uwaterloo.ca>, linux-kernel@...r.kernel.org
Subject: Re: do_generic_mapping_read performance issue

On Mon, Mar 12, 2007 at 03:20:12PM +0100, Jan Kara wrote:
>   Hi,
> 
> > Hi, I am encountering a performance problem, which I have tracked into the 
> > Linux kernel. The problem occurs with my experimental web server that uses 
> > sendfile to repeatedly transmit files.  The files are based on the static 
> > portion of the SPECweb99 fileset and range in size to model a reasonable 
> > workload.  With this workload, a significant number of the requests are 
> > for files of size 4 KB or less.
> > 
> > I have determined that the performance problems occurs in the function
> > do_generic_mapping_read in file mm/filemap.c for kernel version 2.6.20.1.
> > Here is the specific code fragment:
> > 
> >         /*
> >          * When (part of) the same page is read multiple times
> >          * in succession, only mark it as accessed the first time.
> >          */
> >         if (prev_index != index)
> >                 mark_page_accessed(page);
>   Actually, the code is like that certainly for two years :).

Did it always use ra->prev_page? ISTR it using pos%PAGE_SIZE == 0 at some
stage (ie. read from the start of a page -- obviously that also has holes).

> > The implication of this code is that for files of size less than or equal 
> > to a single page, the page associated with such a file is likely to get 
> > evicted from the cache regardless of how frequently it is accessed.  The 
> > reason is that after the first access, prev_index is always zero and index 
> > can only be zero. Hence, mark_page_accessed is never called after the 
> > first time the file is requested.  As a result, the page is evicted from 
> > the cache no matter how frequently it is used.  By changing the kernel to 
> > always call mark_page_accessed for these files, the server throughput is 
> > increased by as much as 20%.
>   Your analysis seems to be right. But to observe this behaviour you have
> to have the file open and just always reread it using the same file
> descriptor, don't you? That's probably not too common...
> 
> > I was wondering if anyone could explain why the call to mark_page_accessed 
> > is conditional? That is, what problem it is trying to solve. It would seem 
> > that in many scenarios, if the same page is accessed repeatedly, then it 
> > would be appropriate to keep that page cached.
>   I also don't know why the condition is there but it's there at least
> for two years so I'm not sure anybody remembers ;). Nick, do you have
> an idea?

Yeah it is there because that is basically how our "use once" detection
handles the case where an app does not read in chunks that are PAGE_SIZE
multiples and PAGE_SIZE aligned.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ