[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84c89ac10910180933p3ddb9947ye464a19ba29e4ccc@mail.gmail.com>
Date: Sun, 18 Oct 2009 22:03:42 +0530
From: Viji V Nair <viji@...oraproject.org>
To: Eric Sandeen <sandeen@...hat.com>
Cc: Theodore Tso <tytso@....edu>, ext3-users@...hat.com,
linux-ext4@...r.kernel.org
Subject: Re: optimising filesystem for many small files
On Sun, Oct 18, 2009 at 9:04 PM, Eric Sandeen <sandeen@...hat.com> wrote:
> Viji V Nair wrote:
>>
>> On Sun, Oct 18, 2009 at 3:56 AM, Theodore Tso <tytso@....edu> wrote:
>>>
>>> On Sat, Oct 17, 2009 at 11:26:04PM +0530, Viji V Nair wrote:
>>>>
>>>> these files are not in a single directory, this is a pyramid
>>>> structure. There are total 15 pyramids and coming down from top to
>>>> bottom the sub directories and files are multiplied by a factor of 4.
>>>>
>>>> The IO is scattered all over!!!! and this is a single disk file system.
>>>>
>>>> Since the python application is creating files, it is creating
>>>> multiple files to multiple sub directories at a time.
>>>
>>> What is the application trying to do, at a high level? Sometimes it's
>>> not possible to optimize a filesystem against a badly designed
>>> application. :-(
>>
>> The application is reading the gis data from a data source and
>> plotting the map tiles (256x256, png images) for different zoom
>> levels. The tree output of the first zoom level is as follows
>>
>> /tiles/00
>> `-- 000
>> `-- 000
>> |-- 000
>> | `-- 000
>> | `-- 000
>> | |-- 000.png
>> | `-- 001.png
>> |-- 001
>> | `-- 000
>> | `-- 000
>> | |-- 000.png
>> | `-- 001.png
>> `-- 002
>> `-- 000
>> `-- 000
>> |-- 000.png
>> `-- 001.png
>>
>> in each zoom level the fourth level directories are multiplied by a
>> factor of four. Also the number of png images are multiplied by the
>> same number.
>>>
>>> It sounds like it is generating files distributed in subdirectories in
>>> a completely random order. How are the files going to be read
>>> afterwards? In the order they were created, or some other order
>>> different from the order in which they were read?
>>
>> The application which we are using are modified versions of mapnik and
>> tilecache, these are single threaded so we are running 4 process at a
>> time. We can say only four images are created at a single point of
>> time. Some times a single image is taking around 20 sec to create. I
>> can see lots of system resources are free, memory, processors etc
>> (these are 4G, 2 x 5420 XEON)
>>
>> I have checked the delay in the backend data source, it is on a 12Gbps
>> LAN and no delay at all.
>
> The delays are almost certainly due to the drive heads seeking like mad as
> they attempt to write data all over the disk; most filesystems are designed
> so that files in subdirectories are kept together, and new subdirectories
> are placed at relatively distant locations to make room for the files they
> will contain.
>
> In the past I've seen similar applications also slow down due to new inode
> searching heuristics in the inode allocator, but that was on ext3 and ext4
> is significantly different in that regard...
>
>> These images are also read in the same manner.
>>
>>> With a sufficiently bad access patterns, there may not be a lot you
>>> can do, other than (a) throw hardware at the problem, or (b) fix or
>>> redesign the application to be more intelligent (if possible).
>>>
>>> - Ted
>>>
>>
>> The file system is crated with "-i 1024 -b 1024" for larger inode
>> number, 50% of the total images are less than 10KB. I have disabled
>> access time and given a large value to the commit also. Do you have
>> any other recommendation of the file system creation?
>
> I think you'd do better to change, if possible, how the application behaves.
>
> I probably don't know enough about the app but rather than:
>
> /tiles/00
> `-- 000
> `-- 000
> |-- 000
> | `-- 000
> | `-- 000
> | |-- 000.png
> | `-- 001.png
>
> could it do:
>
> /tiles/00/000000000000000000.png
> /tiles/00/000000000000000001.png
>
> ...
>
> for example? (or something similar)
>
> -Eric
The tilecache application is creating these directory structure, we
need to change it and our application for a new directory tree.
>
>> Viji
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists