lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sat, 22 Jul 2017 18:40:40 -0700
From:   Saurabh Kadekodi <saukad@...cmu.edu>
To:     Andreas Dilger <adilger@...ger.ca>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: Collecting aged Ext4 profiles

fsstats captures most of the stuff I want to (age and size distributions). It does not capture the directory depth distribution (i.e. what fraction of the files are how deep in the fs hierarchy) which can be important in an aging study because Ext4 chooses to split high level directories in different block groups resulting in some fragmentation. fsstats also does not capture free space fragmentation and the fragmentation score, both of which are important for my study.

If fsstats is more convenient, it would be great if the following commands could also be run in order to capture the fragmentation:

1. e2freefrag ext4_dev

2. e4defrag -c mount_point

Thanks,
Saurabh

> On Jul 17, 2017, at 1:12 PM, Saurabh Kadekodi <saukad@...cmu.edu> wrote:
> 
> Thanks Andreas. Yes, it would be great if you could share the archive. I will go through fsstats and check the exact difference. In case it captures what I need to, I agree that using fsstats would be more apt.
> 
> Thanks,
> Saurabh
> 
>> On Jul 17, 2017, at 1:08 PM, Andreas Dilger <adilger@...ger.ca> wrote:
>> 
>> On Jul 16, 2017, at 12:34 AM, Andreas Dilger <adilger@...ger.ca> wrote:
>>> On Jul 15, 2017, at 18:14, Saurabh Kadekodi <saukad@...cmu.edu> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am a PhD student studying file and storage systems and I am currently conducting research on local file system aging. My research aims at understanding realistic aging patterns and analyzing the effects of aging on file system data structures and its performance. For this purpose, I would like to capture characteristics of naturally aged file systems (i.e. not aged via synthetic workload generators).
>>>> 
>>>> In order to facilitate this profile capture, I have written a shell / python based profiling tool (fsagestats - https://github.com/saurabhkadekodi/fsagestats)  that does a file system tree walk and captures different characteristics (file age, file size and directory depth) of files and directories and produces distributions. I do not care about file names or data within each file. It also runs e2freefrag in order to understand the level of free space fragmentation, e4defrag in order to capture the fragmentation score, and copies a large file (~ 2GB) and runs filefrag in order to understand the file fragmentation, all of which are directly correlated with the file system performance. It dumps the results in the results dir, which is to be specified when you run fsagestats. You can send me the aging profile by tarring up the results directory and sending it via email.
>>>> 
>>>> Since I do not have access to Ext4 systems that see a lot of churn, I am reaching out to the Ext4 community in order to find volunteers willing to run my script and capture their Ext4 aging profile. Please feel free to modify the script as per your installation or as you see fit. Since fsagestats collects no private information, I eventually intend to host these profiles publicly (unless explicitly requested not to) to aid other researchers / enthusiasts.
>>>> 
>>>> In case you have any questions on concerns, please let me know.
>>>> 
>>>> Thanks,
>>>> Saurabh Kadekodi
>>>> 
>>>> PS: cc’ing the response and / or the aging profile to saukad@...cmu.edu is greatly appreciated.
>>> 
>>> How does your fsagestats tool compare to the existing fsstats tool (http://web.cs.dal.ca/~morven/CSCI3120/fsstats)?  If there isn't a significant difference between the two, it would be nice to stick with the existing tool to collect the filesystem information so that the body of data collected continues to grow.
>> 
>> Actually, a slightly better URL is https://github.com/adilger/fsstats which is a
>> proper Git repo and includes the original license.  The original project URL
>> http://www.pdsi-scidac.org/fsstats/ is no longer functional.  I also have a local
>> archive of results from that project if you are interested.
>> 
>> Cheers, Andreas
>> 
>> 
>> 
>> 
>> 
> 

Powered by blists - more mailing lists