linux-ext4 - Re: [PATCH] ext4: critical info format fix in __ext4_grp_locked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4D8809C5.1000402@tao.ma>
Date:	Tue, 22 Mar 2011 10:30:29 +0800
From:	Tao Ma <tm@....ma>
To:	Ted Ts'o <tytso@....edu>
CC:	Robin Dong <hao.bigrat@...il.com>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: critical info format fix in __ext4_grp_locked_error

Hi Ted,
On 03/22/2011 08:47 AM, Ted Ts'o wrote:
> Applied to the ext4 patch queue.
> 
> On Fri, Mar 18, 2011 at 05:58:03PM +0800, Robin Dong wrote:
>> From: Robin Dong <sanbai@...bao.com>
>>
>> When we do performence-testing on ext4 filesystem, we observe a warning like this:
>>
>> "[ 1684.113205] EXT4-fs error (device sda7): ext4_mb_generate_buddy:718: group 259825901 blocks in bitmap, 26057 in gd"
>>
>> indeed, it should be
>>
>> "group 2598, 25901 blocks in bitmap, 26057 in gd"
> 
> Note that the next two paragraphs don't really belong in a commit
> description.  It's best if you put this kind of stuff after the
> signed-off-by lines, with a "---" separating the commit description, like this:
> 
> Signed off by: Ty Coon <tycoon@...il.com>
> ---
> Stuff that explains the context of the patch
> 
> diff --git ....
> 
>> This bug is found on upstream 2.6.36 kernel. We ran a 2.6.36 kernel
>> on the online system with 8 Ext4 file systems. 2 of them are mounted
>> with delayed allocation feature. This warning is only observed on
>> delayed allocation enabled Ext4 file systems.
>>
>> This issue is not easy to reproduce, on two servers with 2.6.36
>> kenrel + ext4, after running 110+ days, the error starts to appear
>> on kernel log. When check the error log, we found the info format
>> should be fixed, that's how this patch comes.
> 
> Can you send more information about what sort of workloads your
> servers are under, and any other information about how to reproduce
> it?
OK, so let me try to describe the situation here.
This is a web cache server and we use squid to cache some data. This bug
was found we were testing 2.6.36 vanilla kernel. We don't know for sure
how to reproduce it since it showed up when the test server ran for
about 100 days. And the bad thing is that the volume was reformatted for
another test. :( But we have several machines here, and we are
continuing our test, so if there are any error happening again, we
promise that we will prompt what we find immediately.

btw, when testing 2.6.32 kernel, we find another error, a dir inode is
corrupted and some error in message like

Mar 16 11:15:28 cache161 kernel: [484403.699588] EXT4-fs error (device
sda5): ext4_lookup: deleted inode referenced: 21496065

This volume is also mounted with delay allocation.

Do you know bug related to this? We haven't checked if the upstream has
the similar bug or not.

Thanks.

Regards,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html