lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D8809C5.1000402@tao.ma>
Date:	Tue, 22 Mar 2011 10:30:29 +0800
From:	Tao Ma <tm@....ma>
To:	Ted Ts'o <tytso@....edu>
CC:	Robin Dong <hao.bigrat@...il.com>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: critical info format fix in __ext4_grp_locked_error

Hi Ted,
On 03/22/2011 08:47 AM, Ted Ts'o wrote:
> Applied to the ext4 patch queue.
> 
> On Fri, Mar 18, 2011 at 05:58:03PM +0800, Robin Dong wrote:
>> From: Robin Dong <sanbai@...bao.com>
>>
>> When we do performence-testing on ext4 filesystem, we observe a warning like this:
>>
>> "[ 1684.113205] EXT4-fs error (device sda7): ext4_mb_generate_buddy:718: group 259825901 blocks in bitmap, 26057 in gd"
>>
>> indeed, it should be
>>
>> "group 2598, 25901 blocks in bitmap, 26057 in gd"
> 
> Note that the next two paragraphs don't really belong in a commit
> description.  It's best if you put this kind of stuff after the
> signed-off-by lines, with a "---" separating the commit description, like this:
> 
> Signed off by: Ty Coon <tycoon@...il.com>
> ---
> Stuff that explains the context of the patch
> 
> diff --git ....
> 
>> This bug is found on upstream 2.6.36 kernel. We ran a 2.6.36 kernel
>> on the online system with 8 Ext4 file systems. 2 of them are mounted
>> with delayed allocation feature. This warning is only observed on
>> delayed allocation enabled Ext4 file systems.
>>
>> This issue is not easy to reproduce, on two servers with 2.6.36
>> kenrel + ext4, after running 110+ days, the error starts to appear
>> on kernel log. When check the error log, we found the info format
>> should be fixed, that's how this patch comes.
> 
> Can you send more information about what sort of workloads your
> servers are under, and any other information about how to reproduce
> it?
OK, so let me try to describe the situation here.
This is a web cache server and we use squid to cache some data. This bug
was found we were testing 2.6.36 vanilla kernel. We don't know for sure
how to reproduce it since it showed up when the test server ran for
about 100 days. And the bad thing is that the volume was reformatted for
another test. :( But we have several machines here, and we are
continuing our test, so if there are any error happening again, we
promise that we will prompt what we find immediately.

btw, when testing 2.6.32 kernel, we find another error, a dir inode is
corrupted and some error in message like

Mar 16 11:15:28 cache161 kernel: [484403.699588] EXT4-fs error (device
sda5): ext4_lookup: deleted inode referenced: 21496065

This volume is also mounted with delay allocation.

Do you know bug related to this? We haven't checked if the upstream has
the similar bug or not.

Thanks.

Regards,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ