[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1221745514.3550.83.camel@frecb007923.frec.bull.fr>
Date: Thu, 18 Sep 2008 15:45:14 +0200
From: Frédéric Bohé <frederic.bohe@...l.net>
To: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH v2] ext4: fix initialization of UNINIT bitmap blocks
I am not very confident with buffer's behavior, but I think I have
understood what happens. Correct me if I am wrong.
Let's see the second test case, which was :
dd if=/dev/urandom of=/dev/md0 bs=1M count=20
mkfs.ext4 -t ext4dev /dev/md0 10M
mount -t ext4dev /dev/md0 /mnt/test
resize2fs /dev/md0 20M
for i in $(seq 1 3800); do touch /mnt/test/file${i} 2>&1; done
touch: cannot touch `/mnt/test/file3188': No space left on device
touch: cannot touch `/mnt/test/file3189': No space left on device
...
...
The issue here is that you can't use all inode of the second group of
the fs.
This happens because resize2fs make a call to ext2fs_read_bitmaps. This
function reads all bitmaps while paying attention not to read the
uninited bitmap. This works well as long as the fs block size is equal
to the page size. But in the above test case, the fs use 1k blocks and
we have an issue.
That's because the "read" function issued by ext2fs_read_bitmaps is a
call to kernel's block_read_full_page function. So when a single bitmap
block is asked for, 4 blocks (for 1k blocks fs on x86) are actually read
(including the uninited ones) and their respective buffer set to
uptodate.
As we rely on the buffer's uptodate flags to initialize or not this
buffer, it may happen that certain bitmap blocks are not initialized at
all. So their buffer contains the random garbage that was present on the
disk prior to the mkfs ( In the above test case, the inode bitmap of the
second group is full a random bits so I can't use all of its inodes ).
If the bitmap block corresponding to this buffer is later changed, its
UNINIT flag will be cleared and the content of the buffer written to the
disk, including the garbage.
I am a bit lost on how to fix this. Aneesh was right, I think it's an
ext2fs_read_bitmaps bug, not a kernel bug. I guess we need a userland
function to read a single block whatever the block size and page size
are. I've made a try using O_DIRECT flag but I was unsuccessful. Any
ideas/suggestions ?
Fred
Le lundi 15 septembre 2008 à 16:30 +0200, Frédéric Bohé a écrit :
> Le lundi 15 septembre 2008 à 19:06 +0530, Aneesh Kumar K.V a écrit :
> > On Mon, Sep 15, 2008 at 02:16:47PM +0200, Frédéric Bohé wrote:
> > > From: Frederic Bohe <frederic.bohe@...l.net>
> > >
> > > Do not rely on buffer head's uptodate flag to initialize
> > > uninitialized bitmap blocks.
> > >
> > > Signed-off-by: Frederic Bohe <frederic.bohe@...l.net>
> > > ---
> > > Sorry there was a copy/paste error in the previous mail !
> > >
> > > This patch makes sure to initialize uninited bitmap blocks.
> > > These are two test cases where bugs appear because of uninited blocks :
> > >
> > > 1- This test case lead to uninited block bitmap and an error message
> > > from the mballocator during the second dd.
> > >
> > > dd if=/dev/urandom of=/dev/md0 bs=1M count=300
> > > mkfs.ext4 -t ext4dev /dev/md0 1G
> > > mount -t ext4dev /dev/md0 /mnt/test
> > > resize2fs /dev/md0 2G
> > > dd if=/dev/zero of=/mnt/test/dummy bs=1M count=1500
> > >
> > > Note that the first dd is to make sure we have random garbage in the
> > > uninited blocks. If not, you could miss the issue depending what was in
> > > those blocks before running mkfs.
> > >
> > > 2- This test case lead to uninited inode bitmap blocks, making it
> > > impossible to use all the inodes of the fs.
> > >
> > > dd if=/dev/urandom of=/dev/md0 bs=1M count=20
> > > mkfs.ext4 -t ext4dev /dev/md0 10M
> > > mount -t ext4dev /dev/md0 /mnt/test
> > > resize2fs /dev/md0 20M
> > > for i in $(seq 1 3800); do touch /mnt/test/file${i} 2>&1; done
> > >
> > > balloc.c | 4 +++-
> > > ialloc.c | 4 +++-
> > > mballoc.c | 4 +++-
> > > 3 files changed, 9 insertions(+), 3 deletions(-)
> > >
> > > Index: linux-2.6.27-rc5+patch_queue/fs/ext4/balloc.c
> > > ===================================================================
> > > --- linux-2.6.27-rc5+patch_queue.orig/fs/ext4/balloc.c 2008-09-15 10:59:27.000000000 +0200
> > > +++ linux-2.6.27-rc5+patch_queue/fs/ext4/balloc.c 2008-09-15 14:03:04.000000000 +0200
> > > @@ -318,9 +318,11 @@ ext4_read_block_bitmap(struct super_bloc
> > > block_group, bitmap_blk);
> > > return NULL;
> > > }
> > > - if (bh_uptodate_or_lock(bh))
> > > + if (buffer_uptodate(bh) &&
> > > + !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
> > > return bh;
> > >
> > > + lock_buffer(bh);
> > > spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
> > > if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
> > > ext4_init_block_bitmap(sb, bh, block_group, desc);
> >
> > Why ? I guess resize should mark those buffer_heads as not uptodate so
> > that we do a reinit of block bitmap again later. The above change will
> > result in calling ext4_init_block_bitmap everytime we do a
> > read_block_bitmap on an uninit group
>
> Thanks for your comment Aneesh. I thought ext4_init_block_bitmap was
> setting the EXT4_BG_BLOCK_UNINIT flags, but it seems it is not true.
> I will try to fix it on the resize side.
>
>
> >
> >
> >
> >
> > > Index: linux-2.6.27-rc5+patch_queue/fs/ext4/ialloc.c
> > > ===================================================================
> > > --- linux-2.6.27-rc5+patch_queue.orig/fs/ext4/ialloc.c 2008-09-15 10:59:27.000000000 +0200
> > > +++ linux-2.6.27-rc5+patch_queue/fs/ext4/ialloc.c 2008-09-15 11:12:16.000000000 +0200
> > > @@ -115,9 +115,11 @@ ext4_read_inode_bitmap(struct super_bloc
> > > block_group, bitmap_blk);
> > > return NULL;
> > > }
> > > - if (bh_uptodate_or_lock(bh))
> > > + if (buffer_uptodate(bh) &&
> > > + !(desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)))
> > > return bh;
> > >
> > > + lock_buffer(bh);
> > > spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
> > > if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
> > > ext4_init_inode_bitmap(sb, bh, block_group, desc);
> > > Index: linux-2.6.27-rc5+patch_queue/fs/ext4/mballoc.c
> > > ===================================================================
> > > --- linux-2.6.27-rc5+patch_queue.orig/fs/ext4/mballoc.c 2008-09-15 10:59:27.000000000 +0200
> > > +++ linux-2.6.27-rc5+patch_queue/fs/ext4/mballoc.c 2008-09-15 14:02:44.000000000 +0200
> > > @@ -785,9 +785,11 @@ static int ext4_mb_init_cache(struct pag
> > > if (bh[i] == NULL)
> > > goto out;
> > >
> > > - if (bh_uptodate_or_lock(bh[i]))
> > > + if (buffer_uptodate(bh[i]) &&
> > > + !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
> > > continue;
> > >
> > > + lock_buffer(bh[i]);
> > > spin_lock(sb_bgl_lock(EXT4_SB(sb), first_group + i));
> > > if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
> > > ext4_init_block_bitmap(sb, bh[i],
> > >
> >
> > -aneesh
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists