[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5154F67A.8020401@tao.ma>
Date: Fri, 29 Mar 2013 10:03:38 +0800
From: Tao Ma <tm@....ma>
To: Zach Brown <zab@...hat.com>
CC: linux-ext4@...r.kernel.org
Subject: Re: [PATCH 2/2] ext4: Handle readdir when a file is converted from
inline to block based.
On 03/29/2013 02:44 AM, Zach Brown wrote:
>> Zach reported that if a dir is inlined, the offset is within the inode, while
>> if we have done the conversion, the dir now will have a block offset or even
>> a hashed pos. The good thing is that ext4 is also prepared to handle some
>> situation that the dir is changed during many calls of getdents.
>
> This doesn't fix the problem. The problem isn't using the right code
> path within ext4 for either inline or normal block directories.
>
> The problem is that offsets for existing files change. Yeah, ext4 also
> has this problem when it converts from classic linear dirents to hashed
> dirents, but I bet that basically doesn't happen any more. Inline dirs
> are making the problem happen for every single directory as it grows.
Thanks for the explanation. I just looked deep into the problem and yes,
the code is really tricky for an old linear dir. Now it also uses the
ext4_dx_readdir, so the situation you described doesn't happen...
Maybe I will also need to pretend as if inline dir is hashed like the
normal linear dir and return the hash value as the pos.
Thanks,
Tao
>
> There's two ways to experience the bug:
>
> 1) nfs clients getting the wrong entry because the offset has changed
> from the time that they got it from the server
>
> 2) more worryingly: a concurrent readdir() can see duplicate entries
> from simply advancing f_pos as it does normally
>
> Here's a quick little demonstration of the second case:
>
> d_off: 2 d_name: ., f_pos 2
> d_off: 4 d_name: .., f_pos 4
> d_off: 16 d_name: a, f_pos 16
> d_off: 28 d_name: b, f_pos 28
> d_off: 40 d_name: c, f_pos 40
> d_off: 371778706554281332 d_name: .., f_pos 18446744071750344052
> d_off: 1068979911240654558 d_name: b, f_pos 18446744072795659998
> d_off: 6187216788877381273 d_name: c, f_pos 1633586841
> d_off: 6280769109141524706 d_name: e, f_pos 1386254562
>
> Run the following in a newly created empty dir with inline_data:
>
> #include <stdlib.h>
> #include <dirent.h>
> #include <stdio.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <errno.h>
> #include <string.h>
> #include <sys/syscall.h>
>
> struct linux_dirent {
> long d_ino;
> off_t d_off;
> unsigned short d_reclen;
> char d_name[];
> };
>
> int main(int argc, char **argv)
> {
> struct linux_dirent dent;
> char name[2] = {0,};
> int i;
> int ret;
> int fd;
>
> fd = open(".", O_RDONLY | O_DIRECTORY);
> if (fd < 0) {
> printf("open(\".\", O_RDONLY|O_DIRECTORY) failed: %u (%s)\n",
> errno, strerror(errno));
> exit(1);
> }
>
> for (i = 0; i < 26; i++) {
> name[0] = 'a' + i;
> mknod(name, S_IFREG|0755, 0);
> ret = syscall(SYS_getdents, fd, &dent, sizeof(dent));
> if (ret < 1)
> break;
> printf("d_off: %llu d_name: %s, f_pos %llu\n",
> (unsigned long long)dent.d_off,
> dent.d_name,
> (unsigned long long)lseek(fd, 0, SEEK_CUR));
> }
>
> return 0;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists