[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87txv2cog1.fsf@devron.myhome.or.jp>
Date: Thu, 13 Sep 2012 17:33:02 +0900
From: OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
To: Namjae Jeon <linkinjeon@...il.com>
Cc: "J. Bruce Fields" <bfields@...ldses.org>,
"Steven J. Magnani" <steve@...idescorp.com>,
Al Viro <viro@...iv.linux.org.uk>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org,
Namjae Jeon <namjae.jeon@...sung.com>,
Ravishankar N <ravi.n1@...sung.com>,
Amit Sahrawat <a.sahrawat@...sung.com>
Subject: Re: [PATCH v2 1/5] fat: allocate persistent inode numbers
Namjae Jeon <linkinjeon@...il.com> writes:
>> I see. So, client can't solve the ESTALE if inode cache was evicted,
>> right? (without application changes)
>
> There can be situation where we may get not only ESTALE but EIO also.
>
> For example,
> -------------------------------
> fd = open(“foo.txt”);
> while (1) {
> sleep(1);
> write(fd..);
> }
> --------------------------------
>
> Here “write” may fail when inode number of “foo.txt” is changed at
> server due to cache eviction under memory pressure.
> When we tried a similar test, we found that “write” is retuning “EIO”
> instead of “ESTALE”
>
> ---------------------------------------------------------------------------------------------------------
> #> ./write_test_dbg bbb 1000 0
> FILE : bbb, SIZE : 1048576000 , FSYNC : OFF , RECORD_SIZE = 4096
> 106264 -rwxr-xr-x 1 root 0 0 Jan 1 00:14 bbb
> write failed after 60080128 bytes:, errno = 5: Input/output error
> ---------------------------------------------------------------------------------------------------------
>
> As we get EIO instead of ESTALE, it may be difficult to decide when
> "restart from LOOKUP” in such situation.
> Also, as per Bruce opinion, we can not avoid ESTALE from inode number
> change in rebooted server case.
> In reboot case, it is worst as it may attempt to write in a different
> file if NFS handle at NFS client match with inode number of some other
> file at NFS server.
I see.
>> Grepping around... Documentation/sysctl/vm.txt mentions a
>> vfs_cache_pressure parameter.
>> Yeah. And dirty hack will be possible to adjust sb->s_shrink.batch.
> I am worrying if it could lead to OOM condition on embedded
> system(short memory(DRAM) and support 3TB HDD disk of big size.)
>
> Please let me know if any issues or queries.
So, now I think stable inode number may be useful if there are users of
it. And I guess those functionality is no collisions with -mm. And I
suppose we can add two modes for "nfs" option (e.g. nfs=1 and nfs=2).
If nfs=1, works like current -mm without no limited operations.
If nfs=2, try to make stable FH and limit some operations
(option name doesn't matter here.)
Does this work fine?
--
OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists