[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120913112024.GA24684@fieldses.org>
Date: Thu, 13 Sep 2012 07:20:25 -0400
From: "J. Bruce Fields" <bfields@...ldses.org>
To: OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Cc: Namjae Jeon <linkinjeon@...il.com>,
"Steven J. Magnani" <steve@...idescorp.com>,
Al Viro <viro@...iv.linux.org.uk>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org,
Namjae Jeon <namjae.jeon@...sung.com>,
Ravishankar N <ravi.n1@...sung.com>,
Amit Sahrawat <a.sahrawat@...sung.com>
Subject: Re: [PATCH v2 1/5] fat: allocate persistent inode numbers
On Thu, Sep 13, 2012 at 05:33:02PM +0900, OGAWA Hirofumi wrote:
> Namjae Jeon <linkinjeon@...il.com> writes:
>
> >> I see. So, client can't solve the ESTALE if inode cache was evicted,
> >> right? (without application changes)
> >
> > There can be situation where we may get not only ESTALE but EIO also.
> >
> > For example,
> > -------------------------------
> > fd = open(“foo.txt”);
> > while (1) {
> > sleep(1);
> > write(fd..);
> > }
> > --------------------------------
> >
> > Here “write” may fail when inode number of “foo.txt” is changed at
> > server due to cache eviction under memory pressure.
> > When we tried a similar test, we found that “write” is retuning “EIO”
> > instead of “ESTALE”
> >
> > ---------------------------------------------------------------------------------------------------------
> > #> ./write_test_dbg bbb 1000 0
> > FILE : bbb, SIZE : 1048576000 , FSYNC : OFF , RECORD_SIZE = 4096
> > 106264 -rwxr-xr-x 1 root 0 0 Jan 1 00:14 bbb
> > write failed after 60080128 bytes:, errno = 5: Input/output error
> > ---------------------------------------------------------------------------------------------------------
> >
> > As we get EIO instead of ESTALE, it may be difficult to decide when
> > "restart from LOOKUP” in such situation.
> > Also, as per Bruce opinion, we can not avoid ESTALE from inode number
> > change in rebooted server case.
> > In reboot case, it is worst as it may attempt to write in a different
> > file if NFS handle at NFS client match with inode number of some other
> > file at NFS server.
>
> I see.
>
> >> Grepping around... Documentation/sysctl/vm.txt mentions a
> >> vfs_cache_pressure parameter.
> >> Yeah. And dirty hack will be possible to adjust sb->s_shrink.batch.
> > I am worrying if it could lead to OOM condition on embedded
> > system(short memory(DRAM) and support 3TB HDD disk of big size.)
> >
> > Please let me know if any issues or queries.
>
> So, now I think stable inode number may be useful if there are users of
> it. And I guess those functionality is no collisions with -mm. And I
> suppose we can add two modes for "nfs" option (e.g. nfs=1 and nfs=2).
>
> If nfs=1, works like current -mm without no limited operations.
Apologies, I haven't been following the conversation carefully: remind
me what "works like current -mm" means?
--b.
> If nfs=2, try to make stable FH and limit some operations
>
> (option name doesn't matter here.)
>
> Does this work fine?
> --
> OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists