[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <499938F2.7050301@panasas.com>
Date: Mon, 16 Feb 2009 11:59:14 +0200
From: Boaz Harrosh <bharrosh@...asas.com>
To: Evgeniy Polyakov <zbr@...emap.net>
CC: Avishay Traeger <avishay@...il.com>, Jeff Garzik <jeff@...zik.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
open-osd <osd-dev@...n-osd.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
James Bottomley <James.Bottomley@...senPartnership.com>
Subject: Re: [PATCH 6/8] exofs: super_operations and file_system_type
Evgeniy Polyakov wrote:
> Hi.
>
Hi
> On Mon, Feb 09, 2009 at 03:25:53PM +0200, Boaz Harrosh (bharrosh@...asas.com) wrote:
>> + case Opt_to:
>> + if (match_int(&args[0], &option))
>> + return -EINVAL;
>> + if (option <= 0) {
>> + EXOFS_ERR("Timout must be > 0");
>> + return -EINVAL;
>> + }
>> + opts->timeout = option * HZ;
>
> Is it intentional to be a different timeouton systems with different HX
> but the same mount option?
>
Does not "option * HZ" means that "option" is in seconds?
Any way I would not bother, it is an undocumented option for debugging only.
>> +static struct inode *exofs_alloc_inode(struct super_block *sb)
>> +{
>> + struct exofs_i_info *oi;
>> +
>> + oi = kmem_cache_alloc(exofs_inode_cachep, GFP_KERNEL);
>
> I'm curious if this should be GFP_NOFS or not?
>
Currently none of the OSD transports, (Shhhh ... including osd initiator
library), are not SWAP safe.
I will revisit all that, far in the future, when I will need SWAPyness?
>> + if (!oi)
>> + return NULL;
>> +
>> + oi->vfs_inode.i_version = 1;
>> + return &oi->vfs_inode;
>> +}
>
>> +static void exofs_put_super(struct super_block *sb)
>> +{
>> + int num_pend;
>> + struct exofs_sb_info *sbi = sb->s_fs_info;
>> +
>> + /* make sure there are no pending commands */
>> + for (num_pend = atomic_read(&sbi->s_curr_pending); num_pend > 0;
>> + num_pend = atomic_read(&sbi->s_curr_pending)) {
>
> This rises a question. Let's check exofs_new_inode() for example (it is
> a bad example, since inode can not be created when we already in the
> put_super() callback, but still there are others), it increments
> s_curr_pending way after inode was created, so is it possible that
> some in-flight callback is about to be executed and its subsequent
> s_curr_pending manipulation will not be detected by this loop?
>
> Should s_curr_pending increment be audited all over the code to be
> increased before the potential postponing command starts (which is not
> the case in exofs_new_inode() above)?
>
I have experimented with this a bit in the passed. And did not find any
problems. To the best of my understanding, I'm somehow protected by the VFS.
If the FS is busy ie. file handles open. It will refuse an unmount. Once all
handles are closed, it will remove visibility from the FS and only then
unmount. So the loop above will only wait for old commands. Actually I put
prints in there and I never ever got a count of pending commands.
It is hard to test because it is hard to find the time slot after all file
handles are close, but with heavy pending IO, and before the umount kicks
in.
Also note that I've decided to fsync on file close so I think most/all IO
should be finished by the time a file is closed.
>> + wait_queue_head_t wq;
>> + init_waitqueue_head(&wq);
>> + wait_event_timeout(wq,
>> + (atomic_read(&sbi->s_curr_pending) == 0),
>> + msecs_to_jiffies(100));
>> + }
>> +
>> + osduld_put_device(sbi->s_dev);
>> + kfree(sb->s_fs_info);
>> + sb->s_fs_info = NULL;
>> +}
>
Thanks, I appreciate your comments, they make me think
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists