[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0811252148540.32523@alien.or.mcafeemobile.com>
Date: Tue, 25 Nov 2008 22:27:38 -0800 (PST)
From: Davide Libenzi <davidel@...ilserver.org>
To: Tejun Heo <htejun@...il.com>
cc: Oleg Nesterov <oleg@...hat.com>,
Eric Van Hensbergen <ericvh@...il.com>,
Ron Minnich <rminnich@...dia.gov>, Ingo Molnar <mingo@...e.hu>,
Christoph Hellwig <hch@...radead.org>,
Miklos Szeredi <mszeredi@...e.cz>,
Brad Boyer <flar@...andria.com>,
Al Viro <viro@...iv.linux.org.uk>,
Roland McGrath <roland@...hat.com>,
Mauro Carvalho Chehab <mchehab@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] poll: allow f_op->poll to sleep, take#5
On Wed, 26 Nov 2008, Tejun Heo wrote:
> +static int pollwake(wait_queue_t *wait, unsigned mode, int sync, void *key)
> +{
> + struct poll_wqueues *pwq = wait->private;
> + DECLARE_WAITQUEUE(dummy_wait, pwq->polling_task);
> +
> + /*
> + * Wake up functions have full barrier semantics, no need for
> + * barrier here.
> + */
> + pwq->triggered = 1;
> +
> + /*
> + * Perform the default wake up operation using a dummy
> + * waitqueue.
> + *
> + * TODO: This is hacky but there currently is no interface to
> + * pass in @sync. @sync is scheduled to be removed and once
> + * that happens, wake_up_process() can be used directly.
> + */
> + return default_wake_function(&dummy_wait, mode, sync, key);
> +}
> +int poll_schedule_timeout(struct poll_wqueues *pwq, int state,
> + ktime_t *expires, unsigned long slack)
> +{
> + int rc = -EINTR;
> +
> + set_current_state(state);
> + if (!pwq->triggered)
> + rc = schedule_hrtimeout_range(expires, slack, HRTIMER_MODE_ABS);
> + __set_current_state(TASK_RUNNING);
> +
> + /*
> + * Prepare for the next iteration. ->poll() might not have
> + * enough barrier semantics from the second round as waits are
> + * registered only during the first one. Use set_mb().
> + */
> + set_mb(pwq->triggered, 0);
> +
> + return rc;
> +}
> +EXPORT_SYMBOL(poll_schedule_timeout);
Look, pollwake() does:
w1) WR triggered (1)
w2) WMB
w3) WR task->state (RUNNING)
While poll_schedule_timeout() does:
s1) WR task->state (TASK_INTERRUPTIBLE)
s2) MB
s3) RD triggered
s4) IF0 => RD task->state (if !RUNNING -> sleep)
The only risk is that w3 preceed s1, so that we go to sleep even though a
wakeup has been issued. But if w3 is visible, w1 is visible too, that
means that 'triggered' is visible in s3 (there's a MB in s2). So we skip
the schedule_hrtimeout_range(). So IMO you need no barriers on 'triggered'.
If you feel you need barriers, do you mind explaning a sequence of events
that makes a barrier-free version break?
- Davide
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists