[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080812201858.GD6819@linux.vnet.ibm.com>
Date: Tue, 12 Aug 2008 13:18:58 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Jarek Poplawski <jarkao2@...il.com>
Cc: David Miller <davem@...emloft.net>, emil.s.tantilov@...el.com,
jeffrey.t.kirsher@...el.com, netdev@...r.kernel.org
Subject: Re: [BUG] NULL pointer dereference in skb_dequeue
On Tue, Aug 12, 2008 at 08:09:27PM +0200, Jarek Poplawski wrote:
> On Tue, Aug 12, 2008 at 06:42:24AM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 12, 2008 at 06:36:22AM +0000, Jarek Poplawski wrote:
> ...
> > > >From net/sched/sch_generic.c:
> > >
> > > void __qdisc_run(struct Qdisc *q)
> > > {
> > > unsigned long start_time = jiffies;
> > >
> > > while (qdisc_restart(q)) {
> > > /*
> > > * Postpone processing if
> > > * 1. another process needs the CPU;
> > > * 2. we've been doing it for too long.
> > > */
> > > if (need_resched() || jiffies != start_time) {
> > > __netif_schedule(q);
> > >
> > > This function is run from dev_queue_xmit() (net/core/dev.c) under
> > > rcu_read_lock_bh(), and this "q" pointer is passed here for later use
> > > (reading) by softirq run net_tx_action(). Alas in net/ RCU primitives
> > > are probably omitted in a few places...
> >
> > If I understand this code, one way to handle it would be to increment
> > q->refcnt before passing to netif_schedule(), then decrementing it
> > (within an RCU read-side critical section) in the softirq handler.
> >
> > There are probably other ways to handle this as well.
>
> I understand this similarly (but I'm still trying to find out what's
> wrong with reading this again in a separate read-side section).
The usual problem with re-reading in a separate read-side critical section
is that someone might have removed/destroyed it in the meantime.
Consider the following example:
Task 0:
rcu_read_lock();
p = rcu_dereference(global_pointer);
if (p == NULL) {
rcu_read_unlock();
goto somewhere_else;
}
do_something_with(p);
rcu_read_unlock();
do_some_unrelated_stuff();
rcu_read_lock();
do_something_else_with(p); /* BUG!!! */
rcu_read_unlock();
somewhere_else:
Task 1:
spin_lock(&mylock);
p = global_pointer;
global_pointer = NULL;
spin_unlock(&mylock);
synchronize_rcu();
kfree(p);
Suppose task 0 picks up the global_pointer just before task 1 NULLs it.
Then Task 1's synchronize_rcu() is within its rights to return as soon
as task 0 executes its first rcu_read_unlock(). This means that task
1's kfree(p) might happen before task 0's do_something_else_with(p),
which could cause general death and destruction.
> David gave some additional explanations (which BTW don't look to me
> like very "orthodox" RCU) in this thread:
> http://marc.info/?l=linux-netdev&m=121851847805942&w=2
It looks to me like Dave believes that there is in fact a problem:
http://marc.info/?l=linux-netdev&m=121851965707714&w=2
But if it gets postponed into ksoftirqd... the RCU will pass
too early.
I'm still thinking about how to fix this without avoiding RCU
and without adding new synchronization primitives.
The only change to Dave's comment that I would make is to his first
paragraph:
But if it gets postponed into ksoftirqd or if the kernel has
been built with CONFIG_PREEMPT_RCU... the RCU will pass too early.
My thought would be to use a reference count as noted earlier, on the
grounds that postponing to softirq should be relatively rare. But again
I really cannot claim to understand this code.
Or am I missing something here?
Thanx, Paul
> Thanks,
> Jarek P.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists