[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ygfy7f1g1l0.fsf@janus.isnogud.escape.de>
Date: 20 Sep 2007 10:53:15 +0200
From: Urs Thuermann <urs@...ogud.escape.de>
To: Patrick McHardy <kaber@...sh.net>
Cc: netdev@...r.kernel.org, David Miller <davem@...emloft.net>,
Thomas Gleixner <tglx@...utronix.de>,
Oliver Hartkopp <oliver@...tkopp.net>,
Oliver Hartkopp <oliver.hartkopp@...kswagen.de>,
Urs Thuermann <urs@...ogud.escape.de>
Subject: Re: [PATCH 2/7] CAN: Add PF_CAN core module
Hi Patrick,
I have done allmost all changes to the code as you suggested. The
changes to use the return value of can_rx_register() also fixed a
minor flax with failing bind() and setsockopt() on raw sockets.
But there are two things left I would like to ask/understand:
Patrick McHardy <kaber@...sh.net> writes:
> > When the module is unloaded it calls can_proto_unregister() which
> > clears the pointer. Do you see a race condition here?
>
> Yes, you do request_module, load the module, get the cp pointer
> from proto_tab, the module is unloaded again. cp points to
> stable memory. Using module references would fix this.
How would I use the module reference counter? Somehow with
try_module_get()? I have thought something like
cp = proto_tab[protocol];
if (!cp ...)
return ...;
if (!try_module_get(cp->prot->owner))
return ...;
sk = sk_alloc(...)
module_put(...);
return ret;
But here I see two problems:
1. Between the check !cp... and referencing cp->prot->owner the
module could get unloaded and the reference be invalid. Is there
some lock I can hold that prevents module unloading? I haven't
found something like this in include/linux/module.h
2. If the module gets unloaded after the first check and
request_module() but before the call to try_module_get() the
socket() syscall will return with error, although module auto
loading would normally be successful. How can I prevent that?
> > find_dev_rcv_lists() is called in one place from can_rcv() with RCU
> > lock held, as you write. The other two calls to find_dev_rcv_lists()
> > are from can_rx_register/unregister() functions which change the
> > receive lists. Therefore, we can't only use RCU but need protection
> > against simultanous writes. We do this with the spin_lock_bh(). The
> > _bh variant, because can_rcv() runs in interrupt and we need to block
> > that. I thought this is pretty standard.
> >
> > I'll check this again tomorrow, but I have put much time in these
> > locking issues already, changed it quite a few times and hoped to have
> > got it right finally.
>
>
> I'm not saying you should use *only* RCU, you need the lock
> for additions/removal of course, but since the receive path
> doesn't take that lock and relies on RCU, you need to use
> the _rcu list walking variant to avoid races with concurrent
> list changes.
I have no objections to add the _rcu suffix for the code changing the
receive lists, but I don't see why it's necessary. When I do a
spin_lock_bh() before writing, can't I be sure that there is no
interrupt routine running in parallel while I hold this spinlock? If
so, there is no reader in parallel because the can_rcv() function runs
in a softirq. I'd really like to understand why you think the writers
should also use the _rcu variant. I'm sorry if I miss something
obvious here, but could you try to explain it to me?
urs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists