netdev - Re: [PATCH] dropmon: add ability to detect when hardware dropsrxpackets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A0D1BB8.10204@cosmosbay.com>
Date:	Fri, 15 May 2009 09:37:28 +0200
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Neil Horman <nhorman@...driver.com>
CC:	Jiri Pirko <jpirko@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	netdev@...r.kernel.org, davem@...emloft.net
Subject: Re: [PATCH] dropmon: add ability to detect when hardware	dropsrxpackets

Jiri Pirko a écrit :
> Thu, May 14, 2009 at 07:29:54PM CEST, nhorman@...driver.com wrote:
>> On Thu, May 14, 2009 at 02:44:08PM +0200, Jiri Pirko wrote:
>>>> +
>>>> +		/*
>>>> +		 * Clean the device list
>>>> +		 */
>>>> +		list_for_each_entry_rcu(new_stat, &hw_stats_list, list) {
>>> 		^^^^^^^^^^^^^^^^^^^^^^^
>>> This is meaningless here. Use list_for_each_entry_rcu only under rcu_read_lock.
>>> Also it would be good to use list_for_each_entry_safe here since you're
>>> modifying the list.
>>>
>> The definition of list_for_each_entry_rcu specifically says its safe against
>> list-mutation primitives, so its fine.  Although you are correct, in that its
>> safety is dependent on the protection of rcu_read_lock(), so I'll add that in.
> 
> You are right that list_for_each_entry_rcu is safe against list-mutation
> primitives. But there's no need for this on the update side when you hold a writer
> spinlock. Here I think it's better (and also less confusing) to use ordinary
> list_for_each_entry which in this case must be list_for_each_entry_safe.

Absolutely.

RCU is tricky, and must be well understood. Using the right verbs is really needed
to help everybody read the code and be able to understand it quickly and maintain it
if needed.

In this particular case, the list_for_each_entry_rcu() is a litle bit more
expensive than regular list_for_each_entry(), as it includes additionnal barriers.

Just reading code like this :

+	spin_lock(&trace_state_lock);
+		rcu_read_lock();
+		list_for_each_entry_rcu(new_stat, &hw_stats_list, list) {
+			if (new_stat->dev == NULL) {
+				list_del_rcu(&new_stat->list);
+				call_rcu(&new_stat->rcu, free_dm_hw_stat);
+			}
+		}
+		rcu_read_unlock();
+	spin_unlock(&trace_state_lock);

We *know* something is wrong, as rcu_read_lock() is supposed to guard a readside,
so having both rcu_read_lock() and list_del_rcu() is certainly wrong, we dont have 
to actually understand what is really done by the algorithm.

So following is much better : (but maybe not correct... I let you 
check if list_for_each_entry_safe() would not be better here, since
you delete elements in a list while iterating in it)

Maybe you can add a break; after call_rcu() if you know only one element
can match the "if (new_stat->dev == NULL) " condition, and use normal list_for_each_entry()

+	spin_lock(&trace_state_lock);
+		list_for_each_entry(new_stat, &hw_stats_list, list) {
+			if (new_stat->dev == NULL) {
+				list_del_rcu(&new_stat->list);
+				call_rcu(&new_stat->rcu, free_dm_hw_stat);
+			}
+		}
+	spin_unlock(&trace_state_lock);

Even if the 'wrong' version was not buggy (no crashes or corruption), the 'right' one
is a commonly used construct in kernel that most dev are able to read without asking 
themselves "is is correct or not ? Dont we have something strange here ?"

Neil, you need to read more code playing with RCU to get familiar with it, we all did same
errors in the past :)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html