lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20241113062147.1444772-1-gnaaman@drivenets.com>
Date: Wed, 13 Nov 2024 06:21:47 +0000
From: Gilad Naaman <gnaaman@...venets.com>
To: pabeni@...hat.com
Cc: davem@...emloft.net,
	dsahern@...nel.org,
	edumazet@...gle.com,
	gnaaman@...venets.com,
	horms@...nel.org,
	kuba@...nel.org,
	kuniyu@...zon.com,
	netdev@...r.kernel.org,
	vadim.fedorenko@...ux.dev
Subject: Re: [PATCH net-next v2] Avoid traversing addrconf hash on ifdown

>On 11/11/24 13:07, Vadim Fedorenko wrote:
>> On 11/11/2024 05:21, Gilad Naaman wrote:
>>>> On 10/11/2024 06:53, Gilad Naaman wrote:
>>>>>>> -           spin_unlock_bh(&net->ipv6.addrconf_hash_lock);
>>>>>>> +   list_for_each_entry(ifa, &idev->addr_list, if_list) {
>>>>>>> +           addrconf_del_dad_work(ifa);
>>>>>>> +
>>>>>>> +           /* combined flag + permanent flag decide if
>>>>>>> +            * address is retained on a down event
>>>>>>> +            */
>>>>>>> +           if (!keep_addr ||
>>>>>>> +               !(ifa->flags & IFA_F_PERMANENT) ||
>>>>>>> +               addr_is_local(&ifa->addr))
>>>>>>> +                   hlist_del_init_rcu(&ifa->addr_lst);
>>>>>>>      }
>>>>>>>
>>>>>>> +   spin_unlock(&net->ipv6.addrconf_hash_lock);
>>>>>>> +   read_unlock_bh(&idev->lock);
>>>>>>
>>>>>> Why is this read lock needed here? spinlock addrconf_hash_lock will
>>>>>> block any RCU grace period to happen, so we can safely traverse
>>>>>> idev->addr_list with list_for_each_entry_rcu()...
>>>>>
>>>>> Oh, sorry, I didn't realize the hash lock encompasses this one;
>>>>> although it seems obvious in retrospect.
>>>>>
>>>>>>> +
>>>>>>>      write_lock_bh(&idev->lock);
>>>>>>
>>>>>> if we are trying to protect idev->addr_list against addition, then we
>>>>>> have to extend write_lock scope. Otherwise it may happen that another
>>>>>> thread will grab write lock between read_unlock and write_lock.
>>>>>>
>>>>>> Am I missing something?
>>>>>
>>>>> I wanted to ensure that access to `idev->addr_list` is performed under lock,
>>>>> the same way it is done immediately afterwards;
>>>>> No particular reason not to extend the existing lock, I just didn't think
>>>>> about it.
>>>>>
>>>>> For what it's worth, the original code didn't have this protection either,
>>>>> since the another thread could have grabbed the lock between
>>>>> `spin_unlock_bh(&net->ipv6.addrconf_hash_lock);` of the last loop iteration,
>>>>> and the `write_lock`.
>>>>>
>>>>> Should I extend the write_lock upwards, or just leave it off?
>>>>
>>>> Well, you are doing write manipulation with the list, which is protected
>>>> by read-write lock. I would expect this lock to be held in write mode.
>>>> And you have to protect hash map at the same time. So yes, write_lock
>>>> and spin_lock altogether, I believe.
>>>>
>>>
>>> Note that within the changed lines, the list itself is only iterated-on,
>>> not manipulated.
>>> The changes are to the `addr_lst` list, which is the hashtable, not the
>>> list this lock protects.
>>>
>>> I'll send v3 with the write-lock extended.
>>> Thank you!
>> 
>> Reading it one more time, I'm not quite sure that locking hashmap
>> spinlock under idev->lock in write mode is a good idea... We have to
>> think more about it, maybe ask for another opinion. Looks like RTNL
>> should protect idev->addr_list from modification while idev->lock is
>> more about changes to idev, not only about addr_list.
>> 
>> @Eric could you please shed some light on the locking schema here?
>
>AFAICS idev->addr_list is (write) protected by write_lock(idev->lock),
>while net->ipv6.inet6_addr_lst is protected by
>spin_lock_bh(&net->ipv6.addrconf_hash_lock).
>
>Extending the write_lock() scope will create a lock dependency between
>the hashtable lock and the list lock, which in turn could cause more
>problem in the future.
>
>Note that idev->addr_list locking looks a bit fuzzy, as is traversed in
>several places under the RCU lock only. I suggest finish the conversion
>of idev->addr_list to RCU and do this additional traversal under RCU, too.

Sure, no problem.

I've looked over the usage of ->addr_list in this file and there are about four
places where I'm certain I can replace idev->lock with RCU:

 - dev_forward_change
 - inet6_addr_del
 - addrconf_dad_run
 - addrconf_disable_policy_idev

As for the rest, if it's okay to run it by you before submitting a patch:

 - ipv6_link_dev_addr:
   Modifies list directly under write-lock.

 -  __ipv6_get_lladdr & ipv6_inherit_eui64 & ipv6_lonely_lladdr: Traverse in
    reverse. According my (admittedly limited) understanding, this is not
    possible in RCU.

 - addrconf_permanent_addr: Not sure if this can be RCU'd, as there's no
   variant that is both _rcu and _safe.
   If it was safe to keep iterating with just `_rcu`, I'm not sure why
   `_safe` was needed in the first place.

 - addrconf_ifdown & inet6_set_iftoken:
   Seems like write-lock is taken anyway and regardless of the iteration,
   so I'm not sure it would benefit from introducing RCU.

 - check_cleanup_prefix_route:
   I'm conflicted about this one.
   When called from ipv6_del_addr(), the write lock is taken anyway.
   When called from inet6_addr_modify(), the write-lock is taken;
   where a read-lock could have done the job.

   Should this be RCU'd as well?

>Cheers,
>
>Paolo

Cheers


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ