netdev - Re: [PATCH net-next 06/14] net/tls: Add generic NIC offload infrastructure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <618e927c-c015-8ec1-d5c7-72298dd9f522@virtuozzo.com>
Date:   Thu, 22 Mar 2018 16:03:46 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Boris Pismenny <borisp@...lanox.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        "David S. Miller" <davem@...emloft.net>
Cc:     netdev@...r.kernel.org, Dave Watson <davejwatson@...com>,
        Ilya Lesokhin <ilyal@...lanox.com>,
        Aviad Yehezkel <aviadye@...lanox.com>
Subject: Re: [PATCH net-next 06/14] net/tls: Add generic NIC offload
 infrastructure

On 22.03.2018 15:38, Boris Pismenny wrote:
> ...
>>>>
>>>> Can't we move this check in tls_dev_event() and use it for all types of events?
>>>> Then we avoid duplicate code.
>>>>
>>>
>>> No. Not all events require this check. Also, the result is different for different events.
>>
>> No. You always return NOTIFY_DONE, in case of !(netdev->features & NETIF_F_HW_TLS_TX).
>> See below:
>>
>> static int tls_check_dev_ops(struct net_device *dev)
>> {
>>     if (!dev->tlsdev_ops)
>>         return NOTIFY_BAD;
>>
>>     return NOTIFY_DONE;
>> }
>>
>> static int tls_device_down(struct net_device *netdev)
>> {
>>     struct tls_context *ctx, *tmp;
>>     struct list_head list;
>>     unsigned long flags;
>>
>>     ...
>>     return NOTIFY_DONE;
>> }
>>
>> static int tls_dev_event(struct notifier_block *this, unsigned long event,
>>                   void *ptr)
>> {
>>          struct net_device *dev = netdev_notifier_info_to_dev(ptr);
>>
>>     if (!(netdev->features & NETIF_F_HW_TLS_TX))
>>         return NOTIFY_DONE;
>>            switch (event) {
>>          case NETDEV_REGISTER:
>>          case NETDEV_FEAT_CHANGE:
>>              return tls_check_dev_ops(dev);
>>            case NETDEV_DOWN:
>>              return tls_device_down(dev);
>>          }
>>          return NOTIFY_DONE;
>> }
>>  
> 
> Sure, will fix in V3.
> 
>>>>> +
>>>>> +    /* Request a write lock to block new offload attempts
>>>>> +     */
>>>>> +    percpu_down_write(&device_offload_lock);
>>>>
>>>> What is the reason percpu_rwsem is chosen here? It looks like this primitive
>>>> gives more advantages readers, then plain rwsem does. But it also gives
>>>> disadvantages to writers. It would be good, unless tls_device_down() is called
>>>> with rtnl_lock() held from netdevice notifier. But since netdevice notifier
>>>> are called with rtnl_lock() held, percpu_rwsem will increase the time rtnl_lock()
>>>> is locked.
>>> We use the a rwsem to allow multiple (readers) invocations of tls_set_device_offload, which is triggered by the user (persumably) during the TLS handshake. This might be considered a fast-path.
>>>
>>> However, we must block all calls to tls_set_device_offload while we are processing NETDEV_DOWN events (writer).
>>>
>>> As you've mentioned, the percpu rwsem is more efficient for readers, especially on NUMA systems, where cache-line bouncing occurs during reader acquire and reduces performance.
>>
>> Hm, and who are the readers? It's used from do_tls_setsockopt_tx(), but it doesn't
>> seem to be performance critical. Who else?
>>
> 
> It depends on whether you consider the TLS handshake code as critical.
> The readers are TCP connections processing the CCS message of the TLS handshake. They are providing key material to the kernel to start using Kernel TLS.

The thing is rtnl_lock() is critical for the rest of the system,
while TLS handshake is small subset of actions the system makes.

rtnl_lock() is used just almost everywhere, from netlink messages
to netdev ioctls.

Currently, you even just can't close raw socket without rtnl lock.
So, all of this is big reason to avoid doing rcu waitings under it.

Kirill

>>>>
>>>> Can't we use plain rwsem here instead?
>>>>
>>>
>>> Its a performance tradeoff. I'm not certain that the percpu rwsem write side acquire is significantly worse than using the global rwsem.
>>>
>>> For now, while all of this is experimental, can we agree to focus on the performance of readers? We can change it later if it becomes a problem.
>>
>> Same as above.
>>   
> 
> Replaced with rwsem from V2.