netdev - Re: [PATCH RFC net-next 0/4] bridge: improve cache utilization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <74c4d638-e957-9c54-2e7e-9103cd430a6c@cumulusnetworks.com>
Date:   Tue, 31 Jan 2017 19:51:45 +0100
From:   Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To:     Stephen Hemminger <stephen@...workplumber.org>
Cc:     netdev@...r.kernel.org, roopa@...ulusnetworks.com,
        davem@...emloft.net
Subject: Re: [PATCH RFC net-next 0/4] bridge: improve cache utilization

On 31/01/17 19:45, Nikolay Aleksandrov wrote:
> On 31/01/17 19:21, Stephen Hemminger wrote:
>> On Tue, 31 Jan 2017 19:09:09 +0100
>> Nikolay Aleksandrov <nikolay@...ulusnetworks.com> wrote:
>>
>>> On 31/01/17 17:41, Nikolay Aleksandrov wrote:
>>>>>
>>>>> I agree with the first 3 patches, but not the last one.
>>>>> Changing the API just for a performance hack is not necessary. Instead make
>>>>> the algorithm smarter and use per-cpu values.
>>>>>  
>>>>
>>>> Thanks for the feedback, I would very much prefer any of the other two approaches
>>>> I tried (per-cpu pool and per-cpu for each fdb), from the two the second one -
>>>> per-cpu for each fdb is much simpler, so would it be acceptable to do per-cpu allocation
>>>> for each fdb ?
>>>>
>>>>
>>>>   
>>>
>>> Okay, after some more testing the version with per-cpu per-fdb allocations, at 300 000 fdb entries
>>> I got 120 failed per-cpu allocs which seems okay. I'll wait a little more and will repost the series
>>> with per-cpu allocations and without the RFC tag.
>>>
>>> Thanks,
>>>  Nik
>>>
>>
>> You could also use a mark/sweep algorithm (rather than recording updated).
>> It turns out that clearing is fast (can be unlocked).
>> The timer workqueue can mark all fdb entries (during scan), then in forward
>> function clear the bit if it is set. This would turn writes into reads.
> 
> The wq doesn't have a strict next call, it is floating depending on the soonest
> expire, this can cause issues as we don't know when last we've reset the bit and
> using the scan interval resolution will result in big offsets when purging entries.
> 
>>
>> To keep the API for last used, just change the resolution to be scan interval.
>>
> 
> With default 300 second resolution ? People will be angry. :-)
> Also this has to happen for both "updated" and "used", they're both causing trouble.
> In fact "used" is much worse than "updated", because it's written to by all who transmit
> to that fdb.
> 
> Actually to start we can do something much simpler - just always update "used" at most
> once per 1/10 of ageing_time for example. The default case would give us an update every
> 30 seconds if the fdb is actually used or we can cap it at 10 seconds.
> The "updated" we move to its own cache line and with proper config (bind ports to CPUs)
> it will be fine.
> 

Acutally this is a no go, there're already users out there who depend on the high resolution
of the "used" field, so we cannot break them. We're back to either an option or per-cpu.

> What do you think ?
> 
>