lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 Jan 2017 19:51:45 +0100
From:   Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To:     Stephen Hemminger <stephen@...workplumber.org>
Cc:     netdev@...r.kernel.org, roopa@...ulusnetworks.com,
        davem@...emloft.net
Subject: Re: [PATCH RFC net-next 0/4] bridge: improve cache utilization

On 31/01/17 19:45, Nikolay Aleksandrov wrote:
> On 31/01/17 19:21, Stephen Hemminger wrote:
>> On Tue, 31 Jan 2017 19:09:09 +0100
>> Nikolay Aleksandrov <nikolay@...ulusnetworks.com> wrote:
>>
>>> On 31/01/17 17:41, Nikolay Aleksandrov wrote:
>>>>>
>>>>> I agree with the first 3 patches, but not the last one.
>>>>> Changing the API just for a performance hack is not necessary. Instead make
>>>>> the algorithm smarter and use per-cpu values.
>>>>>  
>>>>
>>>> Thanks for the feedback, I would very much prefer any of the other two approaches
>>>> I tried (per-cpu pool and per-cpu for each fdb), from the two the second one -
>>>> per-cpu for each fdb is much simpler, so would it be acceptable to do per-cpu allocation
>>>> for each fdb ?
>>>>
>>>>
>>>>   
>>>
>>> Okay, after some more testing the version with per-cpu per-fdb allocations, at 300 000 fdb entries
>>> I got 120 failed per-cpu allocs which seems okay. I'll wait a little more and will repost the series
>>> with per-cpu allocations and without the RFC tag.
>>>
>>> Thanks,
>>>  Nik
>>>
>>
>> You could also use a mark/sweep algorithm (rather than recording updated).
>> It turns out that clearing is fast (can be unlocked).
>> The timer workqueue can mark all fdb entries (during scan), then in forward
>> function clear the bit if it is set. This would turn writes into reads.
> 
> The wq doesn't have a strict next call, it is floating depending on the soonest
> expire, this can cause issues as we don't know when last we've reset the bit and
> using the scan interval resolution will result in big offsets when purging entries.
> 
>>
>> To keep the API for last used, just change the resolution to be scan interval.
>>
> 
> With default 300 second resolution ? People will be angry. :-)
> Also this has to happen for both "updated" and "used", they're both causing trouble.
> In fact "used" is much worse than "updated", because it's written to by all who transmit
> to that fdb.
> 
> Actually to start we can do something much simpler - just always update "used" at most
> once per 1/10 of ageing_time for example. The default case would give us an update every
> 30 seconds if the fdb is actually used or we can cap it at 10 seconds.
> The "updated" we move to its own cache line and with proper config (bind ports to CPUs)
> it will be fine.
> 

Acutally this is a no go, there're already users out there who depend on the high resolution
of the "used" field, so we cannot break them. We're back to either an option or per-cpu.

> What do you think ?
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ