lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 16 Dec 2020 00:23:07 +0100 From: Daniel Borkmann <daniel@...earbox.net> To: Andrii Nakryiko <andrii.nakryiko@...il.com>, Cong Wang <xiyou.wangcong@...il.com> Cc: Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>, Cong Wang <cong.wang@...edance.com>, Alexei Starovoitov <ast@...nel.org>, Dongdong Wang <wangdongdong.6@...edance.com> Subject: Re: [Patch bpf-next v2 2/5] bpf: introduce timeout map On 12/15/20 11:03 PM, Andrii Nakryiko wrote: > On Tue, Dec 15, 2020 at 12:06 PM Cong Wang <xiyou.wangcong@...il.com> wrote: >> >> On Tue, Dec 15, 2020 at 11:27 AM Andrii Nakryiko >> <andrii.nakryiko@...il.com> wrote: >>> >>> On Mon, Dec 14, 2020 at 12:17 PM Cong Wang <xiyou.wangcong@...il.com> wrote: >>>> >>>> From: Cong Wang <cong.wang@...edance.com> >>>> >>>> This borrows the idea from conntrack and will be used for conntrack in >>>> bpf too. Each element in a timeout map has a user-specified timeout >>>> in secs, after it expires it will be automatically removed from the map. [...] >>>> char key[] __aligned(8); >>>> }; >>>> >>>> @@ -143,6 +151,7 @@ static void htab_init_buckets(struct bpf_htab *htab) >>>> >>>> for (i = 0; i < htab->n_buckets; i++) { >>>> INIT_HLIST_NULLS_HEAD(&htab->buckets[i].head, i); >>>> + atomic_set(&htab->buckets[i].pending, 0); >>>> if (htab_use_raw_lock(htab)) { >>>> raw_spin_lock_init(&htab->buckets[i].raw_lock); >>>> lockdep_set_class(&htab->buckets[i].raw_lock, >>>> @@ -431,6 +440,14 @@ static int htab_map_alloc_check(union bpf_attr *attr) >>>> return 0; >>>> } >>>> >>>> +static void htab_sched_gc(struct bpf_htab *htab, struct bucket *b) >>>> +{ >>>> + if (atomic_fetch_or(1, &b->pending)) >>>> + return; >>>> + llist_add(&b->gc_node, &htab->gc_list); >>>> + queue_work(system_unbound_wq, &htab->gc_work); >>>> +} >>> >>> I'm concerned about each bucket being scheduled individually... And >>> similarly concerned that each instance of TIMEOUT_HASH will do its own >>> scheduling independently. Can you think about the way to have a >>> "global" gc/purging logic, and just make sure that buckets that need >>> processing would be just internally chained together. So the purging >>> routing would iterate all the scheduled hashmaps, and within each it >>> will have a linked list of buckets that need processing? And all that >>> is done just once each GC period. Not N times for N maps or N*M times >>> for N maps with M buckets in each. >> >> Our internal discussion went to the opposite actually, people here argued >> one work is not sufficient for a hashtable because there would be millions >> of entries (max_entries, which is also number of buckets). ;) > > I was hoping that it's possible to expire elements without iterating > the entire hash table every single time, only items that need to be > processed. Hashed timing wheel is one way to do something like this, > kernel has to solve similar problems with timeouts as well, why not > taking inspiration there? Couldn't this map be coupled with LRU map for example through flag on map creation so that the different LRU map flavors can be used with it? For BPF CT use case we do rely on LRU map to purge 'inactive' entries once full. I wonder if for that case you then still need to schedule a GC at all.. e.g. if you hit the condition time_after_eq64(now, entry->expires) you'd just re-link the expired element from the public htab to e.g. the LRU's local CPU's free/pending-list instead. Thanks, Daniel
Powered by blists - more mailing lists