netdev - Re: [PATCH net-next v6 04/12] net: homa: create homa_pool.h and homa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4e43078f-a41e-4953-9ee9-de579bd92914@redhat.com>
Date: Mon, 27 Jan 2025 10:41:13 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: John Ousterhout <ouster@...stanford.edu>
Cc: netdev@...r.kernel.org, edumazet@...gle.com, horms@...nel.org,
 kuba@...nel.org
Subject: Re: [PATCH net-next v6 04/12] net: homa: create homa_pool.h and
 homa_pool.c

On 1/25/25 12:53 AM, John Ousterhout wrote:
> On Thu, Jan 23, 2025 at 4:06 AM Paolo Abeni <pabeni@...hat.com> wrote:
> ...
>>> +     pool->descriptors = kmalloc_array(pool->num_bpages,
>>> +                                       sizeof(struct homa_bpage),
>>> +                                       GFP_ATOMIC);
>>
>> Possibly wort adding '| __GFP_ZERO' and avoid zeroing some fields later.
> 
> I prefer to do all the initialization explicitly (this makes it
> totally clear that a zero value is intended, as opposed to accidental
> omission of an initializer). If you still think I should use
> __GFP_ZERO, let me know and I'll add it.

Indeed the __GFP_ZERO flag is the preferred for such allocation, as it
at very least reduce the generated code size.

>>> +int homa_pool_get_pages(struct homa_pool *pool, int num_pages, __u32 *pages,
>>> +                     int set_owner)
>>> +{
>>> +     int core_num = raw_smp_processor_id();
>>
>> Why the 'raw' variant? If this code is pre-emptible it means another
>> process could be scheduled on the same core...
> 
> My understanding is that raw_smp_processor_id is faster.
> homa_pool_get_pages is invoked with a spinlock held, so there is no
> risk of a core switch while it is executing. Is there some other
> problem I have missed?

raw_* variants, alike __* ones, fall under the 'use at your own risk'
category.

In this specific case raw_smp_processor_id() is supposed to be used if
you don't care the process being move on other cores while using the
'id' value.

Using raw_smp_processor_id() and building with the CONFIG_DEBUG_PREEMPT
knob, the generated code will miss run-time check for preemption being
actually disabled at invocation time. Such check will be added while
using smp_processor_id(), with no performance cost for non debug build.

>>> +struct homa_bpage {
>>> +     union {
>>> +             /**
>>> +              * @cache_line: Ensures that each homa_bpage object
>>> +              * is exactly one cache line long.
>>> +              */
>>> +             char cache_line[L1_CACHE_BYTES];
>>> +             struct {
>>> +                     /** @lock: to synchronize shared access. */
>>> +                     spinlock_t lock;
>>> +
>>> +                     /**
>>> +                      * @refs: Counts number of distinct uses of this
>>> +                      * bpage (1 tick for each message that is using
>>> +                      * this page, plus an additional tick if the @owner
>>> +                      * field is set).
>>> +                      */
>>> +                     atomic_t refs;
>>> +
>>> +                     /**
>>> +                      * @owner: kernel core that currently owns this page
>>> +                      * (< 0 if none).
>>> +                      */
>>> +                     int owner;
>>> +
>>> +                     /**
>>> +                      * @expiration: time (in sched_clock() units) after
>>> +                      * which it's OK to steal this page from its current
>>> +                      * owner (if @refs is 1).
>>> +                      */
>>> +                     __u64 expiration;
>>> +             };
>>
>> ____cacheline_aligned instead of inserting the struct into an union
>> should suffice.
> 
> Done (but now that alloc_percpu_gfp is being used I'm not sure this is
> needed to ensure alignment?).

Yep, cacheline alignment should not be needed for percpu data.

/P