[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoApBVuyra+K4UkL5euzpKbUZkvBToeQNvmaspRu1ywOSA@mail.gmail.com>
Date: Tue, 13 May 2025 11:25:10 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: axboe@...nel.dk, rostedt@...dmis.org, mhiramat@...nel.org,
mathieu.desnoyers@...icios.com, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Jason Xing <kernelxing@...cent.com>, Yushan Zhou <katrinzhou@...cent.com>
Subject: Re: [PATCH v1 5/5] relayfs: uniformally use possible cpu iteration
On Tue, May 13, 2025 at 11:21 AM Andrew Morton
<akpm@...ux-foundation.org> wrote:
>
> On Tue, 13 May 2025 10:03:01 +0800 Jason Xing <kerneljasonxing@...il.com> wrote:
>
> > On Tue, May 13, 2025 at 8:52 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
> > >
> > > On Mon, 12 May 2025 10:49:35 +0800 Jason Xing <kerneljasonxing@...il.com> wrote:
> > >
> > > > From: Jason Xing <kernelxing@...cent.com>
> > > >
> > > > Use for_each_possible_cpu to create per-cpu relayfs file to avoid later
> > > > hotplug cpu which doesn't have its own file.
> > >
> > > I don't understand this. Exactly what problem are we trying to solve?
> >
> > The reason behind this change is can we directly allocate per possible
> > cpu at the initialization phase. After this, even if some cpu goes
> > online, we don't need to care about it.
> >
> > The idea is directly borrowed from the networking area where people
> > use possible cpu iteration for most cases.
>
> I'd rephrase that as "I know this is wrong but I'm lazy so I'll do it
> this way for now and I'll fix it up later but I never actually do so".
>
> Yes, handling hotplug is a hassle and for_each_possible_cpu() saves a
> couple of hours at development time, but its cost is forever.
>
> > >
> > > > Reviewed-by: Yushan Zhou <katrinzhou@...cent.com>
> > > > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > > > ---
> > > > kernel/relay.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/relay.c b/kernel/relay.c
> > > > index 27f7e701724f..dcb099859e83 100644
> > > > --- a/kernel/relay.c
> > > > +++ b/kernel/relay.c
> > > > @@ -519,7 +519,7 @@ struct rchan *relay_open(const char *base_filename,
> > > > kref_init(&chan->kref);
> > > >
> > > > mutex_lock(&relay_channels_mutex);
> > > > - for_each_online_cpu(i) {
> > > > + for_each_possible_cpu(i) {
> > >
> > > num_possible_cpus() can sometimes greatly exceed num_online_cpus(), so
> > > this is an unfortunate change.
> >
> > Are you worried about too much extra memory to waste in this case?
>
> That, and the cost of iterating over 1024 CPUs instead of 4 CPUs!
>
> > A relevant thing I would like to share here:
> > To keep the high performance of transferring data between kernel space
> > and user space, the per-cpu mechanism is needed like how relay works
> > at the moment. It allocates many unnecessary/big memory chunks
> > especially when the cpu number is very large, say, 256. I'm still
> > working on this to see if we can figure out a good approach to balance
> > the performance and memory.
> >
> > > It would be better to implement the
> > > hotplug notifier?
> >
> > Right, but sorry, I hesitate to do so because it involves much more
> > work and corresponding tests.
>
> I did this conversion a few times, a million years ago. It's
> surprisingly easy.
I see. Let me try this :)
Thanks,
Jason
Powered by blists - more mailing lists