[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4wymvTimJrKoq1=PRmX6BMwKp9pRH62cQ_a06Avms-0XQ@mail.gmail.com>
Date: Tue, 10 Feb 2026 11:06:12 +0800
From: Barry Song <21cnbao@...il.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@....com>
Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>, Pavan Rallabhandi <Pavan.Rallabhandi@....com>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"lsf-pc@...ts.linux-foundation.org" <lsf-pc@...ts.linux-foundation.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: Re: [LSF/MM/BPF TOPIC] Machine Learning (ML) library in Linux kernel
On Tue, Feb 10, 2026 at 6:07 AM Viacheslav Dubeyko
<Slava.Dubeyko@....com> wrote:
>
> Hi Barry,
>
> On Mon, 2026-02-09 at 18:25 +0800, Barry Song wrote:
> > On Sat, Feb 7, 2026 at 3:40 AM Viacheslav Dubeyko <Slava.Dubeyko@....com> wrote:
> > >
> > > Hello,
> > >
> > [...]
> > >
> > > The continuous learning model can be adopted during training phase.
> > > It implies that kernel subsystem can receive ML model recommendations
> > > even during training phase. ML model proxy on kernel side can estimate
> > > the current kernel subsystem state, tries to apply the ML model
> > > recommendations, and estimate the efficiency of applied recommendations.
> > > Generally speaking, ML model proxy on kernel side can consider several
> > > modes of interaction with ML model recommendations: (1) emergency mode,
> > > (2) learning mode, (3) collaboration mode, (4) recommendation mode.
> > > The emergency mode is the mode when kernel subsystem is in critical state
> > > and it is required to work as efficient as possible without capability of
> > > involving the ML model recommendations (for example, ML model
> > > recommendations are completely inadequate or load is very high).
> > > The learning mode implies that kernel subsystem can try to apply
> > > the ML model recommendations for some operations with the goal of
> > > estimation the maturity of ML model. Also, ML model proxy can degrade
> > > the mode to learning state if ML model recommendations becomes inefficient.
> > > The collaboration mode has the goal of using ML recommendations in
> > > 50% of operations with the goal of achieving mature state of ML model.
> > > And, finally, ML model proxy can convert kernel subsystem in recommendation
> > > mode if ML model is mature enough and efficiency of applying
> > > the ML recommendations is higher than using human-made algorithms.
> >
> > Hi Slava,
> >
> > Do we have any concrete examples where an ML-based proxy,
> > together with its userspace ML agent, has demonstrated
> > measurable performance improvements over well-designed,
> > human-crafted kernel algorithms?
> >
> > Such examples could be in scheduling, filesystem I/O, or memory
> > reclamation and readahead. I think having a real, data-backed
> > example would be much more helpful for this discussion than
> > reasoning about an abstract framework without a concrete use
> > case.
> >
>
> This patchset [1] is the first step of declaring the ML library API with the
> goal of discussing it. As the next step, I am considering of using ML library
> API for implementing two real-life use-cases: (1) GC subsystem of LFS file
> systems (NILFS2, F2FS, SSDFS), (2) ML-based DAMON approach. I see multiple
> potential real-life use-cases of ML library. But let me start from these two
> ones and, then, we will able to extend the approach for other use-cases. The
> goal of this talk is to hear the opinion of the community and to elaborate the
> proper vision of ML library architecture.
I’m very interested in your real-world use case.
If you have any early-stage prototype code that demonstrates the full
flow from user space to kernel space—including both the kernel ML proxy
and the user-space ML agent (for example, for filesystem garbage
collection)—I’d be glad to take a look if you’re able to share it.
Thanks
Barry
Powered by blists - more mailing lists