lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ia4wm23y9vl.fsf@castle.c.googlers.com>
Date: Tue, 30 Dec 2025 20:47:58 +0000
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Qi Zheng <qi.zheng@...ux.dev>,  hannes@...xchg.org,  hughd@...gle.com,
  mhocko@...e.com,  muchun.song@...ux.dev,  david@...nel.org,
  lorenzo.stoakes@...cle.com,  ziy@...dia.com,  harry.yoo@...cle.com,
  imran.f.khan@...cle.com,  kamalesh.babulal@...cle.com,
  axelrasmussen@...gle.com,  yuanchu@...gle.com,  weixugc@...gle.com,
  chenridong@...weicloud.com,  mkoutny@...e.com,
  akpm@...ux-foundation.org,  hamzamahfooz@...ux.microsoft.com,
  apais@...ux.microsoft.com,  lance.yang@...ux.dev,  linux-mm@...ck.org,
  linux-kernel@...r.kernel.org,  cgroups@...r.kernel.org,  Qi Zheng
 <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v2 00/28] Eliminate Dying Memory Cgroup

Shakeel Butt <shakeel.butt@...ux.dev> writes:

> On Mon, Dec 29, 2025 at 08:11:22PM -0800, Roman Gushchin wrote:
>> Shakeel Butt <shakeel.butt@...ux.dev> writes:
>> 
>> > On Tue, Dec 30, 2025 at 01:36:12AM +0000, Roman Gushchin wrote:
>> >> Qi Zheng <qi.zheng@...ux.dev> writes:
>> >> 
>> >> Hey!
>> >> 
>> >> I ran this patchset through AI review and it found few regression (which
>> >> can of course be false positives). When you'll have time, can you,
>> >> please, take a look and comment on which are real and which are not?
>> >> 
>> >> Thank you!
>> >
>> > Hi Roman, this is really good. I assume this is Gemini model. I see BPF
>> > and networking folks have automated the AI review process which is
>> > really good. I think MM should also adopt that model. Are you looking
>> > into automating MM review bot?
>> 
>> Yes, absolutely. We're working on it, hopefully in January/February we
>> can have something reasonably solid.
>
> Can you share a bit more about the plan? Are you working more on the
> infra side of things or also iterating on the prompts? (I assume you are
> using Chris Mason's review prompts)

Mostly on the infra side. I want to look closer into mm and cgroups
patches, but I need a bit more data and infra to do this.

As of now we're building some basic infra to run it on scale.

> On the infra side, one thing I would love to have is to get early
> feedback/review on my patches before posting on the list. Will it be
> possible to support such a scenario?

Absolutely.
You can do it already using a local setup, assuming you have an access
to some decent LLM. In my experience, a standard entry-level pro
subscription which is like $20/month these days is good enough to review
a number of patchsets per day. This works well for reviewing personal
patches and/or some targeted upstream reviews (like this one), but for
covering entire subsystems we need more infra and more token budgets
(and this is what we're building).

Btw, I mastered a pre-configured environment which saves time on setting
things up: https://github.com/rgushchin/kengp .
I've been testing it with Gemini CLI, but it should work with other
similar tools with some minimal changes (and I'd love to accept patches
adding this support).

Actually I think there will be several rounds of ai-reviews:
1) Developers can review their patches while working on the code and
before sending anything upstream
2) AI bots can send feedback to proposed patchset, as it works for bpf
currently
3) Maintainers and/or ci systems will run ai reviews to ensure the
quality of changes (e.g. eventually we can re-review mm-next nightly
to make sure it's not creating new regressions with other changes in
linux-next).

> On the prompt side, what kind of experiments are you doing to reduce the
> false positives? I wonder if we can comeup with some recommendations
> which help maintainers to describe relevant prompts for their sub-area
> and be helpful for AI reviews.

I like Chris's approach here: let's look into specific examples of
false positives and missed bugs and tailor our prompt systems to handle
these cases correctly. The smarter LLMs are, the fewer tricks we really
need, so it's better to keep prompts minimal. We don't really need to
write AI-specific long texts about subsystems, but sometimes codify some
non-obvious design principles and limitations.

> On the automation side, I assume we will start with some manual process.
> I definitely see back-and-forth to improve prompts for MM and someone
> needs to manually review the results generated by AI and may have to
> update prompts for better results. Also we can start with opt-in
> approach i.e. someone adds a tag to the subject for AI to review the
> series.

Yeah, there are some things to figure out on how to make sure we're not
creating a lot of noise and not repeating the same feedback if it's not
useful.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ