[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACzwLxga0scmSg-MQZ2Joo4Z39he27r0UGpUCvKC4iOk12CkaA@mail.gmail.com>
Date: Mon, 28 Jul 2025 21:43:01 +0500
From: Sabyrzhan Tasbolatov <snovitoll@...il.com>
To: Kees Cook <kees@...nel.org>
Cc: "Dr. David Alan Gilbert" <linux@...blig.org>, Sasha Levin <sashal@...nel.org>, workflows@...r.kernel.org,
linux-doc@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, rostedt@...dmis.org,
konstantin@...uxfoundation.org, corbet@....net, josh@...htriplett.org
Subject: Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
On Mon, Jul 28, 2025 at 11:40 AM Kees Cook <kees@...nel.org> wrote:
>
> On Sun, Jul 27, 2025 at 03:45:42PM +0000, Dr. David Alan Gilbert wrote:
> > When doing qemu dev, I frequently run it in a tmux, and start it with
> > '-nographic' which gets you a single stream with both serial and monitor in it;
> > alternatively you can get one pane with the serial output and one with the
> > monitor, that takes a little more setup;
>
> Yeah, I haven't played with it yet, but I expect I'll need to try several
> approaches and see which the agent can best deal with. It's better with
> non-interactive stuff, so I'm thinking that giving it tooling that will
> run a script at boot or have the image bring up ssh for the agent to run
> individual commands via ssh... it all depends on what the agent can wrap
> its logic around.
FWIW,
If we ask LLM to produce the code, then LLM replies with some description
and the code section within the paragraph. So in this pipeline, we need to
pre-process the LLM output. But there's another way, I believe.
We explain the MCP agent its role with the instruction, tell it to
save the code output
to the designated directory. This should be possible using MCP
filesystem servers
with RW access of the directory, so we're ready to test the generated
git diffs or C code.
Testing can be also orchestrated by the separate MCP agent who is instructed to
take the code from the output directory and run the QEMU on specific
arch, config etc.
Code generator and testing agents can optimize by themselves.
There's a MCP agent framework with "Evaluator-Optimizer" workflow [1]
to optimize the output
to some EXCELLENT quality, which is a vague description for me.
[1] https://github.com/lastmile-ai/mcp-agent/blob/main/examples/workflows/workflow_evaluator_optimizer/main.py#L57
The downside is that all of this works via LLM APIs which are not free.
But this is some orchestrated way of verifying LLM code generation, I guess.
In local development, we could grep the LLM's git diff and run the
QEMU via script for the test,
and evaluate the correctness of the code ourselves. The only money
charging here will be the LLM model,
if it's from the vendor. If Linux kernel has its own trained
Ollama-like free models to download, then it's even better.
>
> --
> Kees Cook
Powered by blists - more mailing lists