linux-kernel - Re: [RFC PATCH] docs: submitting-patches: (AI?) Tool disclosure tag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aINvLgwaKZsKOibE@gallifrey>
Date: Fri, 25 Jul 2025 11:49:02 +0000
From: "Dr. David Alan Gilbert" <linux@...blig.org>
To: Laurent Pinchart <laurent.pinchart@...asonboard.com>
Cc: Sasha Levin <sashal@...nel.org>, Kees Cook <kees@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Konstantin Ryabitsev <konstantin@...uxfoundation.org>,
	corbet@....net, workflows@...r.kernel.org, josh@...htriplett.org,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] docs: submitting-patches: (AI?) Tool disclosure tag

* Laurent Pinchart (laurent.pinchart@...asonboard.com) wrote:
> On Fri, Jul 25, 2025 at 11:29:17AM +0000, Dr. David Alan Gilbert wrote:
> > * Sasha Levin (sashal@...nel.org) wrote:
> > > On Fri, Jul 25, 2025 at 01:20:59AM +0000, Dr. David Alan Gilbert wrote:
> > > > * Sasha Levin (sashal@...nel.org) wrote:
> > > > > On Thu, Jul 24, 2025 at 04:54:11PM -0700, Kees Cook wrote:
> > > > > > On Thu, Jul 24, 2025 at 07:45:56PM -0400, Steven Rostedt wrote:
> > > > > > > My thought is to treat AI as another developer. If a developer helps you
> > > > > > > like the AI is helping you, would you give that developer credit for that
> > > > > > > work? If so, then you should also give credit to the tooling that's helping
> > > > > > > you.
> > > > > > >
> > > > > > > I suggested adding a new tag to note any tool that has done non-trivial
> > > > > > > work to produce the patch where you give it credit if it has helped you as
> > > > > > > much as another developer that you would give credit to.
> > > > > >
> > > > > > We've got tags to choose from already in that case:
> > > > > >
> > > > > > Suggested-by: LLM
> > > > > >
> > > > > > or
> > > > > >
> > > > > > Co-developed-by: LLM <not@...an.with.legal.standing>
> > > > > > Signed-off-by: LLM <not@...an.with.legal.standing>
> > > > > >
> > > > > > The latter seems ... not good, as it implies DCO SoB from a thing that
> > > > > > can't and hasn't acknowledged the DCO.
> > > > > 
> > > > > In my mind, "any tool" would also be something like gcc giving you a
> > > > > "non-trivial" error (think something like a buffer overflow warning that
> > > > > could have been a security issue).
> > > > > 
> > > > > In that case, should we encode the entire toolchain used for developing
> > > > > a patch?
> > > > > 
> > > > > Maybe...
> > > > > 
> > > > > Some sort of semi-standardized shorthand notation of the tooling used to
> > > > > develop a patch could be interesting not just for plain disclosure, but
> > > > > also to be able to trace back issues with patches ("oh! the author
> > > > > didn't see a warning because they use gcc 13 while the warning was added
> > > > > in gcc 14!").
> > > > > 
> > > > > Signed-off-by: John Doe <jd@...mple.com> # gcc:14.1;ccache:1.2;sparse:4.7;claude-code:0.5
> > > > > 
> > > > > This way some of it could be automated via git hooks and we can recommend
> > > > > a relevant string to add with checkpatch.
> > > > 
> > > > For me there are two separate things:
> > > >  a) A tool that found a problem
> > > >  b) A tool that wrote a piece of code.
> > > > 
> > > > I think the cases you're referring to are all (a), where as I'm mostly
> > > > thinking here about (b).
> > > > In the case of (a) it's normally _one_ of those tools that found it,
> > > > e.g. I see some:
> > > >   Found by gcc -fanalyzer
> > > 
> > > I think that the line between (a) and (b) gets very blurry very fast, so
> > > I'd rather stay out of trying to define it.
> > > 
> > > Running "cargo clippy" on some code might generate a warning as follows:
> > > 
> > > warning: variables can be used directly in the `format!` string
> > >   --> dyad/src/kernel/sha_processing.rs:20:13
> > >    |
> > > 20 |             debug!("git sha {} could not be validated, attempting a second way...", git_sha);
> > >    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >    |
> > >    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
> > >    = note: `#[warn(clippy::uninlined_format_args)]` on by default
> > > help: change this to
> > >    |
> > > 20 -             debug!("git sha {} could not be validated, attempting a second way...", git_sha);
> > > 20 +             debug!("git sha {git_sha} could not be validated, attempting a second way...");
> > > 
> > > As you see, it proposes a fix at the bottom. Should I attribute "cargo
> > > clippy" in my commit message as it wrote some code?
> > > 
> > > Would your answer change if I run "cargo clippy --fix" which would
> > > automatically apply the fix on it's own?
> > > 
> > > We'll be hitting these issues all over the place if we try and draw a
> > > line... For example, with more advances autocompletion: where would you
> > > draw the line between completing variable names and writing an entire
> > > function based on a comment I've made?
> > 
> > Fuzzy isn't it!
> > 
> > There's at least 3 levels as I see it:
> >   1) Reported-by:
> >     That's a lot of tools, that generate an error or warning.
> >   2) Suggested-by:
> >     That covers your example above (hmm including --fix ????)
> >   3) Co-authored-by:
> >     Where a tool wrote code based on your more abstract instructions
> > 
> > (1) & (2) are taking some existing code and finding errors or light
> > improvements;  I don't think it matters whether the tool is a good
> > old chunk of C or an LLM that's doing it, but how much it's originating.
> 
> Except from a copyright point of view. The situation is quite clear for
> deterministic code generation, it's less so for LLMs.

As long as you'd acknowledged the use of the LLM in all cases, it seems to
me right to say to what degree you use it (i.e. the 1..3) above.
I think even most people worried about copright issues would worry
less if an LLM had just told you about a problem (1) and you fixed it.
(Although obviously IANAL)

Dave

> > (Now I'm leaning more towards Kees's style of using existing tags
> > if we could define a way to do it cleanly).
> 
> -- 
> Regards,
> 
> Laurent Pinchart
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/