linux-kernel - Re: [RFC PATCH] docs: submitting-patches: (AI?) Tool disclosure tag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aIQTT6Gaf1WhxRX7@gallifrey>
Date: Fri, 25 Jul 2025 23:29:19 +0000
From: "Dr. David Alan Gilbert" <linux@...blig.org>
To: Sasha Levin <sashal@...nel.org>
Cc: Kees Cook <kees@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
	Konstantin Ryabitsev <konstantin@...uxfoundation.org>,
	corbet@....net, workflows@...r.kernel.org, josh@...htriplett.org,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] docs: submitting-patches: (AI?) Tool disclosure tag

* Sasha Levin (sashal@...nel.org) wrote:
> On Fri, Jul 25, 2025 at 11:29:17AM +0000, Dr. David Alan Gilbert wrote:
> > * Sasha Levin (sashal@...nel.org) wrote:
> > > On Fri, Jul 25, 2025 at 01:20:59AM +0000, Dr. David Alan Gilbert wrote:
> > > > * Sasha Levin (sashal@...nel.org) wrote:
> > > > > On Thu, Jul 24, 2025 at 04:54:11PM -0700, Kees Cook wrote:
> > > > > > On Thu, Jul 24, 2025 at 07:45:56PM -0400, Steven Rostedt wrote:
> > > > > > > My thought is to treat AI as another developer. If a developer helps you
> > > > > > > like the AI is helping you, would you give that developer credit for that
> > > > > > > work? If so, then you should also give credit to the tooling that's helping
> > > > > > > you.
> > > > > > >
> > > > > > > I suggested adding a new tag to note any tool that has done non-trivial
> > > > > > > work to produce the patch where you give it credit if it has helped you as
> > > > > > > much as another developer that you would give credit to.
> > > > > >
> > > > > > We've got tags to choose from already in that case:
> > > > > >
> > > > > > Suggested-by: LLM
> > > > > >
> > > > > > or
> > > > > >
> > > > > > Co-developed-by: LLM <not@...an.with.legal.standing>
> > > > > > Signed-off-by: LLM <not@...an.with.legal.standing>
> > > > > >
> > > > > > The latter seems ... not good, as it implies DCO SoB from a thing that
> > > > > > can't and hasn't acknowledged the DCO.
> > > > >
> > > > > In my mind, "any tool" would also be something like gcc giving you a
> > > > > "non-trivial" error (think something like a buffer overflow warning that
> > > > > could have been a security issue).
> > > > >
> > > > > In that case, should we encode the entire toolchain used for developing
> > > > > a patch?
> > > > >
> > > > > Maybe...
> > > > >
> > > > > Some sort of semi-standardized shorthand notation of the tooling used to
> > > > > develop a patch could be interesting not just for plain disclosure, but
> > > > > also to be able to trace back issues with patches ("oh! the author
> > > > > didn't see a warning because they use gcc 13 while the warning was added
> > > > > in gcc 14!").
> > > > >
> > > > > Signed-off-by: John Doe <jd@...mple.com> # gcc:14.1;ccache:1.2;sparse:4.7;claude-code:0.5
> > > > >
> > > > > This way some of it could be automated via git hooks and we can recommend
> > > > > a relevant string to add with checkpatch.
> > > >
> > > > For me there are two separate things:
> > > >  a) A tool that found a problem
> > > >  b) A tool that wrote a piece of code.
> > > >
> > > > I think the cases you're referring to are all (a), where as I'm mostly
> > > > thinking here about (b).
> > > > In the case of (a) it's normally _one_ of those tools that found it,
> > > > e.g. I see some:
> > > >   Found by gcc -fanalyzer
> > > 
> > > I think that the line between (a) and (b) gets very blurry very fast, so
> > > I'd rather stay out of trying to define it.
> > > 
> > > Running "cargo clippy" on some code might generate a warning as follows:
> > > 
> > > warning: variables can be used directly in the `format!` string
> > >   --> dyad/src/kernel/sha_processing.rs:20:13
> > >    |
> > > 20 |             debug!("git sha {} could not be validated, attempting a second way...", git_sha);
> > >    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >    |
> > >    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
> > >    = note: `#[warn(clippy::uninlined_format_args)]` on by default
> > > help: change this to
> > >    |
> > > 20 -             debug!("git sha {} could not be validated, attempting a second way...", git_sha);
> > > 20 +             debug!("git sha {git_sha} could not be validated, attempting a second way...");
> > > 
> > > As you see, it proposes a fix at the bottom. Should I attribute "cargo
> > > clippy" in my commit message as it wrote some code?
> > > 
> > > Would your answer change if I run "cargo clippy --fix" which would
> > > automatically apply the fix on it's own?
> > > 
> > > We'll be hitting these issues all over the place if we try and draw a
> > > line... For example, with more advances autocompletion: where would you
> > > draw the line between completing variable names and writing an entire
> > > function based on a comment I've made?
> > 
> > Fuzzy isn't it!
> > 
> > There's at least 3 levels as I see it:
> >  1) Reported-by:
> >    That's a lot of tools, that generate an error or warning.
> >  2) Suggested-by:
> >    That covers your example above (hmm including --fix ????)
> >  3) Co-authored-by:
> >    Where a tool wrote code based on your more abstract instructions
> > 
> > (1) & (2) are taking some existing code and finding errors or light
> > improvements;  I don't think it matters whether the tool is a good
> > old chunk of C or an LLM that's doing it, but how much it's originating.
> 
> So let's say I'm using github copilot, and I go:
> 
> 	/* Iterate over pointers in KEY_TYPE_extent: */
> 	#define extent_ptr_next(_e, _ptr) <tab> <tab>
> 
> and copilot completes the code with "__bkey_ptr_next(_ptr, extent_entry_last(_e))".
> 
> Was my instruction abstract? Was it within the realm of something we
> consider a trivial change, or should we attribute the agent? :)

Heck, I don't know either!   I mean there are places & projects that ban even
that level of use, but I'd agree that the 'more abstract' doesn't fit there.

> Why tackle any of this to begin with?

It seemed to me appropriate to identify use of AI which some might
object to, or which wouldn't be allowed in their project, or which
might indicate the need to look for different type of errors than
humans normally make.  At the same time it seemed appropriate to
acknowledge things that worked.

Dave

> -- 
> Thanks,
> Sasha
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/