linux-kernel - Re: [RFC PATCH] docs: submitting-patches: (AI?) Tool disclosure tag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIQHzWOkWYCGX4Xg@lappy>
Date: Fri, 25 Jul 2025 18:40:13 -0400
From: Sasha Levin <sashal@...nel.org>
To: "Dr. David Alan Gilbert" <linux@...blig.org>
Cc: Kees Cook <kees@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
	Konstantin Ryabitsev <konstantin@...uxfoundation.org>,
	corbet@....net, workflows@...r.kernel.org, josh@...htriplett.org,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] docs: submitting-patches: (AI?) Tool disclosure tag

On Fri, Jul 25, 2025 at 11:29:17AM +0000, Dr. David Alan Gilbert wrote:
>* Sasha Levin (sashal@...nel.org) wrote:
>> On Fri, Jul 25, 2025 at 01:20:59AM +0000, Dr. David Alan Gilbert wrote:
>> > * Sasha Levin (sashal@...nel.org) wrote:
>> > > On Thu, Jul 24, 2025 at 04:54:11PM -0700, Kees Cook wrote:
>> > > > On Thu, Jul 24, 2025 at 07:45:56PM -0400, Steven Rostedt wrote:
>> > > > > My thought is to treat AI as another developer. If a developer helps you
>> > > > > like the AI is helping you, would you give that developer credit for that
>> > > > > work? If so, then you should also give credit to the tooling that's helping
>> > > > > you.
>> > > > >
>> > > > > I suggested adding a new tag to note any tool that has done non-trivial
>> > > > > work to produce the patch where you give it credit if it has helped you as
>> > > > > much as another developer that you would give credit to.
>> > > >
>> > > > We've got tags to choose from already in that case:
>> > > >
>> > > > Suggested-by: LLM
>> > > >
>> > > > or
>> > > >
>> > > > Co-developed-by: LLM <not@...an.with.legal.standing>
>> > > > Signed-off-by: LLM <not@...an.with.legal.standing>
>> > > >
>> > > > The latter seems ... not good, as it implies DCO SoB from a thing that
>> > > > can't and hasn't acknowledged the DCO.
>> > >
>> > > In my mind, "any tool" would also be something like gcc giving you a
>> > > "non-trivial" error (think something like a buffer overflow warning that
>> > > could have been a security issue).
>> > >
>> > > In that case, should we encode the entire toolchain used for developing
>> > > a patch?
>> > >
>> > > Maybe...
>> > >
>> > > Some sort of semi-standardized shorthand notation of the tooling used to
>> > > develop a patch could be interesting not just for plain disclosure, but
>> > > also to be able to trace back issues with patches ("oh! the author
>> > > didn't see a warning because they use gcc 13 while the warning was added
>> > > in gcc 14!").
>> > >
>> > > Signed-off-by: John Doe <jd@...mple.com> # gcc:14.1;ccache:1.2;sparse:4.7;claude-code:0.5
>> > >
>> > > This way some of it could be automated via git hooks and we can recommend
>> > > a relevant string to add with checkpatch.
>> >
>> > For me there are two separate things:
>> >  a) A tool that found a problem
>> >  b) A tool that wrote a piece of code.
>> >
>> > I think the cases you're referring to are all (a), where as I'm mostly
>> > thinking here about (b).
>> > In the case of (a) it's normally _one_ of those tools that found it,
>> > e.g. I see some:
>> >   Found by gcc -fanalyzer
>>
>> I think that the line between (a) and (b) gets very blurry very fast, so
>> I'd rather stay out of trying to define it.
>>
>> Running "cargo clippy" on some code might generate a warning as follows:
>>
>> warning: variables can be used directly in the `format!` string
>>   --> dyad/src/kernel/sha_processing.rs:20:13
>>    |
>> 20 |             debug!("git sha {} could not be validated, attempting a second way...", git_sha);
>>    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>    |
>>    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
>>    = note: `#[warn(clippy::uninlined_format_args)]` on by default
>> help: change this to
>>    |
>> 20 -             debug!("git sha {} could not be validated, attempting a second way...", git_sha);
>> 20 +             debug!("git sha {git_sha} could not be validated, attempting a second way...");
>>
>> As you see, it proposes a fix at the bottom. Should I attribute "cargo
>> clippy" in my commit message as it wrote some code?
>>
>> Would your answer change if I run "cargo clippy --fix" which would
>> automatically apply the fix on it's own?
>>
>> We'll be hitting these issues all over the place if we try and draw a
>> line... For example, with more advances autocompletion: where would you
>> draw the line between completing variable names and writing an entire
>> function based on a comment I've made?
>
>Fuzzy isn't it!
>
>There's at least 3 levels as I see it:
>  1) Reported-by:
>    That's a lot of tools, that generate an error or warning.
>  2) Suggested-by:
>    That covers your example above (hmm including --fix ????)
>  3) Co-authored-by:
>    Where a tool wrote code based on your more abstract instructions
>
>(1) & (2) are taking some existing code and finding errors or light
>improvements;  I don't think it matters whether the tool is a good
>old chunk of C or an LLM that's doing it, but how much it's originating.

So let's say I'm using github copilot, and I go:

	/* Iterate over pointers in KEY_TYPE_extent: */
	#define extent_ptr_next(_e, _ptr) <tab> <tab>

and copilot completes the code with "__bkey_ptr_next(_ptr, extent_entry_last(_e))".

Was my instruction abstract? Was it within the realm of something we
consider a trivial change, or should we attribute the agent? :)

Why tackle any of this to begin with?

-- 
Thanks,
Sasha