linux-kernel - Re: [PATCH 00/15] Implement MODVERSIONS for Rust

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CABCJKueScjymx3TR-DCVynfgSg93-17cKC0s0b3A5KpiFEOiEA@mail.gmail.com>
Date: Thu, 1 Aug 2024 19:38:27 +0000
From: Sami Tolvanen <samitolvanen@...gle.com>
To: Petr Pavlu <petr.pavlu@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Masahiro Yamada <masahiroy@...nel.org>, 
	Luis Chamberlain <mcgrof@...nel.org>, Miguel Ojeda <ojeda@...nel.org>, 
	Matthew Maurer <mmaurer@...gle.com>, Alex Gaynor <alex.gaynor@...il.com>, 
	Wedson Almeida Filho <wedsonaf@...il.com>, Gary Guo <gary@...yguo.net>, linux-kbuild@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-modules@...r.kernel.org, 
	rust-for-linux@...r.kernel.org
Subject: Re: [PATCH 00/15] Implement MODVERSIONS for Rust

Hi Petr,

On Thu, Aug 1, 2024 at 4:22 AM Petr Pavlu <petr.pavlu@...e.com> wrote:
>
> STG is an interesting tool. I've played with it a bit last year. To be
> frank, I was surprised to see a new tool being proposed by Google to
> generate modversion CRCs from DWARF instead of potentially extending
> your STG project for this purpose. I'm not sure if it is something that
> you folks have considered and evaluated.

Yes. STG is fairly large and written in C++, so it didn't seem like a
great candidate for inclusion in the kernel tree, and it's presumably
not shipped by most distributions, so it's not really ideal as a
dependency either.

> Sure, it might be necessary to extend the symtypes format a bit, for
> example, by allowing spaces in type names. What other problems do you
> see?
>
> The example I showed preserves the DWARF tag names in type descriptions.
> Cross-references and the target type names use the s# prefix as they
> they need to be distinguished from other tokens.

What I meant is that genksyms output is basically preprocessed C with
references scattered in, so if you want something that's fully
backwards compatible, that might not be possible. However, if you're
happy with just the same database structure and don't care about the
exact type description format beyond that, that should be doable, but
like you said, still requires extending the format to support more
complex type names.

> What I think is needed is the ability to compare an updated kernel with
> some previous reference and have an output that clearly and accurately
> shows why CRCs of some symbols changed. The previous reference should be
> possible to store in Git together with the kernel source. It means it
> should be ideally some text format and limited in size. This is what
> distributions that care about stable kABI do in some form currently.
>
> This functionality would be needed if some distribution wants to
> maintain stable Rust kABI (not sure if it is actually feasible), or if
> the idea is for gendwarfksyms to be a general tool that could replace
> genksyms. I assume for the sake of argument that this is the case.
>
> Gendwarfksyms could implement this functionality on its own, or as
> discussed, I believe it could provide a symtypes-like dump and a second
> tool could be used to work with this format and for comparing it.
>
> From my point of view, the current --debug format is not suitable for
> this purpose because its expanded and unstructured form means it is
> bloated and hard to compare with a previous reference.

I do see your point, there's a ton of duplication in the --debug
output as each symbol expands all the types it uses.

> I'm also not quite yet sold on using separate DWARF tooling, such as
> libabigail or STG, to actually understand why gendwarfksyms produced
> a different CRC for some symbol. Using these tools makes sense in the
> genksyms world, where genksyms operates on the source code level and
> this additional tooling can only work on debug data.
>
> With gendwarfksyms working directly with DWARF data, it doesn't seem
> appealing to me to first run gendwarfksyms to produce CRCs, compare them
> with their reference, and if they are different, use a second tool to
> process the same DWARF data again and with some luck hopefully get an
> actual answer why the CRCs changed. I'm worried that users might
> encounter inaccurate answers if the two tools interpret the input data
> differently.

I agree, there might be other DWARF changes that we don't care about,
but that nevertheless make it harder to figure out what caused the
actual CRC to change.

My initial thought was that simply running a diff between the --debug
output would be sufficient, as it would directly tell you what
changed, but I didn't consider that the size of the output dump would
be of concern as well. I'll look into adding a symtypes-style output
format in the next version. Thanks again for the feedback!

Sami