linux-kernel - Re: [PATCH v6 0/5] Hwmon PMUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fX+m3KOR_c8v5=tzSbrsfE-C7_5B83ORjRoH2SQuJNA-g@mail.gmail.com>
Date: Fri, 25 Oct 2024 16:07:47 -0700
From: Ian Rogers <irogers@...gle.com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Namhyung Kim <namhyung@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...hat.com>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	Ravi Bangoria <ravi.bangoria@....com>, Weilin Wang <weilin.wang@...el.com>, 
	Yoshihiro Furudera <fj5100bi@...itsu.com>, James Clark <james.clark@...aro.org>, 
	Athira Jajeev <atrajeev@...ux.vnet.ibm.com>, Howard Chu <howardchu95@...il.com>, 
	Oliver Upton <oliver.upton@...ux.dev>, Changbin Du <changbin.du@...wei.com>, 
	Ze Gao <zegao2021@...il.com>, Junhao He <hejunhao3@...wei.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v6 0/5] Hwmon PMUs

On Fri, Oct 25, 2024 at 2:01 PM Arnaldo Carvalho de Melo
<acme@...nel.org> wrote:
>
> On Fri, Oct 25, 2024 at 11:26:26AM -0700, Ian Rogers wrote:
> > On Fri, Oct 25, 2024 at 10:30 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > > On Thu, Oct 24, 2024 at 06:33:27PM -0700, Ian Rogers wrote:
> > > > So I think moving the enum declarations into one patch is okay. But as
> > > > the enum values have no bearing on hardware constants, or something
> > > > outside of the code that uses them it smells strange to me. Ultimately
> > > > this is going to do little to the lines of code count but damage
> > > > readability. I'm not sure why we're doing this given the kernel model
> > > > for adding a driver is to add it as a large chunk. For example, here
> > > > is adding the intel PT driver:
> > > > https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u
>
> > > Maybe others can understand a big patch easily, but I'm not.
>
> > My understanding is that we make small patches so that the codebase is
> > more bisectable. When there is something new, like a driver or here a
>
> That is super important, having patches being super small and doing just
> one thing helps in bisecting problems.
>
> If two things are done in one patch, and one of them causes a problem,
> then bisection is a very effective way of finding out what exactly
> caused a problem.
>
> But bisection is not the only benefit from breaking down larger patches
> into smaller ones.
>
> We want to have more people joining our ranks, doing low level tooling
> and kernel work.
>
> Writing new functionality in a series of patches, growing in complexity
> is a way to reduce the cognitive load on understantind how something
> works.
>
> As much as trying to emulate how the kernel community works is a good
> model as that community has been producing a lot of good code in a
> frantic, athletic pace, and as much as I can agree with you that adding
> a new piece of code will not affect bisectability as its new code, I
> think having it broken down in multiple patches benefits revieweing.

Can you explain how, as asked, can separating the declaration of a
function from its definition aid in reviewing? As a reviewer, I want
to know the scope of a function and its documentation. Placing them in
2 separate patches doesn't benefit my reviewing.

> Reviewing is something we should do more, but its very taxing.
>
> One would rather try to write as much code as possible, leaving to
> others the reviewing part.
>
> But its a balancing act.
>
> Whatever we can do to help reviewers, like taking into account what they
> say they would prefer as a way to submit our work, even if it isn't
> exactly of our liking, is one such thing.
>
> So if Namhyung says that it would be best for you to try to break down
> your patches into smaller ones, like I did say to you in the past, even
> taking the trouble to do it myself, in the process introducing problems,
> later fixed, I think you should try to do what he says.
>
> He is the maintainer, try to address his comments.

I think I've written long emails addressing the comments. Just saying
too big (1) doesn't match how existing drivers are added (although
I've split the code many times so the addition is the smallest it can
be) (2) as I've pointed out makes the code harder to bisect, work with
compilers and understand.

I think there is far too much developer push back going on, it feels
capricious, I'm lucky as I'll just go push into Google's tree. I'm
only persisting here for upstream's benefit and ultimately my benefit
when I pull from upstream. Perfect shouldn't be the enemy of good, but
frequently (more often than not for me) reviewer comments aren't
improving the code they are significantly stalling it:

1) parallel testing
https://lore.kernel.org/lkml/20241025192109.132482-1-irogers@google.com/
1.1) pushed back because it used an #ifdef __linux__ to maintain some
posix library code (a now dropped complaint)
1.2) pushed back for improvements in test numbering, addressed in:
https://lore.kernel.org/lkml/20241025192109.132482-11-irogers@google.com/
not an unreasonable thing to do but feature creep. Hey we'll only take
your work helping us if you also add feature xyz

2) libdw clean up
https://lore.kernel.org/lkml/20241017002520.59124-1-irogers@google.com/
Pushed back as more cross architecture output would make the commit
messages better. Doesn't sound crazily unreasonable until you realize
the function that is being called and needing cross platform testing
is 6 lines long and only applies when you do analysis of x86 perf.data
files on non-x86 platforms. We heavily test the code on x86 and the
chance that cross platform testing will show anything is very small.

On the other hand I can point at unreviewed maintainer code going into
the tree and code where I've pointed out it is broken, from a
fundamental CS perspective, it is also taken into the tree.

RISC-V has been damaged and now in the driver they are trying to
workaround the perf tool. There were already comments to this effect
in ARM breakpoint driver's code.

On Intel we now have TPEBS (which took far far too long to land)
behind a flag which means we've made accurate top-down analysis
require an additional flag on all newer Intel models, something I
pushed against.

So the reviewing is inconsistent, damages the code (a maintainer may
disagree with the reviewer and developers saying otherwise but the
maintainer has to be followed to land) and is constantly stalling
development. Fixing reference counting took years to land because of
endless stalling, any reasonable developer would have just given up.
It is hard to imagine the state the code base would be in without it.

Of the patches I've mentioned how many are code health and how many
are a feature I can say working on is part of my day job? I see a
deliberate lack of understanding of what a developer needs. To say
I've not tried to address comments, I'd say 90% of the noise on
linux-perf-users is me resending patches, mine and others, to address
comments. Here I've made the patches a size that makes sense. I can
move the enums, which feels like a compiler error along the lines of
"static function defined but not used" but beside this, changing
evsel's name meaning to make it part of the event encoding is imo
wrong, having separate patches for a function declaration and then 1
for its definition, can you imagine taking this to its extreme and
what the patches would look like if you did this? In making things
smaller, as has happened already in this series, it is never clear you
will hit a magical maintainer happy threshold. Knowing how to make a
"right" patch is even harder when it is inconsistent with the rest of
Linux development.

Thanks,
Ian