linux-kernel - Re: comments style: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190822173801.GA2218@andrea>
Date:   Thu, 22 Aug 2019 19:38:01 +0200
From:   Andrea Parri <parri.andrea@...il.com>
To:     Petr Mladek <pmladek@...e.com>
Cc:     John Ogness <john.ogness@...utronix.de>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Andrea Parri <andrea.parri@...rulasolutions.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Brendan Higgins <brendanhiggins@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: comments style: Re: [RFC PATCH v4 1/9] printk-rb: add a new
 printk ringbuffer implementation

On Thu, Aug 22, 2019 at 03:50:52PM +0200, Petr Mladek wrote:
> On Wed 2019-08-21 07:46:28, John Ogness wrote:
> > On 2019-08-20, Sergey Senozhatsky <sergey.senozhatsky.work@...il.com> wrote:
> > > [..]
> > >> > +	 *
> > >> > +	 * Memory barrier involvement:
> > >> > +	 *
> > >> > +	 * If dB reads from gA, then dC reads from fG.
> > >> > +	 * If dB reads from gA, then dD reads from fH.
> > >> > +	 * If dB reads from gA, then dE reads from fE.
> > >> > +	 *
> > >> > +	 * Note that if dB reads from gA, then dC cannot read from fC.
> > >> > +	 * Note that if dB reads from gA, then dD cannot read from fD.
> > >> > +	 *
> > >> > +	 * Relies on:
> > >> > +	 *
> > >> > +	 * RELEASE from fG to gA
> > >> > +	 *    matching
> > >> > +	 * ADDRESS DEP. from dB to dC
> > >> > +	 *
> > >> > +	 * RELEASE from fH to gA
> > >> > +	 *    matching
> > >> > +	 * ADDRESS DEP. from dB to dD
> > >> > +	 *
> > >> > +	 * RELEASE from fE to gA
> > >> > +	 *    matching
> > >> > +	 * ACQUIRE from dB to dE
> > >> > +	 */
> > >> 
> > >> But I am not sure how much this is useful. It would take ages to decrypt
> > >> all these shortcuts (signs) and translate them into something
> > >> human readable. Also it might get outdated easily.
> > >> 
> > The labels are necessary for the technical documentation of the
> > barriers. And, after spending much time in this, I find them very
> > useful. But I agree that there needs to be a better way to assign label
> > names.
> 
> I could understand that you spend a lot of time on creating the
> labels and that they are somehow useful for you.
> 
> But I am not using them and I hope that I will not have to:
> 
>   + Grepping takes a lot of time, especially over several files.
> 
>   + Grepping is actually not enough. It is required to read
>     the following comment or code to realize what the label is for.
> 
>   + Several barriers have multiple dependencies. Grepping one
>     label helps to check that one connection makes sense.
>     But it is hard to keep all relations in head to confirm
>     that they are complete and make sense overall.
> 
>   + There are about 50 labels in the code. "Entry Lifecycle"
>     section in dataring.c talks about 8 step. One would
>     expect that it would require 8 read and 8 write barriers.
> 
>     Even coordination of 16 barriers might be complicated to check.
>     Where 50 is just scary.
> 
> 
>   + It seems to be a newly invented format and it is not documented.
>     I personally do not understand it completely, for example,
>     the meaning of "RELEASE from jA->cD->hA to jB".

IIUC, something like "hA is the interested access, happening within
cD (should have been cC?), which in turn happens within jA".  But I
should defer to John (FWIW, I found that notation quite helpful).


> 
> 
> I hope that we could do better. I believe that human readable
> comments all less error prone because they describe the intention.
> Pseudo code based on labels just describes the code but it
> does not explain why it was done this way.
> 
> From my POV, the labels do more harm than good. The code gets
> too scattered and is harder to follow.
> 
> 
> > I hope that we can agree that the labels are important.
> 
> It would be great to hear from others.

I agree with you that reviewing these comments might be "scary" and
not suitable for a bed-reading  ;-) (I didn't have time to complete
such review yet).  OTOH, from my POV, removing such comments/labels
could only make such (and future) reviews scarier, because then the
(memory-ordering) "intention" would then be _hidden in the code.


> 
> > And that a formal documentation of the barriers is also important.
> 
> It might be helpful if it can be somehow feed to a tool that would
> prove correctness. Is this the case?

>From what I've read so far, it _should be relatively straighforward
to write down a litmus test from any such comment (and give this to
the LKMM simulator).


> 
> In each case, it should follow some "widely" used format.
> We should not invent a new one that nobody else would use
> and understand.

Agreed.  Well, litmus tests (or the comments here in question, that
are intended to convey the same information) have been successfully
adopted by memory model and concurrency people for as long as I can
remember, current architecture reference manuals use these tools to
describe the semantics of fence or atomic instructions, discussions
about memory barriers on LKML, gcc MLs often reduce to a discussion
around one or more litmus tests...

[trimming]


> > Andrea suggested that the documentation should be within the code, which
> > I think is a good idea. Even if it means we have more comments than
> > code.
> 
> It depends on the type of the information. I would describe:
> 
>   + The overall design on top of the source file or in
>     Documentation/...
> 
>   + The behavior of externally used API and non-obvious functions
>     above the function definition.
> 
>   + Implementation details, non-obvious effects, side effects,
>     relations, meaning of tricky calculation, meaning of
>     a block of code inside the code. But each function should
>     ideally fit on the screen.
> 
> I personally tend to write more documentation but it is sometimes
> too much. I am trying to become more effective and to the point.

Unfortunately, I don't know of more concise ways to convey the same
information that these comments are intended to provide.  Thoughts?

Please don't get me wrong: I'm all for overall design, external API,
etc., if some improvements can be achieved here.

  Andrea