linux-kernel - Re: comments style: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190823055445.GA7195@jagdpanzerIV>
Date:   Fri, 23 Aug 2019 14:54:45 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Petr Mladek <pmladek@...e.com>, linux-kernel@...r.kernel.org,
        Andrea Parri <andrea.parri@...rulasolutions.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Brendan Higgins <brendanhiggins@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: comments style: Re: [RFC PATCH v4 1/9] printk-rb: add a new
 printk ringbuffer implementation

On (08/21/19 07:46), John Ogness wrote:
[..]
> The labels are necessary for the technical documentation of the
> barriers. And, after spending much time in this, I find them very
> useful. But I agree that there needs to be a better way to assign label
> names.
[..]
> > Where dp stands for descriptor push. For dataring we can add a 'dr'
> > prefix, to avoid confusion with desc barriers, which have 'd' prefix.
> > And so on. Dunno.
> 
> Yeah, I spent a lot of time going in circles on this one.
[..]
> I hope that we can agree that the labels are important. And that a
> formal documentation of the barriers is also important. Yes, they are a
> lot of work, but I find it makes it a lot easier to go back to the code
> after I've been away for a while. Even now, as I go through your
> feedback on code that I wrote over a month ago, I find the formal
> comments critical to quickly understand _exactly_ why the memory
> barriers exist.

Yeah. I like those tagsi/labels, and appreciate your efforts.

Speaking about it in general, not necessarily related to printk patch set.
With or without labels/tags we still have to grep. But grep-ing is much
easier when we have labels/tags. Otherwise it's sometimes hard to understand
what to grep for - _acquire, _relaxed, smp barrier, write_once, or
anything else.

> Perhaps we should choose labels that are more clear, like:
> 
>     dataring_push:A
>     dataring_push:B
> 
> Then we would see comments like:
> 
>     Memory barrier involvement:
> 
>     If _dataring_pop:B reads from dataring_datablock_setid:A, then
>     _dataring_pop:C reads from dataring_push:G.
[..]
>     RELEASE from dataring_push:E to dataring_datablock_setid:A
>        matching
>     ACQUIRE from _dataring_pop:B to _dataring_pop:E

I thought about it. That's very informative, albeit pretty hard to maintain.
The same applies to drA or prA and any other context dependent prefix.

> But then how should the labels in the code look? Just the letter looks
> simple in code, but cannot be grepped.

Yes, good point.

>     dataring_push()
>     {
>         ...
>         /* E */
>         ...
>     }

If only there was something as cool as grep-ing, but cooler. Something
that "just sucks less". Something that even folks like myself could use.

Bare with me.

Apologies. This email is rather long; but it's pretty easy to read.
Let's see if this can fly.

So what I did.

I changed several LMM tags/labels definitions, so they have common format:

			LMM_TAG(name)

I don't insist on this particular naming scheme, it can be improved.

======================================================================
diff --git a/kernel/printk/dataring.c b/kernel/printk/dataring.c
index e48069dc27bc..54eb28d47d30 100644
--- a/kernel/printk/dataring.c
+++ b/kernel/printk/dataring.c
@@ -577,11 +577,11 @@ char *dataring_push(struct dataring *dr, unsigned long size,
 	to_db_size(&size);
 
 	do {
-		/* fA: */
+		/* LMM_TAG(fA) */
 		ret = get_new_lpos(dr, size, &begin_lpos, &next_lpos);
 
 		/*
-		 * fB:
+		 * LMM_TAG(fB)
 		 *
 		 * The data ringbuffer tail may have been pushed (by this or
 		 * any other task). The updated @tail_lpos must be visible to
@@ -621,7 +621,7 @@ char *dataring_push(struct dataring *dr, unsigned long size,
 
 		if (!ret) {
 			/*
-			 * fC:
+			 * LMM_TAG(fC)
 			 *
 			 * Force @desc permanently invalid to minimize risk
 			 * of the descriptor later unexpectedly being
@@ -631,14 +631,14 @@ char *dataring_push(struct dataring *dr, unsigned long size,
 			dataring_desc_init(desc);
 			return NULL;
 		}
-	/* fE: */
+	/* LMM_TAG(fE) */
 	} while (atomic_long_cmpxchg_relaxed(&dr->head_lpos, begin_lpos,
 					     next_lpos) != begin_lpos);
 
 	db = to_datablock(dr, begin_lpos);
 
 	/*
-	 * fF:
+	 * LMM_TAG(fF)
 	 *
 	 * @db->id is a garbage value and could possibly match the @id. This
 	 * would be a problem because the data block would be considered
@@ -648,7 +648,7 @@ char *dataring_push(struct dataring *dr, unsigned long size,
 	WRITE_ONCE(db->id, id - 1);
 
 	/*
-	 * fG:
+	 * LMM_TAG(fG)
 	 *
 	 * Ensure that @db->id is initialized to a wrong ID value before
 	 * setting @begin_lpos so that there is no risk of accidentally
@@ -668,7 +668,7 @@ char *dataring_push(struct dataring *dr, unsigned long size,
 	 */
 	smp_store_release(&desc->begin_lpos, begin_lpos);
 
-	/* fH: */
+	/* LMM_TAG(fH) */
 	WRITE_ONCE(desc->next_lpos, next_lpos);
 
 	/* If this data block wraps, use @data from the content data block. */
diff --git a/kernel/printk/numlist.c b/kernel/printk/numlist.c
index 16c6ffa74b01..285e0431dbf8 100644
--- a/kernel/printk/numlist.c
+++ b/kernel/printk/numlist.c
@@ -338,7 +338,7 @@ struct nl_node *numlist_pop(struct numlist *nl)
 	tail_id = atomic_long_read(&nl->tail_id);
 
 	for (;;) {
-		/* cB */
+		/* LMM_TAG(cB) */
 		while (!numlist_read(nl, tail_id, NULL, &next_id)) {
 			/*
 			 * @tail_id is invalid. Try again with an
@@ -357,6 +357,7 @@ struct nl_node *numlist_pop(struct numlist *nl)
 
 		/*
 		 * cC:
+		 * LMM_TAG(cD)
 		 *
 		 * Make sure the node is not busy.
 		 */
@@ -368,7 +369,7 @@ struct nl_node *numlist_pop(struct numlist *nl)
 		if (r == tail_id)
 			break;
 
-		/* cA: #3 */
+		/* LMM_TAG(cA) #3 */
 		tail_id = r;
 	}
 
======================================================================



Okay.
Next, I added the following simple quick-n-dirty perl script:

======================================================================
Subject: [PATCH] add LMM_TAG parser

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
---
 scripts/ctags-parse-lmm.pl | 45 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)
 create mode 100755 scripts/ctags-parse-lmm.pl

diff --git a/scripts/ctags-parse-lmm.pl b/scripts/ctags-parse-lmm.pl
new file mode 100755
index 000000000000..785f6945c936
--- /dev/null
+++ b/scripts/ctags-parse-lmm.pl
@@ -0,0 +1,45 @@
+#!/usr/bin/perl
+#
+# SPDX-License-Identifier: GPL-2.0
+#
+# Parse Linux Memory Model tags and add corresponding entries to the ctags file
+#
+# LMM Über Alles!
+
+use strict;
+
+sub parse($$)
+{
+	my ($t, $f) = @_;
+	my $ctags;
+	my $file;
+
+	if (!open($ctags, '>>', $t)) {
+		print "Could not open $t: $!\n";
+		exit 1;
+	}
+
+	if (!open($file, '<', $f)) {
+		print "Could not open $f: $1\n";
+		exit 1;
+	}
+
+	while (my $row = <$file>) {
+		chomp $row;
+
+		if ($row =~ m/LMM_TAG\((.+)\)/) {
+			# yup...
+			print $ctags "$1\t$f\t/LMM_TAG($1)/;\"\td\n";
+		}
+	}
+	close($file);
+	close($ctags);
+}
+
+if ($#ARGV != 1) {
+	print "Usage:\n\tscripts/ctags-parse-lmm.pl tags C-file-to-parse\n";
+	exit 1;
+}
+
+parse($ARGV[0], $ARGV[1]);
+exit 0;
-- 
2.23.0
======================================================================




The next thing I did was

	./scripts/ctags-parse-lmm.pl ./tags kernel/printk/dataring.c
	./scripts/ctags-parse-lmm.pl ./tags kernel/printk/numlist.c
	./scripts/ctags-parse-lmm.pl ./tags kernel/printk/ringbuffer.c

These 3 commands added the following entries to the tags file
(I'm using ctags and vim)

======================================================================
$ tail tags
fA	kernel/printk/dataring.c	/LMM_TAG(fA)/;"	d
fB	kernel/printk/dataring.c	/LMM_TAG(fB)/;"	d
fC	kernel/printk/dataring.c	/LMM_TAG(fC)/;"	d
fE	kernel/printk/dataring.c	/LMM_TAG(fE)/;"	d
fF	kernel/printk/dataring.c	/LMM_TAG(fF)/;"	d
fG	kernel/printk/dataring.c	/LMM_TAG(fG)/;"	d
fH	kernel/printk/dataring.c	/LMM_TAG(fH)/;"	d
cB	kernel/printk/numlist.c	/LMM_TAG(cB)/;"	d
cD	kernel/printk/numlist.c	/LMM_TAG(cD)/;"	d
cA	kernel/printk/numlist.c	/LMM_TAG(cA)/;"	d
======================================================================


So now when I perform LMM tag search or jump to a tag definition, vim
goes exactly to the line where the corresponding LMM_TAG was defined.

Example:

kernel/printk/ringbuffer.c

        RELEASE from jA->cD->hA to jB	
                         ^
                         C-]                    // jump to tag under cursor

vim goes to kernel/printk/numlist.c

360			* LMM_TAG(cD)
				  ^

Exactly where cD was defined.

Welcome to the future!


> Andrea suggested that the documentation should be within the code, which
> I think is a good idea. Even if it means we have more comments than
> code.

I agree that such documentation is handy. It, probably, would be even better
if we could use some tooling to make it easier to use.

	-ss