[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e505c35-8428-89bb-7f9b-bc819382c3cd@infradead.org>
Date: Sun, 26 Jul 2020 12:08:08 -0700
From: Randy Dunlap <rdunlap@...radead.org>
To: Joe Perches <joe@...ches.com>,
Christophe Leroy <christophe.leroy@...roup.eu>
Cc: linuxppc-dev@...ts.ozlabs.org, Paul Mackerras <paulus@...ba.org>,
linux-kernel@...r.kernel.org, Michael Ellerman <mpe@...erman.id.au>
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words
On 7/26/20 10:49 AM, Joe Perches wrote:
> On Sun, 2020-07-26 at 10:23 -0700, Randy Dunlap wrote:
>> On 7/26/20 7:29 AM, Christophe Leroy wrote:
>>> Randy Dunlap <rdunlap@...radead.org> a écrit :
>>>
>>>> Drop duplicated words in arch/powerpc/ header files.
>>>
>>> How did you detect them ? Do you have some script for tgat, or you just read all comments ?
>>
>> Yes, it's a script that finds lots of false positives, so I have to check
>> each and every one of them for validity.
>
> And it's a lot of work too. (thanks Randy)
>
> It could be something like:
>
> $ grep-2.5.4 -nrP --include=*.[ch] '\b([A-Z]?[a-z]{2,}\b)[ \t]*(?:\n[ \t]*\*[ \t]*|)\1\b' * | \
> grep -vP '\b(?:struct|enum|union)\s+([A-Z]?[a-z]{2,})\s+\*?\s*\1\b' | \
> grep -vP '\blong\s+long\b' | \
> grep -vP '\b([A-Z]?[a-z]{2,})(?:\t+| {2,})\1\b'
Hi Joe,
(what is grep-2.5.4 ?)
It looks like you tried a few iterations of this -- since it drops things
like "long long". There are lots of data types that are repeated & valid.
And many struct names, like "struct kref kref", "struct completion completion",
and "struct mutex mutex". I handle (ignore) those manually, although that
could be added to the Perl script.
v0.1 of this script also found lots of repeated numbers and strings of
special characters (ASCII art etc.), so now it ignores duplicated numbers
or special characters -- since it is really looking for duplicate words.
Anyway, I might as well attach it. It's no big deal.
And if someone else wants to tackle using it, go for it.
--
~Randy
Download attachment "find_dup_words.pl" of type "application/x-perl" (2959 bytes)
Powered by blists - more mailing lists