lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220628122741.93641-2-daniel.thompson@linaro.org>
Date:   Tue, 28 Jun 2022 13:27:39 +0100
From:   Daniel Thompson <daniel.thompson@...aro.org>
To:     Nathan Chancellor <nathan@...nel.org>, Tom Rix <trix@...hat.com>
Cc:     Daniel Thompson <daniel.thompson@...aro.org>,
        Masahiro Yamada <masahiroy@...nel.org>,
        Michal Marek <michal.lkml@...kovi.net>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        linux-kbuild@...r.kernel.org, llvm@...ts.linux.dev,
        linux-kernel@...r.kernel.org
Subject: [PATCH 1/2] clang-tools: Generate clang compatible output even with gcc builds

Currently `make compile_commands.json` cannot produce useful output for
kernels built with gcc. That is because kbuild will opportunistically
enable gcc-specific command line options from recent versions of gcc.
Options that are not compatible with clang cause trouble because most of
the tools that consume compile_commands.json only understand the clang
argument set. This is to be expected since it was the clang folks wrote
the spec to help make those tools come alive (and AFAIK all the tools
that consume the compilation database are closely linked to the clang
tools):
https://clang.llvm.org/docs/JSONCompilationDatabase.html

Let's fix this by adding code to gen_compile_commands.py that will
automatically strip not-supported-by-clang command line options from
the compilation database. This allows the common consumers of the
compilation database (clang-tidy, clangd code completion engine,
CodeChecker, etc) to work without requiring the developer to build the
kernel using a different C compiler.

In theory this could cause problems if/when a not-based-on-clang tool
emerges that reuses the clang compilation database format. This is not
expected to be a problem in practice since the heuristics added to
gen_compile_commands.py are pretty conservative. The should only ever
disable some rather esoteric compiler options ("they must be esoteric
otherwise clang would have implemented them..."). It is hard to reason
about what will/won't break tools that are not yet written but we can
hope the removing esoteric options will be benign!

Signed-off-by: Daniel Thompson <daniel.thompson@...aro.org>
---
 Makefile                                    |  5 +-
 scripts/clang-tools/gen_compile_commands.py | 71 ++++++++++++++++++++-
 2 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 513c1fbf7888..9ea6867aaf9c 100644
--- a/Makefile
+++ b/Makefile
@@ -1886,8 +1886,11 @@ nsdeps: modules
 # Clang Tooling
 # ---------------------------------------------------------------------------
 
+ifdef CONFIG_CC_IS_GCC
+gen_compile_commands-flags += --gcc
+endif
 quiet_cmd_gen_compile_commands = GEN     $@
-      cmd_gen_compile_commands = $(PYTHON3) $< -a $(AR) -o $@ $(filter-out $<, $(real-prereqs))
+      cmd_gen_compile_commands = $(PYTHON3) $< $(gen_compile_commands-flags) -a $(AR) -o $@ $(filter-out $<, $(real-prereqs))
 
 $(extmod_prefix)compile_commands.json: scripts/clang-tools/gen_compile_commands.py \
 	$(if $(KBUILD_EXTMOD),,$(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS)) \
diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 1d1bde1fd45e..02f6a1408968 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -56,6 +56,9 @@ def parse_arguments():
     ar_help = 'command used for parsing .a archives'
     parser.add_argument('-a', '--ar', type=str, default='llvm-ar', help=ar_help)
 
+    gcc_help = 'tidy up gcc invocations to work with clang'
+    parser.add_argument('-g', '--gcc', action='store_true', help=gcc_help)
+
     paths_help = ('directories to search or files to parse '
                   '(files should be *.o, *.a, or modules.order). '
                   'If nothing is specified, the current directory is searched')
@@ -67,6 +70,7 @@ def parse_arguments():
             os.path.abspath(args.directory),
             args.output,
             args.ar,
+            args.gcc,
             args.paths if len(args.paths) > 0 else [args.directory])
 
 
@@ -196,10 +200,73 @@ def process_line(root_directory, command_prefix, file_path):
         'command': prefix + file_path,
     }
 
+clang_options = {}
+
+def check_clang_compatibility(target, flag):
+    """Check that the supplied flag does not cause clang to return an error.
+
+    The results of the check, which is expensive if repeated many times, is
+    cached in the clang_options variable and reused in subsequent calls.
+    """
+    global clang_options
+    if flag in clang_options:
+        return clang_options[flag]
+
+    c = 'echo "int f;"| clang {} {} - -E > /dev/null 2>&1'.format(target, flag)
+    rc = os.system(c)
+    compatible = rc == 0
+    clang_options[flag] = compatible
+    if not compatible:
+        logging.info('Not supported by clang: %s', flag)
+
+    return compatible
+
+def make_clang_compatible(entry):
+    """Scans and transforms the command line options to make the invocation
+    compatible with clang.
+
+    There are two main heuristics:
+
+    1. Use the gcc compiler prefix to populate the clang --target variable
+       (which is needed for cross-compiles to work correctly)
+
+    2. Scan for any -f or -m options that are not supported by clang and
+       discard them.
+
+    This allows us to use clang tools on our kernel builds even if we built the
+    kernel using gcc.
+    """
+    newcmd = []
+    target = ''
+
+    # Splitting the command line like this isn't going to handle quoted
+    # strings transparently. However assuming the quoted string does not
+    # contain tabs, double spaces or words commencing with '-f' or '-c'
+    # (which is fairly reasonable) then this simple approach will be
+    # sufficient.
+    atoms = entry['command'].split()
+
+    # Use the compiler prefix as the clang --target variable
+    if atoms[0].endswith('-gcc'):
+        target = '--target=' + os.path.basename(atoms[0][:-4])
+        newcmd.append(atoms[0])
+        newcmd.append(target)
+        del atoms[0]
+
+    # Drop incompatible flags that provoke fatal errors for clang. Note that
+    # unsupported -Wenable-warning flags are not fatal so we don't have to
+    # worry about those.
+    for atom in atoms:
+        if atom.startswith('-f') or atom.startswith('-m'):
+            if not check_clang_compatibility(target, atom):
+                continue
+        newcmd.append(atom)
+
+    entry['command'] = ' '.join(newcmd)
 
 def main():
     """Walks through the directory and finds and parses .cmd files."""
-    log_level, directory, output, ar, paths = parse_arguments()
+    log_level, directory, output, ar, gcc, paths = parse_arguments()
 
     level = getattr(logging, log_level)
     logging.basicConfig(format='%(levelname)s: %(message)s', level=level)
@@ -232,6 +299,8 @@ def main():
                     try:
                         entry = process_line(directory, result.group(1),
                                              result.group(2))
+                        if gcc:
+                            make_clang_compatible(entry)
                         compile_commands.append(entry)
                     except ValueError as err:
                         logging.info('Could not add line from %s: %s',
-- 
2.35.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ