lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250412122911.327134-1-sashal@kernel.org>
Date: Sat, 12 Apr 2025 08:29:11 -0400
From: Sasha Levin <sashal@...nel.org>
To: workflows@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
	Sasha Levin <sashal@...nel.org>
Subject: [PATCH] verify_pull_requests: initial pull request sanitizer

I'm working on evolving the work I'm doing on the linus-next integration
branch, and this seemed like another useful tool.

Verify that either the sender of the pull request is listed as a
maintainer for the subsystem the patches are destined for. This provides
us two things:

1. Audit the correctness of the MAINTAINERS file, and provide an
   opportunity to correct and add missing "tribal knowledge" (folks who
   are the de-facto maintainers, but are not listed in MAINTAINERS).

2. Verify that inadvertent changes are not included in a pull request.

Below is an example output of the tool. Take note that for pull request
#3 we see a warning because Jens isn't listed as a maintainer for
drivers/nvme/ even though he is sending pull requests for it.

$ ./scripts/verify_pull_requests.sh --days 1
Number of pull requests in the last 1 day(s): 5
Processing pull requests...
Pull request #1: http://lore.kernel.org/all/CAH2r5mt3CCXVEwdsrqPe1VE+xebPSh2k4Wg5Zqqp_OCm+m7cPQ@mail.gmail.com/
  Sender: Steve French <smfrench@...il.com>
  Repository: git://git.samba.org/sfrench/cifs-2.6.git
  Branch/Tag: tags/v6.15-rc1-smb3-client-fixes
  Fetching: git fetch "git://git.samba.org/sfrench/cifs-2.6.git" "tags/v6.15-rc1-smb3-client-fixes"
  Fetch: ✅ Successfully fetched
  Checking maintainer status for 10 commit(s)...
  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits
------------------------
Pull request #2: http://lore.kernel.org/all/20250411181650.GA372618@bhelgaas/
  Sender: Bjorn Helgaas <helgaas@...nel.org>
  Repository: git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git
  Branch/Tag: tags/pci-v6.15-fixes-1
  Fetching: git fetch "git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git" "tags/pci-v6.15-fixes-1"
  Fetch: ✅ Successfully fetched
  Checking maintainer status for 1 commit(s)...
  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits
------------------------
Pull request #3: http://lore.kernel.org/all/8d3e5d98-09b1-4274-af25-124c91342b7a@kernel.dk/
  Sender: Jens Axboe <axboe@...nel.dk>
  Repository: git://git.kernel.dk/linux.git
  Branch/Tag: tags/block-6.15-20250411
  Fetching: git fetch "git://git.kernel.dk/linux.git" "tags/block-6.15-20250411"
  Fetch: ✅ Successfully fetched
  Checking maintainer status for 13 commit(s)...
  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits
  ⚠️  Warning: Sender is NOT listed as maintainer for these commits (but a signer is):
    - 70289ae5cac4d nvmet-fc: put ref when assoc->del_work is already scheduled
    - b0b26ad0e1943 nvmet-fc: take tgtport reference only once
    - 1a909565733ed nvmet-fc: update tgtport ref per assoc
    - 88517565b5929 nvmet-fc: inline nvmet_fc_free_hostport
    - aeaa0913a6994 nvmet-fc: inline nvmet_fc_delete_assoc
    - 72511b1dc4147 nvmet-fcloop: add ref counting to lport
    - f22c458f9495f nvmet-fcloop: replace kref with refcount
    - 2b5f0c5bc819a nvmet-fcloop: swap list_add_tail arguments
------------------------
Pull request #4: http://lore.kernel.org/all/Z_kntkZxksOfGwpt@8bytes.org/
  Sender: Joerg Roedel <joro@...tes.org>
  Repository: git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git
  Branch/Tag: tags/iommu-fixes-v6.15-rc1
  Fetching: git fetch "git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git" "tags/iommu-fixes-v6.15-rc1"
  Fetch: ✅ Successfully fetched
  Checking maintainer status for 9 commit(s)...
  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits
------------------------
Pull request #5: http://lore.kernel.org/all/CAJZ5v0iEn-Lyic6zxDehxF1HHfNfg11_S7COMsHnZeQ+TzZAsA@mail.gmail.com/
  Sender: "Rafael J. Wysocki" <rafael@...nel.org>
  Repository: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
  Branch/Tag: acpi-6.15-rc2
  Fetching: git fetch "git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git" "tags/acpi-6.15-rc2"
  Fetch: ✅ Successfully fetched
  Checking maintainer status for 3 commit(s)...
  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits

Signed-off-by: Sasha Levin <sashal@...nel.org>
---
 scripts/verify_pull_requests.sh | 393 ++++++++++++++++++++++++++++++++
 1 file changed, 393 insertions(+)
 create mode 100755 scripts/verify_pull_requests.sh

diff --git a/scripts/verify_pull_requests.sh b/scripts/verify_pull_requests.sh
new file mode 100755
index 0000000000000..3dd6492a71d2f
--- /dev/null
+++ b/scripts/verify_pull_requests.sh
@@ -0,0 +1,393 @@
+#!/bin/bash
+#set -x
+
+# Default number of days to search
+days=1
+
+# Parse command line arguments
+while [ "$#" -gt 0 ]; do
+    case "$1" in
+        --days)
+            shift
+            if [[ "$1" =~ ^[0-9]+$ ]]; then
+                days="$1"
+            else
+                echo "Error: --days requires a numeric argument"
+                exit 1
+            fi
+            ;;
+        *)
+            echo "Unknown option: $1"
+            echo "Usage: $0 [--days N]"
+            exit 1
+            ;;
+    esac
+    shift
+done
+
+URL="https://lore.kernel.org/all/?q=s:%22GIT+PULL%22+AND+t:torvalds+AND+rt:${days}.day.ago...+AND+NOT+s:re:&x=A"
+
+temp_file=$(mktemp)
+curl -s "$URL" > "$temp_file"
+
+count=$(grep -c "<entry>" "$temp_file")
+echo "Number of pull requests in the last ${days} day(s): $count"
+
+# Extract message URLs and filter out query parameters and #related links
+message_urls=$(grep -o "http://lore.kernel.org/all/[^\"]*" "$temp_file" | grep -v "\\?" | grep -v "#related")
+
+echo "Processing pull requests..."
+
+count=0
+while read -r message_url; do
+    count=$((count + 1))
+    echo "Pull request #$count: $message_url"
+
+    message_content=$(mktemp)
+    curl -s -L "$message_url" > "$message_content"
+
+    email_content=$(cat "$message_content")
+
+    # Extract and clean sender information
+    from_line=$(echo "$email_content" | grep -o "From:.*" | head -1)
+    from_line=$(echo "$from_line" | sed 's/&lt;/</g' | sed 's/&gt;/>/g' | sed 's/&#34;/"/g' | sed 's/&quot;/"/g')
+
+    if [[ "$from_line" =~ From:[[:space:]]+(.*)[[:space:]]+\<([^>]+)\> ]]; then
+        sender_name="${BASH_REMATCH[1]}"
+        sender_email="${BASH_REMATCH[2]}"
+        sender_name=$(echo "$sender_name" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
+        sender_email=$(echo "$sender_email" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
+        echo "  Sender: $sender_name <$sender_email>"
+    else
+        echo "  Sender: $(echo "$from_line" | sed 's/From: //')"
+    fi
+
+    found_repo=false
+    repo=""
+    branch=""
+
+    # Try extraction methods in order of preference
+
+    # 1. Extract repo from HTML links
+    html_href_lines=$(echo "$email_content" | grep -n '<a[[:space:]]*href=".*git.*"')
+
+    if [ -n "$html_href_lines" ]; then
+        while read -r numbered_line; do
+            line_num=$(echo "$numbered_line" | cut -d: -f1)
+            line=$(echo "$numbered_line" | cut -d: -f2-)
+
+            if [[ $line =~ href=\"([^\"]*gitlab[^\"]*|[^\"]*git[^\"]*|[^\"]*kernel\.org[^\"]*)\" ]]; then
+                repo="${BASH_REMATCH[1]}"
+
+                # Check for branch on same line or next line
+                if [[ $line =~ \</a\>([[:space:]]*([[:alnum:]/_.-]+)) ]]; then
+                    branch="${BASH_REMATCH[2]}"
+                    echo "  Repository: $repo"
+                    echo "  Branch/Tag: $branch"
+                    found_repo=true
+                    break
+                else
+                    next_line_num=$((line_num + 1))
+                    next_line=$(echo "$email_content" | sed -n "${next_line_num}p")
+                    next_line=$(echo "$next_line" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$//')
+
+                    if [[ $next_line =~ ^[[:alnum:]/_.-]+$ ]]; then
+                        branch="$next_line"
+                        echo "  Repository: $repo"
+                        echo "  Branch/Tag: $branch"
+                        found_repo=true
+                        break
+                    elif [ "$found_repo" = false ]; then
+                        repo_no_branch=$repo
+                        line_no_branch=$line
+                    fi
+                fi
+            fi
+        done <<< "$html_href_lines"
+    fi
+
+    # 2. Extract repo from plain text if not found in HTML
+    if [ "$found_repo" = false ]; then
+        repo_lines=$(echo "$email_content" | grep -n -i "git://\|https://git\|git@" | grep -v "href=")
+
+        if [ -n "$repo_lines" ]; then
+            while read -r numbered_line; do
+                line_num=$(echo "$numbered_line" | cut -d: -f1)
+                line=$(echo "$numbered_line" | cut -d: -f2-)
+
+                if [[ $line =~ (git://|ssh://git|https://git|git@)[^[:space:]]+(/[^[:space:]]+)+ ]]; then
+                    repo="${BASH_REMATCH[0]}"
+                    repo=$(echo "$repo" | sed 's/[,.\\]$//' | sed 's/[[:space:]]*$//')
+
+                    if [[ $line =~ $repo[[:space:]]+([[:alnum:]/_.-]+) ]]; then
+                        branch="${BASH_REMATCH[1]}"
+                        echo "  Repository: $repo"
+                        echo "  Branch/Tag: $branch"
+                        found_repo=true
+                        break
+                    else
+                        next_line_num=$((line_num + 1))
+                        next_line=$(echo "$email_content" | sed -n "${next_line_num}p")
+                        next_line=$(echo "$next_line" | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$//')
+
+                        if [[ $next_line =~ ^[[:alnum:]/_.-]+$ ]]; then
+                            branch="$next_line"
+                            echo "  Repository: $repo"
+                            echo "  Branch/Tag: $branch"
+                            found_repo=true
+                            break
+                        elif [ "$found_repo" = false ]; then
+                            repo_no_branch=$repo
+                            line_no_branch=$line
+                        fi
+                    fi
+                fi
+            done <<< "$repo_lines"
+        fi
+    fi
+
+    # 3. Try "available in the Git repository at:" section
+    if [ "$found_repo" = false ]; then
+        main_repo_section=$(echo "$email_content" | grep -A 10 "available in the Git repository at")
+
+        if [ -n "$main_repo_section" ]; then
+            if [[ $main_repo_section =~ href=\"([^\"]*gitlab[^\"]*|[^\"]*git[^\"]*|[^\"]*kernel\.org[^\"]*) ]]; then
+                repo="${BASH_REMATCH[1]}"
+                echo "  Repository: $repo"
+                found_repo=true
+
+                tags_line=$(echo "$main_repo_section" | grep -o "tags/[[:alnum:]/_.-]*" | head -1)
+                if [ -n "$tags_line" ]; then
+                    branch="$tags_line"
+                    echo "  Branch/Tag: $branch"
+                fi
+            fi
+        fi
+    fi
+
+    # 4. Use repo without branch if that's all we found
+    if [ "$found_repo" = false ] && [ -n "${repo_no_branch:-}" ]; then
+        repo="$repo_no_branch"
+        echo "  Repository: $repo"
+        echo "  Context: $line_no_branch"
+        found_repo=true
+    fi
+
+    if [ "$found_repo" = false ]; then
+        echo "  No repository URL found in this pull request."
+    else
+        # Convert ssh URLs to git URLs for verification
+        verification_repo="$repo"
+
+        # Handle different git URL formats for kernel.org
+        if [[ "$verification_repo" =~ ^ssh://git@...olite\.kernel\.org(.*) ]]; then
+            verification_repo="git://git.kernel.org${BASH_REMATCH[1]}"
+            echo "  Using git URL for verification: $verification_repo"
+        fi
+
+        if [[ "$verification_repo" =~ ^git@...olite\.kernel\.org:(.*) ]]; then
+            verification_repo="git://git.kernel.org/${BASH_REMATCH[1]}"
+            echo "  Using git URL for verification: $verification_repo"
+        fi
+
+        if [ -n "$verification_repo" ] && [ -n "$branch" ]; then
+            # Try fetching, first with tags/ prefix if needed
+            fetch_ref="$branch"
+            if [[ ! "$branch" =~ ^(refs/|tags/) ]] && [[ ! "$branch" =~ ^remotes/ ]]; then
+                fetch_ref="tags/$branch"
+            fi
+
+            echo "  Fetching: git fetch \"$verification_repo\" \"$fetch_ref\""
+            if git fetch "$verification_repo" "$fetch_ref" 2>/dev/null; then
+                echo "  Fetch: ✅ Successfully fetched"
+
+                # Check if there are any commits to verify
+                commit_hashes=$(git rev-list --no-merges origin/master..FETCH_HEAD 2>/dev/null)
+
+                if [ -z "$commit_hashes" ]; then
+                    echo "  ℹ️ No new commits found. Pull request likely already merged."
+                else
+                    total_commits=$(echo "$commit_hashes" | wc -l)
+                    echo "  Checking maintainer status for $total_commits commit(s)..."
+
+                    # Array to store problematic commits
+                    problematic_commits=()
+                    # Array to store commits where sender is not maintainer but a signer is
+                    sender_not_maintainer_commits=()
+
+                    # Check each commit silently
+                    while read -r commit_hash; do
+                        [ -z "$commit_hash" ] && continue
+
+                        commit_msg=$(git log -1 --pretty=format:"%h %s" "$commit_hash")
+
+                        if [ -f "scripts/get_maintainer.pl" ]; then
+                            maintainers=$(git show "$commit_hash" | ./scripts/get_maintainer.pl)
+                            signoffs=$(git show -s --format=%b "$commit_hash" | grep -i "Signed-off-by:" | sed 's/^[[:space:]]*Signed-off-by:[[:space:]]*//')
+
+                            valid_maintainer=false
+                            sender_is_maintainer=false
+
+                            # Check if sender is a maintainer
+                            if echo "$maintainers" | grep -q "$sender_email" || echo "$maintainers" | grep -q "$sender_name"; then
+                                valid_maintainer=true
+                                sender_is_maintainer=true
+                            else
+                                # Check if any signoff person is a maintainer
+                                while read -r signoff; do
+                                    [ -z "$signoff" ] && continue
+
+                                    # Extract name and email from signoff
+                                    if [[ "$signoff" =~ (.*)[[:space:]]+\<([^>]+)\> ]]; then
+                                        signer_name="${BASH_REMATCH[1]}"
+                                        signer_email="${BASH_REMATCH[2]}"
+                                        signer_name=$(echo "$signer_name" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
+                                        signer_email=$(echo "$signer_email" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
+
+                                        if echo "$maintainers" | grep -q "$signer_email" || echo "$maintainers" | grep -q "$signer_name"; then
+                                            valid_maintainer=true
+                                            break
+                                        fi
+                                    fi
+                                done <<< "$signoffs"
+                            fi
+
+                            # Add to problematic commits if no valid maintainer found
+                            if [ "$valid_maintainer" = false ]; then
+                                problematic_commits+=("$commit_msg")
+                            # Track commits where sender is not a maintainer but a signer is
+                            elif [ "$sender_is_maintainer" = false ]; then
+                                sender_not_maintainer_commits+=("$commit_msg")
+                            fi
+                        fi
+                    done <<< "$commit_hashes"
+
+                    # Display results based on problematic commits
+                    if [ ${#problematic_commits[@]} -eq 0 ]; then
+                        echo "  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits"
+
+                        # Add warning if we found commits where sender is not a maintainer
+                        if [ ${#sender_not_maintainer_commits[@]} -gt 0 ]; then
+                            echo "  ⚠️  Warning: Sender is NOT listed as maintainer for these commits (but a signer is):"
+                            for commit in "${sender_not_maintainer_commits[@]}"; do
+                                echo "    - $commit"
+                            done
+                        fi
+                    else
+                        echo "  ❌ Maintainer verification: Neither sender nor any signers are listed as maintainers for these commits:"
+                        for commit in "${problematic_commits[@]}"; do
+                            echo "    - $commit"
+                        done
+                    fi
+                fi
+            else
+                # Try without tags/ prefix if the first attempt failed
+                if [[ "$fetch_ref" == tags/* ]]; then
+                    fetch_ref="${branch}"
+                    echo "  Fetching: git fetch \"$verification_repo\" \"$fetch_ref\""
+                    if git fetch "$verification_repo" "$fetch_ref" 2>/dev/null; then
+                        echo "  Fetch: ✅ Successfully fetched"
+
+                        # Check if there are any commits to verify
+                        commit_hashes=$(git rev-list --no-merges origin/master..FETCH_HEAD 2>/dev/null)
+
+                        if [ -z "$commit_hashes" ]; then
+                            echo "  ℹ️ No new commits found. Pull request likely already merged."
+                        else
+                            total_commits=$(echo "$commit_hashes" | wc -l)
+                            echo "  Checking maintainer status for $total_commits commit(s)..."
+
+                            # Array to store problematic commits
+                            problematic_commits=()
+                            # Array to store commits where sender is not maintainer but a signer is
+                            sender_not_maintainer_commits=()
+
+                            # Check each commit silently
+                            while read -r commit_hash; do
+                                [ -z "$commit_hash" ] && continue
+
+                                commit_msg=$(git log -1 --pretty=format:"%h %s" "$commit_hash")
+
+                                if [ -f "scripts/get_maintainer.pl" ]; then
+                                    maintainers=$(git show "$commit_hash" | ./scripts/get_maintainer.pl)
+                                    signoffs=$(git show -s --format=%b "$commit_hash" | grep -i "Signed-off-by:" | sed 's/^[[:space:]]*Signed-off-by:[[:space:]]*//')
+
+                                    valid_maintainer=false
+                                    sender_is_maintainer=false
+
+                                    # Check if sender is a maintainer
+                                    if echo "$maintainers" | grep -q "$sender_email" || echo "$maintainers" | grep -q "$sender_name"; then
+                                        valid_maintainer=true
+                                        sender_is_maintainer=true
+                                    else
+                                        # Check if any signoff person is a maintainer
+                                        while read -r signoff; do
+                                            [ -z "$signoff" ] && continue
+
+                                            # Extract name and email from signoff
+                                            if [[ "$signoff" =~ (.*)[[:space:]]+\<([^>]+)\> ]]; then
+                                                signer_name="${BASH_REMATCH[1]}"
+                                                signer_email="${BASH_REMATCH[2]}"
+                                                signer_name=$(echo "$signer_name" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
+                                                signer_email=$(echo "$signer_email" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
+
+                                                if echo "$maintainers" | grep -q "$signer_email" || echo "$maintainers" | grep -q "$signer_name"; then
+                                                    valid_maintainer=true
+                                                    break
+                                                fi
+                                            fi
+                                        done <<< "$signoffs"
+                                    fi
+
+                                    # Add to problematic commits if no valid maintainer found
+                                    if [ "$valid_maintainer" = false ]; then
+                                        problematic_commits+=("$commit_msg")
+                                    # Track commits where sender is not a maintainer but a signer is
+                                    elif [ "$sender_is_maintainer" = false ]; then
+                                        sender_not_maintainer_commits+=("$commit_msg")
+                                    fi
+                                fi
+                            done <<< "$commit_hashes"
+
+                            # Display results based on problematic commits
+                            if [ ${#problematic_commits[@]} -eq 0 ]; then
+                                echo "  ✅ Maintainer verification: Sender or a signer is listed as maintainer for all commits"
+
+                                # Add warning if we found commits where sender is not a maintainer
+                                if [ ${#sender_not_maintainer_commits[@]} -gt 0 ]; then
+                                    echo "  ⚠️ Warning: Sender is NOT listed as maintainer for these commits (but a signer is):"
+                                    for commit in "${sender_not_maintainer_commits[@]}"; do
+                                        echo "    - $commit"
+                                    done
+                                fi
+                            else
+                                echo "  ❌ Maintainer verification: Neither sender nor any signers are listed as maintainers for these commits:"
+                                for commit in "${problematic_commits[@]}"; do
+                                    echo "    - $commit"
+                                done
+                            fi
+                        fi
+                    else
+                        echo "  Fetch: ❌ Failed to fetch"
+                    fi
+                else
+                    echo "  Fetch: ❌ Failed to fetch"
+                fi
+            fi
+        elif [ -n "$verification_repo" ]; then
+            # If we only have the repository but no branch/tag, just verify the repository exists
+            echo "  Verifying: git ls-remote --exit-code \"$verification_repo\""
+            if git ls-remote --exit-code "$verification_repo" > /dev/null 2>&1; then
+                echo "  Verification: ✅ Repository exists"
+            else
+                echo "  Verification: ❌ Could not access repository"
+            fi
+        fi
+    fi
+
+    rm "$message_content"
+
+    echo "------------------------"
+done <<< "$message_urls"
+
+rm "$temp_file"
-- 
2.39.5


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ