lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Jul 2018 15:12:11 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Joe Perches <joe@...ches.com>,
        Samuel Ortiz <sameo@...ux.intel.com>,
        "David S. Miller" <davem@...emloft.net>,
        Rob Herring <robh+dt@...nel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Jonathan Cameron <jic23@...nel.org>,
        linux-wireless <linux-wireless@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        DTML <devicetree@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        "open list:HARDWARE RANDOM NUMBER GENERATOR CORE" 
        <linux-crypto@...r.kernel.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        linux-iio@...r.kernel.org,
        Linux PM list <linux-pm@...r.kernel.org>,
        lvs-devel@...r.kernel.org, netfilter-devel@...r.kernel.org,
        coreteam@...filter.org
Subject: Re: [PATCH 1/4] treewide: convert ISO_8859-1 text comments to utf-8

tools/perf/tests/.gitignore:
                            LLVM byte-codes, uncompressed
On Wed, Jul 25, 2018 at 2:55 AM, Andrew Morton
<akpm@...ux-foundation.org> wrote:
> On Tue, 24 Jul 2018 17:13:20 -0700 Joe Perches <joe@...ches.com> wrote:
>
>> On Tue, 2018-07-24 at 14:00 -0700, Andrew Morton wrote:
>> > On Tue, 24 Jul 2018 13:13:25 +0200 Arnd Bergmann <arnd@...db.de> wrote:
>> > > Almost all files in the kernel are either plain text or UTF-8
>> > > encoded. A couple however are ISO_8859-1, usually just a few
>> > > characters in a C comments, for historic reasons.
>> > > This converts them all to UTF-8 for consistency.
>> []
>> > Will we be getting a checkpatch rule to keep things this way?
>>
>> How would that be done?
>
> I'm using this, seems to work.
>
>         if ! file $p | grep -q -P ", ASCII text|, UTF-8 Unicode text"
>         then
>                 echo $p: weird charset
>         fi

There are a couple of files that my version of 'find' incorrectly identified as
something completely different, like:

Documentation/devicetree/bindings/pinctrl/pinctrl-sx150x.txt:
            SemOne archive data
Documentation/devicetree/bindings/rtc/epson,rtc7301.txt:
            Microsoft Document Imaging Format
Documentation/filesystems/nfs/pnfs-block-server.txt:
            PPMN archive data
arch/arm/boot/dts/bcm283x-rpi-usb-host.dtsi:
        Sendmail frozen configuration  - version = "host";
Documentation/networking/segmentation-offloads.txt:
        StuffIt Deluxe Segment (data) : gmentation Offloads in the
Linux Networking Stack
arch/sparc/include/asm/visasm.h:                              SAS 7+
arch/xtensa/kernel/setup.c:                                         ,
init=0x454c, stat=0x090a, dev=0x2009, bas=0x2020
drivers/cpufreq/powernow-k8.c:
TI-XX Graphing Calculator (FLASH)
tools/testing/selftests/net/forwarding/tc_shblocks.sh:
                            Minix filesystem, V2 (big endian)
tools/perf/tests/.gitignore:
                            LLVM byte-codes, uncompressed

All of the above seem to be valid ASCII or UTF-8 files, so the check
above will lead
to false-positives, but it may be good enough as they are the
exception, and may be
bugs in 'file'.

Not sure if we need to worry about 'file' not being installed.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ