linux-kernel - Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180403070412.GH3313@nanopsycho>
Date:   Tue, 3 Apr 2018 09:04:12 +0200
From:   Jiri Pirko <jiri@...nulli.us>
To:     Rahul Lakkireddy <rahul.lakkireddy@...lsio.com>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
        "stephen@...workplumber.org" <stephen@...workplumber.org>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
        Ganesh GR <ganeshgr@...lsio.com>,
        Nirranjan Kirubaharan <nirranjan@...lsio.com>,
        Indranil Choudhury <indranil@...lsio.com>
Subject: Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware
 dump in second kernel

Mon, Apr 02, 2018 at 02:30:45PM CEST, rahul.lakkireddy@...lsio.com wrote:
>On Monday, April 04/02/18, 2018 at 14:41:43 +0530, Jiri Pirko wrote:
>> Fri, Mar 30, 2018 at 08:42:00PM CEST, ebiederm@...ssion.com wrote:
>> >Rahul Lakkireddy <rahul.lakkireddy@...lsio.com> writes:
>> >
>> >> On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote:
>> >>> Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkireddy@...lsio.com wrote:
>> >>> >Add a new module crashdd that exports the /sys/kernel/crashdd/
>> >>> >directory in second kernel, containing collected hardware/firmware
>> >>> >dumps.
>> >>> >
>> >>> >The sequence of actions done by device drivers to append their device
>> >>> >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are
>> >>> >as follows:
>> >>> >
>> >>> >1. During probe (before hardware is initialized), device drivers
>> >>> >register to the crashdd module (via crashdd_add_dump()), with
>> >>> >callback function, along with buffer size and log name needed for
>> >>> >firmware/hardware log collection.
>> >>> >
>> >>> >2. Crashdd creates a driver's directory under
>> >>> >/sys/kernel/crashdd/<driver>. Then, it allocates the buffer with
>> >>> 
>> >>> This smells. I need to identify the exact ASIC instance that produced
>> >>> the dump. To identify by driver name does not help me if I have multiple
>> >>> instances of the same driver. This looks wrong to me. This looks like
>> >>> a job for devlink where you have 1 devlink instance per 1 ASIC instance.
>> >>> 
>> >>> Please see:
>> >>> http://patchwork.ozlabs.org/project/netdev/list/?series=36524
>> >>> 
>> >>> I bevieve that the solution in the patchset could be used for
>> >>> your usecase too.
>> >>> 
>> >>> 
>> >>
>> >> The sysfs approach proposed here had been dropped in favour exporting
>> >> the dumps as ELF notes in /proc/vmcore.
>> >>
>> >> Will be posting the new patches soon.
>> >
>> >The concern was actually how you identify which device that came from.
>> >Where you read the identifier changes but sysfs or /proc/vmcore the
>> >change remains valid.
>> 
>> Yeah. I still don't see how you link the dump and the device.
>
>In our case, the dump and the device are being identified by the
>driver’s name followed by its corresponding pci bus id.  I’ve posted an
>example in my v3 series:
>
>https://www.spinics.net/lists/netdev/msg493781.html
>
>Here’s an extract from the link above:
>
># readelf -n /proc/vmcore
>
>Displaying notes found at file offset 0x00001000 with length 0x04003288:
>Owner                 Data size     Description
>VMCOREDD_cxgb4_0000:02:00.4 0x02000fd8      Unknown note type:(0x00000700)
>VMCOREDD_cxgb4_0000:04:00.4 0x02000fd8      Unknown note type:(0x00000700)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>CORE                 0x00000150     NT_PRSTATUS (prstatus structure)
>VMCOREINFO           0x0000074f     Unknown note type: (0x00000000)
>
>Here, for my two devices, the dump’s names are
>VMCOREDD_cxgb4_0000:02:00.4 and VMCOREDD_cxgb4_0000:04:00.4.
>
>It’s really up to the callers to write their own unique name for the
>dump.  The name is appended to “VMCOREDD_” string.
>
>> Rahul, did you look at the patchset I pointed out?
>
>For devlink, I think the dump name would be identified by
>bus_type/device_name; i.e. “pci/0000:02:00.4” for my example.
>Is my understanding correct?

Yes.


>
>Thanks,
>Rahul