lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Feb 2018 17:04:20 -0800
From:   Jakub Kicinski <kubakici@...pl>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     Rahul Lakkireddy <rahul.lakkireddy@...lsio.com>,
        David Miller <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Ganesh GR <ganeshgr@...lsio.com>,
        Nirranjan Kirubaharan <nirranjan@...lsio.com>,
        Indranil Choudhury <indranil@...lsio.com>
Subject: Re: [PATCH net-next] cxgb4: append firmware dump to vmcore in
 kernel panic

On Tue, 20 Feb 2018 16:51:03 -0800, Florian Fainelli wrote:
> On 02/20/2018 04:43 PM, Jakub Kicinski wrote:
> > On Mon, 19 Feb 2018 18:04:17 +0530, Rahul Lakkireddy wrote:  
> >> Our requirement is to analyze the state of firmware/hardware at the
> >> time of kernel panic.   
> > 
> > I was wondering about this since you posted the patch and I can't come
> > up with any specific scenario where kernel crash would correlate
> > clearly with device state in non-trivial way.
> > 
> > Perhaps there is something about cxgb4 HW/FW that makes this useful.
> > Could you explain?  Could you give a real life example of a bug?  
> > Is it related to the TOE-looking TLS offload Atul is posting?
> > 
> > Is the panic you're targeting here real or manually triggered from user
> > space to get a full dump of kernel and FW?
> > 
> > That's me trying to guess what you're doing.. :)
> 
> One case where this might be helpful is if you are chasing down DMA
> corruption and you would like to get a nearly instant capture of both
> the kernel's memory and the adapter which may be responsible for that.
> This is not probably 100% proof because there is a timing window during
> which the dumps of both contexts are going to happen, and that alone
> might be influencing the captured memory view. Just guessing of course.

Perhaps this is what you mean with the timing window - but with random
corruptions by the time kernel hits the corrupted memory 40/100Gb
adapter has likely forgotten all about those DMAs..  And IOMMUs are
pretty good at catching corruptions on big iron CPUs (i.e. it's easy to
catch them in testing, even if production environment runs iommu=pt).
At least that's my gut feeling/experience ;)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ