lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92468e50-409f-c54f-8bf8-87587061d98e@rempel-privat.de>
Date:   Wed, 7 Jun 2017 09:07:33 +0200
From:   Oleksij Rempel <linux@...pel-privat.de>
To:     Tobias Diedrich <ranma+ath9k_htc_fw@...edrich.de>,
        Nathan Royce <nroycea+kernel@...il.com>,
        QCA ath9k Development <ath9k-devel@....qualcomm.com>,
        Kalle Valo <kvalo@...eaurora.org>,
        linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        ath9k_htc_fw <ath9k_htc_fw@...ts.infradead.org>
Subject: Re: ath9k_htc - Division by zero in kernel (as well as firmware
 panic)

Am 07.06.2017 um 02:12 schrieb Tobias Diedrich:
> Oleksij Rempel wrote:
>> Yes, this is "normal" problem. The firmware has no error handler for PCI
>> bus related exceptions. So if we filed to read PCI bus first time, we
>> have choice to Ooops and stall or Ooops and reboot ASAP. So we reboot
>> and provide an kernel "firmware panic!" message.
>> Every one who can or will to fix this, is welcome.
>>
>>> *****
>>> Jun 02 14:55:30 computer kernel: usb 1-1.1: ath: firmware panic!
>>> exccause: 0x0000000d; pc: 0x0090ae81; badvaddr: 0x10ff4038.
> [...]
> 
>> memdmp 50ae78 50ae88
> 
> 50ae78: 6c10 0412 6aa2 0c02 0088 20c0 2008 1940  l...j..........@
> 
> [...copy to bin...]
> $ bin/objdump -b binary -m xtensa  -D /tmp/memdump.bin 
> [..]
>    0:   6c1004          entry   a1, 32
>    3:   126aa2          l32r    a2, 0xfffdaa8c
>    6:   0c0200          memw
>    9:   8820            l32i.n  a8, a2, 0      <----------Exception cause PC still points at load
>    b:   c020            movi.n  a2, 0
>    d:   081940          extui   a9, a8, 1, 1
> 
> Judging from that it should be fairly simple to at least implement
> some sort of retry, possible after triggering a PCIe link retrain?

I assume, yes.

> There are some related PCIe root complex registers that may point to
> what exactly failed if they were dumped.
> 
> The root complex registers live at 0x00040000 and I think match the
> registers described for the root complex in the AR9344 datasheet.

Suddenly I don't have ar7010 docs to tell..

> PCIE_INT_MASK would map to 0x40050 and has a bit for SYS_ERR:
> "A system error. The RC Core asserts CFG_SYS_ERR_RC if any device in
> the hierarchy reports any of the following errors and the associated
> enable bit is set in the Root Control register: ERR_COR, ERR_FATAL,
> ERR_NONFATAL."
> 
> AFAICS link retrain can be done by setting bit3 (INIT_RST,
> "Application request to initiate a training reset") in
> PCIE_APP (0x40000).
> 
> See sboot/magpie_1_1/sboot/cmnos/eeprom/src/cmnos_eeprom.c (which
> flips some bits in the RC to enable the PCIe bus for reading the
> EEPROM).
> 
> The root complex pci configuration space is at 0x20000 which could
> have further error details:
>> memdmp 20000 20200
> 
> 020000: a02a 168c 0010 0006 0000 0001 0001 0000  .*..............
> 020010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020030: 0000 0000 0000 0040 0000 0000 0000 01ff  .......@........
> 020040: 5bc3 5001 0000 0000 0000 0000 0000 0000  [.P.............
> 020050: 0080 7005 0000 0000 0000 0000 0000 0000  ..p.............
> 020060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020070: 0042 0010 0000 8701 0000 2010 0013 4411  .B............D.
> 020080: 3011 0000 0000 0000 00c0 03c0 0000 0000  0...............
> 020090: 0000 0000 0000 0010 0000 0000 0000 0000  ................
> 0200a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0200b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0200c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0200d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0200e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0200f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020100: 1401 0001 0000 0000 0000 0000 0006 2030  ...............0
> 020110: 0000 0000 0000 2000 0000 00a0 0000 0000  ................
> 020120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020140: 0001 0002 0000 0000 0000 0000 0000 0000  ................
> 020150: 0000 0000 8000 00ff 0000 0000 0000 0000  ................
> 020160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020170: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 020190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0201a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0201b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0201c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0201d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0201e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0201f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 
> Transformed into something suitable for feeding into lspci -F:
> 
> 00:00.0 Description filled in by lspci
> 00: 8c 16 2a a0 06 00 10 00 01 00 00 00 00 00 01 00
> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
> 40: 01 50 c3 5b 00 00 00 00 00 00 00 00 00 00 00 00
> 50: 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00
> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 70: 10 00 42 00 01 87 00 00 10 20 00 00 11 44 13 00
> 80: 00 00 11 30 00 00 00 00 c0 03 c0 00 00 00 00 00
> 90: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
> $ lspci -F /tmp/hexdump -vvv
> 00:00.0 Non-VGA unclassified device: Qualcomm Atheros Device a02a (rev 01)
>         !!! Invalid class 0000 for header type 01
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0
>         Interrupt: pin A routed to IRQ 255
>         Bus: primary=00, secondary=00, subordinate=00, sec-latency=0
>         I/O behind bridge: 00000000-00000fff
>         Memory behind bridge: 00000000-000fffff
>         Prefetchable memory behind bridge: 00000000-000fffff
>         Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
>         BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
>                 PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0
>                         ExtTag- RBE+
>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>                 LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <1us, L1 <64us
>                         ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
>                 RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
>                 RootCap: CRSVisible-
>                 RootSta: PME ReqID 0000, PMEStatus- PMEPending-
>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
>                 LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 

Looks promising :)

>>> Jun 02 14:55:30 computer kernel: usb 1-1.1: ath9k_htc: Transferred FW:
>>> ath9k_htc/htc_7010-1.4.0.fw, size: 72812
> 
> $ ls -l /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw
> -rw-r--r-- 1 root root 72812 Dec 14 04:59 /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw
> $ sha1sum /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw
> 959cb6550930de2882e12b9a549c3cf0c9bf51ac /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw

-- 
Regards,
Oleksij



Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ