lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211227080350.GA469126@ogabbay-vm-u20.habana-labs.com>
Date:   Mon, 27 Dec 2021 10:03:50 +0200
From:   Oded Gabbay <ogabbay@...nel.org>
To:     gregkh@...uxfoundation.org
Cc:     linux-kernel@...r.kernel.org
Subject: [git pull] habanalabs pull request for kernel 5.17

Hi Greg,

This is habanalabs pull request for the merge window of kernel 5.17.
It mainly enhances the driver to deal with extreme cases, such as
reset-during-reset, events during reset and allowing monitoring
applications to continue running during reset.

Full details are in the tag.

Thanks,
Oded

The following changes since commit 1bb866dcb8cf5054de88f592fc0ec1f275ad9d63:

  Merge tag 'iio-for-5.17a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next (2021-12-22 12:33:01 +0100)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux.git misc-habanalabs-next-2021-12-27

for you to fetch changes up to ce80098db2439ee44403ec6fccd3a10be21c7aff:

  habanalabs: support hard-reset scheduling during soft-reset (2021-12-26 14:42:31 +0200)

----------------------------------------------------------------
This tag contains habanalabs driver changes for v5.17:

- Support reset-during-reset. In case the f/w notifies the driver
  that the f/w is going to reset the device, the driver should
  support that even if it is in the middle of doing another
  reset

- Support events from f/w that arrive during device resets.
  These events would be ignored which is bad as critical errors
  would not be reported and treated by the driver.

- Don't kill processes that hold the control device open during
  hard-reset of the device. The control device operations can't
  crash if done during hard-reset. And usually, only monitoring
  applications are using the control device, so killing them
  defies their purpose.

- Fix handling of hwmon nodes when working with legacy f/w

- Change the compute context pointer to be boolean. This pointer
  was abused by multiple code paths that wanted fast access to
  the compute context structure.

- Add uapi to fetch historical errors. This is necessary as errors
  sometimes result in hard-reset where the user application is
  being terminated.

- Optimize GAUDI's MMU cache invalidation.

- Add support for loading the latest f/w.

- Add uapi to fetch HBM replacement and pending rows information.

- Multiple bug fixes to the reset code.

- Multiple bug fixes for Multi-CS ioctl code.

- Multiple bug fixes for wait-for-interrupt ioctl code.

- Many small bug fixes and cleanups.

----------------------------------------------------------------
Bharat Jauhari (3):
      habanalabs: handle abort scenario for user interrupt
      habanalabs: rename reset flags
      habanalabs: refactor wait-for-user-interrupt function

Dani Liberman (6):
      habanalabs: change wait for interrupt timeout to 64 bit
      habanalabs: add support for fetching historic errors
      habanalabs: fix race condition in multi CS completion
      habanalabs: add SOB information to signal submission uAPI
      habanalabs: enable access to info ioctl during hard reset
      habanalabs: keep control device alive during hard reset

Guy Zadicario (1):
      habanalabs/gaudi: fix debugfs dma channel selection

Oded Gabbay (16):
      habanalabs/gaudi: recover from CPU WD event
      habanalabs: make hdev creation code more readable
      habanalabs: prevent false heartbeat message
      habanalabs: abort reset on invalid request
      habanalabs: fix soft reset accounting
      habanalabs: rename late init after reset function
      habanalabs/gaudi: return EPERM on non hard-reset
      habanalabs: free signal handle on failure
      habanalabs: remove redundant check on ctx_fini
      habanalabs: save ctx inside encaps signal
      habanalabs: fix etr asid configuration
      habanalabs: add helper to get compute context
      habanalabs: remove compute context pointer
      habanalabs: remove in_debug check in device open
      habanalabs: fix hwmon handling for legacy f/w
      habanalabs: replace some -ENOTTY with -EINVAL

Ofir Bitton (18):
      habanalabs: expand clock throttling information uAPI
      habanalabs: debugfs support for larger I2C transactions
      habanalabs: handle device TPM boot error as warning
      habanalabs: fix possible deadlock in cache invl failure
      habanalabs: move device boot warnings to the correct location
      habanalabs: add more info ioctls support during reset
      habanalabs: change misleading IRQ warning during reset
      habanalabs: handle events during soft-reset
      habanalabs: return correct clock throttling period
      habanalabs: add current PI value to cpu packets
      habanalabs: sysfs support for two infineon versions
      habanalabs: expose soft reset sysfs nodes for inference ASIC
      habanalabs: modify cpu boot status error print
      habanalabs: fix endianness when reading cpld version
      habanalabs: fix comments according to kernel-doc
      habanalabs: refactor reset information variables
      habanalabs: add a lock to protect multiple reset variables
      habanalabs: support hard-reset scheduling during soft-reset

Ohad Sharabi (11):
      habanalabs: modify wait for boot fit in dynamic FW load
      habanalabs: revise and document use of boot status flags
      habanalabs: adding indication of boot fit loaded
      habanalabs: use variable poll interval for fw loading
      habanalabs: don't clear previous f/w indications
      habanalabs: skip PLL freq fetch
      habanalabs: skip read fw errors if dynamic descriptor invalid
      habanalabs: wait again for multi-CS if no CS completed
      habanalabs: clean MMU headers definitions
      habanalabs: prevent wait if CS in multi-CS list completed
      habanalabs: handle skip multi-CS if handling not done

Rajaravi Krishna Katta (2):
      habanalabs: add dedicated message towards f/w to set power
      habanalabs: Move frequency change thread to goya_late_init

Tomer Tayar (5):
      habanalabs: align debugfs documentation to alphabetical order
      habanalabs: add power information type to POWER_GET packet
      habanalabs: pass reset flags to reset thread
      habanalabs: add missing kernel-doc comments for hl_device fields
      habanalabs: add CPU-CP packet for engine core ASID cfg

Yuri Nudelman (5):
      habanalabs: print va_range in vm node debugfs
      habanalabs: wrong VA size calculation
      habanalabs: make last_mask an MMU property
      habanalabs: add enum mmu_op_flags
      habanalabs: partly skip cache flush when in PMMU map flow

farah kassabri (3):
      habanalabs/gaudi: Fix collective wait bug
      habanalabs: add new opcodes for INFO IOCTL
      habanalabs: change wait_for_interrupt implementation

 .../ABI/testing/debugfs-driver-habanalabs          |  23 +-
 drivers/misc/habanalabs/common/command_buffer.c    |  46 ++-
 .../misc/habanalabs/common/command_submission.c    | 389 +++++++++++++++------
 drivers/misc/habanalabs/common/context.c           |  39 ++-
 drivers/misc/habanalabs/common/debugfs.c           |  97 +++--
 drivers/misc/habanalabs/common/device.c            | 387 ++++++++++----------
 drivers/misc/habanalabs/common/firmware_if.c       | 253 ++++++++++----
 drivers/misc/habanalabs/common/habanalabs.h        | 301 +++++++++++-----
 drivers/misc/habanalabs/common/habanalabs_drv.c    | 150 ++++----
 drivers/misc/habanalabs/common/habanalabs_ioctl.c  | 195 +++++++++--
 drivers/misc/habanalabs/common/hw_queue.c          |   5 +-
 drivers/misc/habanalabs/common/hwmon.c             | 209 +++++++++--
 drivers/misc/habanalabs/common/irq.c               |  14 +-
 drivers/misc/habanalabs/common/memory.c            |  78 +++--
 drivers/misc/habanalabs/common/mmu/mmu.c           |  25 ++
 drivers/misc/habanalabs/common/mmu/mmu_v1.c        |  18 +-
 drivers/misc/habanalabs/common/sysfs.c             |  56 ++-
 drivers/misc/habanalabs/gaudi/gaudi.c              | 313 ++++++++++++-----
 drivers/misc/habanalabs/gaudi/gaudiP.h             |   4 +-
 drivers/misc/habanalabs/gaudi/gaudi_coresight.c    |   4 +-
 drivers/misc/habanalabs/goya/goya.c                | 165 +++++++--
 drivers/misc/habanalabs/goya/goyaP.h               |  14 +-
 drivers/misc/habanalabs/goya/goya_coresight.c      |   4 +-
 drivers/misc/habanalabs/goya/goya_hwmgr.c          |  31 +-
 drivers/misc/habanalabs/include/common/cpucp_if.h  |  62 +++-
 .../misc/habanalabs/include/common/hl_boot_if.h    |   4 +
 .../habanalabs/include/hw_ip/mmu/mmu_general.h     |  19 +-
 .../misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h   |  18 +-
 .../misc/habanalabs/include/hw_ip/mmu/mmu_v1_1.h   |  20 +-
 include/uapi/misc/habanalabs.h                     | 166 +++++++--
 30 files changed, 2185 insertions(+), 924 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ