lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 25 Mar 2022 14:02:37 +0000 From: Duke Abbaddon <duke.abbaddon@...il.com> To: mobile@...udflare.com Subject: New GPU/CPU & Motherboard Bios strategy for ASUS unique RX6700XTC-FlareEdition2021 https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.18-x86-Platform-Drivers New GPU/CPU & Motherboard Bios strategy for ASUS unique RX6700XTC-FlareEdition2021 Important Business : RS Date: Sun, Jan 3, 2021 at 11:12 AM To: Kr*****, L** <l**.kr****@....com> To: <Med**@...inx.com> FPGA BitFile & Code Opt (c)RS 2021-01 Priority of Operating process for streamlining Dynamic FPGA units on CPU & GPU By Rupert S Factors common in FPGA are:100000 Gates to 750000 Gates (Ideal for complex tasks) Programmable Processor command implementation & reprogram speed 3ns to 15 Seconds 2 million gates Processor core usage to reprogram ? 15% of a 200Mhz processor = 200ns programming time Processor core usage to reprogram ? 20% to 25% of a 200Mhz processor = 30ns programming time 250 to 2900 Gates 1uns to 2ns (ideal for small complex instructions) Processor usage (in programming) 2 to 5% CPU @200Mhz 2000 to 12500 to 25000 Gates (ideal for very complex function) 30uns to 8ns (ideal for small complex instructions & RISC) Processor usage (in programming) 2 to 9% CPU @200Mhz Plans to load a BitFile rely on constant use & not on the fly, However small gate arrays permit microsecond coding.. However I do state that a parameter for operating order is specified & for most users Automatic. Operating system functions.. for example AUDIO are a priority & will stay consistent.. So we will have specific common instructions that are specific to OS & BIOS Firmware.. Commons will take 20% of a large FPGA (relative) With the aim of having at least 4 common & hard to match functions; As a core large ARRAY..The aim being not to reprogram every second, For example during boot process with: Bitfile preorder profile: 1uns to 2ns (ideal for small complex instructions) During the operation of the Computer or array the FPGA may contain specific ANTIVirus & firewall functions, That we map to ML The small unit groups of fast reprogrammables will be ideal for application that we are using for more than 30 minutes.. & May be clustered. Optimus (Prime) bitfile : RS Obviously handheld devices require uniquely optimum feature set & tiny processor size.. Create the boundry and push that limit. We will obviously prefer to enable Hardcode pre trained models such as : SiMD Tessellation & maths objective : for gaming & science Dynamic DMA Clusters (OS,Security,Root) Maths Unit HardDrive Accelerators Compressors Compiler optimisers CPU/GPU Core Prefetch/ML optimiser (on die) Combined Shader & function for both DirectX,Metal & Vulkan utility.. GPU & CPU Synergy Network & Cache. Direct Audio & Video,Haptic processing dynamic; element 3D Extrapolation.. Dynamic Meta Data processing & conversion .. (Very important because not all Meta data is understood directly in the used process.) Obviously handheld devices require uniquely optimum feature set & tiny processor size.. Create the boundry and push that limit. (c)Rupert S https://science.n-helix.com "processor programs a reprogrammable execution unit with the bitfile so that the reprogrammable execution unit is capable of executing specialized instructions associated with the program." https://hothardware.com/news/amd-patent-hybrid-cpu-fpga-design-xilinx "AMD Patent Reveals Hybrid CPU-FPGA Design That Could Be Enabled By Xilinx Tech xilinx office While they often aren’t as great as CPUs on their own, FPGAs can do a wonderful job accelerating specific tasks. Whether it's accelerating acting as a fabric for wide-scale datacenter services boosting AI performance, an FPGA in the hands of a capable engineer can offload a wide variety of tasks from a CPU and speed processes along. Intel has talked a big game about integrating Xeons with FPGAs over the last six years, but it hasn't resulted in a single product hitting its lineup. A new patent by AMD, though, could mean that the FPGA newcomer might be ready to make one of its own. In October, AMD announced plans to acquire Xilinx as part of a big push into the datacenter. On Thursday, the United States Patent and Trademark Office (USPTO) published an AMD patent for integrating programmable execution units with a CPU. AMD made 20 claims in its patent application, but the gist is that a processor can include one or more execution units that can be programmed to handle different types of custom instruction sets. That's exactly what an FPGA does. It might be a little bit until we see products based on this design, as it seems a little too soon to be part of CPUs included in recent EPYC leaks. While AMD has made waves with its chiplet designs for Zen 2 and Zen 3 processors, that doesn't seem to be what's happening here. The programmable unit in AMD's FPGA patent actually shares registers with the processor's floating-point and integer execution units, which would be difficult, or at least very slow, if they're not on the same package. This kind of integration should make it easy for developers to weave these custom instructions into applications, and the CPU would just know to pass those onto the on-processor FPGA. Those programmable units can handle atypical data types, specifically FP16 (or half-precision) values used to speed up AI training and inference. xilinx vu19p In the case of multiple programmable units, each unit could be programmed with a different set of specialized instructions, so the processor could accelerate multiple instruction sets, and these programmable EUs can be reprogrammed on the fly. The idea is that when a processor loads a program, it also loads a bitfile that configures the programmable execution unit to speed up certain tasks. The CPU's own decode and dispatch unit could address the programmable unit, passing those custom instructions to be processed. AMD has been working on different ways to speed up AI calculations for years. First the company announced and released the Radeon Impact series of AI accelerators, which were just big headless Radeon graphics processors with custom drivers. The company doubled down on that with the release of the MI60, its first 7-nm GPU ahead of the Radeon RX 5000 series launch, in 2018. A shift to focusing on AI via FPGAs after the Xilinx acquisition makes sense, and we're excited to see what the company comes up with." ***** https://science.n-helix.com/2018/12/rng.html https://science.n-helix.com/2022/02/rdseed.html https://science.n-helix.com/2017/04/rng-and-random-web.html https://science.n-helix.com/2022/02/interrupt-entropy.html https://science.n-helix.com/2021/11/monticarlo-workload-selector.html https://science.n-helix.com/2022/03/security-aspect-leaf-hash-identifiers.html Audio, Visual & Bluetooth & Headset & mobile developments only go so far: https://science.n-helix.com/2022/02/visual-acuity-of-eye-replacements.html https://science.n-helix.com/2022/03/ice-ssrtp.html https://science.n-helix.com/2021/11/ihmtes.html https://science.n-helix.com/2021/10/eccd-vr-3datmos-enhanced-codec.html https://science.n-helix.com/2021/11/wave-focus-anc.html https://science.n-helix.com/2021/12/3d-audio-plugin.html Integral to Telecoms Security TRNG *RAND OP Ubuntu : https://manpages.ubuntu.com/manpages/trusty/man1/pollinate.1.html https://pollinate.n-helix.com
Powered by blists - more mailing lists