Artificial intelligence (AI) engines have been integrated in a myriad of applications, whether they run in data centers or on edge/end-node devices. Most contemporary AI applications use the cloud to execute computationally intensive and power demanding deep learning algorithms. Moving the processing from the cloud to edge devices reduces data transfers and latency, improves security, and enables scalability. The huge computational requirements of deep learning neural networks (DNN) deem it necessary to achieve challenging tradeoffs among energy, latency, and accuracy at every application level. This project will utilize alternative numbering systems and fused primitives to create DNN architectures with low latency, low energy, and high accuracy for AI implementation. Optimization at the algorithm and architecture level will enable the proposed architectures to attain target power and performance. Moreover, optimized digital circuit primitives for functions that accelerate AI engines in FPGAs and ASICs will be developed, verified, and demonstrated in FPGA prototypes.