Memory And System Aware Architectures For Real-Time Machine Learning