Abstract: Transformer and its large variants are widely used in Natural Language Processing and Computer Vision, which bring the high computational cost and delay in reasoning phase, and become the ...