On The Statistical Complexity Of Offline Policy Evaluation For Tabular Reinforcement Learning