Learning Emergent Gaits with Decentralized Phase Oscillators: On the Role of Observations, Rewards, and Feedback