Internal model from observations for reward shapingDaiki KimuraSubhajit Chaudhuryet al.2018ALA 2018Conference paper