Meta Reinforcement Learning through Memory