The topic of this thesis is stochastic optimal control and reinforcement learning. Our aim is to unify the theory and language used in the two fields. The thesis presents both frameworks and discuss similarities, differences and how the reinforcement learning framework can be extended to include elements from the Hamilton-Jacobi Bellman equations. In the second part of the thesis, this theory is used in order to price exotic options in energy markets. We also use the HJB-equations and the Q-learner as an update rule to look at problems from portfolio optimization.