Written by
Nitin Prakash

on

on

# Dynamic Programming

Dynamic programming is a technique for solving a complex problem by breaking it down into a collection of simpler subproblems. Smaller subproblems are solved only once and stored in a data structure like array, map etc. The fundamental idea being to optimize a given recursive problem and improve upon the complexity.

Stating the same as an RL paradigm:

```
Dymanic programming refers to solving a given problem using a collection
of algorithms that computes optimal policies given a perfect model of the
environment as a finite Markov decision process.
```

From the above reference, it should be clear that when we talk about DP we are just formulating a structure for the search of an optimal policy. In any paradigm of solving problems, DP requires two conditions to be satisfied:

- Optimal substructure
- Ovarlapping subproblems

Fortunately, MDPs satisfy both the properties.