Dynamic Programming amazing tricks! — Part II — DP patterns

Mariano Crosetti
10 min readOct 27, 2022

Common DP patterns so that you gain the intuition to solve Dynamic Programming Problems!


This series of guides will be helpful for you if you are:

  • preparing coding interviews.
  • training for competitive programming.
  • just want to learn how to come up with ideas to solve dynamic programming problems.

I will assume that you read the last parts.

My goal here is to explain some “non-trivial” DP patterns and to contribute to develop a “dynamic programming intuition”. I also recommend reading about classic DP problems. But I will not write about them as there is enough good material on the internet with that perspective.

Again, remember you will only truly understand it once you practice the ideas. For this Part II, all the topics will have a related problem, available to be submitted to an online judge (check homework after each section). It’s important you code a solution to really understand the idea. Also, it’s important you submit the code and get accepted by the judge. This will teach you to implement bugless solutions.

DP on ranges

Motivational problem: Given an even length string X formed by the characters ‘{’, ‘}’, ‘[’, ‘]’, ‘(’ and ‘)’, calculate the minimal number of “character changes” that must be applied to obtain a “good parenthesized” string. A “character change” consists of overwriting a character from the string for another one. A string T that is “good parenthesized” is inductively defined as follows:

The inductive definition of a “good parenthesized” string T, assuming S, S₁, and S₂ are “good parenthesized”.

Observe that the inductive definition suggests a recursive implementation of transforming the input string X to a good “parenthesized string” T:

  • T is empty will be the base case, which is only possible when X is empty.
  • T is {S}, [S] or (S) will imply matching the first and last characters of X (possibly by changing one or both of them) and then solving the problem for the rest of the characters (without the first and last).
  • T is S₁S₂ then the cost of transforming X to T is transforming certain X₁ to S₁ and X₂ to S₂ (X = X₁X₂ with |X₁| = |S₁| and |X₂| = |S₂|). This is clear because the “character changes” will not affect the length of the string. So we can try every possible X₁X₂ split of the input X, do a recursive call and take the minimum of those costs.

Our implementation represents every substring as a pair of indices a, b representing the string X[a,b) (from character X[a] to X[b-1]):

Usually working with [a,b) (closed-open interval convention) lead to tidier implementations.

Notice that every time, we are calling the recursive function in a substring of the input X. And we only have O(N²) substrings! From Part I, we know that if we have a recursive function we can pass it to a “Top-Down” DP implementation by memorization of the states and obtaining the complexity:

This gives us a resultant complexity of O(N³).

Also, as we‘ve seen in Part I, we can translate our “Top-Down” implementation to “Bottom-Up” if we do a calculation in a group of nested for loops that visit all the states respecting the dependency order of the calculations:

diff = b — a is the length of the interval we are calculating.

In this case, every state call recursively other states that are always shorter (in length). To ensure that we already have the shorter results already calculated, we iterate in the outermost loop through the length of the state. In that way, we calculate the solutions for every subrange in increasing order of length.

Homework: https://www.spoj.com/problems/MIXTURES/

Bitmask DP

Motivational problem: Given a list of 2N students (N ≤ 8) and their locations in the 2D plane, calculate the min cost of student pairing. The cost of a pairing is the sum of the costs of each pair. The cost of a pair (s₁,s₂) is the euclidean distance between s₁ and s₂.

We can calculate the solution for every subset S of the 2N students:

S ⊆ {1,2 .. 2N}.

The number of states of this recursion is small: 2¹⁶ = 65536 (see the cardinality of the set of subsets). We can apply memorization!

The implementation trick is to use an integer msk to represent a subset S: the element i belongs to S (i ∈ S) iff the i-th bit of msk is 1 (read here more about the bijection between binary numbers and elements of the power set):

We use (mks>>i)&1 to check if the i-th bit is on. Also, we use msk ^ (1<<p) to flip (turn off) the p-th bit.

Also, notice that in our implementation we are matching the element p with i. We don’t need to do two nested iterations (that would be O(N²)). The key idea is that for any element p, eventually, we will need to match it, so we can take an arbitrary p and then iterate only over possibles i.

For turning on/off a bit, checking if a bit is on/off, or choosing a bit from a mask, it’s useful to know how C++ bit operators work:

The last two are built-in functions from GCC.

This can be trivially translated to a “Bottom-Up” implementation with a simple for loop for visiting the single variable we have in the state. Notice that for recursive calls we are always turning off bits so the recursive calls are applied to “smaller” masks (seeing them as numbers). So the “for” can be just iterating all masks in order (from 0 to 2ⁿ).

Homework: https://www.spoj.com/problems/TRSTAGE/

DP on digits

Motivational problem: Given X ≤ 10¹⁵, calculate the number of numbers Y such that 0 ≤ Y ≤ X, and Y doesn’t contain consecutive numbers in its base 10 representation.

For example, for X = 112 we should consider all the numbers in the range [0,112] except {11, 22, 33, 44, 55, 66, 77, 88, 99, 100, 110, 111, 112}.

The idea for solving this problem as a “Digit DP” problem is to think of the number Y as a string consisting of digits, and build the number Y. This technique is usually useful in many counting problems that have constraints over the digits (for some base representations).

As we said, we are going to construct a string that will be the decimal (base 10) representation of Y. For that, we will be considering all possible Y with the same length. The maximum length will be the length of X (because Y ≤ X) so we will be considering this as the length of Y, by adding leading (non-significant) zeros to the shorter numbers. For example, when X = 112 we will represent Y = 2 as 002 (notice that 002 must be considered valid, even when it repeats the 0 because they are leading zeros).

We can formulate the construction of Y as a recursive function that has the following arguments:

  • The position p that we are currently filling. We are going to complete from left (most significant digit) to right.
  • We don’t really need “all the past” (the digits already filled). By knowing the last digit we filled u, we can ensure to not repeat consecutive digits.
  • We will also need some extra information about the past that will be codified in a state b. It will be useful to know if we filled with a non-zero digit (because that determines if we can or can’t repeat zeros as they would be leading zeroes). And also it’s necessary to know if the prefix we completed is matching (or not) with the respective prefix in X, that’s important to fill the position with a digit and respect that Y ≤ X.

Note: The first and the last items are typical of any “DP on Digits” problem. The second one is particular to this problem: it’s the necessary information to satisfy the particular constraint over the digits.

The code of the recursive implementation of the construction of Y with DP memorization already applied.

The possible values for the state b are:

  • 0: if the already filled prefix matches the respective one from X.
  • 1: if the already completed prefix is lexicographically less than the respective one from X and there is some filled digit in the past different from zero.
  • 2: if the already filled prefix is all 0's.

Then, the recursive function f(pos, last, state) calculates the number of possible filling from [post, |X|), considering that the last digit was last and the past (already filled) prefix is any from the state indicated by state. f will do that by completing pos with some digit and calling itself accordingly (in pos+1, and with updated last and state):

  • Y[pos] can be d ∈ [0 .. 9] if state = 1,2 or d ∈ [0 .. X[pos]] if state = 0.
  • If pos > 0 then we will have to check that d last, except that state = 2 and last = 0 (because it will be a non-significant 0).
  • We must update the state considering the current state and d.

As the function computation is O(1) and the number of states is |X| x 10 x 3, the total complexity of the DP is O(|X|), so O(log X). Wow!

Note: many of these problems are solvable by number theory, combinatorics, or a greedy algorithm but solving them using “DP on digits” is really easy to implement (once we have practice) and it’s very easy to think that the solution is correct. Also, we can apply the techniques already known from DP and (for example) find the K-th solution that satisfies the constraints!


Layer DP

Motivational problem: Calculate the number of 2D boards of size N ≤ 16, with 0 or 1 on their cells such that doesn’t exist adjacents 1 (that have an edge in common).

A brute-force solution that constructs all the combinations would be very exponential O(2ᴺ ˣ ᴺ). A backtracking implementation will be filling the board, for example from cell (0,0) to (N-1, N-1), trying 1 and 0 and calling recursively (probably prunning checking there are no adjacent ones):

But notice again: we don’t need so much information about the past! If we are completing row by row from (0,0) to (N-1, N-1), as the backtracking does, we only need the information about the last N completed cells: this constitute the “last layer” and are the only cells that can be conflicting with the cells we will be completing in the future.

The last observation allows us to only have a state with the current position (i,j) and the state of the last N-filled cells. This state can be codified in a bitmask (it’s only 2ᴺ = 2¹⁶ = 65536 possibilities at most). The overall state has a size equal to N²2ᴺ. This is very small (16.777.216), we can apply memorization!

As each single function computation is O(1), then the total complexity of the DP is O(N²2ᴺ). Wow moment, again!

Homework: https://www.spoj.com/problems/GNY07H/

See also (it has a problem section with many exercises): https://cp-algorithms.com/dynamic_programming/profile-dynamics.html

Subset Bitmask Trick

Motivational problem: N friends want to share a piece of chocolate of size w x h blocks (w, h ≤ 100). Each of them wants one rectangular piece of exactly v₁, v₂, … vₙ blocks. We can split a piece of chocolate across an entire row or column and create two pieces. We can successively apply this operation. We want to know if we can satisfy the requirements of all the friends without remaining chocolate.

Example: if we have a 3 x 4 chocolate and 4 friends that want respectively pieces of sizes 6, 3, 2, 1. We can split it as follows:

The first split consist on splitting the 3x4 chocolate by column in two pieces of 3x3 and 3x1. The second split takes the 3x3 piece and split it by row in two pieces of 2x3 and 1x3. The third split takes the 1x3 piece and split it by column in pieces of 1x2 and 1x1. Finally, we got 4 pieces: 3x1, 2x3, 1x2 and 1x1 of respectively sizes 3, 6, 2, and 1 as required.

Another example: if we have a 2 x 3 chocolate and only 2 friends that want respectively pieces of 5 and 1 the answer is that is not possible: we only can make 1 split (as we will always have N-1 splits), and it’s impossible to obtain rectangle from size 1 as the height and width are both greater.

We can solve the problem for any size w x h and subset of friends S with a function f(w, h, S) by just iterating all the subsets T ⊂ S and splitting the chocolate (by column or by row) to satisfy consistently the demand of the subset T and S-T (recursive calls will call f on f(w’, h, T) and f(w-w’, h, S-T) or f(w, h’, T) and f(w, h-h’, S-T)). Remember that for codifying a subset S in a function we can use a bitmask.

Notice that:

  • It’s only necessary to iterate across all the subsets T ⊂ S (and not necessary to iterate through possible splits h’, h-h’) because for given h and w and T we have only one possible h’ (as the T friends want exactly some amount of chocolate that must be equal to w x h’).
  • The number of states is w x h x 2ᴺ. The cost of iteration (all subsets of S) is O(2ᴺ). That would make the complexity O(N²4ᴺ). But the reality is that if we iterate only the subsets of S (and not all the bitmasks T, and then check if they are subsets of S) the complexity is much less. That’s can be shown by a subtler analysis: for each iteration of the loop (on a different function call) we will have a different 4-uple (w, h, S, T). The complexity of the whole DP is the number of different 4-uples. Now, as T ⊂ S, the number of (S,T) is not 4ᴺ but 3ᴺ: that’s clear if we consider there is a bijection between each (S,T) with T ⊂ S [1,N] and the number of way of “coloring” with 3 colors N integers (the colors would be different for elements belonging to T, T-S and [1,N]-S).
  • For a given w and S the value of h is determined. As we explained in Part I (recovering a parameter section) we can drop the parameter from the state.

So finally, the complexity is O(w x 3ᴺ)

Code for iterating only the subsets of “set” bitmask. The complexity of executing this code for each of 2ᴺ masks is 3ᴺ. Be careful if you need to consider the empty subset as this “for” loop breaks when “subset” is empty.

Homework: https://onlinejudge.org/index.php?option=onlinejudge&Itemid=8&page=show_problem&problem=3540

End of Part II

This is the end of Part II. I will publish a series of articles about “Dynamic Programming” teaching the thinking intuition behind the technique (in contrast to most of tutorials that focus on classic DP problems).

About Me

If you liked this post / my style follow me.

I will try to keep on posting regularly about SE, Web, and ML always with these objectives in mind:

  • keep it short
  • being meaningful
  • explain with examples



Mariano Crosetti

NLP & Computer Vision Engineer 🧠 Only posts that add VALUE to GROWTH as SWE AI / ML 📈 ICPC LATAM Champion 🏆 (defeated Berkeley, Stanford & ETH) Ex-@Google