Dynamic Programming amazing tricks! — Part III.a — All you need about DP optimizations — Knuth Magic!

Mariano Crosetti
7 min readDec 12, 2022


In Part III, I will present some popular DP optimizations. I will explain them with a story as they are hard-to-understand topics. And give an intuition to recognize scenarios where it might be applicable.

Part III will consist of three sub-articles: (a) Knuth optimization (this one), (b) Knapsack on trees, and (c) D&C optimization.


This series of guides will be helpful for you if you are:

  • preparing coding interviews.
  • training for competitive programming.
  • just want to learn how to come up with ideas to solve dynamic programming problems.

I will assume that you read the last parts.

My goal in this article is to explain popular “DP optimizations tricks”: advanced artifices that reduce time complexity. I will focus on providing insights so you to recognize DP’s formulas that can be optimized with them.

As I said in past articles, you will only truly understand it once you practice the ideas. So again, I put a homework section with exercises.

Finally, remember the content of this article is more advanced compared to the previous ones, so don’t demotivate if you don’t understand the concepts on the first read.

Knuth’s optimization

Motivational problem: Having N different integers Vᵢ with their frequencies Fᵢ, calculate the cost of the binary search tree with the minimum cost where the cost of a binary tree is calculated by the frequencies with the formula:

Lᵢ is the height of each node: the distance to the root plus one.

For example, if Vᵢ = [2, 1, 3] and the respective Fᵢ = [1, 100, 1] the answer is 105. The best solution will use 1 as the root, even when the resulting trees are imbalanced because it has a huge frequency compared with the others.

Observe that for any solution (any search tree) the Inorder traversal of the Vᵢ will be always the same: the Vᵢ sorted. We can construct the tree from the Inorder of the nodes: as all the solutions have the same in-order we are considering all of them. So we will sort the input by Vᵢ.

The first step then is to choose which node will be the root. Notice that once we sort, if Vᵣ is the root, then Vᵢ < Vᵣ < Vⱼ for all i < r < j so the Vᵢ ‘s will be placed on the left subtree and the Vⱼ ‘s will be placed on the right subtree. Then we can apply the algorithm recursively to both remaining subsets of nodes obtaining respective costs C₀ and C₁. The optimal tree that has Vᵣ as the root will have these optimal subtrees as respective left and right subtrees. For calculating the cost of this optimal tree we have to take into account the cost added by the root (Fᵣ) and the fact that the nodes from the subtrees will have in the resulting tree +1 of depth (so will have to add the sum of Fᵢ and Fⱼ to the cost). In conclusion, the cost will be:

We note sum(i,j) the sum of the Fᵢ in the subrange [i,j). It can be calculated in O(1) with a pre-processing of O(N).

We can notice that the successive recursive calls can be computed by following the same logic of iterating through all the possible choices of the root, recursively calculating the cost of the left and right tree costs, using them to calculate the cost of the optimal subtree that use the chosen root and minimizing the cost for all the choices of the root. Essentially:

Notice that the invariant of the recursion is that we always calculate an optimal subtree for a range of the original (sorted by Vᵢ) array. A subrange of Vₖ with i ≤ k < j can be represented by the endpoints (i,j).

As we explained in the DP Part I the total complexity is the product of the number of states and the complexity of the function (assuming the recursive calls are already calculated). Here we have O(N²) states and f does a loop of O(N). So the total complexity of the DP is O(N³).

We will see an optimization that easily converts it to O(N²). First, let’s do some observations. Let’s define K(i,j) as the minimum position k for which the minimization from f(i,j) finds an optimal value. Notice that if we magically know K(i,j) in O(1), we could reduce the complexity to O(N²) so it’s a pretty interesting function. It can be proven that for this particular problem the following condition holds:

  • K(i,j) is monotonic in both parameters.
  • In terms of our range DP, this means that if we observe the optimal root k for a subrange (i,j) and compare it with the optimal choice for a subrange (i-1, j) -the one that includes an element more to the left- the optimal root in this last subrange is the same or it’s on the left of optimal root from the (i,j) case: if we add elements to the left, the optimal “break” choice for the DP doesn’t move to the right (move left or stays equal).
  • Similarly, the first inequality expresses that if we add an element to the right the optimal choice doesn’t move to the left (move right or stay equal).

Even when we can’t calculate magically K in O(1), if we delimit A ≤ K(i,j) ≤ B we can loop k from A to B (instead that doing it from i to j-1). Notice that we delimited K(i,j) by K(i, j-1) and K(i+1, j). Also notice that [i, j-1) and [i+1, j) are smaller subranges (in length) than (i, j]. So we can compute K(i,j) while calculating f and be sure that K(i, j-1) and K(i+1, j) are calculated (for example in a bottom-up implementation) when we calculate K(i,j). So we can use these bounds to delimit the iteration.

Cost Analysis

This magically reduces the complexity to O(N²). Wow! Why? You can find a formal demonstration here (see “Proof of correctness” subsection). But let’s explain with images:

  • The iteration cost of f is now: K(i+1, j) — K(i, j-1) + 1. We represent it in a grid and painted K(i+1, j), and K(i, j-1) in green and red respectively (for some arbitrary choice of i,j):
Source of the image: https://www.pc-arg.com/media/attachment/dpavanzada.pdf (ICPC Argentinian Training Camp 2019) Author: Agustin Guitierrez (youtube.com/@agustin.elsantodel90).
  • Painting similarly for every (i,j):
Source of the image: https://www.pc-arg.com/media/attachment/dpavanzada.pdf (ICPC Argentinian Training Camp 2019) Author: Agustin Guitierrez (youtube.com/@agustin.elsantodel90).
  • Notice that all the internal costs get canceled contributing (each) with only a cost of 1. In total, here we get O(N²).
  • The costs from the right border didn’t cancel with anything, contributing (each) with O(N). In total, we get O(N²) from there.

So, the total cost is O(N²). This is an example of amortized cost analysis.

General Case

Notice that the exposed argument applied for any DP of the form:

The generic form of the DP where we can apply the “Knuth Optimization Trick” is a range DP where we are optimizing (minimizing or maximizing) something. We are not assuming anything about the cost.

If the optimal position K(i,j) of the minimization for a range (i,j) satisfy:

This condition is essential for the optimization so it’s called the “Knuth Condition”.

For deciding if it holds is good to think: How do behave the optimal “break” of a range (i,j) if we add an element to the right or left? Remember that the monotonic condition is the same as the optimal stay equal or moves left when we add elements to the left (and respectively the same for the right).

Sometimes is easier to check that the cost function C(i,j) satisfies the conditions (for a ≤ b ≤ c ≤ d):

The first condition can be understood as the cost increase (or at least doesn’t decrease) when the interval grows. The second condition (also known as quadrangle inequality) generally is a consequence as the cost grows at least linearly, or faster (so longer intervals have much longer costs than the sum of subintervals).

These conditions guarantee that Knuth’s condition holds.

Recognizing when it might apply

Always we get a DP on the range with minimization (or maximization) in O(N³) but we need a better complexity of O(N²) we must consider an instance of the problem for the subrange (i,j) and ask ourselves how the optimal “break” behaves regarding we add an element to the left or right. Does it always stay or move accordingly? If our intuition says “Yes” maybe it deserves to add the optimization and send it to the judge! (you don’t need to demonstrate the correctness for competitive programming!).

Generic implementation

Implementing it is an easy and tiny modification to the DP.

Based on the code of CP Algorithm, but respecting the bottom-up form of ranges DP defined in Part II.

If we want, we can easily implement Knuth Optimization also for a Top-Down range DP:

The only things we added to the classical range DP are the bounds to the for loop and the previous dummy calls (lines 12–14).


About Me

If you liked this post / my style follow me.

I will try to keep on posting regularly about SE, Web, and ML always with these objectives in mind:

  • keep it short
  • being meaningful
  • explain with examples



Mariano Crosetti

NLP & Computer Vision Engineer 🧠 Only posts that add VALUE to GROWTH as SWE AI / ML 📈 ICPC LATAM Champion 🏆 (defeated Berkeley, Stanford & ETH) Ex-@Google