📖 Chapters

Data Structures and Algorithms in C (DSA)

Category: IT Fundamentals & Programming

This course is designed to help learners master the core concepts of Data Structures and Algorithms (DSA) using the C programming language. Starting from the basics and advancing to complex topics like graphs, dynamic programming, and memory optimization, this course …

Advanced DP (LIS, Knapsack, DP on Trees/Grids)

📘 Chapter: Advanced DP (LIS, Knapsack, DP on Trees/Grids)

In the previous chapter, we covered the fundamental principles of Dynamic Programming: Optimal Substructure and Overlapping Subproblems, along with the Memoization (top-down) and Tabulation (bottom-up) approaches. Now, we'll dive into some classic and more advanced DP problems. These problems will solidify your understanding and equip you with patterns that are applicable to a wide range of challenges, including those on sequences, items with weights, and structured data like trees and grids.

📈 1. Longest Increasing Subsequence (LIS)

The LIS problem is a fundamental sequence problem solved with DP.

Problem Statement:

Given an unsorted array of integers, find the length of the longest subsequence such that all elements of the subsequence are sorted in increasing order. A subsequence does not require contiguous elements.
Example: For `[10, 22, 9, 33, 21, 50, 41, 60]`, the LIS is `[10, 22, 33, 41, 60]` or `[10, 22, 33, 50, 60]`, both of length 5.

Approach 1: $O(N^2)$ DP

Define `dp[i]` as the length of the longest increasing subsequence ending at index `i`.

Base Case: `dp[i] = 1` for all `i` (each element itself forms an LIS of length 1).
Recurrence: To compute `dp[i]`, iterate through all `j` from `0` to `i-1`. If `arr[i] > arr[j]`, it means `arr[i]` can extend the LIS ending at `arr[j]`. So, `dp[i] = max(dp[i], 1 + dp[j])`.
The final answer is the maximum value in the `dp` array.

// C++ for LIS (O(N^2) DP)
int lengthOfLIS_N2(const vector<int>& nums) {
    if (nums.empty()) return 0;
    int n = nums.size();
    vector<int> dp(n, 1); // dp[i] = LIS ending at index i

    for (int i = 1; i < n; ++i) {
        for (int j = 0; j < i; ++j) {
            if (nums[i] > nums[j]) {
                dp[i] = max(dp[i], 1 + dp[j]);
            }
        }
    }
    return *max_element(dp.begin(), dp.end()); // Find max in dp array
}

Time Complexity: $O(N^2)$ due to nested loops.
Space Complexity: $O(N)$ for the `dp` array.

Approach 2: $O(N \log N)$ DP with Binary Search

This optimized approach maintains an array, say `tails`, where `tails[k]` stores the smallest ending element of an increasing subsequence of length `k+1`.

Initialize `tails` as an empty list/vector.
Iterate through each number `num` in the input array:
- Use binary search (specifically, `lower_bound` in C++ STL) to find the first element in `tails` that is greater than or equal to `num`.
- If such an element is found, it means `num` can replace that element to potentially form a shorter-ending LIS of the same length, or an LIS ending with a smaller number, which is always better. Update `tails[index]` with `num`.
- If no such element is found (meaning `num` is greater than all elements in `tails`), it implies `num` can extend the longest increasing subsequence found so far. Append `num` to `tails`.
The length of the `tails` array at the end will be the length of the LIS.

// C++ for LIS (O(N log N) DP)
int lengthOfLIS_NlogN(const vector<int>& nums) {
    if (nums.empty()) return 0;
    vector<int> tails; // stores the smallest ending element of an increasing subsequence of length i+1

    for (int num : nums) {
        // Find iterator to the first element in tails >= num
        auto it = lower_bound(tails.begin(), tails.end(), num);

        if (it == tails.end()) { // If num is greater than all elements in tails
            tails.push_back(num); // Extend LIS by 1
        } else {
            *it = num; // Replace existing element to form a smaller-ending LIS of same length
        }
    }
    return tails.size(); // The size of tails is the length of LIS
}

Time Complexity: $O(N \log N)$ due to $N$ iterations, each performing a binary search on `tails` (which can grow up to size $N$).
Space Complexity: $O(N)$ for the `tails` array.

🎒 2. Knapsack Problem (0/1 Knapsack)

A classic optimization problem where you must decide whether to include an item or not.

Problem Statement:

Given $N$ items, each with a `weight[i]` and a `value[i]`, and a knapsack with maximum `capacity W`. Determine the maximum total `value` that can be obtained by selecting a subset of items such that their total `weight` does not exceed `W`. Each item can either be entirely included or entirely excluded (hence "0/1").

DP State and Recurrence:

Define `dp[i][w]` as the maximum value that can be obtained using the first `i` items with a knapsack capacity of `w`.

Base Cases: `dp[0][w] = 0` (no items, no value). `dp[i][0] = 0` (zero capacity, no value).
Recurrence: For each item `i` (from 1 to N) and each capacity `w` (from 1 to W):
- If `weight[i-1]` (current item's weight) is greater than `w` (current capacity):
  - `dp[i][w] = dp[i-1][w]` (Cannot include item `i`, so value is same as with `i-1` items).
- Else (if `weight[i-1]` is less than or equal to `w`):
  - `dp[i][w] = max(dp[i-1][w], value[i-1] + dp[i-1][w - weight[i-1]])`
  - This means: either don't include item `i` (take value from `dp[i-1][w]`), OR include item `i` (add its value to the maximum value obtainable from `i-1` items with remaining capacity `w - weight[i-1]`).
The final answer is `dp[N][W]`.

// C++ for 0/1 Knapsack Problem
int knapsack(int W, const vector<int>& weights, const vector<int>& values, int N) {
    // dp[i][w] stores max value using first 'i' items with capacity 'w'
    vector<vector<int>> dp(N + 1, vector<int>(W + 1, 0));

    for (int i = 1; i <= N; ++i) { // Iterate through items
        for (int w = 1; w <= W; ++w) { // Iterate through capacities
            int current_weight = weights[i-1]; // weights and values are 0-indexed
            int current_value = values[i-1];

            if (current_weight > w) {
                // Cannot include current item, take value from previous items
                dp[i][w] = dp[i-1][w];
            } else {
                // Option 1: Don't include current item
                // Option 2: Include current item + max value from remaining capacity with previous items
                dp[i][w] = max(dp[i-1][w], current_value + dp[i-1][w - current_weight]);
            }
        }
    }
    return dp[N][W];
}

Time Complexity: $O(N \cdot W)$ due to nested loops.
Space Complexity: $O(N \cdot W)$ for the 2D DP table. This can be optimized to $O(W)$ using only two rows or even one row, as `dp[i][w]` only depends on `dp[i-1]` values.

// C++ for 0/1 Knapsack (O(W) Space Optimized)
int knapsack_optimized(int W, const vector<int>& weights, const vector<int>& values, int N) {
    vector<int> dp(W + 1, 0); // dp[w] stores max value for current capacity w

    for (int i = 0; i < N; ++i) { // Iterate through items
        // Iterate weights from W down to current_weight
        // This ensures dp[w - current_weight] is from previous iteration (i-1)
        for (int w = W; w >= weights[i]; --w) {
            dp[w] = max(dp[w], values[i] + dp[w - weights[i]]);
        }
    }
    return dp[W];
}

🗺️ 3. DP on Grids (Unique Paths)

Many pathfinding or counting problems on grids can be solved with DP.

Problem Statement:

A robot is located at the top-left corner of an `m x n` grid. The robot can only move either down or right at any point in time. How many unique paths are there from the top-left corner to the bottom-right corner?

DP State and Recurrence:

Define `dp[i][j]` as the number of unique paths to reach cell `(i, j)`.

Base Cases:
- `dp[0][j] = 1` for all `j` (only one way to reach any cell in the first row: keep moving right).
- `dp[i][0] = 1` for all `i` (only one way to reach any cell in the first column: keep moving down).
Recurrence: For any other cell `(i, j)` (where `i > 0` and `j > 0`):
- `dp[i][j] = dp[i-1][j] + dp[i][j-1]` (The robot can reach `(i, j)` either by coming down from `(i-1, j)` or by coming right from `(i, j-1)`).
The final answer is `dp[m-1][n-1]`.

// C++ for Unique Paths on a Grid
int uniquePaths(int m, int n) {
    // dp[i][j] stores number of unique paths to reach (i, j)
    vector<vector<int>> dp(m, vector<int>(n, 0));

    // Base cases: first row and first column have only 1 way to reach
    for (int i = 0; i < m; ++i) dp[i][0] = 1;
    for (int j = 0; j < n; ++j) dp[0][j] = 1;

    // Fill the rest of the table
    for (int i = 1; i < m; ++i) {
        for (int j = 1; j < n; ++j) {
            dp[i][j] = dp[i-1][j] + dp[i][j-1];
        }
    }
    return dp[m-1][n-1];
}

Time Complexity: $O(m \cdot n)$ to fill the grid.
Space Complexity: $O(m \cdot n)$ for the 2D DP table. Can be optimized to $O(min(m, n))$ by only keeping track of the previous row/column.

🌳 4. DP on Trees (Conceptual)

DP on trees is a common pattern where the state of a node's solution depends on the solutions of its children (or ancestors). These problems are typically solved using a DFS traversal.

Concept: For a rooted tree, the subproblems are typically defined over subtrees. A DFS traversal naturally explores subtrees before processing their parent, allowing the propagation of DP values upwards.
Pattern:
- Define `dp[u]` (or `dp[u][state]`) representing some property for the subtree rooted at `u`.
- Perform a DFS. When visiting node `u`, recursively call DFS for all its children `v`.
- After the recursive calls return, use `dp[v]` (results from children) to compute `dp[u]`.
Examples:
- Maximum sum of nodes in a binary tree such that no two adjacent nodes are selected.
- Diameter of a tree (can also be solved with two BFS/DFS, but DP approach exists).
- Minimum cost to convert a tree to satisfy certain properties.

📌 Real-Life Example (UdaanPath)

These advanced DP patterns are applicable in various complex optimization and data analysis scenarios:

LIS for Trend Analysis: If UdaanPath collects data on user engagement or content popularity over time, LIS could identify the longest periods of sustained growth or increasing user activity.
Knapsack for Resource Optimization:
- Course Bundle Creation: When creating course bundles, each course has a "value" (e.g., student interest, revenue potential) and "weight" (e.g., required instructor hours, server resources). A 0/1 Knapsack approach could help select the optimal set of courses for a bundle given resource constraints.
- Feature Prioritization: If a product team has a budget (capacity) and features have estimated development costs (weights) and impact scores (values), Knapsack can help select the most impactful features within the budget.
Grid DP for Pathfinding/Logistics: In a complex warehouse or campus navigation system (like guiding students to different labs in a multi-floor building), grid DP can calculate the number of unique valid routes or the minimum cost path (if weights are involved), accounting for obstacles.
DP on Trees for Content Hierarchy/Knowledge Graphs: If UdaanPath organizes its educational content or internal knowledge base as a tree or graph hierarchy, DP on trees could be used to:
- Calculate the total "knowledge points" or "impact" of a subtree of topics.
- Find the most efficient way to prune or restructure parts of the content tree while maintaining core dependencies.

These advanced DP techniques provide powerful tools for solving sophisticated optimization, counting, and structural analysis problems encountered in large-scale systems.

✅ Summary: Mastering DP Patterns

Advanced DP applies the core principles (optimal substructure, overlapping subproblems) to more complex scenarios.
Longest Increasing Subsequence (LIS):
- $O(N^2)$ DP: `dp[i]` = LIS ending at `i`.
- $O(N \log N)$ DP: Uses `tails` array and binary search.
0/1 Knapsack Problem:
- Maximize value within capacity, each item taken entirely or not.
- `dp[i][w]` = max value using first `i` items with capacity `w`.
- Time: $O(N \cdot W)$, Space: $O(N \cdot W)$ (can be $O(W)$).
DP on Grids (e.g., Unique Paths):
- Counting paths or finding min/max paths on a grid.
- `dp[i][j]` depends on `dp[i-1][j]` and `dp[i][j-1]`.
- Time: $O(rows \cdot cols)$, Space: $O(rows \cdot cols)$.
DP on Trees: Solved via DFS, where parent's DP state depends on children's computed states.

📤 Coming Next: Bit Manipulation and XOR Tricks

From the algorithmic heavy lifting of Dynamic Programming, we'll shift gears to a more "bit-level" understanding of data. Our next chapter will delve into Bit Manipulation and XOR Tricks, exploring how direct operations on binary representations of numbers can lead to incredibly efficient and elegant solutions for certain problems, often crucial in competitive programming and low-level optimization.

← Dynamic Programming (Intro, Memoization, Tabulation) Bit Manipulation and XOR Tricks →