📖 Chapters
- 1. Introduction to DSA and Its Importance
- 2. Time and Space Complexity (Big O, Θ, Ω)
- 3. Recursion and Stack Memory
- 4. Mathematics for DSA (GCD, LCM, Modulo Arithmetic)
- 5. Arrays: Basics and Operations
- 6. Strings: Manipulation & Inbuilt Functions
- 7. Structures and Unions with Arrays
- 8. Linked Lists (Singly, Circular, Doubly)
- 9. Stack (Using Array & Linked List)
- 10. Queue (Simple, Circular, Deque)
- 11. Linear Search & Binary Search
- 12. Sorting Basics: Bubble, Selection, Insertion
- 13. Merge Sort
- 14. Quick Sort
- 15. Counting Sort, Radix Sort, Bucket Sort
- 16. Heap Sort
- 17. Binary Trees & Representations
- 18. Binary Tree Traversals (Pre, In, Post)
- 19. Binary Search Tree (BST)
- 20. AVL Trees (Self-balancing BST)
- 21. Trie (Prefix Tree)
- 22. Segment Tree (Range Queries)
- 23. Fenwick Tree / Binary Indexed Tree (BIT)
- 24. Heap & Priority Queues (Min-Heap, Max-Heap)
- 25. Hashing and Hash Tables
- 26. Disjoint Set Union (Union-Find, Path Compression)
- 27. Sparse Tables
- 28. Sliding Window Technique
- 29. Two Pointers Technique
- 30. Graph Representations (Adjacency Matrix & List)
- 31. BFS and DFS (Traversal & Search)
- 32. Topological Sort (Kahn’s & DFS)
- 33. Shortest Path Algorithms (Dijkstra, Bellman-Ford)
- 34. Minimum Spanning Tree (Kruskal, Prim)
- 35. Cycle Detection in Graphs
- 36. Bridges and Articulation Points
- 37. Greedy Algorithms (Activity Selection, Huffman Coding)
- 38. Backtracking (N-Queens, Sudoku Solver)
- 39. Dynamic Programming (Intro, Memoization, Tabulation)
- 40. Advanced DP (LIS, Knapsack, DP on Trees/Grids)
- 41. Bit Manipulation and XOR Tricks
- 42. Interview Problems from FAANG Companies
- 43. Competitive Coding Patterns
- 44. Building Mini Projects Using DSA (Menu-driven)
- 45. Final Quiz + Problem Sheet (100+ questions)
- 46. Next Steps: Competitive Programming or System Design?
Data Structures and Algorithms in C (DSA)
Category: IT Fundamentals & Programming
This course is designed to help learners master the core concepts of Data Structures and Algorithms (DSA) using the C programming language. Starting from the basics and advancing to complex topics like graphs, dynamic programming, and memory optimization, this course …
Counting Sort, Radix Sort, Bucket Sort
📘 Chapter 15: Beyond Comparisons: Counting Sort, Radix Sort, Bucket Sort
So far, we've explored comparison-based sorting algorithms like Bubble, Selection, Insertion, Merge, and Quick Sort. These algorithms sort elements by comparing them to each other. However, there's a powerful class of sorting algorithms known as non-comparison sorts that can achieve linear time complexity, O(n), under specific conditions. They do this by leveraging information about the distribution or structure of the input numbers themselves, rather than just their relative order. In this chapter, we'll dive into three prominent non-comparison sorts: Counting Sort, Radix Sort, and Bucket Sort.
📊 Counting Sort: Ideal for Small Ranges
Counting Sort is an efficient sorting algorithm for integers when the range of input numbers (the difference between the maximum and minimum values) is not significantly larger than the number of items to be sorted. It works by counting the frequency of each distinct element in the input array, and then using that information to place the elements in their correct sorted positions.
// Counting Sort in C void countingSort(int arr[], int n) { // Find the maximum element to determine the range int max = arr[0]; for (int i = 1; i < n; i++) { if (arr[i] > max) max = arr[i]; } int count[max + 1]; // Create count array (size max + 1) int output[n]; // Create output array // Initialize count array with all zeros for (int i = 0; i <= max; ++i) { count[i] = 0; } // Store the count of each element for (int i = 0; i < n; i++) { count[arr[i]]++; } // Store cumulative count of each array (prefix sum) // This gives the actual position of elements in the output array for (int i = 1; i <= max; i++) { count[i] += count[i - 1]; } // Place the elements in output array // Iterate from right to left to ensure stability for (int i = n - 1; i >= 0; i--) { output[count[arr[i]] - 1] = arr[i]; count[arr[i]]--; } // Copy the output array to arr[], so that arr[] now contains sorted elements for (int i = 0; i < n; i++) { arr[i] = output[i]; } }
Time Complexity: O(n + k), where 'n' is the number of elements and 'k' is the range of input (max_value - min_value + 1). If 'k' is proportional to 'n' (i.e., k = O(n)), then the time complexity becomes O(n), which is linear.
Space Complexity: O(n + k) for the `count` and `output` arrays.
Counting Sort Example Visualization: `[1, 4, 1, 2, 7, 5, 2]` (Max value = 7)
🔢 Radix Sort: Digit by Digit Efficiency
Radix Sort is a non-comparison sorting algorithm that sorts numbers by processing individual digits (or bits) from least significant to most significant (LSD Radix Sort) or vice versa (MSD Radix Sort). It requires a stable sorting algorithm (like Counting Sort) as a subroutine for sorting digits. This makes it particularly effective for integers or strings with a fixed length.
// Radix Sort in C (using Counting Sort as a subroutine) // A utility function to get maximum value in arr[] int getMax(int arr[], int n) { int max = arr[0]; for (int i = 1; i < n; i++) if (arr[i] > max) max = arr[i]; return max; } // A function to do counting sort of arr[] according to // the digit represented by 'exp'. void countSortForRadix(int arr[], int n, int exp) { int output[n]; // output array int i, count[10] = {0}; // 0-9 for digits // Store count of occurrences in count[] for (i = 0; i < n; i++) count[(arr[i] / exp) % 10]++; // Change count[i] so that count[i] now contains actual // position of this digit in output[] for (i = 1; i < 10; i++) count[i] += count[i - 1]; // Build the output array for (i = n - 1; i >= 0; i--) { output[count[(arr[i] / exp) % 10] - 1] = arr[i]; count[(arr[i] / exp) % 10]--; } // Copy the output array to arr[], so that arr[] now // contains sorted numbers according to current digit for (i = 0; i < n; i++) arr[i] = output[i]; } // The main function to that sorts arr[] of size n using Radix Sort void radixSort(int arr[], int n) { // Find the maximum number to know number of digits int m = getMax(arr, n); // Do counting sort for every digit. Note that instead of passing digit number, // exp is passed. exp is 10^i where i is current digit number for (int exp = 1; m / exp > 0; exp *= 10) countSortForRadix(arr, n, exp); }
Time Complexity: O(d * (n + b)), where 'd' is the number of digits (or passes), 'n' is the number of elements, and 'b' is the base (radix) of the numbers (usually 10 for decimal numbers). If 'd' and 'b' are constants, this simplifies to O(n), making it linear.
Space Complexity: O(n + b) due to the auxiliary space used by Counting Sort (output array and count array).
Radix Sort Example Visualization: `[170, 45, 75, 90, 802, 24, 2, 66]`
🗑️ Bucket Sort: Distributing into Bins
Bucket Sort, also known as bin sort, is a distribution sort algorithm that distributes elements of an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm (like Insertion Sort, which is efficient for small inputs) or by recursively applying the bucket sort algorithm. Finally, the elements are gathered from the buckets in order. Bucket Sort works best when the input is uniformly distributed over a range.
// Bucket Sort in C (simplified for integers, real-world often uses floating points) // This example assumes numbers are in range [0, 100] for simplicity. #include <stdio.h> #include <stdlib.h> #include <string.h> // For memset // Function to sort an array using insertion sort (for individual buckets) void insertionSortForBucket(int arr[], int n) { int i, key, j; for (i = 1; i < n; i++) { key = arr[i]; j = i - 1; while (j >= 0 && arr[j] > key) { arr[j + 1] = arr[j]; j = j - 1; } arr[j + 1] = key; } } void bucketSort(int arr[], int n) { // 1. Create n buckets. For simplicity, let's assume max value is 100 // and elements are distributed across these buckets. // In a real scenario, you'd determine the number of buckets and range per bucket // based on max_val and min_val of the array. int max_val = arr[0]; for(int i=1; imax_val) max_val = arr[i]; } // Let's create 'num_buckets' buckets. // A common approach is num_buckets = n or sqrt(n) // For integers, usually elements are mapped to buckets using (value * num_buckets) / (max_val + 1) int num_buckets = 10; // Example: 10 buckets if (n > 0) num_buckets = (max_val / n) + 1; // A more dynamic way to determine buckets // Create an array of vectors (or linked lists) for buckets // This is a common way to represent dynamic arrays in C, but typically // would be done with dynamic memory or standard library containers in C++ // For simplicity, let's use a 2D array and track sizes. This is not ideal for real-world. // In a production C environment, you'd use dynamic arrays (e.g., `struct Vector { int* data; int size; int capacity; }`). // For demonstration, let's use a simplified approach assuming max items per bucket. // THIS IS A SIMPLIFIED, ILLUSTRATIVE C IMPLEMENTATION FOR THE CONCEPT. // A robust C implementation would be more complex (e.g., using linked lists for buckets). // Simulate buckets using a simple array of arrays (fixed size for illustration) // In real C, you'd typically use linked lists or dynamically resizing arrays for buckets. int buckets[num_buckets][n]; // Max n elements per bucket int bucket_sizes[num_buckets]; memset(bucket_sizes, 0, sizeof(bucket_sizes)); // Initialize all sizes to 0 // 2. Put array elements into appropriate buckets for (int i = 0; i < n; i++) { int bucket_idx = (arr[i] * num_buckets) / (max_val + 1); // Map to bucket index buckets[bucket_idx][bucket_sizes[bucket_idx]++] = arr[i]; } // 3. Sort individual buckets for (int i = 0; i < num_buckets; i++) { insertionSortForBucket(buckets[i], bucket_sizes[i]); } // 4. Concatenate all buckets into arr[] int index = 0; for (int i = 0; i < num_buckets; i++) { for (int j = 0; j < bucket_sizes[i]; j++) { arr[index++] = buckets[i][j]; } } }
Time Complexity: O(n + k), where 'n' is the number of elements and 'k' is the number of buckets. If elements are uniformly distributed, making the bucket sorting efficient (e.g., O(1) or small constant), then it approaches O(n).
Space Complexity: O(n + k) for the buckets.
Bucket Sort Example Visualization: `[0.89, 0.56, 0.65, 0.12, 0.69, 0.45]` (Numbers between 0.0 and 1.0, 10 buckets)
⚡ Advantages and Limitations of Non-Comparison Sorts
These algorithms offer a compelling alternative to comparison sorts:
- Potential for O(n) Complexity: Their biggest advantage is the ability to sort in linear time under specific conditions, outperforming the O(n log n) lower bound of comparison sorts.
- Counting Sort: Excellent for integers within a small, well-defined range. It's stable.
- Radix Sort: Efficient for integers, especially when they have a fixed number of digits or a predictable maximum number of digits. It is also stable if the subroutine sorting (like Counting Sort) is stable.
- Bucket Sort: Works well for uniformly distributed data, especially floating-point numbers. Its performance degrades if data is clustered. It can be stable depending on the sorting algorithm used for buckets.
However, they come with crucial limitations:
- Data Type Restrictions: Primarily designed for integers or data that can be mapped to integers (like characters in Radix sort). They are not general-purpose for arbitrary data types.
- Range/Distribution Dependence:
- Counting Sort's efficiency degrades if the range 'k' is very large (e.g., sorting 3 numbers between 1 and 1 Billion).
- Radix Sort's efficiency depends on the number of digits/passes.
- Bucket Sort's efficiency relies heavily on uniform data distribution. If all elements fall into one bucket, it degenerates to the performance of the bucket's internal sorting algorithm.
- Higher Space Complexity: They often require significant auxiliary space for count arrays, buckets, or output arrays (O(n+k) or O(n+b)).
📌 Real-Life Example (UdaanPath)
These non-comparison sorts, given their specialized nature, find practical applications in specific niches within a platform like UdaanPath:
- Counting Sort for Quiz Scores: If UdaanPath hosts quizzes where scores are integers within a fixed, relatively small range (e.g., 0-100 points), Counting Sort could be incredibly fast for ranking students or tallying score distributions. For example, if 10,000 students take a quiz scored out of 100, Counting Sort would sort their scores in O(10,000 + 100) = O(10,100) time, which is practically instantaneous.
- Radix Sort for User IDs/Course Codes: If UdaanPath assigns fixed-length alphanumeric user IDs or course codes (e.g., "UD12345", "DS98765"), Radix Sort could be used for extremely fast sorting of these identifiers, especially when dealing with millions of records. It's also efficient for IP addresses, which are essentially numbers.
- Bucket Sort for Performance Metrics: For analyzing student performance metrics that are normalized to a scale (e.g., overall skill level from 0.0 to 1.0, or percentile ranks), Bucket Sort could efficiently group students into performance tiers. If student skill levels are evenly distributed, this provides a quick way to segment the user base for targeted educational content or interventions.
✅ Summary: Specialized Sorting Powerhouses
- Non-comparison sorts leverage specific properties of data to sort, potentially achieving O(n) time complexity.
- Counting Sort is excellent for integers within a small, known range.
- Radix Sort sorts by processing digits/bits, ideal for integers and fixed-length strings, often using Counting Sort as a stable subroutine.
- Bucket Sort distributes elements into bins and sorts each bin, performing best with uniformly distributed data.
- While highly efficient under their specific conditions, their applicability is limited by data type, range, or distribution constraints, and they generally require more auxiliary space.
📤 Coming Next: Heap Sort - An In-Place O(n log n) Solution
We've seen the power of both comparison-based and non-comparison based sorts. In the next chapter, we'll explore Heap Sort, another efficient comparison-based sorting algorithm that leverages a specialized data structure called a "Heap" to achieve an O(n log n) time complexity with the significant advantage of being an in-place sort.