Find the repeating and missing numbers
You are given an integer array nums of size n containing numbers in the range [1, n]. Every number appears exactly once except one value A which appears twice and another value B which is missing. Return [A, B] (A first - the repeating number, B second - the missing number). You must not modify the original array.
Examples:
nums = [3, 5, 4, 1, 1]
output [1, 2] (1 repeats, 2 missing)
nums = [1, 2, 3, 6, 7, 5, 7]
output [7, 4] (7 repeats, 4 missing)
Instead of jumping straight to shortcuts, optimal solution or throwing formulas at you, this article takes the time to explain why each method works. We begin with the simplest brute-force idea, then slowly improve it step by step, until we reach the clean and efficient final solution. The explanations are simple, natural and written in a way that feels like someone personally guiding you through the problem.
1) Brute Force (Nested loops)
The most straightforward way is to compare every element with every other element. If two positions hold the same value, that number is our repeating value A. Once we know A, we simply check which number from 1 to n is missing to find B.
Steps
-
Start with the first element and compare it with every element that comes after it.
-
Continue this for all positions. The moment you find two equal values, you've found the repeating number A.
-
Next, look for the missing number. Go through the numbers 1 to n and check if each one exists in the array. The number that does not appear is your missing value B
Code
#include <vector>
using namespace std;
class SolutionBrute {
public:
vector<int> findErrorNums(vector<int>& nums) {
int n = nums.size();
int A = -1, B = -1;
// Find repeating number A
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (nums[i] == nums[j]) {
A = nums[i];
break;
}
}
if (A != -1) break;
}
// Find missing number B
for (int k = 1; k <= n; k++) {
bool found = false;
for (int x = 0; x < n; x++) {
if (nums[x] == k) {
found = true;
break;
}
}
if (!found) {
B = k;
break;
}
}
return {A, B};
}
};
public class SolutionBrute {
public int[] findErrorNums(int[] nums) {
int n = nums.length;
int A = -1, B = -1;
// Find repeating A using nested loops
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (nums[i] == nums[j]) {
A = nums[i];
break;
}
}
if (A != -1) break;
}
// Find missing B by checking every number from 1..n
for (int k = 1; k <= n; k++) {
boolean found = false;
for (int x = 0; x < n; x++) {
if (nums[x] == k) {
found = true;
break;
}
}
if (!found) {
B = k;
break;
}
}
return new int[]{A, B};
}
}
public class SolutionBrute {
public int[] FindErrorNums(int[] nums) {
int n = nums.Length;
int A = -1, B = -1;
// Find repeating number A
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (nums[i] == nums[j]) {
A = nums[i];
break;
}
}
if (A != -1) break;
}
// Find missing number B
for (int k = 1; k <= n; k++) {
bool found = false;
for (int x = 0; x < n; x++) {
if (nums[x] == k) {
found = true;
break;
}
}
if (!found) {
B = k;
break;
}
}
return new int[] { A, B };
}
}
class SolutionBrute:
def findErrorNums(self, nums):
n = len(nums)
A = -1
B = -1
# Find repeating number A
for i in range(n):
for j in range(i + 1, n):
if nums[i] == nums[j]:
A = nums[i]
break
if A != -1:
break
# Find missing number B
for k in range(1, n + 1):
found = False
for x in nums:
if x == k:
found = True
break
if not found:
B = k
break
return [A, B]
Complexity
-
Time: O(n²) because we compare pairs and then scan again to find the missing number.
-
Space: O(1) since we don't use extra data structures (aside from a few basic variables).
This method is slow but easy to understand , a pure brute-force approach.
2) Brute Force : Counting by Scanning Each Number (Another Simple Approach)
After the first brute-force method, another straightforward idea is to check each number from 1 to n and count how many times it appears in the array.
Understanding the thought process
Think of it this way: pick the number 1, scan the whole array and see whether it appears once, twice or not at all. Then repeat the same check for 2, 3 and every other number. This gives us both the repeating and the missing values, but it still involves a lot of repeated scanning.
Steps
-
For every number between 1 and n, count how many times it appears in the array.
-
The number that appears twice is your repeating number.
-
The number that never appears is your missing number.
Code
#include <vector>
using namespace std;
vector<int> findErrorNumsBrute(vector<int>& nums) {
int n = nums.size();
int repeating = -1, missing = -1;
for (int i = 1; i <= n; i++) {
int count = 0;
for (int num : nums) {
if (num == i) count++;
}
if (count == 2) repeating = i;
if (count == 0) missing = i;
}
return {repeating, missing};
}
public int[] findErrorNumsBrute(int[] nums)
{
int n = nums.length;
int repeating = -1, missing = -1;
for (int i = 1; i <= n; i++)
{
int count = 0;
for (int num : nums)
{
if (num == i) count++;
}
if (count == 2) repeating = i;
if (count == 0) missing = i;
}
return new int[] { repeating, missing };
}
public int[] FindErrorNumsBrute(int[] nums)
{
int n = nums.Length;
int repeating = -1, missing = -1;
for (int i = 1; i <= n; i++)
{
int count = 0;
foreach (int num in nums)
{
if (num == i) count++;
}
if (count == 2) repeating = i;
if (count == 0) missing = i;
}
return new int[] { repeating, missing };
}
def findErrorNumsBrute(nums):
n = len(nums)
repeating = -1
missing = -1
for i in range(1, n + 1):
count = 0
for num in nums:
if num == i:
count += 1
if count == 2:
repeating = i
if count == 0:
missing = i
return [repeating, missing]
Complexity
-
Time: O(n²) because we scan the array again for each number.
-
Space: O(1) since we don't use additional data structures.
3) Better Approach: Using a Frequency Array / Hash Map (Simple and Beginner-Friendly)
A much cleaner approach is to count how many times each number appears. Instead of scanning the array repeatedly, we just record the frequency of every number from 1 to n. The number that appears twice becomes our repeating value A and the number that appears zero times is the missing value B. The best part: we do all this without touching or modifying the original array.
Steps
-
Create a frequency array/hashmap of size n + 1.
-
Go through the input array and increase the count for each number.
-
After that, scan through the frequency array:
-
The number with frequency 2 is the repeating number.
-
The number with frequency 0 is the missing number.
-
Code
#include <vector>
using namespace std;
class SolutionFreq {
public:
vector<int> findErrorNums(vector<int>& nums) {
int n = nums.size();
vector<int> freq(n + 1, 0);
// Count frequency
for (int num : nums)
freq[num]++;
int A = -1, B = -1;
for (int i = 1; i <= n; i++) {
if (freq[i] == 2) A = i;
if (freq[i] == 0) B = i;
}
return {A, B};
}
};
public class SolutionFreq {
public int[] findErrorNums(int[] nums) {
int n = nums.length;
int[] freq = new int[n + 1];
// Count frequency of each value
for (int num : nums) {
freq[num]++;
}
int A = -1, B = -1;
// Find repeating and missing
for (int i = 1; i <= n; i++) {
if (freq[i] == 2) A = i; // repeating
if (freq[i] == 0) B = i; // missing
}
return new int[]{A, B};
}
}
public class SolutionFreq
{
public int[] FindErrorNums(int[] nums)
{
int n = nums.Length;
int[] freq = new int[n + 1];
// Count frequency
foreach (int num in nums)
freq[num]++;
int A = -1, B = -1;
// Find repeating and missing
for (int i = 1; i <= n; i++)
{
if (freq[i] == 2) A = i;
if (freq[i] == 0) B = i;
}
return new int[] { A, B };
}
}
def findErrorNums(nums):
n = len(nums)
freq = [0] * (n + 1)
# Count frequency
for num in nums:
freq[num] += 1
A = -1
B = -1
# Find repeating and missing
for i in range(1, n + 1):
if freq[i] == 2:
A = i
elif freq[i] == 0:
B = i
return [A, B]
Complexity
-
Time: O(n) : one pass to count frequencies and another to check them.
-
Space: O(n) : because we use an extra frequency array of size n + 1.
4) Optimal Approach 1: Math Formula Method (Sum & Sum of Squares) - O(1) Extra Space
This approach uses basic algebra to find the repeating and missing numbers without any extra memory. If all values from 1 to n were present correctly, their total sum and their sum of squares would match known formulas. But because one number A appears twice and another number B is missing, these sums become unbalanced and we can use that imbalance to solve the puzzle.
Understanding the idea
-
The difference between the actual sum and the expected sum gives us A − B.
-
The difference between the actual sum of squares and the expected sum of squares gives us A² − B², which can be rewritten as (A − B)(A + B).
-
With these two equations, we can easily calculate both A and B.
-
This method is clean, fast and uses constant space.
Math Formula
As we know the summation of the first N numbers is:
(N*(N+1))/2
and the summation of squares of the first N numbers is:
(N*(N+1)*(2N+1))/6
Steps
-
Calculate the sum of all elements and the sum of their squares. Use long to avoid overflow.
-
Compute the expected sum and expected sum of squares for numbers 1 to n.
-
Find the differences:
-
diff = S − S_true -> this equals A − B ......... (i)
-
sqDiff = P − P_true -> this equals A² − B² .............(ii)
-
Compute sumAB = sqDiff / diff -> (A − B)(A + B) / (A − B) -> this gives A + B. ......... (iii)
-
-
Solve the two equations to get:
-
A => 2A/2 => (A + A )/2 => (A - B + A + B)/2 => (diff + sumAB) / 2 (in other words adding eq (i) and (iii) diff + sumAB = A - B + A + B => A = (diff + sumAB) / 2)
-
B = A − diff (from eq (i))
-
-
Return the values.
Code
#include <vector>
using namespace std;
class SolutionMath {
public:
vector<int> findErrorNums(vector<int>& nums) {
int n = nums.size();
long long S = 0; // sum(nums)
long long P = 0; // sum(nums^2)
for (int num : nums) {
S += num;
P += 1LL * num * num;
}
long long S_true = 1LL * n * (n + 1) / 2;
long long P_true = 1LL * n * (n + 1) * (2LL * n + 1) / 6;
long long diff = S - S_true; // A - B
long long sqDiff = P - P_true; // A^2 - B^2
long long sumAB = sqDiff / diff; // A + B
long long A = (diff + sumAB) / 2;
long long B = A - diff;
return {(int)A, (int)B};
}
};
public class SolutionMath {
public int[] findErrorNums(int[] nums) {
int n = nums.length;
long S = 0; // sum(nums)
long P = 0; // sum(nums^2)
for (int num : nums) {
S += num;
P += 1L * num * num;
}
long S_true = 1L * n * (n + 1) / 2;
long P_true = 1L * n * (n + 1) * (2L * n + 1) / 6;
long diff = S - S_true; // A - B
long sqDiff = P - P_true; // A^2 - B^2
long sumAB = sqDiff / diff; // A + B
long A = (diff + sumAB) / 2;
long B = A - diff;
return new int[]{(int) A, (int) B};
}
}
public class SolutionMath
{
public int[] FindErrorNums(int[] nums)
{
int n = nums.Length;
long S = 0; // sum(nums)
long P = 0; // sum(nums^2)
foreach (int num in nums)
{
S += num;
P += (long)num * num;
}
long S_true = (long)n * (n + 1) / 2;
long P_true = (long)n * (n + 1) * (2L * n + 1) / 6;
long diff = S - S_true; // A - B
long sqDiff = P - P_true; // A^2 - B^2
long sumAB = sqDiff / diff; // A + B
long A = (diff + sumAB) / 2;
long B = A - diff;
return new int[] { (int)A, (int)B };
}
}
def findErrorNumsMath(nums):
n = len(nums)
S = 0 # sum(nums)
P = 0 # sum(nums^2)
for num in nums:
S += num
P += num * num
S_true = n * (n + 1) // 2
P_true = n * (n + 1) * (2 * n + 1) // 6
diff = S - S_true # A - B
sqDiff = P - P_true # A^2 - B^2
sumAB = sqDiff // diff # A + B
A = (diff + sumAB) // 2
B = A - diff
return [A, B]
Complexity
-
Time: O(n) : just one pass to compute sums.
-
Space: O(1) : only a few variables are used.
5) Optimal Approach 2: XOR Partitioning Method (Fast and Uses O(1) Space)
This approach uses a classic XOR trick often applied to problems where two numbers need to be identified. XOR has a useful property: if you XOR two identical numbers, they cancel each other out. We take advantage of this to isolate the repeating and missing values.
Understanding the idea
-
First, XOR all the numbers in the array and all numbers from 1 to n.
-
Since the repeating number A appears twice, one of its copies cancels out.
-
The missing number B never appears, so it remains in the final XOR.
-
After all cancellations, what we are left with is: xor = A ^ B
How do we separate A and B from this XOR?
If xor is not zero, it means A and B differ in at least one bit.
For example, if xor = 001000
this 1-bit tells us: At this position, A has one value and B has the opposite value
That single differing bit becomes our tool.
Why this bit matters
We use this bit to divide all numbers into two groups:
-
Group 1 : the bit is 0
-
Group 2 : the bit is 1
Since A and B differ at this bit : one of them must fall into Group 1 and the other must fall into Group 2
Why does grouping reveal A and B?
Once we split all numbers into two groups based on the differing bit, we XOR the numbers inside each group separately. Every normal number appears once in the array and once in the range 1 to n and both copies fall into the same group so they cancel each other out. After this cancellation, each group is left with only one unmatched value: one group holds A and the other holds B. These two results, which we can call x and y are exactly the numbers we're trying to find.
How do we know which is the repeating number?
To identify which of x or y is the repeating value, we simply check the original array. The number that appears twice is A and the other one is the missing number B. This final scan tells us their correct order.
Steps
-
XOR all values in the array and all numbers from 1 to n -> result is A ^ B.
-
Find any set bit in this XOR result (commonly the rightmost set bit).
-
Split numbers from both the array and the range 1..n into two buckets based on this bit.
-
XOR within each bucket separately this gives two numbers, say x and y.
-
One of them is the repeating number A and the other is the missing number B.
-
Scan the array once:
- If x appears twice, then A = x and B = y. Otherwise, A = y and B = x.
#include <vector>
using namespace std;
class SolutionXOR {
public:
vector<int> findErrorNums(vector<int>& nums) {
int n = nums.size();
int xorVal = 0;
for (int i = 0; i < n; i++) {
xorVal ^= nums[i];
xorVal ^= (i + 1);
}
int rightmostSetBit = xorVal & -xorVal;
int x = 0, y = 0;
for (int i = 0; i < n; i++) {
if (nums[i] & rightmostSetBit) x ^= nums[i];
else y ^= nums[i];
int val = i + 1;
if (val & rightmostSetBit) x ^= val;
else y ^= val;
}
int countX = 0;
for (int num : nums) {
if (num == x) countX++;
}
return countX == 2 ? vector<int>{x, y} : vector<int>{y, x};
}
};
public class SolutionXOR {
public int[] findErrorNums(int[] nums) {
int n = nums.length;
int xor = 0;
// XOR all elements in nums and all numbers from 1..n
for (int i = 0; i < n; i++) {
xor ^= nums[i];
xor ^= (i + 1);
}
// xor = A ^ B (missing ^ repeating)
int rightmostSetBit = xor & -xor;
int x = 0, y = 0;
for (int i = 0; i < n; i++) {
if ((nums[i] & rightmostSetBit) != 0) x ^= nums[i];
else y ^= nums[i];
int val = i + 1;
if ((val & rightmostSetBit) != 0) x ^= val;
else y ^= val;
}
// Check which one is repeating
int countX = 0;
for (int num : nums) {
if (num == x) countX++;
}
return countX == 2 ? new int[]{x, y} : new int[]{y, x};
}
}
public class SolutionXOR
{
public int[] FindErrorNums(int[] nums)
{
int n = nums.Length;
int xor = 0;
// XOR nums[i] and (i+1)
for (int i = 0; i < n; i++)
{
xor ^= nums[i];
xor ^= (i + 1);
}
int rightmostSetBit = xor & -xor;
int x = 0, y = 0;
for (int i = 0; i < n; i++)
{
if ((nums[i] & rightmostSetBit) != 0)
x ^= nums[i];
else
y ^= nums[i];
int val = i + 1;
if ((val & rightmostSetBit) != 0)
x ^= val;
else
y ^= val;
}
int countX = 0;
foreach (int num in nums)
{
if (num == x) countX++;
}
return countX == 2 ? new int[] { x, y } : new int[] { y, x };
}
}
def findErrorNumsXOR(nums):
n = len(nums)
xor = 0
# XOR nums[i] and (i+1)
for i in range(n):
xor ^= nums[i]
xor ^= (i + 1)
rightmostSetBit = xor & -xor
x = 0
y = 0
for i in range(n):
if nums[i] & rightmostSetBit:
x ^= nums[i]
else:
y ^= nums[i]
val = i + 1
if val & rightmostSetBit:
x ^= val
else:
y ^= val
countX = nums.count(x)
return [x, y] if countX == 2 else [y, x]
Complexity
-
Time: O(n) , only a few simple passes.
-
Space: O(1) , no extra storage needed.
Complexity Comparison Table
| Approach | Time Complexity | Extra Space | Stability / Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Simple but slow for large n |
| Frequency (array / HashMap) | O(n) | O(n) | Clear and safe, easy to implement |
| Math (sum & sum of squares) | O(n) | O(1) | Fast and memory-efficient, requires long to avoid overflow |
| XOR partitioning | O(n) | O(1) | Fast, memory-efficient, bitwise trick, needs one additional pass to determine which value is the duplicate |
Real-World Use Cases of the "Repeating and Missing Number" Problem
These techniques for finding the repeating and missing number are very useful in real engineering workflows. Here are some of the most common situations where they're used
1. Data Integrity Checks
When large datasets are imported or merged, it's easy for a process to accidentally duplicate one row and skip another. These methods help pinpoint exactly which ID was repeated and which one went missing.
2. Audit & Reconciliation
Systems that generate sequential identifiers like invoices, order IDs, ticket numbers or batch IDs may occasionally issue the same number twice or leave one unused. Detecting this mismatch is essential for clean audits and reliable reconciliation.
3. ETL and Data Pipeline Operations
During transformations, joins or shuffling operations, duplicate keys or missing keys can silently appear. Since many pipelines avoid modifying source data, the O(1 space math/XOR approach offers a fast way to detect such issues without extra memory overhead.
4. Telemetry and Event Streaming Systems
Event logs often rely on sequential event IDs. Network retries can create duplicates, while dropped packets can leave gaps. These algorithms verify the continuity of event streams and catch missing or repeated event numbers.
Overall, these approaches provide lightweight, high-performance diagnostics especially the XOR-based solutions, which are perfect when memory is tight or when working with large arrays that you don't want to copy or modify.
Final Notes
-
All the solutions discussed above work without modifying the original array, which is important when the input must stay unchanged.
-
When using the Math based approach, make sure to use
longfor calculating sums and sum-of-squares. With large n (like 10⁵ or more), int can overflow and lead to incorrect results. -
The XOR technique offers a great balance of high performance and constant space usage, making it ideal when memory is limited.
-
The frequency based method is the simplest to understand and maintain, which is why it's often preferred in production codebases where readability and debugging matter.
