I have written a solution for this question before: http://www.lifeincode.net/programming/leetcode-median-of-two-sorted-arrays-java/
That solution seems easy to understand. However, it turns out to be very difficult to implement. The hardest part is how we come up with a correct way to handle borders.
After reading some post, I found there is another way to achieve findKth number from two sorted arrays, and the running time complexity is $O(log(min(m, n)))$ instead of $O(log(m + n))$, even better than the requirement.
Let’s go through the question again.
There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).
Analysis
Assuming that m is always smaller or equal than n, to find out the Kth number in these two arrays, we can divide array A and array B to two parts like the following:
A[0], A[1], …, A[i – 1] / A[i], A[i + 1], …, A[m – 1]
B[0], B[1], …, B[j – 1] / B[j], B[j + 1], …, B[n – 1]
The length of first part of A is i, and the length of first part of B is j. In this case, we can choose any i from 0 to m. If i is 0, it means that there is no value in left part. If i is m, it means that there is no value in right part. It’s the same for j.
Assuming $i + j == k$, in the same time if we can make sure that the maximum value of the left parts is smaller than or equal to the minimum value of right parts, the Kth value must be A[i – 1] or B[j – 1] (depending on which is larger).
But how can we make sure that all values in left parts are always smaller than any value of right parts?
We’ve already known that the left part of A is smaller than right part of A, and left part of B is smaller than right part of B. The only thing we need to make sure is that the left part of A is smaller than right part of B, and the left part of B is smaller than right part of A. Since array A and array B are sorted, we can come up with the following condition:
- A[i – 1] <= B[j]
- B[j – 1] <= A[i]
We noticed that when i == 0 or j == 0, index i – 1 and j – 1 are invalid. However, if i == 0, it means that there is no value in the left part of A. We don’t even need to worry about the first condition. This is the same when j == 0.
We are going to find the Kth number. So if i is decided, j must equal to k – i, which can make the length of left parts to k. Binary search can be used here to find out the correct value of i.
- If the first condition doesn’t hold, which means A[i – 1] > B[j], we must decrease i. Why? Since when we try to decrease i, A[i – 1] is getting smaller. In the same time, j = k – i is getting larger, which makes B[j] getting larger. So decreasing i makes it possible to achieve the first condition.
- If the second condition doesn’t hold, which means B[j – 1] > A[i], we must increase i. The reason is similar.
Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
public class Solution { public double findMedianSortedArrays(int[] nums1, int[] nums2) { int m = nums1.length; int n = nums2.length; if (m > n) { return findMedianSortedArrays(nums2, nums1); } if ((m + n) % 2 != 0) { return (double)findKth(nums1, nums2, (m + n + 1) / 2); } else { return ((double)findKth(nums1, nums2, (m + n) / 2) + findKth(nums1, nums2, (m + n) / 2 + 1)) / 2; } } private int findKth(int[] nums1, int[] nums2, int k) { int iMin = Math.max(0, k - nums2.length); int iMax = Math.min(nums1.length, k); while (iMin <= iMax) { int i = (iMin + iMax) / 2; int j = k - i; if (i > 0 && j < nums2.length && nums1[i - 1] > nums2[j]) { iMax = i - 1; } else if (j > 0 && i < nums1.length && nums2[j - 1] > nums1[i]) { iMin = i + 1; } else { if (i == 0) { return nums2[k - 1]; } if (j == 0) { return nums1[k - 1]; } return Math.max(nums1[i - 1], nums2[k - i - 1]); } } return 0; } } |
There are some optimizations here. In the analysis part, I mentioned that we can choose i from 0 to m. However, if k is quite big, we should have the minimum length of left part of array A. Otherwise even if we are using all values from array B, the length of left parts cannot reach k. If you change it back to 0 and m for iMin and iMax, you can try with test case A=[1], B[1].
Complexity
As mentioned above, the time complexity is $O(log(min(m, n)))$.