Data Structures Algorithms 简明教程

Binary Search Algorithm

二分查找是一种具有 O(log n) 运行时复杂度的快速搜索算法。此搜索算法基于分而治之的原理，因为它在搜索之前将数组分成两半。为了使此算法正常工作，数据收集应采用已排序的形式。

Binary search is a fast search algorithm with run-time complexity of Ο(log n). This search algorithm works on the principle of divide and conquer, since it divides the array into half before searching. For this algorithm to work properly, the data collection should be in the sorted form.

二分查找通过比较集合中最中间的项来查找特定的键值。如果发生匹配，则返回项的索引。但是，如果中间项的值大于键值，则搜索中间项的右子数组。否则，搜索左子数组。此过程会递归地继续，直到子数组的大小减小为零。

Binary search looks for a particular key value by comparing the middle most item of the collection. If a match occurs, then the index of item is returned. But if the middle item has a value greater than the key value, the right sub-array of the middle item is searched. Otherwise, the left sub-array is searched. This process continues recursively until the size of a subarray reduces to zero.

Binary Search Algorithm

二分查找算法是一种仅在间隔内执行查找的间隔查找方法。二分查找算法接受的输入必须始终是一个经过排序的数组，因为它根据较大的值或较小的值将数组划分为子数组。该算法遵循以下过程：

Binary Search algorithm is an interval searching method that performs the searching in intervals only. The input taken by the binary search algorithm must always be in a sorted array since it divides the array into subarrays based on the greater or lower values. The algorithm follows the procedure below −

Step 1 − 选择数组中的中间项并将其与要搜索的键值进行比较。如果匹配，则返回中值的索引。

Step 1 − Select the middle item in the array and compare it with the key value to be searched. If it is matched, return the position of the median.

Step 2 − 如果它不匹配键值，请检查键值是否大于或小于中值。

Step 2 − If it does not match the key value, check if the key value is either greater than or less than the median value.

Step 3 − 如果键更大，则在右子数组中执行搜索；但如果键低于中值，则在左子数组中执行搜索。

Step 3 − If the key is greater, perform the search in the right sub-array; but if the key is lower than the median value, perform the search in the left sub-array.

Step 4 − 重复步骤 1、2 和 3，直到子数组的大小变为 1。

Step 4 − Repeat Steps 1, 2 and 3 iteratively, until the size of sub-array becomes 1.

Step 5 − 如果键值在数组中不存在，则算法将返回不成功的搜索。

Step 5 − If the key value does not exist in the array, then the algorithm returns an unsuccessful search.

Pseudocode

二分搜索算法的伪代码看起来是这样的：

The pseudocode of binary search algorithms should look like this −

Procedure binary_search
   A ← sorted array
   n ← size of array
   x ← value to be searched

   Set lowerBound = 1
   Set upperBound = n

   while x not found
      if upperBound < lowerBound
         EXIT: x does not exists.

      set midPoint = lowerBound + ( upperBound - lowerBound ) / 2

      if A[midPoint] < x
         set lowerBound = midPoint + 1

      if A[midPoint] > x
         set upperBound = midPoint - 1

      if A[midPoint] = x
         EXIT: x found at location midPoint
   end while
end procedure

Analysis

由于二进制搜索算法以迭代方式执行搜索，因此计算时间复杂度并不像线性搜索算法那样容易。

Since the binary search algorithm performs searching iteratively, calculating the time complexity is not as easy as the linear search algorithm.

每次不成功的迭代之后，通过划分为多个子数组来迭代搜索输入数组。因此，形成的递归关系将是一个除法函数。

The input array is searched iteratively by dividing into multiple sub-arrays after every unsuccessful iteration. Therefore, the recurrence relation formed would be of a dividing function.

用更简单的术语解释，

To explain it in simpler terms,

During the first iteration, the element is searched in the entire array. Therefore, length of the array = n.
In the second iteration, only half of the original array is searched. Hence, length of the array = n/2.
In the third iteration, half of the previous sub-array is searched. Here, length of the array will be = n/4.
Similarly, in the ith iteration, the length of the array will become n/2i

为了实现成功的搜索，在最后一次迭代之后，数组的长度必须为 1。因此，

To achieve a successful search, after the last iteration the length of array must be 1. Hence,

n/2i = 1

这给我们 −

That gives us −

n = 2i

对两侧应用 log，

Applying log on both sides,

log n = log 2i
log n = i. log 2
i = log n

二分搜索算法的时间复杂度为 O(log n)

The time complexity of the binary search algorithm is O(log n)

Example

为了进行二分查找，目标数组必须已排序。我们通过一个图片示例学习二分查找的过程。以下是我们已排序的数组，让我们假设我们需要使用二分查找搜索值 31 的位置。

For a binary search to work, it is mandatory for the target array to be sorted. We shall learn the process of binary search with a pictorial example. The following is our sorted array and let us assume that we need to search the location of value 31 using binary search.

首先，我们将使用以下公式确定数组的一半 −

First, we shall determine half of the array by using this formula −

mid = low + (high - low) / 2

即为 0 + (9 - 0) / 2 = 4（4.5 的整数部分）。因此，4 是该数组的中间位置。

Here it is, 0 + (9 - 0) / 2 = 4 (integer value of 4.5). So, 4 is the mid of the array.

现在我们比较存储在位置 4 处的该值与要搜索的值（即 31）。我们发现位置 4 处的该值为 27，不匹配。由于该值大于 27 且我们有一个已排序的数组，因此我们还知道目标值肯定位于数组的上半部分。

Now we compare the value stored at location 4, with the value being searched, i.e. 31. We find that the value at location 4 is 27, which is not a match. As the value is greater than 27 and we have a sorted array, so we also know that the target value must be in the upper portion of the array.

我们将 low 更改为 mid + 1，并再次找到新的 mid 值。

We change our low to mid + 1 and find the new mid value again.

low = mid + 1
mid = low + (high - low) / 2

我们的新 mid 值现在为 7。将存储在位置 7 处的该值与我们的目标值 31 进行比较。

Our new mid is 7 now. We compare the value stored at location 7 with our target value 31.

存储在位置 7 的值不是匹配项，而是我们正在寻找的值那个更小。因此，该值必须在这个位置的较低部分。

The value stored at location 7 is not a match, rather it is less than what we are looking for. So, the value must be in the lower part from this location.

因此，我们再次计算中间值。这次是 5。

Hence, we calculate the mid again. This time it is 5.

我们将存储在位置 5 的值与我们的目标值进行比较。我们发现这是一个匹配项。

We compare the value stored at location 5 with our target value. We find that it is a match.

我们得出结论，目标值 31 存储在位置 5。

We conclude that the target value 31 is stored at location 5.

二分查找将可搜索项减半，从而将要进行的比较次数减少到极小数量。

Binary search halves the searchable items and thus reduces the count of comparisons to be made to very less numbers.

Implementation

二分查找是一种快速搜索算法，运行时间复杂度为 Ο(log n)。该搜索算法遵循分治原则。为了让此算法正常工作，数据集合应为排序形式。

Binary search is a fast search algorithm with run-time complexity of Ο(log n). This search algorithm works on the principle of divide and conquer. For this algorithm to work properly, the data collection should be in a sorted form.