leafee98-blog/content/essays/cpp-sort-implement.md
leafee98 c08ae834a8
All checks were successful
ci/woodpecker/push/deploy Pipeline was successful
new essays: cpp-sort-implement
2023-07-29 16:01:21 +08:00

4.9 KiB
Raw Permalink Blame History

title date tags categories weight show_comments draft
C++ 的 sort 实现 2023-07-29T15:09:46+08:00
50 true false

目前 C++ 中 std::sort 的实现使用的是 内省排序 Introsort、和插入排序共用的方式。当总元素数量小于一个阈值时(通常是 16那么 sort 将只会使用插入排序。在内省排序中,当到达一定深度时(深度为元素数量的对数值),排序将转为堆排序。

  /// This is a helper function...
  template<typename _RandomAccessIterator, typename _Compare>
    _GLIBCXX20_CONSTEXPR
    inline _RandomAccessIterator
    __unguarded_partition_pivot(_RandomAccessIterator __first,
                                _RandomAccessIterator __last, _Compare __comp)
    {
      _RandomAccessIterator __mid = __first + (__last - __first) / 2;
      std::__move_median_to_first(__first, __first + 1, __mid, __last - 1,
                                  __comp);
      return std::__unguarded_partition(__first + 1, __last, __first, __comp);
    }

  template<typename _RandomAccessIterator, typename _Compare>
    _GLIBCXX20_CONSTEXPR
    inline void
    __partial_sort(_RandomAccessIterator __first,
                   _RandomAccessIterator __middle,
                   _RandomAccessIterator __last,
                   _Compare __comp)
    {
      std::__heap_select(__first, __middle, __last, __comp);
      std::__sort_heap(__first, __middle, __comp);
    }

  /// This is a helper function for the sort routine.
  template<typename _RandomAccessIterator, typename _Size, typename _Compare>
    _GLIBCXX20_CONSTEXPR
    void
    __introsort_loop(_RandomAccessIterator __first,
                     _RandomAccessIterator __last,
                     _Size __depth_limit, _Compare __comp)
    {
      while (__last - __first > int(_S_threshold))
        {
          if (__depth_limit == 0)
            {
              std::__partial_sort(__first, __last, __last, __comp);
              return;
            }
          --__depth_limit;
          _RandomAccessIterator __cut =
            std::__unguarded_partition_pivot(__first, __last, __comp);
          std::__introsort_loop(__cut, __last, __depth_limit, __comp);
          __last = __cut;
        }
    }

  template<typename _RandomAccessIterator, typename _Compare>
    _GLIBCXX20_CONSTEXPR
    inline void
    __sort(_RandomAccessIterator __first, _RandomAccessIterator __last,
           _Compare __comp)
    {
      if (__first != __last)
        {
          std::__introsort_loop(__first, __last,
                                std::__lg(__last - __first) * 2,
                                __comp);
          std::__final_insertion_sort(__first, __last, __comp);
        }
    }

  /**
   *  @brief Sort the elements of a sequence.
   *  @ingroup sorting_algorithms
   *  @param  __first   An iterator.
   *  @param  __last    Another iterator.
   *  @return  Nothing.
   *
   *  Sorts the elements in the range `[__first, __last)` in ascending order,
   *  such that for each iterator `i` in the range `[__first, __last - 1)`,
   *  `*(i+1) < *i` is false.
   *
   *  The relative ordering of equivalent elements is not preserved, use
   *  `stable_sort()` if this is needed.
  */
  template<typename _RandomAccessIterator>
    _GLIBCXX20_CONSTEXPR
    inline void
    sort(_RandomAccessIterator __first, _RandomAccessIterator __last)
    {
      // concept requirements
      __glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
            _RandomAccessIterator>)
      __glibcxx_function_requires(_LessThanComparableConcept<
            typename iterator_traits<_RandomAccessIterator>::value_type>)
      __glibcxx_requires_valid_range(__first, __last);
      __glibcxx_requires_irreflexive(__first, __last);

      std::__sort(__first, __last, __gnu_cxx::__ops::__iter_less_iter());
    }

这里的函数从下往上看,外部调用的 std::sort 是下面第一个函数,它经过一些断言以后直接使用了 std::__sort,也就是倒数第二个函数。在 std::__sort 中,当容器内部不为空的时候,固定先使用内省排序再使用插入排序。在内省排序(std::__introsort_loop)中,若容器元素数量小于阈值就什么也不做,就相当于在数量较小时只使用插入排序,当达到深度限制以后,将转为堆排序(即 std::__partial_sort 的实现),否则(std::__unguarded_partition_pivot)将选取处于容器中间位置的元素作为基准,将小于基准的移动到左侧,大于基准的移动到右侧,并递归调用,即快速排序的思路。

关于深度限制为 2 倍的容器元素个数的对数,在理想情况下容器每次被二等分时是永远到达不了这个限制的,但需要注意基准的选择是直接选取位于中间位置的元素,也因此极大概率是无法二等分的,此时就能够达到容器深度的限制了。