leafee98-blog/content/essays/cpp-sort-implement.md
leafee98 c08ae834a8
All checks were successful
ci/woodpecker/push/deploy Pipeline was successful
new essays: cpp-sort-implement
2023-07-29 16:01:21 +08:00

115 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "C++ 的 sort 实现"
date: 2023-07-29T15:09:46+08:00
tags: []
categories: []
weight: 50
show_comments: true
draft: false
---
目前 C++ 中 std::sort 的实现使用的是 [内省排序 Introsort][1]、和插入排序共用的方式。当总元素数量小于一个阈值时(通常是 16那么 sort 将只会使用插入排序。在内省排序中,当到达一定深度时(深度为元素数量的对数值),排序将转为堆排序。
<!--more-->
```cpp
/// This is a helper function...
template<typename _RandomAccessIterator, typename _Compare>
_GLIBCXX20_CONSTEXPR
inline _RandomAccessIterator
__unguarded_partition_pivot(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Compare __comp)
{
_RandomAccessIterator __mid = __first + (__last - __first) / 2;
std::__move_median_to_first(__first, __first + 1, __mid, __last - 1,
__comp);
return std::__unguarded_partition(__first + 1, __last, __first, __comp);
}
template<typename _RandomAccessIterator, typename _Compare>
_GLIBCXX20_CONSTEXPR
inline void
__partial_sort(_RandomAccessIterator __first,
_RandomAccessIterator __middle,
_RandomAccessIterator __last,
_Compare __comp)
{
std::__heap_select(__first, __middle, __last, __comp);
std::__sort_heap(__first, __middle, __comp);
}
/// This is a helper function for the sort routine.
template<typename _RandomAccessIterator, typename _Size, typename _Compare>
_GLIBCXX20_CONSTEXPR
void
__introsort_loop(_RandomAccessIterator __first,
_RandomAccessIterator __last,
_Size __depth_limit, _Compare __comp)
{
while (__last - __first > int(_S_threshold))
{
if (__depth_limit == 0)
{
std::__partial_sort(__first, __last, __last, __comp);
return;
}
--__depth_limit;
_RandomAccessIterator __cut =
std::__unguarded_partition_pivot(__first, __last, __comp);
std::__introsort_loop(__cut, __last, __depth_limit, __comp);
__last = __cut;
}
}
template<typename _RandomAccessIterator, typename _Compare>
_GLIBCXX20_CONSTEXPR
inline void
__sort(_RandomAccessIterator __first, _RandomAccessIterator __last,
_Compare __comp)
{
if (__first != __last)
{
std::__introsort_loop(__first, __last,
std::__lg(__last - __first) * 2,
__comp);
std::__final_insertion_sort(__first, __last, __comp);
}
}
/**
* @brief Sort the elements of a sequence.
* @ingroup sorting_algorithms
* @param __first An iterator.
* @param __last Another iterator.
* @return Nothing.
*
* Sorts the elements in the range `[__first, __last)` in ascending order,
* such that for each iterator `i` in the range `[__first, __last - 1)`,
* `*(i+1) < *i` is false.
*
* The relative ordering of equivalent elements is not preserved, use
* `stable_sort()` if this is needed.
*/
template<typename _RandomAccessIterator>
_GLIBCXX20_CONSTEXPR
inline void
sort(_RandomAccessIterator __first, _RandomAccessIterator __last)
{
// concept requirements
__glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
_RandomAccessIterator>)
__glibcxx_function_requires(_LessThanComparableConcept<
typename iterator_traits<_RandomAccessIterator>::value_type>)
__glibcxx_requires_valid_range(__first, __last);
__glibcxx_requires_irreflexive(__first, __last);
std::__sort(__first, __last, __gnu_cxx::__ops::__iter_less_iter());
}
```
这里的函数**从下往上**看,外部调用的 `std::sort` 是下面第一个函数,它经过一些断言以后直接使用了 `std::__sort`,也就是倒数第二个函数。在 `std::__sort` 中,当容器内部不为空的时候,固定先使用内省排序再使用插入排序。在内省排序(`std::__introsort_loop`)中,若容器元素数量小于阈值就什么也不做,就相当于在数量较小时只使用插入排序,当达到深度限制以后,将转为堆排序(即 `std::__partial_sort` 的实现),否则(`std::__unguarded_partition_pivot`)将选取处于容器中间位置的元素作为基准,将小于基准的移动到左侧,大于基准的移动到右侧,并递归调用,即快速排序的思路。
关于深度限制为 2 倍的容器元素个数的对数,在理想情况下容器每次被二等分时是永远到达不了这个限制的,但需要注意基准的选择是直接选取位于中间位置的元素,也因此极大概率是无法二等分的,此时就能够达到容器深度的限制了。
[1]: https://zh.wikipedia.org/wiki/%E5%86%85%E7%9C%81%E6%8E%92%E5%BA%8F