福特-约翰逊合并插入排序

Ford-Johnson算法，也称为合并插入排序（名称可能由Knuth给出），是一种就地排序算法，旨在执行尽可能少的比较以对集合进行排序。不幸的是，该算法需要一些特定的数据结构来跟踪元素的顺序，并且通常太慢而无法实用。无论如何，从计算机科学的角度来看，这是一个有趣的算法。虽然没有执行最佳数量的比较，但在减少比较数量方面，它仍然是参考，也是最著名的比较类型之一。
该算法并不是世界上最简单的算法，我找不到可以在网上找到适当的解释，因此，我将根据Donald Knuth撰写的《计算机编程艺术》第3卷中的描述，尽我所能进行解释。
充分利用二进制搜索的方法
进行最少的搜索比较次数，我们需要考虑以下有关二进制搜索的观察结果：在元素数量为\ $ 2 ^ n \ $时，对排序后的序列执行二进制搜索所需的最大比较次数是相同的是\ $ 2 ^ {n + 1} -1 \ $。例如，以\ $ 8 \ $或\ $ 15 \ $元素的排序顺序查找元素需要进行相同数量的比较。
许多基于插入的排序算法执行二进制搜索来查找要插入元素的位置，但是其中大多数都没有考虑二进制搜索的属性。
合并插入排序
福特-约翰逊合并插入排序是一个三步算法，令\ $ n \ $为要排序的元素数：

将集合分成\ $ n / 2 \ $对\ $ 2 \ $元素并成对排列。如果元素数为奇数，则表示集合中的最后一个元素未与任何元素配对。

将元素对按其最高值递归排序。如果元素的数量为奇数，则最后一个元素不被视为最高值，而是保留在集合的末尾。考虑到最高值构成了一个排序列表，我们将其称为主链，而其余元素称为pend元素。使用名称\ $ a_1，a_2，a_3，...，a_ {n / 2} \ $标记主链的元素，然后使用名称\ $ b_1，b_2，b_3，...， b_ {n / 2} \ $。对于每个\ $ k \ $，我们都有一个\ $ b_k \ le a_k \ $关系。

将pend元素插入主链。我们知道第一个Pend元素\ $ b_1 \ $小于\ $ a_1 \ $；我们认为它是主链的一部分，然后成为\ $ \ {b_1，a_1，a_2，a_3，...，a_ {n / 2} \} \ $。现在，我们需要将其他pend元素插入主链，以确保插入区域的大小尽可能多地为\ $ 2 \ $减去\ $ 1 \ $。基本上，我们将\ $ b_3 \ $插入\ $ \ {b_1，a_1，a_2 \} \ $（我们知道它小于\ $ a_3 \ $），然后将\ $ b_2 \ $插入\ $ \ {b_1，a_1，b_3 \} \ $无论插入\ $ b_3 \ $的位置如何。请注意，在这些插入期间，插入区域的大小始终最大为3。
要插入到主链中的下一个pend元素\ $ b_k \ $的值，同时最大程度地减少了与之对应的比较次数下一个Jacobsthal编号。我们首先插入了\ $ 3 \ $元素，所以下一个将是\ $ 5 \ $然后是\ $ 11 \ $然后是\ $ 21 \ $，依此类推...
总而言之，第一个pend元素的插入顺序进入主链的情况如下：\ $ b_1，b_3，b_2，b_5，b_4，b_ {11}，b_ {10}，b_9，b_8，b_7，b_6，... \ $。

说实话，这种解释可能还不够清楚，我真的建议您从可用的任何资源中寻找算法的其他描述；它们通常带有图片，以使发生的事情更加明显。我至少可以为您提供指向5个元素展开算法的分步说明的链接；《计算机编程艺术》第3卷第5.3.1节包含对5和21个元素的算法的描述，以及对该算法的一般描述（如果可以访问本书的副本）。
Gimme Codez
我做了我能描述的算法。现在，让我们给您代码。在相关部分附近的注释中描述了大多数优化。另外，某些部分很难看，例如最后将集合从高速缓存中完全移入和移出完整两次，但很难摆脱这一特定部分。为了平稳地执行递归排序，该算法使用group_iterator类，该类从给定大小的集合中“查看”连续的元素组，但是即使能够交换也仅使用该组中的最后一个元素

#include <algorithm>
#include <cstddef>
#include <cstdint>
#include <functional>
#include <list>
#include <iterator>
#include <type_traits>
#include <vector>

////////////////////////////////////////////////////////////
// Iterator used to sort views of the collection

template<typename Iterator>
class group_iterator
{
    private:

        Iterator _it;
        std::size_t _size;

    public:

        ////////////////////////////////////////////////////////////
        // Public types

        using iterator_category = std::random_access_iterator_tag;
        using iterator_type     = Iterator;
        using value_type        = typename std::iterator_traits<Iterator>::value_type;
        using difference_type   = typename std::iterator_traits<Iterator>::difference_type;
        using pointer           = typename std::iterator_traits<Iterator>::pointer;
        using reference         = typename std::iterator_traits<Iterator>::reference;

        ////////////////////////////////////////////////////////////
        // Constructors

        group_iterator() = default;

        group_iterator(Iterator it, std::size_t size):
            _it(it),
            _size(size)
        {}

        ////////////////////////////////////////////////////////////
        // Members access

        auto base() const
            -> iterator_type
        {
            return _it;
        }

        auto size() const
            -> std::size_t
        {
            return _size;
        }

        ////////////////////////////////////////////////////////////
        // Element access

        auto operator*() const
            -> reference
        {
            return _it[_size - 1];
        }

        auto operator->() const
            -> pointer
        {
            return &(operator*());
        }

        ////////////////////////////////////////////////////////////
        // Increment/decrement operators

        auto operator++()
            -> group_iterator&
        {
            _it += _size;
            return *this;
        }

        auto operator++(int)
            -> group_iterator
        {
            auto tmp = *this;
            operator++();
            return tmp;
        }

        auto operator--()
            -> group_iterator&
        {
            _it -= _size;
            return *this;
        }

        auto operator--(int)
            -> group_iterator
        {
            auto tmp = *this;
            operator--();
            return tmp;
        }

        auto operator+=(std::size_t increment)
            -> group_iterator&
        {
            _it += _size * increment;
            return *this;
        }

        auto operator-=(std::size_t increment)
            -> group_iterator&
        {
            _it -= _size * increment;
            return *this;
        }

        ////////////////////////////////////////////////////////////
        // Elements access operators

        auto operator[](std::size_t pos)
            -> decltype(_it[pos * _size + _size - 1])
        {
            return _it[pos * _size + _size - 1];
        }

        auto operator[](std::size_t pos) const
            -> decltype(_it[pos * _size + _size - 1])
        {
            return _it[pos * _size + _size - 1];
        }
};

template<typename Iterator1, typename Iterator2>
auto iter_swap(group_iterator<Iterator1> lhs, group_iterator<Iterator2> rhs)
    -> void
{
    std::swap_ranges(lhs.base(), lhs.base() + lhs.size(), rhs.base());
}

////////////////////////////////////////////////////////////
// Comparison operators

template<typename Iterator1, typename Iterator2>
auto operator==(const group_iterator<Iterator1>& lhs,
                const group_iterator<Iterator2>& rhs)
    -> bool
{
    return lhs.base() == rhs.base();
}

template<typename Iterator1, typename Iterator2>
auto operator!=(const group_iterator<Iterator1>& lhs,
                const group_iterator<Iterator2>& rhs)
    -> bool
{
    return lhs.base() != rhs.base();
}

////////////////////////////////////////////////////////////
// Relational operators

template<typename Iterator1, typename Iterator2>
auto operator<(const group_iterator<Iterator1>& lhs,
               const group_iterator<Iterator2>& rhs)
    -> bool
{
    return lhs.base() < rhs.base();
}

template<typename Iterator1, typename Iterator2>
auto operator<=(const group_iterator<Iterator1>& lhs,
                const group_iterator<Iterator2>& rhs)
    -> bool
{
    return lhs.base() <= rhs.base();
}

template<typename Iterator1, typename Iterator2>
auto operator>(const group_iterator<Iterator1>& lhs,
               const group_iterator<Iterator2>& rhs)
    -> bool
{
    return lhs.base() > rhs.base();
}

template<typename Iterator1, typename Iterator2>
auto operator>=(const group_iterator<Iterator1>& lhs,
                const group_iterator<Iterator2>& rhs)
    -> bool
{
    return lhs.base >= rhs.base();
}

////////////////////////////////////////////////////////////
// Arithmetic operators

template<typename Iterator>
auto operator+(group_iterator<Iterator> it, std::size_t size)
    -> group_iterator<Iterator>
{
    return it += size;
}

template<typename Iterator>
auto operator+(std::size_t size, group_iterator<Iterator> it)
    -> group_iterator<Iterator>
{
    return it += size;
}

template<typename Iterator>
auto operator-(group_iterator<Iterator> it, std::size_t size)
    -> group_iterator<Iterator>
{
    return it -= size;
}

template<typename Iterator>
auto operator-(const group_iterator<Iterator>& lhs, const group_iterator<Iterator>& rhs)
    -> typename group_iterator<Iterator>::difference_type
{
    return (lhs.base() - rhs.base()) / lhs.size();
}

////////////////////////////////////////////////////////////
// Construction function

template<typename Iterator>
auto make_group_iterator(Iterator it, std::size_t size)
    -> group_iterator<Iterator>
{
    return { it, size };
}

template<typename Iterator>
auto make_group_iterator(group_iterator<Iterator> it, std::size_t size)
    -> group_iterator<Iterator>
{
    return { it.base(), size * it.size() };
}

////////////////////////////////////////////////////////////
// Merge-insertion sort

template<
    typename RandomAccessIterator,
    typename Compare
>
auto merge_insertion_sort_impl(RandomAccessIterator first, RandomAccessIterator last,
                               Compare compare)
{
    // Cache all the differences between a Jacobsthal number and its
    // predecessor that fit in 64 bits, starting with the difference
    // between the Jacobsthal numbers 4 and 3 (the previous ones are
    // unneeded)
    static constexpr std::uint_least64_t jacobsthal_diff[] = {
        2u, 2u, 6u, 10u, 22u, 42u, 86u, 170u, 342u, 682u, 1366u,
        2730u, 5462u, 10922u, 21846u, 43690u, 87382u, 174762u, 349526u, 699050u,
        1398102u, 2796202u, 5592406u, 11184810u, 22369622u, 44739242u, 89478486u,
        178956970u, 357913942u, 715827882u, 1431655766u, 2863311530u, 5726623062u,
        11453246122u, 22906492246u, 45812984490u, 91625968982u, 183251937962u,
        366503875926u, 733007751850u, 1466015503702u, 2932031007402u, 5864062014806u,
        11728124029610u, 23456248059222u, 46912496118442u, 93824992236886u, 187649984473770u,
        375299968947542u, 750599937895082u, 1501199875790165u, 3002399751580331u,
        6004799503160661u, 12009599006321322u, 24019198012642644u, 48038396025285288u,
        96076792050570576u, 192153584101141152u, 384307168202282304u, 768614336404564608u,
        1537228672809129216u, 3074457345618258432u, 6148914691236516864u
    };

    using std::iter_swap;

    auto size = std::distance(first, last);
    if (size < 2) return;

    // Whether there is a stray element not in a pair
    // at the end of the chain
    bool has_stray = (size % 2 != 0);

    ////////////////////////////////////////////////////////////
    // Group elements by pairs

    auto end = has_stray ? std::prev(last) : last;
    for (auto it = first ; it != end ; it += 2)
    {
        if (compare(it[1], it[0]))
        {
            iter_swap(it, it + 1);
        }
    }

    ////////////////////////////////////////////////////////////
    // Recursively sort the pairs by max

    merge_insertion_sort(
        make_group_iterator(first, 2),
        make_group_iterator(end, 2),
        compare
    );

    ////////////////////////////////////////////////////////////
    // Separate main chain and pend elements

    // Small node struct for pend elements
    struct node
    {
        RandomAccessIterator it;
        typename std::list<RandomAccessIterator>::iterator next;
    };

    // The first pend element is always part of the main chain,
    // so we can safely initialize the list with the first two
    // elements of the sequence
    std::list<RandomAccessIterator> chain = { first, std::next(first) };
    std::list<node> pend;

    for (auto it = first + 2 ; it != end ; it += 2)
    {
        auto tmp = chain.insert(chain.end(), std::next(it));
        pend.push_back({it, tmp});
    }

    // Add the last element to pend if it exists, when it
    // exists, it always has to be inserted in the full chain,
    // so giving it chain.end() as end insertion point is ok
    if (has_stray)
    {
        pend.push_back({end, chain.end()});
    }

    ////////////////////////////////////////////////////////////
    // Binary insertion into the main chain

    for (int k = 0 ; ; ++k)
    {
        // Find next index
        auto dist = jacobsthal_diff[k];
        if (dist >= pend.size()) break;
        auto it = pend.begin();
        std::advance(it, dist);

        while (true)
        {
            auto insertion_point = std::upper_bound(
                chain.begin(), it->next, it->it,
                [=](auto lhs, auto rhs) {
                    return compare(*lhs, *rhs);
                }
            );
            chain.insert(insertion_point, it->it);

            it = pend.erase(it);
            if (it == pend.begin()) break;
            --it;
        }
    }

    // If there are elements left, insert them too
    while (not pend.empty())
    {
        auto it = std::prev(pend.end());
        auto insertion_point = std::upper_bound(
            chain.begin(), it->next, it->it,
            [=](auto lhs, auto rhs) {
                return compare(*lhs, *rhs);
            }
        );
        chain.insert(insertion_point, it->it);
        pend.pop_back();
    }

    ////////////////////////////////////////////////////////////
    // Move values in order to a cache then back to origin

    std::vector<typename std::iterator_traits<RandomAccessIterator>::value_type> cache;
    cache.reserve(size);

    for (auto&& it: chain)
    {
        auto begin = it.base();
        auto end = begin + it.size();
        std::move(begin, end, std::back_inserter(cache));
    }
    std::move(cache.begin(), cache.end(), first.base());
}

template<
    typename RandomAccessIterator,
    typename Compare = std::less<>
>
auto merge_insertion_sort(RandomAccessIterator first, RandomAccessIterator last,
                          Compare compare={})
    -> void
{
    merge_insertion_sort_impl(
        make_group_iterator(first, 1),
        make_group_iterator(last, 1),
        compare
    );
}

该算法速度较慢，但从未被期望很快，仅比大多数排序算法执行的比较少。现在，是否有可能从优雅或性能（当然还有正确性）的角度进行改进？

作为补充，我仍然在库中维护该算法的实现我的即使大多数错误已在我的自我回答中突出显示，我也可能忘记了其中一些。如果您认为它仍然可以改进，请不要犹豫，或者如果某些链接中断，请ping我。

您可能也对合并交换排序感兴趣。应该在第三卷中。

@coderodde已经在我的一个项目中：p

压倒。（我在TAOCP中总是跳过“一个”排序算法）自动运算符中有一个c＆p错误> =（const group_iterator ＆lhs，.....

@greybeard哦，对了，如果调用该函数肯定是错误的（证明它只是为了保持一致性）。我会在任何人回答之前解决它，谢谢:)

如果有人感兴趣，我会以稍微不同的方式在python中实现该算法，并希望获得反馈。这篇文章在此过程中非常有用，并在自述文件中得到引用。 github.com/PunkChameleon/ford-johnson-merge-insertion-sort

#1 楼

首先，有一个错误

虽然不是致命错误，但由于该算法仍然可以正确地对集合进行排序，但这实际上意味着该算法不会执行应有的较少比较。行std::advance(it, dist);将迭代器向前推进了一步，因此二进制插入有时会在主链中完成，而主链与应有的链相比太大了（大于\ $ 2 ^ n-1 \ $个元素）。显而易见的解决方案是将迭代器提高dist - 1而不是dist；但是，从1中的每个元素中删除jacobsthal_diff也是一种解决方案。

我们不需要从pend中删除元素

而不是从pend中删除元素将它们插入到chain中之后，我们可以改为跟踪pend中与Jacobsthal diff对应的第一个使用的迭代器，添加下一个Jacobsthal diff以找到下一个这样的迭代器，并减小该迭代器，直到遇到先前记住的Jacobsthal diff迭代器为止。不必从pend中删除元素意味着我们不需要将node s存储在支持从头开始快速删除的容器中。基本上，由于我们什么都没有删除，因此我们可以将std::vector<node>切换为pend。

使用原始迭代器

因为我们只需要向其添加Jacobsthal diff数字找到pend的下一个元素，这意味着我们可以对原始集合执行相同的操作，以找到要插入到主链中的迭代器（[first, last)中每个具有偶数索引的元素都是一个固定的迭代器）。这意味着我们可以从pend删除此信息，而仅存储std::vector<typename std::list<RandomAccessIterator>::value_type>而不是std::vector<node>。向量的最大大小应为(size + 1) / 2 - 1，以便我们可以直接保留该数量的元素。

我们可以按任何顺序插入其余元素

最初，我认为其余的pend元素必须以相反的顺序插入（当插入索引对应于Jacobsthal数的最远的pend元素时剩下的元素）。但是，由于二进制搜索的属性，看来我们可以按任何顺序插入它们。因此，以升序插入它们应该可以减轻CPU的工作。

更小巧的东西

iter_swap是不正确的：iter_swap的过载group_iterator的设计目的是交换几种不同类型的group_iterator，这不太正确。它看起来不仅会引起问题，而且显然还会引起ADL问题：在更复杂的情况下，编译器发现对iter_swap的不合格调用含糊不清。解决方案是使iter_swap仅与以下相同类型的group_iterator一起使用：

template<typename Iterator>
auto iter_swap(group_iterator<Iterator> lhs, group_iterator<Iterator> rhs)
    -> void
{
    std::swap_ranges(lhs.base(), lhs.base() + lhs.size(), rhs.base());
}

递归不必要地很复杂：merge_insertion_sort_impl调用merge_insertion_sort，而调用... merge_insertion_sort_impl而不添加任何重要的东西都会带来另一种间接。尽管可能会被编译器忽略，但直接递归调用merge_insertion_sort_impl会使每个人都很容易。
operator>=的group_iterator仍然存在一个小错误：lhs.base之后的括号缺失了，这很可能会如果曾经调用该函数，则会导致编译错误。

将它们放在一起

一旦我们将所有这些备注放在一起，merge_insertion_sort_impl看起来像这样：

template<
    typename RandomAccessIterator,
    typename Compare
>
auto merge_insertion_sort_impl(RandomAccessIterator first, RandomAccessIterator last,
                               Compare compare)
{
    // Cache all the differences between a Jacobsthal number and its
    // predecessor that fit in 64 bits, starting with the difference
    // between the Jacobsthal numbers 4 and 3 (the previous ones are
    // unneeded)
    static constexpr std::uint_fast64_t jacobsthal_diff[] = {
        2u, 2u, 6u, 10u, 22u, 42u, 86u, 170u, 342u, 682u, 1366u,
        2730u, 5462u, 10922u, 21846u, 43690u, 87382u, 174762u, 349526u, 699050u,
        1398102u, 2796202u, 5592406u, 11184810u, 22369622u, 44739242u, 89478486u,
        178956970u, 357913942u, 715827882u, 1431655766u, 2863311530u, 5726623062u,
        11453246122u, 22906492246u, 45812984490u, 91625968982u, 183251937962u,
        366503875926u, 733007751850u, 1466015503702u, 2932031007402u, 5864062014806u,
        11728124029610u, 23456248059222u, 46912496118442u, 93824992236886u, 187649984473770u,
        375299968947542u, 750599937895082u, 1501199875790165u, 3002399751580331u,
        6004799503160661u, 12009599006321322u, 24019198012642644u, 48038396025285288u,
        96076792050570576u, 192153584101141152u, 384307168202282304u, 768614336404564608u,
        1537228672809129216u, 3074457345618258432u, 6148914691236516864u
    };

    using std::iter_swap;

    auto size = std::distance(first, last);
    if (size < 2) return;

    // Whether there is a stray element not in a pair
    // at the end of the chain
    bool has_stray = (size % 2 != 0);

    ////////////////////////////////////////////////////////////
    // Group elements by pairs

    auto end = has_stray ? std::prev(last) : last;
    for (auto it = first ; it != end ; it += 2)
    {
        if (compare(it[1], it[0]))
        {
            iter_swap(it, it + 1);
        }
    }

    ////////////////////////////////////////////////////////////
    // Recursively sort the pairs by max

    merge_insertion_sort_impl(
        make_group_iterator(first, 2),
        make_group_iterator(end, 2),
        compare
    );

    ////////////////////////////////////////////////////////////
    // Separate main chain and pend elements

    // The first pend element is always part of the main chain,
    // so we can safely initialize the list with the first two
    // elements of the sequence
    std::list<RandomAccessIterator> chain = { first, std::next(first) };

    // Upper bounds for the insertion of pend elements
    std::vector<typename std::list<RandomAccessIterator>::iterator> pend;
    pend.reserve((size + 1) / 2 - 1);

    for (auto it = first + 2 ; it != end ; it += 2)
    {
        auto tmp = chain.insert(std::end(chain), std::next(it));
        pend.push_back(tmp);
    }

    // Add the last element to pend if it exists; when it
    // exists, it always has to be inserted in the full chain,
    // so giving it chain.end() as end insertion point is ok
    if (has_stray)
    {
        pend.push_back(std::end(chain));
    }

    ////////////////////////////////////////////////////////////
    // Binary insertion into the main chain

    auto current_it = first + 2;
    auto current_pend = std::begin(pend);

    for (int k = 0 ; ; ++k)
    {
        // Should be safe: in this code, std::distance should always return
        // a positive number, so there is no risk of comparing funny values
        using size_type = std::common_type_t<
            std::uint_fast64_t,
            typename std::list<RandomAccessIterator>::difference_type
        >;

        // Find next index
        auto dist = jacobsthal_diff[k];
        if (dist > static_cast<size_type>(std::distance(current_pend, std::end(pend)))) break;

        auto it = std::next(current_it, dist * 2);
        auto pe = std::next(current_pend, dist);

        do
        {
            --pe;
            it -= 2;

            auto insertion_point = std::upper_bound(
                std::begin(chain), *pe, *it,
                [=](const auto& lhs, const auto& rhs) {
                    return compare(lhs, *rhs);
                }
            );
            chain.insert(insertion_point, it);
        } while (pe != current_pend);

        std::advance(current_it, dist * 2);
        std::advance(current_pend, dist);
    }

    // If there are pend elements left, insert them into
    // the main chain, the order of insertion does not
    // matter so forward traversal is ok
    while (current_pend != std::end(pend))
    {
        auto insertion_point = std::upper_bound(
            std::begin(chain), *current_pend, *current_it,
            [=](const auto& lhs, const auto& rhs) {
                return compare(lhs, *rhs);
            }
        );
        chain.insert(insertion_point, current_it);
        current_it += 2;
        ++current_pend;
    }

    ////////////////////////////////////////////////////////////
    // Move values in order to a cache then back to origin

    std::vector<typename std::iterator_traits<RandomAccessIterator>::value_type> cache;
    cache.reserve(size);

    for (auto&& it: chain)
    {
        auto begin = it.base();
        auto end = begin + it.size();
        std::move(begin, end, std::back_inserter(cache));
    }
    std::move(std::begin(cache), std::end(cache), first.base());
}

该算法仍比大多数常见的排序算法慢几个数量级，但比问题中的原始版本更正确，也更快。

编程黑洞网

福特-约翰逊合并插入排序

评论

#1 楼