动态编程背包解决方案

我使用自底向上的动态编程算法编写了Python中背包问题的解决方案。给定具有值和权重的项目列表以及允许的最大权重，它可以正确地计算最佳值。

对代码风格，注释风格，可读性和最佳实践的任何批评将不胜感激。。我不确定代码的Pythonic程度；填充矩阵似乎是实现动态编程的一种自然方法，但它不会“感觉” Pythonic（在这种情况下，这是一件坏事吗？）。

请注意，注释是有点冗长；在编写算法的过程中，由于要尽最大可能地理解它，所以我尝试对每个步骤都尽可能明确。如果它们过多（或完全不正确），请随时对其进行评论（哈！）。

import sys

def knapsack(items, maxweight):
    # Create an (N+1) by (W+1) 2-d list to contain the running values
    # which are to be filled by the dynamic programming routine.
    #
    # There are N+1 rows because we need to account for the possibility
    # of choosing from 0 up to and including N possible items.
    # There are W+1 columns because we need to account for possible
    # "running capacities" from 0 up to and including the maximum weight W.
    bestvalues = [[0] * (maxweight + 1)
                  for i in xrange(len(items) + 1)]

    # Enumerate through the items and fill in the best-value table
    for i, (value, weight) in enumerate(items):
        # Increment i, because the first row (0) is the case where no items
        # are chosen, and is already initialized as 0, so we're skipping it
        i += 1
        for capacity in xrange(maxweight + 1):
            # Handle the case where the weight of the current item is greater
            # than the "running capacity" - we can't add it to the knapsack
            if weight > capacity:
                bestvalues[i][capacity] = bestvalues[i - 1][capacity]
            else:
                # Otherwise, we must choose between two possible candidate values:
                # 1) the value of "running capacity" as it stands with the last item
                #    that was computed; if this is larger, then we skip the current item
                # 2) the value of the current item plus the value of a previously computed
                #    set of items, constrained by the amount of capacity that would be left
                #    in the knapsack (running capacity - item's weight)
                candidate1 = bestvalues[i - 1][capacity]
                candidate2 = bestvalues[i - 1][capacity - weight] + value

                # Just take the maximum of the two candidates; by doing this, we are
                # in effect "setting in stone" the best value so far for a particular
                # prefix of the items, and for a particular "prefix" of knapsack capacities
                bestvalues[i][capacity] = max(candidate1, candidate2)

    # Reconstruction
    # Iterate through the values table, and check
    # to see which of the two candidates were chosen. We can do this by simply
    # checking if the value is the same as the value of the previous row. If so, then
    # we say that the item was not included in the knapsack (this is how we arbitrarily
    # break ties) and simply move the pointer to the previous row. Otherwise, we add
    # the item to the reconstruction list and subtract the item's weight from the
    # remaining capacity of the knapsack. Once we reach row 0, we're done
    reconstruction = []
    i = len(items)
    j = maxweight
    while i > 0:
        if bestvalues[i][j] != bestvalues[i - 1][j]:
            reconstruction.append(items[i - 1])
            j -= items[i - 1][1]
        i -= 1

    # Reverse the reconstruction list, so that it is presented
    # in the order that it was given
    reconstruction.reverse()

    # Return the best value, and the reconstruction list
    return bestvalues[len(items)][maxweight], reconstruction


if __name__ == '__main__':
    if len(sys.argv) != 2:
        print('usage: knapsack.py [file]')
        sys.exit(1)

    filename = sys.argv[1]
    with open(filename) as f:
        lines = f.readlines()

    maxweight = int(lines[0])
    items = [map(int, line.split()) for line in lines[1:]]

    bestvalue, reconstruction = knapsack(items, maxweight)

    print('Best possible value: {0}'.format(bestvalue))
    print('Items:')
    for value, weight in reconstruction:
        print('V: {0}, W: {1}'.format(value, weight))

期望的输入文件如下：

第一行包含允许的最大权重，后续行包含由值-权重对表示的项目。

我知道一个老帖子，但是如果您还没有碰到它，那么已经枚举允许您指定起始索引（例如，对于i，enumerate（items，1）中的item：应该让我从1开始。

@SimonT：太棒了！我已经用Python编程了2年多了，但我从未见过枚举以这种方式使用。我一定会记住的。

#1 楼

1.回顾

函数knapsack缺少一个文档字符串，该字符串解释了该函数采用的参数（items中是什么东西？items必须是序列，还是可以迭代）？以及返回的内容。
这种功能非常适合doctests。

注释中的内容类似于“用（W + 1）2-d创建（N + 1）清单”。但是N是什么，W是什么？大概N是len(items)，W是maxweight，因此在注释和代码中使用相同的名称将是一个好主意：

N = len(items)
W = maxweight

bestvalues上方的注释无法解释该表中的值的含义。我会这样写：

# bestvalues[i][j] is the best sum of values for any
# subsequence of the first i items, whose weights sum
# to no more than j.

这很清楚为什么\ $ 0≤i≤N \ $，为什么\ $ 0≤j≤W \ $。
在如下循环中：

bestvalues = [[0] * (maxweight + 1)
              for i in xrange(len(items) + 1)]

如果未使用循环变量（此处i），则通常将其命名为_。

这些行可以省略：

# Increment i, because the first row (0) is the case where no items
# are chosen, and is already initialized as 0, so we're skipping it
i += 1

然后在其余代码中，使用i + 1代替i和i而不是i - 1。

重建循环：

i = N
while i > 0:
    # code
    i -= 1

可以这样写：

for i in reversed(range(1, N + 1)):
    # code

代码将显示如下错误消息：

print('usage: knapsack.py [file]')

错误消息应转为标准错误（非标准输出）。最好不要对程序名称进行硬编码，因为重命名程序很容易，但是忘记更新代码。因此改写：

sys.stderr.write("usage: {0} [file]\n".format(sys.argv[0]))

读取问题描述并打印结果的代码块仅在__name__ == '__main__'时运行。这使得很难进行测试，例如从交互式解释器进行测试。通常最好将这种代码放在自己的函数中，例如：

def main(filename):
    with open(filename) as f:
        # etc.

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print('usage: knapsack.py [file]')
        sys.exit(1)
    main(sys.argv[1])

现在您可以从解释器运行main('problem.txt')对其进行测试。

代码将整个文件作为行列表读取到内存中：

lines = f.readlines()

这在这里是无害的，因为文件很小，但是通常最好处理一次归档一行，如下所示：

with open(filename) as f:
    maxweight = int(next(f))
    items = [[int(word) for word in line.split()] for line in f]

2。递归方法

任何动态编程算法都可以通过两种方式实现：通过自下而上构建部分结果表（如后文代码），或通过递归计算结果的方法。自上而下，使用备忘录来避免多次计算任何部分结果。

自上而下的方法通常会导致代码更简单，更清晰，并且仅计算特定对象所需的部分结果。问题实例（而采用自下而上的方法通常很难避免计算所有部分结果，即使其中一些未使用也是如此）。 this：

from functools import lru_cache

def knapsack(items, maxweight):
    """Solve the knapsack problem by finding the most valuable subsequence
    of items that weighs no more than maxweight.

    items must be a sequence of pairs (value, weight), where value is a
    number and weight is a non-negative integer.

    maxweight is a non-negative integer.

    Return a pair whose first element is the sum of values in the most
    valuable subsequence, and whose second element is the subsequence.

    >>> items = [(4, 12), (2, 1), (6, 4), (1, 1), (2, 2)]
    >>> knapsack(items, 15)
    (11, [(2, 1), (6, 4), (1, 1), (2, 2)])

    """
    @lru_cache(maxsize=None)
    def bestvalue(i, j):
        # Return the value of the most valuable subsequence of the first
        # i elements in items whose weights sum to no more than j.
        if j < 0:
            return float('-inf')
        if i == 0:
            return 0
        value, weight = items[i - 1]
        return max(bestvalue(i - 1, j), bestvalue(i - 1, j - weight) + value)

    j = maxweight
    result = []
    for i in reversed(range(len(items))):
        if bestvalue(i + 1, j) != bestvalue(i, j):
            result.append(items[i])
            j -= items[i][1]
    result.reverse()
    return bestvalue(len(items), maxweight), result

要查看需要计算多少局部解决方案，请在返回结果之前打印@functools.lru_cache。解决文档字符串中的示例问题时，输出：

CacheInfo(hits=17, misses=42, maxsize=None, currsize=42)

缓存中的42个条目少于自下而上方法计算出的96个部分解决方案。

\ $ \ begingroup \ $
感谢您的答复（尤其是第3点和第9点）！我不敢相信我忘记考虑记忆化的递归方法。希望今天晚些时候我会应用您的一些建议。再次感谢。
\ $ \ endgroup \ $
–voithos
13年1月16日在22:19

\ $ \ begingroup \ $
注意此提议解决方案的递归深度限制问题。
\ $ \ endgroup \ $
–弗兰克·史密斯
2014年8月8日15:13

\ $ \ begingroup \ $
@Frank：是的，此解决方案使用len（item）个级别的堆栈。
\ $ \ endgroup \ $
–加雷斯·里斯（Gareth Rees）
2014年3月13日在8:26

\ $ \ begingroup \ $
@GarethRees是否需要在knapsack（）中定义bestvalue（）？我倾向于避免在函数内部定义函数以使事情更清楚。
\ $ \ endgroup \ $
–绿色Diod
17年1月17日在13:43

\ $ \ begingroup \ $
@greendiod：我在背包内定义了bestvalue，因为（i）仅在此处使用过；（ii）它使用了外部作用域中的项（必须通过所有递归调用将其传递下来很痛苦）；（iii）已被记住，因此它具有关联的缓存，当背包返回时我们不想保留它。
\ $ \ endgroup \ $
–加雷斯·里斯（Gareth Rees）
17年1月17日在18:06

#2 楼

对于涉及嵌套循环的代码的第一部分，如果应该选择
-使用curr而不是i，
-分配prev = curr - 1和
-使用enumerate(items, 1)，

该代码将按字面意义记录自身。

for curr, (value, weight) in enumerate(items, 1):
    prev = curr - 1
    for capacity in range(maxweight + 1):
        if weight > capacity:
            bestvalues[curr][capacity] = bestvalues[prev][capacity]
        else:
            candidate1 = bestvalues[prev][capacity]
            candidate2 = bestvalues[prev][capacity - weight] + value

            bestvalues[curr][capacity] = max(candidate1, candidate2)

#3 楼

在代码行和执行计算的最快时间方面，可以使用动态编程以一种非常有效的方式解决以下问题。最重要的是，以下代码执行备忘录以缓存先前计算的结果。

try:
    from functools import lru_cache
except ImportError:
    # For Python2
    # pip install backports.functools_lru_cache
    from backports.functools_lru_cache import lru_cache
class knapsack:
    """
    Maximize sum of selected weight.
    Sum of selected size is less than capacity.
    Algorithm: Dynamic Programming
    ----
    >>>items = [(4, 12), (2, 1), (6, 4), (1, 1), (2, 2)]
    >>>weight, size   = zip(*items)
    >>>weight = list(weight)
    >>>size = list(size)
    >>>capacity = 15
    >>>knapsack(size, weight).solve(capacity)

    >>>(11, [1, 2, 3, 4])

    """
    def __init__(self, size, weight):
        self.size = size
        self.weight = weight
    @lru_cache()
    def solve(self, cap, i=0):
        if cap < 0: return -sum(self.weight), []
        if i == len(self.size): return 0, []
        res1 = self.solve(cap,  i + 1)
        res2 = self.solve(cap - self.size[i], i + 1)
        res2 = (res2[0] + self.weight[i], [i] + res2[1])
        return res1 if res1[0] >= res2[0] else res2

\ $ \ begingroup \ $
欢迎使用代码审查！您提出了替代解决方案，但尚未检查代码。请编辑以显示问题代码的哪些方面促使您编写此版本，以及对原始版本进行了哪些改进。也许（重新）阅读“如何回答”是值得的。
\ $ \ endgroup \ $
– Toby Speight
18-10-5在7:54

编程黑洞网

动态编程背包解决方案

评论

#1 楼

评论

#2 楼

#3 楼

评论