对代码风格,注释风格,可读性和最佳实践的任何批评将不胜感激。 。我不确定代码的Pythonic程度;填充矩阵似乎是实现动态编程的一种自然方法,但它不会“感觉” Pythonic(在这种情况下,这是一件坏事吗?)。
请注意,注释是有点冗长;在编写算法的过程中,由于要尽最大可能地理解它,所以我尝试对每个步骤都尽可能明确。如果它们过多(或完全不正确),请随时对其进行评论(哈!)。
import sys
def knapsack(items, maxweight):
# Create an (N+1) by (W+1) 2-d list to contain the running values
# which are to be filled by the dynamic programming routine.
#
# There are N+1 rows because we need to account for the possibility
# of choosing from 0 up to and including N possible items.
# There are W+1 columns because we need to account for possible
# "running capacities" from 0 up to and including the maximum weight W.
bestvalues = [[0] * (maxweight + 1)
for i in xrange(len(items) + 1)]
# Enumerate through the items and fill in the best-value table
for i, (value, weight) in enumerate(items):
# Increment i, because the first row (0) is the case where no items
# are chosen, and is already initialized as 0, so we're skipping it
i += 1
for capacity in xrange(maxweight + 1):
# Handle the case where the weight of the current item is greater
# than the "running capacity" - we can't add it to the knapsack
if weight > capacity:
bestvalues[i][capacity] = bestvalues[i - 1][capacity]
else:
# Otherwise, we must choose between two possible candidate values:
# 1) the value of "running capacity" as it stands with the last item
# that was computed; if this is larger, then we skip the current item
# 2) the value of the current item plus the value of a previously computed
# set of items, constrained by the amount of capacity that would be left
# in the knapsack (running capacity - item's weight)
candidate1 = bestvalues[i - 1][capacity]
candidate2 = bestvalues[i - 1][capacity - weight] + value
# Just take the maximum of the two candidates; by doing this, we are
# in effect "setting in stone" the best value so far for a particular
# prefix of the items, and for a particular "prefix" of knapsack capacities
bestvalues[i][capacity] = max(candidate1, candidate2)
# Reconstruction
# Iterate through the values table, and check
# to see which of the two candidates were chosen. We can do this by simply
# checking if the value is the same as the value of the previous row. If so, then
# we say that the item was not included in the knapsack (this is how we arbitrarily
# break ties) and simply move the pointer to the previous row. Otherwise, we add
# the item to the reconstruction list and subtract the item's weight from the
# remaining capacity of the knapsack. Once we reach row 0, we're done
reconstruction = []
i = len(items)
j = maxweight
while i > 0:
if bestvalues[i][j] != bestvalues[i - 1][j]:
reconstruction.append(items[i - 1])
j -= items[i - 1][1]
i -= 1
# Reverse the reconstruction list, so that it is presented
# in the order that it was given
reconstruction.reverse()
# Return the best value, and the reconstruction list
return bestvalues[len(items)][maxweight], reconstruction
if __name__ == '__main__':
if len(sys.argv) != 2:
print('usage: knapsack.py [file]')
sys.exit(1)
filename = sys.argv[1]
with open(filename) as f:
lines = f.readlines()
maxweight = int(lines[0])
items = [map(int, line.split()) for line in lines[1:]]
bestvalue, reconstruction = knapsack(items, maxweight)
print('Best possible value: {0}'.format(bestvalue))
print('Items:')
for value, weight in reconstruction:
print('V: {0}, W: {1}'.format(value, weight))
期望的输入文件如下:
165
92 23
57 31
49 29
68 44
60 53
43 38
67 63
84 85
87 89
72 82
第一行包含允许的最大权重,后续行包含由值-权重对表示的项目。
#1 楼
1.回顾函数
knapsack
缺少一个文档字符串,该字符串解释了该函数采用的参数(items
中是什么东西?items
必须是序列,还是可以迭代) ?以及返回的内容。这种功能非常适合doctests。
注释中的内容类似于“用(W + 1)2-d创建(N + 1)清单”。但是N是什么,W是什么?大概N是
len(items)
,W是maxweight
,因此在注释和代码中使用相同的名称将是一个好主意:N = len(items)
W = maxweight
bestvalues
上方的注释无法解释该表中的值的含义。我会这样写:# bestvalues[i][j] is the best sum of values for any
# subsequence of the first i items, whose weights sum
# to no more than j.
这很清楚为什么\ $ 0≤i≤N \ $,为什么\ $ 0≤j≤W \ $。
在如下循环中:
bestvalues = [[0] * (maxweight + 1)
for i in xrange(len(items) + 1)]
如果未使用循环变量(此处
i
),则通常将其命名为_
。 这些行可以省略:
# Increment i, because the first row (0) is the case where no items
# are chosen, and is already initialized as 0, so we're skipping it
i += 1
然后在其余代码中,使用
i + 1
代替i
和i
而不是i - 1
。重建循环:
i = N
while i > 0:
# code
i -= 1
可以这样写:
for i in reversed(range(1, N + 1)):
# code
代码将显示如下错误消息:
print('usage: knapsack.py [file]')
错误消息应转为标准错误(非标准输出)。最好不要对程序名称进行硬编码,因为重命名程序很容易,但是忘记更新代码。因此改写:
sys.stderr.write("usage: {0} [file]\n".format(sys.argv[0]))
读取问题描述并打印结果的代码块仅在
__name__ == '__main__'
时运行。这使得很难进行测试,例如从交互式解释器进行测试。通常最好将这种代码放在自己的函数中,例如:def main(filename):
with open(filename) as f:
# etc.
if __name__ == '__main__':
if len(sys.argv) != 2:
print('usage: knapsack.py [file]')
sys.exit(1)
main(sys.argv[1])
现在您可以从解释器运行
main('problem.txt')
对其进行测试。代码将整个文件作为行列表读取到内存中:
lines = f.readlines()
这在这里是无害的,因为文件很小,但是通常最好处理一次归档一行,如下所示:
with open(filename) as f:
maxweight = int(next(f))
items = [[int(word) for word in line.split()] for line in f]
2。递归方法
任何动态编程算法都可以通过两种方式实现:通过自下而上构建部分结果表(如后文代码),或通过递归计算结果的方法。自上而下,使用备忘录来避免多次计算任何部分结果。
自上而下的方法通常会导致代码更简单,更清晰,并且仅计算特定对象所需的部分结果。问题实例(而采用自下而上的方法通常很难避免计算所有部分结果,即使其中一些未使用也是如此)。 this:
from functools import lru_cache
def knapsack(items, maxweight):
"""Solve the knapsack problem by finding the most valuable subsequence
of items that weighs no more than maxweight.
items must be a sequence of pairs (value, weight), where value is a
number and weight is a non-negative integer.
maxweight is a non-negative integer.
Return a pair whose first element is the sum of values in the most
valuable subsequence, and whose second element is the subsequence.
>>> items = [(4, 12), (2, 1), (6, 4), (1, 1), (2, 2)]
>>> knapsack(items, 15)
(11, [(2, 1), (6, 4), (1, 1), (2, 2)])
"""
@lru_cache(maxsize=None)
def bestvalue(i, j):
# Return the value of the most valuable subsequence of the first
# i elements in items whose weights sum to no more than j.
if j < 0:
return float('-inf')
if i == 0:
return 0
value, weight = items[i - 1]
return max(bestvalue(i - 1, j), bestvalue(i - 1, j - weight) + value)
j = maxweight
result = []
for i in reversed(range(len(items))):
if bestvalue(i + 1, j) != bestvalue(i, j):
result.append(items[i])
j -= items[i][1]
result.reverse()
return bestvalue(len(items), maxweight), result
要查看需要计算多少局部解决方案,请在返回结果之前打印
@functools.lru_cache
。解决文档字符串中的示例问题时,输出:CacheInfo(hits=17, misses=42, maxsize=None, currsize=42)
缓存中的42个条目少于自下而上方法计算出的96个部分解决方案。
评论
\ $ \ begingroup \ $
感谢您的答复(尤其是第3点和第9点)!我不敢相信我忘记考虑记忆化的递归方法。希望今天晚些时候我会应用您的一些建议。再次感谢。
\ $ \ endgroup \ $
–voithos
13年1月16日在22:19
\ $ \ begingroup \ $
注意此提议解决方案的递归深度限制问题。
\ $ \ endgroup \ $
–弗兰克·史密斯
2014年8月8日15:13
\ $ \ begingroup \ $
@Frank:是的,此解决方案使用len(item)个级别的堆栈。
\ $ \ endgroup \ $
–加雷斯·里斯(Gareth Rees)
2014年3月13日在8:26
\ $ \ begingroup \ $
@GarethRees是否需要在knapsack()中定义bestvalue()?我倾向于避免在函数内部定义函数以使事情更清楚。
\ $ \ endgroup \ $
–绿色Diod
17年1月17日在13:43
\ $ \ begingroup \ $
@greendiod:我在背包内定义了bestvalue,因为(i)仅在此处使用过; (ii)它使用了外部作用域中的项(必须通过所有递归调用将其传递下来很痛苦); (iii)已被记住,因此它具有关联的缓存,当背包返回时我们不想保留它。
\ $ \ endgroup \ $
–加雷斯·里斯(Gareth Rees)
17年1月17日在18:06
#2 楼
对于涉及嵌套循环的代码的第一部分,如果应该选择-使用
curr
而不是i
,-分配
prev = curr - 1
和-使用
enumerate(items, 1)
,该代码将按字面意义记录自身。
for curr, (value, weight) in enumerate(items, 1):
prev = curr - 1
for capacity in range(maxweight + 1):
if weight > capacity:
bestvalues[curr][capacity] = bestvalues[prev][capacity]
else:
candidate1 = bestvalues[prev][capacity]
candidate2 = bestvalues[prev][capacity - weight] + value
bestvalues[curr][capacity] = max(candidate1, candidate2)
#3 楼
在代码行和执行计算的最快时间方面,可以使用动态编程以一种非常有效的方式解决以下问题。最重要的是,以下代码执行备忘录以缓存先前计算的结果。try:
from functools import lru_cache
except ImportError:
# For Python2
# pip install backports.functools_lru_cache
from backports.functools_lru_cache import lru_cache
class knapsack:
"""
Maximize sum of selected weight.
Sum of selected size is less than capacity.
Algorithm: Dynamic Programming
----
>>>items = [(4, 12), (2, 1), (6, 4), (1, 1), (2, 2)]
>>>weight, size = zip(*items)
>>>weight = list(weight)
>>>size = list(size)
>>>capacity = 15
>>>knapsack(size, weight).solve(capacity)
>>>(11, [1, 2, 3, 4])
"""
def __init__(self, size, weight):
self.size = size
self.weight = weight
@lru_cache()
def solve(self, cap, i=0):
if cap < 0: return -sum(self.weight), []
if i == len(self.size): return 0, []
res1 = self.solve(cap, i + 1)
res2 = self.solve(cap - self.size[i], i + 1)
res2 = (res2[0] + self.weight[i], [i] + res2[1])
return res1 if res1[0] >= res2[0] else res2
评论
\ $ \ begingroup \ $
欢迎使用代码审查!您提出了替代解决方案,但尚未检查代码。请编辑以显示问题代码的哪些方面促使您编写此版本,以及对原始版本进行了哪些改进。也许(重新)阅读“如何回答”是值得的。
\ $ \ endgroup \ $
– Toby Speight
18-10-5在7:54
评论
我知道一个老帖子,但是如果您还没有碰到它,那么已经枚举允许您指定起始索引(例如,对于i,enumerate(items,1)中的item:应该让我从1开始。@SimonT:太棒了!我已经用Python编程了2年多了,但我从未见过枚举以这种方式使用。我一定会记住的。