蒙特卡洛pi计算_编程黑洞网

为了学习Monte Carlo的基础知识，我用它计算了pi。
我还写了代码背后的原因的解释。

在这里，您可以看到带有随机点的圆，我在自己的代码中进行了仿真。

"""
        This programme calculates pi with Monte Carlo

Given a square and a circle inside it.
We have

Area_of_the_square = LENGTH ** 2
Area_of_the_circle = radius ** 2 * pi => (LENGTH ** 2) / 4 * pi

The circle is obviously smaller than the square.
We have the equation:

Area_of_the_square * number_less_than_one == Area_of_the_circle

This programme is going to put a big number of points inside the square
(I suggest TIMES_TO_REPEAT = 10**5).

It will then count how many of them are inside the circle,

number_less_than_one = points_inside_the_circle / total_points

After doing some simple math with this formula:

Area_of_the_square * number_less_than_one == Area_of_the_circle

we get that

pi = number_less_than_one * 4

NOTE: This method is deadly slow and quite complicated,
I made this programme just in order to learn.

"""
import random

TIMES_TO_REPEAT = 10**5
LENGTH = 10**5
CENTER = [LENGTH/2,LENGTH/2]

def in_circle(point):
    x = point[0]
    y = point[1]
    center_x = CENTER[0]
    center_y = CENTER[1]
    radius = LENGTH/2
    return (x - center_x)**2 + (y - center_y)**2 < radius**2

count = inside_count = 0
for i in range(TIMES_TO_REPEAT):
    point = random.randint(1,LENGTH),random.randint(1,LENGTH)
    if in_circle(point):
        inside_count += 1
    count += 1

pi = (inside_count / count) * 4

print(pi)

#1 楼

您的代码似乎正在运行。这里有一些改善的方法。

元组拆箱

x = point[0]
y = point[1]

可以这样写：

x, y = point

删除不需要的内容

您维护了count变量，但是您已经知道计数：TIMES_TO_REPEAT。

变量名称

TIMES_TO_REPEAT似乎不是一个好名字。 NB_POINTS可能会更好。

删除计数逻辑

使用迭代器和sum，您可以删除计数点的逻辑：

inside_count = sum(1 for i in range(NB_POINTS) if in_circle((random.randint(1,LENGTH),random.randint(1,LENGTH))))

下划线表示可抛弃的值

在迭代中，您实际上并不关心i的值。称之为_是一件很平常的事情（有关此问题的更多信息）。

将其移动到函数中

它使测试变得更容易

def compute_pi(nb_it):
    inside_count = sum(1 for _ in range(nb_it) if in_circle((random.randint(1,LENGTH),random.randint(1,LENGTH))))
    return (inside_count / nb_it) * 4

添加if main保护器

从python文件定义函数/值/类/所有内容很普遍，但仅在if main保护器后面使用它们例如，通过模块导入更容易重用。

糟糕

当前，更改LENGTH的值似乎正在更改结果。这可能是错误的。

此时，我的代码如下所示：

import random

NB_POINTS = 10**5
LENGTH = 10**5
CENTER = [LENGTH/2,LENGTH/2]

def in_circle(point):
    x,y = point
    center_x, center_y = CENTER
    radius = LENGTH/2
    return (x - center_x)**2 + (y - center_y)**2 < radius**2

def compute_pi(nb_it):
    inside_count = sum(1 for _ in range(nb_it) if in_circle((random.randint(1,LENGTH),random.randint(1,LENGTH))))
    return (inside_count / nb_it) * 4

if __name__ == "__main__":
    print(compute_pi(NB_POINTS))

使用正确的类型和正确的值

为了解决上述问题并使事情变得更容易，您可以稍微更改所使用的参照。

而不是使用半径为R的圆（等于length / 2）），并以中心（R，R）并在平方[1，R] * [1，R]中选择整数值，也许您可以考虑一个半径为R且以中心为0的圆，并在[-R，+ R]中选择浮动值]。更好的是，您可以通过考虑[0，R] * [0，R]中的点来关注圆角。最后，R在这里并不是很重要，您可以选择1。

然后代码变为：

import random

NB_POINTS = 10**5

def in_circle(x, y):
    return x**2 + y**2 < 1

def compute_pi(nb_it):
    inside_count = sum(1 for _ in range(nb_it) if in_circle(random.random(),random.random()))
    return (inside_count / nb_it) * 4

if __name__ == "__main__":
    print(compute_pi(NB_POINTS))

然后可以简化它。

import random

NB_POINTS = 10**5

def compute_pi(nb_it):
    return 4 * sum(1 for _ in range(nb_it) if random.random()**2 + random.random()**2 < 1) / nb_it

if __name__ == "__main__":
    print(compute_pi(NB_POINTS))

\ $ \ begingroup \ $
就随机性而言，random.random（）是否均匀分布？如果不是，则将random.random（）** 2始终小于或等于1，是否将其视为与random.random（）相同？
\ $ \ endgroup \ $
–薄片
2014年11月10日在18:27

\ $ \ begingroup \ $
@Calpratt random（）在0和1之间均匀分布，但random（）** 2不是。
\ $ \ endgroup \ $
– David Z
2014年11月10日20:58

\ $ \ begingroup \ $
另外：更改LENGTH如何更改结果？由于统计上的波动，这在一定程度上是可以预期的。
\ $ \ endgroup \ $
– David Z
2014年11月10日在21:03

\ $ \ begingroup \ $
不是将所有内容移动到一行，而是写def in_circle：random.random（）** 2 + random.random（）** 2 <1？我本人是Python的新手，但我看到很多人都强调一次“做一件事”，而看起来compute_pi中的一行正在做很多工作，但这很有意义（至少对我来说）以生成随机数，并在同一步骤中检查其位置。而且该行超过79个字符（根据PEP8）。
\ $ \ endgroup \ $
–shadowtalker
2014年11月12日4:56

\ $ \ begingroup \ $
@ssdecontrol：您是对的。可能会更好，但是要使此功能更有用/更有意义，它应该以半径（或直径）和中心为参数，这将带回我尝试删除的繁琐细节。至于PEP 8，我很容易违反80个字符的规则：-/。
\ $ \ endgroup \ $
– SylvainD
2014年11月12日7:59

#2 楼

首先，对实现进行评论：请注意，选择1和10**5之间的随机数与选择0和1之间的随机数实际上没有什么不同；您将所有内容按相同的比例缩放，那么为什么要打扰呢？实际上，要求x和y为整数（不包括0）会降低方法的准确性。

删除该因数可以使代码简化如下：

import random

def approximate_pi(repeats=10**5):
    """Monte Carlo simulation to approximate pi.

    ... explanation goes here

    """
    inside = 0
    for _ in range(repeats):
        x, y = random.random(), random.random()
        if (((0.5 - x) ** 2) + ((0.5 - y) ** 2)) <= 0.25:
            inside += 1
    return (4 * inside) / repeats

if __name__ == '__main__':
    print approximate_pi()

请注意：

我将解释移到了函数docstring中，而不是放在外面；
我使用过if __name__ == '__main__'保护程序来运行代码，而不是直接将其放置在模块中；并且
我将repeats的数量作为参数而不是模块级常量。

如果确实要排除inside_circle，我可以将LENGTH用作该函数的参数：

def inside_circle(point, side_length):
    """Return whether or not point is in the circle.

    Note:
      The circle has a radius of side_length / 2 and sits inside a 
      square with sides of length side_length. Assumes that the 
      square's bottom-left corner is at (0, 0), i.e. that the square 
      is entirely within Quadrant I.

    Args:
      point (tuple of int/float): The x, y coordinates of the point.
      side_length (int/float): The length of the sides of the square
        within which the circle sits.

    """
    ...

def approximate_pi(side_length, repeats=10**5):
    """Monte Carlo simulation to approximate pi."""
    ...
    if inside_circle((x, y), side_length):
        ...

这使您可以from whatever import inside_circle并在其他地方使用它，而不必担心CENTER是否已定义。

#3 楼

正如您已经意识到的那样，纯python中的计算循环通常慢得令人无法接受。对于这种代码，通常最好使用numpy。例如：

import numpy as np

def pi_mc(N, seed=1234):
    rndm = np.random.RandomState(seed)
    x = rndm.uniform(size=N)
    y = rndm.uniform(size=N)
    r = x*x + y*y
    return 4. * np.count_nonzero(r < 1) / N


if __name__ == "__main__":
    for N in [100, 1000, 10000, 100000, 1e6]:
        pp = pi_mc(N)
        print N, pp, pp - np.pi

编辑：根据要求，一些计时：

In [9]: for N in [100, 1000, 10000, 100000, int(1e6)]:
    %timeit pi_mc(N)
   ...:     
100000 loops, best of 3: 14.5 µs per loop
10000 loops, best of 3: 45.9 µs per loop
1000 loops, best of 3: 351 µs per loop
100 loops, best of 3: 3.36 ms per loop
10 loops, best of 3: 46.4 ms per loop

In [10]: for N in [100, 1000, 10000, 100000, int(1e6)]:
    %timeit pure_python_pi_mc(N)
   ....:     
10000 loops, best of 3: 61.5 µs per loop
1000 loops, best of 3: 548 µs per loop
100 loops, best of 3: 5.47 ms per loop
10 loops, best of 3: 54.9 ms per loop
1 loops, best of 3: 546 ms per loop

pure_python_pi_mc是代码从@ 200_success的答案中提取出来，并包装到一个函数中。

如您所见，numpy的相对速度随着迭代次数的增加而提高。这是由于以下事实：建立循环需要固定的时间开销。

关于数值方面的另外两点说明：

总是播种您的随机数生成器（可再现性）
更好地不依赖于python2 vs python3师。即使在python 3上也要使用from __future__ import division。

\ $ \ begingroup \ $
欢迎使用代码审查！您介意在您的答案中包含计时结果的比较吗？
\ $ \ endgroup \ $
– 200_success
2014年11月10日上午11:50

\ $ \ begingroup \ $
@ 200_成功完成
\ $ \ endgroup \ $
–ev-br
2014年11月10日13:25

\ $ \ begingroup \ $
始终为您的随机数生成器设置种子（可重复性）：但是，IMO不应将默认种子设置为确定值！默认值至少应该是时间...
\ $ \ endgroup \ $
–乔治·雷涛（Jorge Leitao）
2014年11月10日13:53

\ $ \ begingroup \ $
首先，您正在失去可重复性。这对测试/调试是不利的：-)。在生产中，多播[并行运行多个MC链]不能及时播种
\ $ \ endgroup \ $
–ev-br
2014年11月10日14:54

\ $ \ begingroup \ $
您能对r = x * x + y * y而不是r = x ** 2 + y ** 2进行评论吗？这是一种风格，还是一种比另一种更优化？
\ $ \ endgroup \ $
–shadowtalker
2014年11月12日5:00

#4 楼

如果将圆以原点为中心，将简化计算。同样，通过对称，您只需建模一个象限。

import random

TIMES_TO_REPEAT = 10**5
LENGTH = 10**5

def in_circle(x, y):
    return x**2 + y**2 < LENGTH**2

inside_count = 0
for _ in range(TIMES_TO_REPEAT):
    point = random.randint(0,LENGTH), random.randint(0,LENGTH)
    if in_circle(*point):
        inside_count += 1

pi = (inside_count / TIMES_TO_REPEAT) * 4

print(pi)

其他一些说明：

i未使用；您可以调用迭代变量_。

count最终将是TIMES_TO_REPEAT，因此您不必通过递增来导出它。

*point使您不必拆开point[0]的包装。和point[1]。
如果您要求点严格位于圆内（使用<而不是≤进行比较），则将0而不是1用作下限似乎很合理。

似乎使用浮点数比LENGTH的更长整数值更有效地提高了准确性。因此，我们可以使用单位圆，并摆脱一个任意常数。

TIMES_TO_REPEAT = 5000000

def in_circle(x, y):
    return x**2 + y**2 < 1

inside_count = 0
for _ in range(TIMES_TO_REPEAT):
    if in_circle(random.random(), random.random()):
        inside_count += 1

pi = (inside_count / TIMES_TO_REPEAT) * 4

\ $ \ begingroup \ $
整体建议可靠（+1）。最后一行依赖python3 float分区。您介意将来的进口部门吗？
\ $ \ endgroup \ $
–ev-br
2014年11月10日13:25

#5 楼

一些小问题：

inside_count / count是Python 2中的整数除法。您需要调用from __future__ import division在两个Python版本中都是正确的，或者在README中有注释以指出潜在用户。
有一些可以简化的代码；
您应该尝试遵循PEP8。

下面是考虑到上述问题的代码的简化。
/>

"""
[Your explanation]
"""
from __future__ import division

import random


def in_circle(x, y):
    return x**2 + y**2 < 1


def compute_pi(samples):
    """
    Returns an approximation of pi using Monte Carlo.

    The precision increases with sqrt(samples). See ref. [...] why.
    """

    counts = 0
    for _ in range(samples):
        if in_circle(random.random(), random.random()):
            counts += 1

    return (counts / samples) * 4  # Expression in file docstring


if __name__ == '__main__':
    print(compute_pi(1000))

#6 楼

我可以建议两件事：
1。为了简化代码，以原点为中心的单位圆。
2。 Python支持complex数据类型。这适用于此问题。使用此功能，您可以使用abs()函数。

我的解决方案：

num_points = 10**5
in_count = 0
for _ in range(num_points):
    point = random.uniform(-1,1) + random.uniform(-1,1)*1j
    if abs(point)<1: in_count += 1
print(in_count*4/num_points)

一个衬板（可能会使初学者感到困惑）：

print(sum(1 for _ in range(num_points) if abs(random.uniform(-1,1) + random.uniform(-1,1)*1j) < 1)*4/num_points)