numba加速

onecodeall

Blog Content

Python 2014-10-31 23:28:12

使用numba加速python

使用 jit 加速 Python 低效的 for 语句

jit 的全称是 Just-in-time，在 numba 里面则特指 Just-in-time compilation（即时编译）。

numba不支持 list comprehension，详情可参见这里::https://github.com/numba/numba/issues/504

jit能够加速的不限于for，但一般而言加速for会比较常见、效果也比较显著。

jit会在某种程度上“预编译”你的代码，这意味着它会在某种程度上固定住各个变量的数据类型；所以在jit下定义数组时，如果想要使用的是float数组的话，就不能用[0] * len(x)定义、而应该像上面那样在0后面加一个小数点：[0.] * len(x)

使用 vectorize 实现 numpy 的 Ufunc 功能

@nb.vectorize(nopython=True)

def add_vec(a, b):

return a + b

vectorize下的函数所接受的参数都是一个个的数而非整个数组。更详细的说明可以参见官方文档:http://numba.pydata.org/numba-doc/0.23.0/user/vectorize.html）

当常数 c 是整数和是浮点数时、速度是不同的。个人猜测这是因为若常数 c 为整数，那么实际运算时需要将它转化为浮点数，从而导致速度变慢

可以显式地定义函数的参数类型和返回类型，如果确定常数 c 就是整数的话，可以这样写：

@nb.vectorize("float32(float32, int32)", nopython=True)

def add_with_vec(a, b):

return a + b

而如果我确定常数 c 就是浮点数的话，可以这样写：

@nb.vectorize("float32(float32, float32)", nopython=True)

def add_with_vec(a, b):

return a + b

而如果我确定常数 c 不是整数就是浮点数的话，我就可以这样写：

@nb.vectorize([

"float32(float32, int32)",

"float32(float32, float32)"

], nopython=True)

def add_with_vec(a, b):

return a + b

注意，float32 和 float64、int32 和 int64 是不同的，需要小心

此外，vectorize最炫酷的地方在于，它可以“并行”：

@nb.vectorize("float32(float32, float32)", target="parallel", nopython=True)

def add_vec(a, b):

return a + b

需要指出的是，vectorize中的参数target一共有三种取值：cpu（默认）、parallel和cuda。关于选择哪个取值，官方文档上有很好的说明：

A general guideline is to choose different targets for different data sizes and algorithms. The “cpu” target works well for small data sizes (approx. less than 1KB) and low compute intensity algorithms. It has the least amount of overhead. The “parallel” target works well for medium data sizes (approx. less than 1MB). Threading adds a small delay. The “cuda” target works well for big data sizes (approx. greater than 1MB) and high compute intensity algorithms. Transfering memory to and from the GPU adds significant overhead.

使用 jit(nogil=True) 实现高效并发（多线程）

我们知道，Python 中由于有 GIL 的存在，所以多线程用起来非常不舒服。不过 numba 的 jit 里面有一项参数叫 nogil，

@nb.jit(nopython=True, nogil=True)

一般来说，数据量越大、并发的效果越明显。反之，数据量小的时候，并发很有可能会降低性能

上一篇：Redis 哨兵模式设置密码
下一篇：python性能加速，提升加快Python程序运行效率的方法

One - One Code All

Blog Content

numba加速

The minute you think of giving up, think of the reason why you held on so long.