这是stackoverflow上一个关于python中yield用法的帖子，这里翻译自投票最高的一个回答，原文链接 here

问题

Python中yield关键字的用途是什么？它有什么作用？
例如，我试图理解以下代码 &sup1：


def _get_child_candidates(self, distance, min_dist, max_dist):
    if self._leftchild and distance - max_dist &lt; self._median:
        yield self._leftchild
    if self._rightchild and distance + max_dist &gt;= self._median:
        yield self._rightchild

这是调用者(caller)：


result, candidates = [], [self]
while candidates:
    node = candidates.pop()
    distance = node._get_dist(obj)
    if distance &lt;= max_dist and distance &gt;= min_dist:
        result.extend(node._values)
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result

当调用方法_get_child_candidates时会发生什么？返回了一个列表(list)？还是返回了一个元素？然后被重复调用了吗？调用何时结束？

&sup1 ：代码来自 Jochen Schulz (jrschulz), who made a great Python library for metric spaces. 这是完整源代码的链接：Module mspace.

回答

要想理解yield的作用，你必须了解什么是生成器(generators)，在这之前，我们先来看可迭代对象(iterables)。

可迭代对象 (iterables)

当你创建了一个列表，你可以遍历这个列表读取它的每一个元素，逐个读取列表元素称为迭代(iteration)。


&gt;&gt;&gt; mylist = [1, 2, 3]
&gt;&gt;&gt; for i in mylist:
...    print(i)
1
2
3

mylist就是一个可迭代对象(iterable)。当你使用列表生成式(list comprehension)创建一个列表(list)，即创建了一个可迭代对象。


&gt;&gt;&gt; mylist = [x*x for x in range(3)]
&gt;&gt;&gt; for i in mylist:
...    print(i)
0
1
4

可以使用for... in...的所有对象都是可迭代对象：列表(lists)、字符串、文件...
这些可迭代对象使用很方便，因为你可以根据需要如你所愿的读取其中的元素。但是，当你有大量数据时把所有值都存储在内存中，这样往往不是你想要的( but you store all the values in memory and this is not always what you want when you have a lot of values.)。

生成器 (Generators)

生成器是迭代器(iterators)，但是只能迭代一次，生成器不会将所有值存储在内存中，而是实时的生成这些值：


&gt;&gt;&gt; mygenerator = (x*x for x in range(3))
&gt;&gt;&gt; for i in mygenerator:
...    print(i)
0
1
4

看上去除了用()替换了原来的[]外，它们没什么不同。但是，你不可以再次使用for i in mygenerator ，因为生成器只能被迭代一次：计算出0，然后并不保存结果和状态继续计算出1，最后计算出4，逐一生成。

yield

yield 是一个类似 return 的关键字，不同的是这个函数将返回一个生成器。


&gt;&gt;&gt; def createGenerator():
...    mylist = range(3)
...    for i in mylist:
...        yield i*i
...
&gt;&gt;&gt; mygenerator = createGenerator() # create a generator
&gt;&gt;&gt; print(mygenerator) # mygenerator is an object!
&lt;generator object createGenerator at 0xb7555c34&gt;
&gt;&gt;&gt; for i in mygenerator:
...     print(i)
0
1
4

这个例子没有什么实际作用。但是当你知道你的函数将返回大量你只需要读取一次的值时，使用生成器是一个有效的做法。
要掌握 yeild，你必须要知道当你调用这个函数时，你在函数体中编写的代码并没有立马执行。
该函数仅仅返回一个生成器对象，这有点棘手 :-)

然后，你的代码将从for循环每次使用生成器停止的位置继续执行。

现在到了关键部分：

for第一次调用从函数创建的生成器对象，函数将从头开始执行直到遇到yeild，然后返回yield后的值作为第一次迭代的返回值。接下来每次调用都会再次执行你在函数中定义的循环，并返回(return)下一个值，直到没有值可以返回(return)。

当循环结束，或者不满足if/else条件，导致函数运行但不会执行(not hit)yeild，此时生成器被认为是空的。

问题代码的解释 (Your code explained)

生成器 (Generator):


# Here you create the method of the node object that will return the generator
def _get_child_candidates(self, distance, min_dist, max_dist):

    # Here is the code that will be called each time you use the generator object:

    # If there is still a child of the node object on its left
    # AND if distance is ok, return the next child
    if self._leftchild and distance - max_dist &lt; self._median:
        yield self._leftchild

    # If there is still a child of the node object on its right
    # AND if distance is ok, return the next child
    if self._rightchild and distance + max_dist &gt;= self._median:
        yield self._rightchild

    # If the function arrives here, the generator will be considered empty
    # there is no more than two values: the left and the right children

调用者 (Caller)：

```# Create an empty list and a list with the current object reference
result, candidates = list(), [self]

Loop on candidates (they contain only one element at the beginning)

while candidates:

# Get the last candidate and remove it from the list
node = candidates.pop()

# Get the distance between obj and the candidate
distance = node._get_dist(obj)

# If distance is ok, then you can fill the result
if distance &lt;= max_dist and distance &gt;= min_dist:
    result.extend(node._values)

# Add the children of the candidate in the candidates list
# so the loop will keep running until it will have looked
# at all the children of the children of the children, etc. of the candidate
candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))

return result


<p>这段代码包含几个高明的部分：</p>
<ul>
<li>这个循环对列表进行迭代，但是迭代中列表还在不断扩展 :-) 这是一种遍历嵌套数据的简明方法，即使这样有些危险，因为你可能会陷入死循环中。在这个例子中，<code>candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))</code>穷尽了生成器产生的所有值，但<code>while</code>不断的创建新的生成器对象加入到列表，因为每个对象作用在不同节点上，所以每个生成器都将生成不同的值。</li>
<li>
<code>extend()</code>是一个列表(list)对象的方法，作用于可迭代对象(iterable)，并将其值添加到列表里。</li>
</ul>
<p>通常，通常我们将列表作为参数传递给它：</p>

>>> a = [1, 2]
>>> b = [3, 4]
>>> a.extend(b)
>>> print(a)
[1, 2, 3, 4]


<p>但是在你的代码里它接收到的是一个生成器(generator)，这很好，因为：</p>
<ol>
<li>你不必重复读取这些值</li>
<li>你可以有很多子对象，但不需要将它们都存储在内存里。</li>
</ol>
<p>它很有效，因为Python不关心一个方法的参数是否是列表，Python只希望他是一个可迭代对象，所以这个参数可以是列表，元组，字符串和生成器！这就是所谓的<code>duck typing </code>，这也是Python为何如此酷的原因之一，但这已经是另外一个问题了......</p>
<p>你可以在这里停下，来看一些生成器的高级用法：</p>
<h3>控制生成器的穷尽 (Controlling a generator exhaustion)</h3>

>>> class Bank(): # Let‘s create a bank, building ATMs
... crisis = False
... def create_atm(self):
... while not self.crisis:
... yield "$100"
>>> hsbc = Bank() # When everything‘s ok the ATM gives you as much as you want
>>> corner_street_atm = hsbc.create_atm()
>>> print(corner_street_atm.next())
$100
>>> print(corner_street_atm.next())
$100
>>> print([corner_street_atm.next() for cash in range(5)])
[‘$100‘, ‘$100‘, ‘$100‘, ‘$100‘, ‘$100‘]
>>> hsbc.crisis = True # Crisis is coming, no more money!
>>> print(corner_street_atm.next())
<type ‘exceptions.StopIteration‘>
>>> wall_street_atm = hsbc.create_atm() # It‘s even true for new ATMs
>>> print(wall_street_atm.next())
<type ‘exceptions.StopIteration‘>
>>> hsbc.crisis = False # The trouble is, even post-crisis the ATM remains empty
>>> print(corner_street_atm.next())
<type ‘exceptions.StopIteration‘>
>>> brand_new_atm = hsbc.create_atm() # Build a new one to get back in business
>>> for cash in brand_new_atm:
... print cash
$100
$100
$100
$100
$100
$100
$100
$100
$100
...


<p><strong> 注意，对于Python 3，请使用 <code>print(corner_street_atm.__next__())</code> 或者 <code>print(next(corner_street_atm))</code> </strong></p>
<p>这在很多场景都非常有用，例如控制资源的获取。</p>
<h3>Itertools，你最好的朋友 (Itertools, your best friend)</h3>
<p>itertools模块包含很多处理可迭代对象的特殊方法。曾经想要复制一个生成器吗？连接两个生成器？用一行代码将嵌套列表中的值进行分组？不创建另一个列表进行<code>Map/Zip</code>？</p>
<p>只需要<code>import itertools</code></p>
<p>需要一个例子？让我们来看看4匹马赛跑到达终点先后顺序的所有可能情况：</p>

>>> horses = [1, 2, 3, 4]
>>> races = itertools.permutations(horses)
>>> print(races)
<itertools.permutations object at 0xb754f1dc>
>>> print(list(itertools.permutations(horses)))
[(1, 2, 3, 4),
(1, 2, 4, 3),
(1, 3, 2, 4),
(1, 3, 4, 2),
(1, 4, 2, 3),
(1, 4, 3, 2),
(2, 1, 3, 4),
(2, 1, 4, 3),
(2, 3, 1, 4),
(2, 3, 4, 1),
(2, 4, 1, 3),
(2, 4, 3, 1),
(3, 1, 2, 4),
(3, 1, 4, 2),
(3, 2, 1, 4),
(3, 2, 4, 1),
(3, 4, 1, 2),
(3, 4, 2, 1),
(4, 1, 2, 3),
(4, 1, 3, 2),
(4, 2, 1, 3),
(4, 2, 3, 1),
(4, 3, 1, 2),
(4, 3, 2, 1)]
```

了解迭代的内部机制 (Understanding the inner mechanisms of iteration)

迭代是一个实现可迭代对象(实现的是 __iter__() 方法)和迭代器(实现的是 __next__() 方法)的过程。你可以获取一个迭代器的任何对象都是可迭代对象，迭代器可以让你迭代遍历一个可迭代对象(Iterators are objects that let you iterate on iterables.) .

在这篇文章中有关于for循环如何工作的更多信息：here

来源：https://segmentfault.com/a/1190000017405045

原文地址：https://www.cnblogs.com/datiangou/p/10136498.html

时间： 2024-11-04 08:42:17

Python yield用法浅析(stackoverflow)

问题

回答

可迭代对象 (iterables)

生成器 (Generators)

yield

问题代码的解释 (Your code explained)

Loop on candidates (they contain only one element at the beginning)

了解迭代的内部机制 (Understanding the inner mechanisms of iteration)

Python yield用法浅析(stackoverflow)的相关文章

Python yield 使用浅析

Python yield 使用浅析 ----以裴波那契数列生成为例

python yield用法 (tornado, coroutine)

python yield用法总结

python yield用法举例说明

[转]Python yield 使用浅析

Python yield 使用浅析(iterable generator )

Python yield用法

python中yield用法