Python文摘:Python with Context Managers

原文地址:https://jeffknupp.com/blog/2016/03/07/python-with-context-managers/

Of all of the most commonly used Python constructs, context managers are neck-and-neck with decorators in a "Things I use but don‘t really understand how they work" contest. As every schoolchild will tell you, the canonical way to open and read from a file is:

with open(‘what_are_context_managers.txt‘, ‘r‘) as infile:
    for line in infile:
        print(‘> {}‘.format(line))

But how many of those who correctly handle file IO know why it‘s correct, or even that there‘s an incorrect way to do it? Hopefully a lot, or else this post won‘t get read much...

Managing Resources

Perhaps the most common (and important) use of context managers is to properly manage resources. In fact, that‘s the reason we use a context manager when reading from a file. The act of opening a file consumes a resource (called a file descriptor), and this resource is limited by your OS. That is to say, there are a maximum number of files a process can have open at one time. To prove it, try running this code:

files = []
for x in range(100000):
    files.append(open(‘foo.txt‘, ‘w‘))

If you‘re on Mac OS X or Linux, you probably got an error message like the following:

> python test.py
Traceback (most recent call last):
  File "test.py", line 3, in <module>
OSError: [Errno 24] Too many open files: ‘foo.txt‘

If you‘re on Windows, your computer probably crashed and your motherboard is now on fire. Let this be a lesson: don‘t leak file descriptors!

Joking aside, what is a "file descriptor" and what does it mean to "leak" one? Well, when you open a file, the operating system assigns an integer to the open file, allowing it to essentially give you a handle to the open file rather than direct access to the underlying file itself. This is beneficial for a variety of reasons, including being able to pass references to files between processes and to maintain a certain level of security enforced by the kernel.

So how does one "leak" a file descriptor. Simply: by not closing opened files. When working with files, it‘s easy to forget that any file that is open()-ed must also be close()-ed. Failure to do so will lead you to discover that there is (usually) a limit to the number of file descriptors a process can be assigned. On UNIX-like systems, $ ulimit -n should give you the value of that upper limit (it‘s 7168 on my system). If you want to prove it to yourself, re-run the example code and replace 100000 with whatever number you got (minus about 5, to account for files the Python interpreter opens on startup). You should now see the program run to completion.

Of course, there‘s a simpler (and better) way to get the program to complete: close each file!. Here‘s a contrived example of how to fix the issue:

files = []
for x in range(10000):
    f = open(‘foo.txt‘, ‘w‘)
    f.close()
    files.append(f)

A Better Way To Manage Resources

In real systems, it‘s difficult to make sure that close() is called on every file opened, especially if the file is in a function that may raise an exception or has multiple return paths. In a complicated function that opens a file, how can you possibly be expected to remember to add close() to every place that function could return from? And that‘s not counting exceptions, either (which may happen from anywhere). The short answer is: you can‘t be.

In other languages, developers are forced to use try...except...finally every time they work with a file (or any other type of resource that needs to be closed, like sockets or database connections). Luckily, Python loves us and gives us a simple way to make sure all resources we use are properly cleaned up, regardless of if the code returns or an exception is thrown: context managers.

By now, the premise should be obvious. We need a convenient method for indicating a particular variable has some cleanup associated with it, and to guarantee that cleanup happens, no matter what. Given that requirement, the syntax for using context managers makes a lot of sense:

with something_that_returns_a_context_manager() as my_resource:
    do_something(my_resource)
    ...
    print(‘done using my_resource‘)

That‘s it! Using with, we can call anything that returns a context manager (like the built-in open() function). We assign it to a variable using ... as <variable_name>. Crucially, the variable only exists within the indented block below the with statement. Think of with as creating a mini-function: we can use the variable freely in the indented portion, but once that block ends, the variable goes out of scope. When the variable goes out of scope, it automatically calls a special method that contains the code to clean up the resource.

But where is the code that is actually being called when the variable goes out of scope? The short answer is, "wherever the context manager is defined." You see, there are a number of ways to create a context manager. The simplest is to define a class that contains two special methods: __enter__() and __exit__()__enter__()returns the resource to be managed (like a file object in the case of open()). __exit__() does any cleanup work and returns nothing.

To make things a bit more clear, let‘s create a totally redundant context manager for working with files:

class File():

    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode

    def __enter__(self):
        self.open_file = open(self.filename, self.mode)
        return self.open_file

    def __exit__(self, *args):
        self.open_file.close()

files = []
for _ in range(10000):
    with File(‘foo.txt‘, ‘w‘) as infile:
        infile.write(‘foo‘)
        files.append(infile)

Let‘s go over what we have. Like any class, there‘s an __init__() method that sets up the object (in our case, setting the file name to open and the mode to open it in). __enter__() opens and returns the file (also creating an attribute open_file so that we can refer to it in __exit__()). __exit__() just closes the file. Running the code above works because the file is being closed when it leaves the with File(‘foo.txt‘, ‘w‘) as infile:block. Even if code in that block raised an exception, the file would still be closed.

Other Useful Context Managers

Given that context managers are so helpful, they were added to the Standard Library in a number of places. Lockobjects in threading are context managers, as are zipfile.ZipFiles. subprocess.Popentarfile.TarFile,telnetlib.Telnetpathlib.Path... the list goes on and on. Essentially, any object that needs to have closecalled on it after use is (or should be) a context manager.

The Lock usage is particularly interesting. In this case, the resource in question is a mutex (e.g. a "Lock"). Using context managers prevents a common source of deadlocks in multi-threaded programs which occur when a thread "acquires" a mutex and never "releases" it. Consider the following:

from threading import Lock
lock = Lock()

def do_something_dangerous():
    lock.acquire()
    raise Exception(‘oops I forgot this code could raise exceptions‘)
    lock.release()

try:
    do_something_dangerous()
except:
    print(‘Got an exception‘)
lock.acquire()
print(‘Got here‘)

Clearly lock.release() will never be called, causing all other threads calling do_something_dangerous() to become deadlocked. In our program, this is represented by never hitting the print(‘Got here‘) line. This, however, is easily fixed by taking advantage of the fact that Lock is a context manager:

from threading import Lock
lock = Lock()

def do_something_dangerous():
    with lock:
        raise Exception(‘oops I forgot this code could raise exceptions‘)

try:
    do_something_dangerous()
except:
    print(‘Got an exception‘)
lock.acquire()
print(‘Got here‘)

Indeed, there is no reasonable way to acquire lock using a context manager and not release it. And that‘s exactly how it should be.

Fun With contextlib

Context managers are so useful, they have a whole Standard Library module devoted to them! contextlibcontains tools for creating and working with context managers. One nice shortcut to creating a context manager from a class is to use the @contextmanager decorator. To use it, decorate a generator function that calls yieldexactly once. Everything before the call to yield is considered the code for __enter__(). Everything after is the code for __exit__(). Let‘s rewrite our File context manager using the decorator approach:

from contextlib import contextmanager

@contextmanager
def open_file(path, mode):
    the_file = open(path, mode)
    yield the_file
    the_file.close()

files = []

for x in range(100000):
    with open_file(‘foo.txt‘, ‘w‘) as infile:
        files.append(infile)

for f in files:
    if not f.closed:
        print(‘not closed‘)

As you can see, the implementation is considerably shorter. In fact, it‘s only five lines long! We open the file,yield it, then close it. The code that follows is just proof that all of the files are, indeed, closed. The fact that the program didn‘t crash is extra insurance it worked.

The official Python docs have a particularly fun/stupid example:

from contextlib import contextmanager

@contextmanager
def tag(name):
    print("<%s>" % name)
    yield
    print("</%s>" % name)

>>> with tag("h1"):
...    print("foo")
...
<h1>
foo
</h1>

My favorite piece of context manager-lunacy, however, has to be contextlib.ContextDecorator. It lets you define a context manager using the class-based approach, but inheriting from contextlib.ContextDecorator. By doing so, you can use your context manager with the with statement as normal or as a function decorator. We could do something similar to the HTML example above using this pattern (which is truly insane and shouldn‘t be done):

from contextlib import ContextDecorator

class makeparagraph(ContextDecorator):
    def __enter__(self):
        print(‘<p>‘)
        return self

    def __exit__(self, *exc):
        print(‘</p>‘)
        return False

@makeparagraph()
def emit_html():
    print(‘Here is some non-HTML‘)

emit_html()

The output will be:

<p>
Here is some non-HTML
</p>

Truly useless and horrifying...

原文地址:https://www.cnblogs.com/chickenwrap/p/9995896.html

时间: 2024-11-08 20:28:20

Python文摘:Python with Context Managers的相关文章

Python上下文管理器(Context managers)

上下文管理器(Context managers) 上下文管理器允许你在有需要的时候,精确地分配和释放资源. 使用上下文管理器最广泛的案例就是with语句了.想象下你有两个需要结对执行的相关操作,然后还要在它们中间放置一段代码.上下文管理器就是专门让你做这种事情的.举个例子: with open('some_file', 'w') as opened_file: opened_file.write('Hola!') 上面这段代码打开了一个文件,往里面写入了一些数据,然后关闭该文件.如果在往文件写数

代写Python、代做Python、Python作业代写、Python代写(微信leechanx)

代写Python.代做Python.Python作业代写.Python代写(微信leechanx) i++ VS ++i性能区别 i++ 为 function () { tmp = i; i = tmp + 1; return tmp; } ++i 为 function () { i = i + 1; return i; }

【python】python 面向对象编程笔记

1. 类的创建 类是一种数据结构,我们可以用它来定义对象,后者把数据值和行为特性融合在一起.类是现实世界的抽象的实体以编程形式出现.实例是这些对象的具体化. 类名通常由大写字母打头.这是标准惯例 class First(): pass if __name__ == '__main__': f = First() f.x = 3 f.y = 5 print(f.x + f.y ) 2. 方法 self 参数,它在所有的方法声明中都存在.这个参数代表实例对象本身,当你用实例调用方法时,由解释器悄悄地

Python,Day3 - Python基础3

1.函数基本语法及特性 函数是什么? 函数一词来源于数学,但编程中的「函数」概念,与数学中的函数是有很大不同的,具体区别,我们后面会讲,编程中的函数在英文中也有很多不同的叫法.在BASIC中叫做subroutine(子过程或子程序),在Pascal中叫做procedure(过程)和function,在C中只有function,在Java里面叫做method. 定义: 函数是指将一组语句的集合通过一个名字(函数名)封装起来,要想执行这个函数,只需调用其函数名即可 特性: 减少重复代码 使程序变的可

Hello Python!用python写一个抓取CSDN博客文章的简单爬虫

网络上一提到python,总会有一些不知道是黑还是粉的人大喊着:python是世界上最好的语言.最近利用业余时间体验了下python语言,并写了个爬虫爬取我csdn上关注的几个大神的博客,然后利用leancloud一站式后端云服务器存储数据,再写了一个android app展示数据,也算小试了一下这门语言,给我的感觉就是,像python这类弱类型的动态语言相比于java来说,开发者不需要分太多心去考虑编程问题,能够把精力集中于业务上,思考逻辑的实现.下面分享一下我此次写爬虫的一下小经验,抛砖引玉

【Python】Python获取命令行参数

有时候需要用同一个Python程序在不同的时间来处理不同的文件,此时如果老是要到Python程序中去修改输入.输出文件名,就太麻烦了.而通过Python获取命令行参数就方便多了.下面是我写得一个小程序,希望对大家有所帮助. 比如下面一个程序test.py是通过接受命令行两个参数,并打印出这两个参数. import sys #需导入sys模块 print sys.argv[1], sys.argv[2] #打印出从命令行接受的两个参数 Linux下运行:python test.py Hello P

【Python】Python对文件夹的操作

上一篇介绍了Python对文件的读写操作,现在来介绍一下Python对文件夹的操作.由于我是项目中用到的,所以我就以我的实际应用实例来说明吧.希望对大家有所帮助. 1.实例需求: 现在有一个文件夹myDir,该文件夹中又有n个文件夹dir1,Dir2, ..., dirN,每个文件夹中又有m个文件,这个实例需要达到的目的就是要将这n个文件夹dir1,Dir2, ..., dirN中的所有文件全部写到一个新文件中,该新文件在文件夹myDir下. 2.源代码: test.py代码如下: import

Awesome Python,Python的框架集合

Awesome Python A curated list of awesome Python frameworks, libraries and software. Inspired by awesome-php. Awesome Python Environment Management 环境管理 Package Management              软件包管理 Package Repositories              软件源 Distribution          

Python day1 ---python基础1

本节内容 Python介绍 编程语言分类 Hello World程序 变量 字符编码 用户输入 数据类型初识 表达式if ...else语句 表达式while 循环 表达式for 循环 break and continue 嵌套 作业需求 一.python介绍 python的创始人---吉多·范罗苏姆(Guido van Rossum)----1989年 目前Python主要应用领域: 云计算: 云计算最火的语言, 典型应用OpenStack WEB开发: 众多优秀的WEB框架,众多大型网站均为