Python高级数据结构-Collections模块

Python数据类型方法精心整理,不必死记硬背,看看源码一切都有了之中,认识了python基本的数据类型和数据结构,现在认识一个高级的:Collections

这个模块对上面的数据结构做了封装,增加了一些很酷的数据结构,比如:

a)Counter: 计数器,用于统计元素的数量

b)OrderDict:有序字典

c)defaultdict:值带有默认类型的字典

d)namedtuple:可命名元组,通过名字来访问元组元素

e)deque :双向队列,队列头尾都可以放,也都可以取(与单向队列对比,单向队列只能一头放,另一头取)

1. Counter

计数器,用于统计对象中每个元素出现的个数

按照老惯例,先看源码:

class Counter(dict):
    ‘‘‘Dict subclass for counting hashable items.  Sometimes called a bag
    or multiset.  Elements are stored as dictionary keys and their counts
    are stored as dictionary values.

    >>> c = Counter(‘abcdeabcdabcaba‘)  # count elements from a string

    >>> c.most_common(3)                # three most common elements
    [(‘a‘, 5), (‘b‘, 4), (‘c‘, 3)]
    >>> sorted(c)                       # list all unique elements
    [‘a‘, ‘b‘, ‘c‘, ‘d‘, ‘e‘]
    >>> ‘‘.join(sorted(c.elements()))   # list elements with repetitions
    ‘aaaaabbbbcccdde‘
    >>> sum(c.values())                 # total of all counts

    >>> c[‘a‘]                          # count of letter ‘a‘
    >>> for elem in ‘shazam‘:           # update counts from an iterable
    ...     c[elem] += 1                # by adding 1 to each element‘s count
    >>> c[‘a‘]                          # now there are seven ‘a‘
    >>> del c[‘b‘]                      # remove all ‘b‘
    >>> c[‘b‘]                          # now there are zero ‘b‘

    >>> d = Counter(‘simsalabim‘)       # make another counter
    >>> c.update(d)                     # add in the second counter
    >>> c[‘a‘]                          # now there are nine ‘a‘

    >>> c.clear()                       # empty the counter
    >>> c
    Counter()

    Note:  If a count is set to zero or reduced to zero, it will remain
    in the counter until the entry is deleted or the counter is cleared:

    >>> c = Counter(‘aaabbc‘)
    >>> c[‘b‘] -= 2                     # reduce the count of ‘b‘ by two
    >>> c.most_common()                 # ‘b‘ is still in, but its count is zero
    [(‘a‘, 3), (‘c‘, 1), (‘b‘, 0)]

    ‘‘‘
    # References:
    #   http://en.wikipedia.org/wiki/Multiset
    #   http://www.gnu.org/software/smalltalk/manual-base/html_node/Bag.html
    #   http://www.demo2s.com/Tutorial/Cpp/0380__set-multiset/Catalog0380__set-multiset.htm
    #   http://code.activestate.com/recipes/259174/
    #   Knuth, TAOCP Vol. II section 4.6.3

    def __init__(self, iterable=None, **kwds):
        ‘‘‘Create a new, empty Counter object.  And if given, count elements
        from an input iterable.  Or, initialize the count from another mapping
        of elements to their counts.

        >>> c = Counter()                           # a new, empty counter
        >>> c = Counter(‘gallahad‘)                 # a new counter from an iterable
        >>> c = Counter({‘a‘: 4, ‘b‘: 2})           # a new counter from a mapping
        >>> c = Counter(a=4, b=2)                   # a new counter from keyword args

        ‘‘‘
        super(Counter, self).__init__()
        self.update(iterable, **kwds)

    def __missing__(self, key):
        """ 对于不存在的元素,返回计数器为0 """
        ‘The count of elements not in the Counter is zero.‘
        # Needed so that self[missing_item] does not raise KeyError
        return 0

    def most_common(self, n=None):
        """ 数量大于等n的所有元素和计数器 """
        ‘‘‘List the n most common elements and their counts from the most
        common to the least.  If n is None, then list all element counts.

        >>> Counter(‘abcdeabcdabcaba‘).most_common(3)
        [(‘a‘, 5), (‘b‘, 4), (‘c‘, 3)]

        ‘‘‘
        # Emulate Bag.sortedByCount from Smalltalk
        if n is None:
            return sorted(self.iteritems(), key=_itemgetter(1), reverse=True)
        return _heapq.nlargest(n, self.iteritems(), key=_itemgetter(1))

    def elements(self):
        """ 计数器中的所有元素,注:此处非所有元素集合,而是包含所有元素集合的迭代器 """
        ‘‘‘Iterator over elements repeating each as many times as its count.

        >>> c = Counter(‘ABCABC‘)
        >>> sorted(c.elements())
        [‘A‘, ‘A‘, ‘B‘, ‘B‘, ‘C‘, ‘C‘]

        # Knuth‘s example for prime factors of 1836:  2**2 * 3**3 * 17**1
        >>> prime_factors = Counter({2: 2, 3: 3, 17: 1})
        >>> product = 1
        >>> for factor in prime_factors.elements():     # loop over factors
        ...     product *= factor                       # and multiply them
        >>> product

        Note, if an element‘s count has been set to zero or is a negative
        number, elements() will ignore it.

        ‘‘‘
        # Emulate Bag.do from Smalltalk and Multiset.begin from C++.
        return _chain.from_iterable(_starmap(_repeat, self.iteritems()))

    # Override dict methods where necessary

    @classmethod
    def fromkeys(cls, iterable, v=None):
        # There is no equivalent method for counters because setting v=1
        # means that no element can have a count greater than one.
        raise NotImplementedError(
            ‘Counter.fromkeys() is undefined.  Use Counter(iterable) instead.‘)

    def update(self, iterable=None, **kwds):
        """ 更新计数器,其实就是增加;如果原来没有,则新建,如果有则加一 """
        ‘‘‘Like dict.update() but add counts instead of replacing them.

        Source can be an iterable, a dictionary, or another Counter instance.

        >>> c = Counter(‘which‘)
        >>> c.update(‘witch‘)           # add elements from another iterable
        >>> d = Counter(‘watch‘)
        >>> c.update(d)                 # add elements from another counter
        >>> c[‘h‘]                      # four ‘h‘ in which, witch, and watch

        ‘‘‘
        # The regular dict.update() operation makes no sense here because the
        # replace behavior results in the some of original untouched counts
        # being mixed-in with all of the other counts for a mismash that
        # doesn‘t have a straight-forward interpretation in most counting
        # contexts.  Instead, we implement straight-addition.  Both the inputs
        # and outputs are allowed to contain zero and negative counts.

        if iterable is not None:
            if isinstance(iterable, Mapping):
                if self:
                    self_get = self.get
                    for elem, count in iterable.iteritems():
                        self[elem] = self_get(elem, 0) + count
                else:
                    super(Counter, self).update(iterable) # fast path when counter is empty
            else:
                self_get = self.get
                for elem in iterable:
                    self[elem] = self_get(elem, 0) + 1
        if kwds:
            self.update(kwds)

    def subtract(self, iterable=None, **kwds):
        """ 相减,原来的计数器中的每一个元素的数量减去后添加的元素的数量 """
        ‘‘‘Like dict.update() but subtracts counts instead of replacing them.
        Counts can be reduced below zero.  Both the inputs and outputs are
        allowed to contain zero and negative counts.

        Source can be an iterable, a dictionary, or another Counter instance.

        >>> c = Counter(‘which‘)
        >>> c.subtract(‘witch‘)             # subtract elements from another iterable
        >>> c.subtract(Counter(‘watch‘))    # subtract elements from another counter
        >>> c[‘h‘]                          # 2 in which, minus 1 in witch, minus 1 in watch
        >>> c[‘w‘]                          # 1 in which, minus 1 in witch, minus 1 in watch
        -1

        ‘‘‘
        if iterable is not None:
            self_get = self.get
            if isinstance(iterable, Mapping):
                for elem, count in iterable.items():
                    self[elem] = self_get(elem, 0) - count
            else:
                for elem in iterable:
                    self[elem] = self_get(elem, 0) - 1
        if kwds:
            self.subtract(kwds)

    def copy(self):
        """ 拷贝 """
        ‘Return a shallow copy.‘
        return self.__class__(self)

    def __reduce__(self):
        """ 返回一个元组(类型,元组) """
        return self.__class__, (dict(self),)

    def __delitem__(self, elem):
        """ 删除元素 """
        ‘Like dict.__delitem__() but does not raise KeyError for missing values.‘
        if elem in self:
            super(Counter, self).__delitem__(elem)

    def __repr__(self):
        if not self:
            return ‘%s()‘ % self.__class__.__name__
        items = ‘, ‘.join(map(‘%r: %r‘.__mod__, self.most_common()))
        return ‘%s({%s})‘ % (self.__class__.__name__, items)

    # Multiset-style mathematical operations discussed in:
    #       Knuth TAOCP Volume II section 4.6.3 exercise 19
    #       and at http://en.wikipedia.org/wiki/Multiset
    #
    # Outputs guaranteed to only include positive counts.
    #
    # To strip negative and zero counts, add-in an empty counter:
    #       c += Counter()

    def __add__(self, other):
        ‘‘‘Add counts from two counters.

        >>> Counter(‘abbb‘) + Counter(‘bcc‘)
        Counter({‘b‘: 4, ‘c‘: 2, ‘a‘: 1})

        ‘‘‘
        if not isinstance(other, Counter):
            return NotImplemented
        result = Counter()
        for elem, count in self.items():
            newcount = count + other[elem]
            if newcount > 0:
                result[elem] = newcount
        for elem, count in other.items():
            if elem not in self and count > 0:
                result[elem] = count
        return result

    def __sub__(self, other):
        ‘‘‘ Subtract count, but keep only results with positive counts.

        >>> Counter(‘abbbc‘) - Counter(‘bccd‘)
        Counter({‘b‘: 2, ‘a‘: 1})

        ‘‘‘
        if not isinstance(other, Counter):
            return NotImplemented
        result = Counter()
        for elem, count in self.items():
            newcount = count - other[elem]
            if newcount > 0:
                result[elem] = newcount
        for elem, count in other.items():
            if elem not in self and count < 0:
                result[elem] = 0 - count
        return result

    def __or__(self, other):
        ‘‘‘Union is the maximum of value in either of the input counters.

        >>> Counter(‘abbb‘) | Counter(‘bcc‘)
        Counter({‘b‘: 3, ‘c‘: 2, ‘a‘: 1})

        ‘‘‘
        if not isinstance(other, Counter):
            return NotImplemented
        result = Counter()
        for elem, count in self.items():
            other_count = other[elem]
            newcount = other_count if count < other_count else count
            if newcount > 0:
                result[elem] = newcount
        for elem, count in other.items():
            if elem not in self and count > 0:
                result[elem] = count
        return result

    def __and__(self, other):
        ‘‘‘ Intersection is the minimum of corresponding counts.

        >>> Counter(‘abbb‘) & Counter(‘bcc‘)
        Counter({‘b‘: 1})

        ‘‘‘
        if not isinstance(other, Counter):
            return NotImplemented
        result = Counter()
        for elem, count in self.items():
            other_count = other[elem]
            newcount = count if count < other_count else other_count
            if newcount > 0:
                result[elem] = newcount
        return result

    def __pos__(self):
        ‘Adds an empty counter, effectively stripping negative and zero counts‘
        result = Counter()
        for elem, count in self.items():
            if count > 0:
                result[elem] = count
        return result

    def __neg__(self):
        ‘‘‘Subtracts from an empty counter.  Strips positive and zero counts,
        and flips the sign on negative counts.

        ‘‘‘
        result = Counter()
        for elem, count in self.items():
            if count < 0:
                result[elem] = 0 - count
        return result

    def _keep_positive(self):
        ‘‘‘Internal method to strip elements with a negative or zero count‘‘‘
        nonpositive = [elem for elem, count in self.items() if not count > 0]
        for elem in nonpositive:
            del self[elem]
        return self

    def __iadd__(self, other):
        ‘‘‘Inplace add from another counter, keeping only positive counts.

        >>> c = Counter(‘abbb‘)
        >>> c += Counter(‘bcc‘)
        >>> c
        Counter({‘b‘: 4, ‘c‘: 2, ‘a‘: 1})

        ‘‘‘
        for elem, count in other.items():
            self[elem] += count
        return self._keep_positive()

    def __isub__(self, other):
        ‘‘‘Inplace subtract counter, but keep only results with positive counts.

        >>> c = Counter(‘abbbc‘)
        >>> c -= Counter(‘bccd‘)
        >>> c
        Counter({‘b‘: 2, ‘a‘: 1})

        ‘‘‘
        for elem, count in other.items():
            self[elem] -= count
        return self._keep_positive()

    def __ior__(self, other):
        ‘‘‘Inplace union is the maximum of value from either counter.

        >>> c = Counter(‘abbb‘)
        >>> c |= Counter(‘bcc‘)
        >>> c
        Counter({‘b‘: 3, ‘c‘: 2, ‘a‘: 1})

        ‘‘‘
        for elem, other_count in other.items():
            count = self[elem]
            if other_count > count:
                self[elem] = other_count
        return self._keep_positive()

    def __iand__(self, other):
        ‘‘‘Inplace intersection is the minimum of corresponding counts.

        >>> c = Counter(‘abbb‘)
        >>> c &= Counter(‘bcc‘)
        >>> c
        Counter({‘b‘: 1})

        ‘‘‘
        for elem, count in self.items():
            other_count = other[elem]
            if other_count < count:
                self[elem] = other_count
        return self._keep_positive()

实际上,Counter是dict的一个子类,实例:

#通过字典形式统计每个元素重复的次数传
res = collections.Counter(‘abcdabcaba‘)
print(res)                                  #结果Counter({‘a‘: 4, ‘b‘: 3, ‘c‘: 2, ‘d‘: 1})  

#dict的子类,所以也可以以字典的形式取得键值对
for k in res:
    print(k, res[k], end=‘  |  ‘)           #结果 a 4  |  b 3  |  c 2  |  d 1  |
for k, v in res.items():
    print(k, v, end=‘  |  ‘)                #结果 a 4  |  b 3  |  c 2  |  d 1  |  

#通过most_common(n),返回前n个重复次数最多的键值对
print(res.most_common())                    #结果None
print(res.most_common(2))                   #结果[(‘a‘, 4), (‘b‘, 3)]  

#通过update来增加元素的重复次数,通过subtract来减少元素重复的次数
a = collections.Counter(‘abcde‘)
res.update(a)
print(res)                                  #结果Counter({‘a‘: 5, ‘b‘: 4, ‘c‘: 3, ‘d‘: 2, ‘e‘: 1}),比原来的res增加了重复次数  

b = collections.Counter(‘aaafff‘)
res.subtract(b)
print(res)                                  #结果Counter({‘b‘: 4, ‘c‘: 3, ‘a‘: 2, ‘d‘: 2, ‘e‘: 1, ‘f‘: -3}),还有负值,要注意  

#fromkeys功能还没实现,使用的话会报错  

2. OrderDict

有序字典,数据结构字典Dict是无序的,有时使用起来不是很方便,Collections里提供一个有序字典OrderDict,用起来就很方便了

在介绍有序字典以前,用已知的知识其实可以自己实现一个有序字典,通过列表或者元祖来维护key,实现有序字典:

lst =[]
dic = {}

lst.append(‘name‘)
dic[‘name‘] = ‘winter‘
lst.append(‘age‘)
dic[‘age‘] = 18

for k in lst:
    print(k, dic[k])

实际上,OrderDict就是通过这种方式实现的

源代码:

class OrderedDict(dict):
    ‘Dictionary that remembers insertion order‘
    # An inherited dict maps keys to values.
    # The inherited dict provides __getitem__, __len__, __contains__, and get.
    # The remaining methods are order-aware.
    # Big-O running times for all methods are the same as regular dictionaries.

    # The internal self.__map dict maps keys to links in a doubly linked list.
    # The circular doubly linked list starts and ends with a sentinel element.
    # The sentinel element never gets deleted (this simplifies the algorithm).
    # The sentinel is in self.__hardroot with a weakref proxy in self.__root.
    # The prev links are weakref proxies (to prevent circular references).
    # Individual links are kept alive by the hard reference in self.__map.
    # Those hard references disappear when a key is deleted from an OrderedDict.

    def __init__(*args, **kwds):
        ‘‘‘Initialize an ordered dictionary.  The signature is the same as
        regular dictionaries, but keyword arguments are not recommended because
        their insertion order is arbitrary.

        ‘‘‘
        if not args:
            raise TypeError("descriptor ‘__init__‘ of ‘OrderedDict‘ object "
                            "needs an argument")
        self, *args = args
        if len(args) > 1:
            raise TypeError(‘expected at most 1 arguments, got %d‘ % len(args))
        try:
            self.__root
        except AttributeError:
            self.__hardroot = _Link()
            self.__root = root = _proxy(self.__hardroot)
            root.prev = root.next = root
            self.__map = {}
        self.__update(*args, **kwds)

    def __setitem__(self, key, value,
                    dict_setitem=dict.__setitem__, proxy=_proxy, Link=_Link):
        ‘od.__setitem__(i, y) <==> od[i]=y‘
        # Setting a new item creates a new link at the end of the linked list,
        # and the inherited dictionary is updated with the new key/value pair.
        if key not in self:
            self.__map[key] = link = Link()
            root = self.__root
            last = root.prev
            link.prev, link.next, link.key = last, root, key
            last.next = link
            root.prev = proxy(link)
        dict_setitem(self, key, value)

    def __delitem__(self, key, dict_delitem=dict.__delitem__):
        ‘od.__delitem__(y) <==> del od[y]‘
        # Deleting an existing item uses self.__map to find the link which gets
        # removed by updating the links in the predecessor and successor nodes.
        dict_delitem(self, key)
        link = self.__map.pop(key)
        link_prev = link.prev
        link_next = link.next
        link_prev.next = link_next
        link_next.prev = link_prev
        link.prev = None
        link.next = None

    def __iter__(self):
        ‘od.__iter__() <==> iter(od)‘
        # Traverse the linked list in order.
        root = self.__root
        curr = root.next
        while curr is not root:
            yield curr.key
            curr = curr.next

    def __reversed__(self):
        ‘od.__reversed__() <==> reversed(od)‘
        # Traverse the linked list in reverse order.
        root = self.__root
        curr = root.prev
        while curr is not root:
            yield curr.key
            curr = curr.prev

    def clear(self):
        ‘od.clear() -> None.  Remove all items from od.‘
        root = self.__root
        root.prev = root.next = root
        self.__map.clear()
        dict.clear(self)

    def popitem(self, last=True):
        ‘‘‘od.popitem() -> (k, v), return and remove a (key, value) pair.
        Pairs are returned in LIFO order if last is true or FIFO order if false.

        ‘‘‘
        if not self:
            raise KeyError(‘dictionary is empty‘)
        root = self.__root
        if last:
            link = root.prev
            link_prev = link.prev
            link_prev.next = root
            root.prev = link_prev
        else:
            link = root.next
            link_next = link.next
            root.next = link_next
            link_next.prev = root
        key = link.key
        del self.__map[key]
        value = dict.pop(self, key)
        return key, value

    def move_to_end(self, key, last=True):
        ‘‘‘Move an existing element to the end (or beginning if last==False).

        Raises KeyError if the element does not exist.
        When last=True, acts like a fast version of self[key]=self.pop(key).

        ‘‘‘
        link = self.__map[key]
        link_prev = link.prev
        link_next = link.next
        soft_link = link_next.prev
        link_prev.next = link_next
        link_next.prev = link_prev
        root = self.__root
        if last:
            last = root.prev
            link.prev = last
            link.next = root
            root.prev = soft_link
            last.next = link
        else:
            first = root.next
            link.prev = root
            link.next = first
            first.prev = soft_link
            root.next = link

    def __sizeof__(self):
        sizeof = _sys.getsizeof
        n = len(self) + 1                       # number of links including root
        size = sizeof(self.__dict__)            # instance dictionary
        size += sizeof(self.__map) * 2          # internal dict and inherited dict
        size += sizeof(self.__hardroot) * n     # link objects
        size += sizeof(self.__root) * n         # proxy objects
        return size

    update = __update = MutableMapping.update

    def keys(self):
        "D.keys() -> a set-like object providing a view on D‘s keys"
        return _OrderedDictKeysView(self)

    def items(self):
        "D.items() -> a set-like object providing a view on D‘s items"
        return _OrderedDictItemsView(self)

    def values(self):
        "D.values() -> an object providing a view on D‘s values"
        return _OrderedDictValuesView(self)

    __ne__ = MutableMapping.__ne__

    __marker = object()

    def pop(self, key, default=__marker):
        ‘‘‘od.pop(k[,d]) -> v, remove specified key and return the corresponding
        value.  If key is not found, d is returned if given, otherwise KeyError
        is raised.

        ‘‘‘
        if key in self:
            result = self[key]
            del self[key]
            return result
        if default is self.__marker:
            raise KeyError(key)
        return default

    def setdefault(self, key, default=None):
        ‘od.setdefault(k[,d]) -> od.get(k,d), also set od[k]=d if k not in od‘
        if key in self:
            return self[key]
        self[key] = default
        return default

    @_recursive_repr()
    def __repr__(self):
        ‘od.__repr__() <==> repr(od)‘
        if not self:
            return ‘%s()‘ % (self.__class__.__name__,)
        return ‘%s(%r)‘ % (self.__class__.__name__, list(self.items()))

    def __reduce__(self):
        ‘Return state information for pickling‘
        inst_dict = vars(self).copy()
        for k in vars(OrderedDict()):
            inst_dict.pop(k, None)
        return self.__class__, (), inst_dict or None, None, iter(self.items())

    def copy(self):
        ‘od.copy() -> a shallow copy of od‘
        return self.__class__(self)

    @classmethod
    def fromkeys(cls, iterable, value=None):
        ‘‘‘OD.fromkeys(S[, v]) -> New ordered dictionary with keys from S.
        If not specified, the value defaults to None.

        ‘‘‘
        self = cls()
        for key in iterable:
            self[key] = value
        return self

    def __eq__(self, other):
        ‘‘‘od.__eq__(y) <==> od==y.  Comparison to another OD is order-sensitive
        while comparison to a regular mapping is order-insensitive.

        ‘‘‘
        if isinstance(other, OrderedDict):
            return dict.__eq__(self, other) and all(map(_eq, self, other))
        return dict.__eq__(self, other)

try:
    from _collections import OrderedDict
except ImportError:
    # Leave the pure Python version in place.
    pass

实例:

dict的方法OrderDict基本都可以使用,比如keys(), values(), clear()

注意,因为OrderDict有序,有些方法不同,比如,pop()和popitem()

另外OrderDict增加了一个move_to_end的方法

#创建一个有序字典
dic = collections.OrderedDict()
dic[‘name‘] = ‘winter‘
dic[‘age‘] = 18
dic[‘gender‘] = ‘male‘

print(dic)                         #结果OrderedDict([(‘name‘, ‘winter‘), (‘age‘, 18), (‘gender‘, ‘male‘)])

#将一个键值对放入最后
dic.move_to_end(‘name‘)
print(dic)                         #结果OrderedDict([(‘age‘, 18), (‘gender‘, ‘male‘), (‘name‘, ‘winter‘)])

3. defaultdict

默认字典,为字典设置一个默认类型;很有用,比如说:

people = [[‘male‘, ‘winter‘], [‘female‘, ‘elly‘], [‘male‘, ‘frank‘], [‘female‘, ‘emma‘]]
#将男性女性分开,所有男性放到‘male‘中,所有女性放放到‘female‘中
gender_sort = {}

for info in people:
    if info[0] in gender_sort:
        gender_sort[info[0]].append(info[1])
    else:
        gender_sort[info[0]] = [info[1]]

print(gender_sort)                              #结果{‘male‘: [‘winter‘, ‘frank‘], ‘female‘: [‘elly‘, ‘emma‘]}

如果使用defaultdict就会简单很多

people = [[‘male‘, ‘winter‘], [‘female‘, ‘elly‘], [‘male‘, ‘frank‘], [‘female‘, ‘emma‘]]

gender_sort = collections.defaultdict(list)
for info in people:
    gender_sort[info[0]].append(info[1])

print(gender_sort)      #结果defaultdict(<class ‘list‘>, {‘male‘: [‘winter‘, ‘frank‘], ‘female‘: [‘elly‘, ‘emma‘]})

这就是defaultdict的最大用处

4. namedtuple

可命名元组,给元组每个元素起一个名字,这样就可以通过名字来访问元组里的元素,增强了可读性;尤其对于坐标,html标签的长宽等,使用名字可读性更强;有点类似于字典了

源代码:

def namedtuple(typename, field_names, *, verbose=False, rename=False, module=None):
    """Returns a new subclass of tuple with named fields.

    >>> Point = namedtuple(‘Point‘, [‘x‘, ‘y‘])
    >>> Point.__doc__                   # docstring for the new class
    ‘Point(x, y)‘
    >>> p = Point(11, y=22)             # instantiate with positional args or keywords
    >>> p[0] + p[1]                     # indexable like a plain tuple
    33
    >>> x, y = p                        # unpack like a regular tuple
    >>> x, y
    (11, 22)
    >>> p.x + p.y                       # fields also accessible by name
    33
    >>> d = p._asdict()                 # convert to a dictionary
    >>> d[‘x‘]
    11
    >>> Point(**d)                      # convert from a dictionary
    Point(x=11, y=22)
    >>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
    Point(x=100, y=22)

    """

    # Validate the field names.  At the user‘s option, either generate an error
    # message or automatically replace the field name with a valid name.
    if isinstance(field_names, str):
        field_names = field_names.replace(‘,‘, ‘ ‘).split()
    field_names = list(map(str, field_names))
    typename = str(typename)
    if rename:
        seen = set()
        for index, name in enumerate(field_names):
            if (not name.isidentifier()
                or _iskeyword(name)
                or name.startswith(‘_‘)
                or name in seen):
                field_names[index] = ‘_%d‘ % index
            seen.add(name)
    for name in [typename] + field_names:
        if type(name) is not str:
            raise TypeError(‘Type names and field names must be strings‘)
        if not name.isidentifier():
            raise ValueError(‘Type names and field names must be valid ‘
                             ‘identifiers: %r‘ % name)
        if _iskeyword(name):
            raise ValueError(‘Type names and field names cannot be a ‘
                             ‘keyword: %r‘ % name)
    seen = set()
    for name in field_names:
        if name.startswith(‘_‘) and not rename:
            raise ValueError(‘Field names cannot start with an underscore: ‘
                             ‘%r‘ % name)
        if name in seen:
            raise ValueError(‘Encountered duplicate field name: %r‘ % name)
        seen.add(name)

    # Fill-in the class template
    class_definition = _class_template.format(
        typename = typename,
        field_names = tuple(field_names),
        num_fields = len(field_names),
        arg_list = repr(tuple(field_names)).replace("‘", "")[1:-1],
        repr_fmt = ‘, ‘.join(_repr_template.format(name=name)
                             for name in field_names),
        field_defs = ‘\n‘.join(_field_template.format(index=index, name=name)
                               for index, name in enumerate(field_names))
    )

    # Execute the template string in a temporary namespace and support
    # tracing utilities by setting a value for frame.f_globals[‘__name__‘]
    namespace = dict(__name__=‘namedtuple_%s‘ % typename)
    exec(class_definition, namespace)
    result = namespace[typename]
    result._source = class_definition
    if verbose:
        print(result._source)

    # For pickling to work, the __module__ variable needs to be set to the frame
    # where the named tuple is created.  Bypass this step in environments where
    # sys._getframe is not defined (Jython for example) or sys._getframe is not
    # defined for arguments greater than 0 (IronPython), or where the user has
    # specified a particular module.
    if module is None:
        try:
            module = _sys._getframe(1).f_globals.get(‘__name__‘, ‘__main__‘)
        except (AttributeError, ValueError):
            pass
    if module is not None:
        result.__module__ = module

    return result

实例:

position_module = collections.namedtuple(‘position‘, [‘x‘, ‘y‘, ‘z‘])   #‘position‘相当于指定一个类型,类似于上面的OrderedDict([(‘age‘, 18), (‘gender‘, ‘male‘), (‘name‘, ‘winter‘)])中的OrderdDict

a_position = position_module(3, 5, 7)
print(a_position)                                   #结果position(x=3, y=5, z=7)
print(a_position.x, a_position.y, a_position.z)     #结果3 5 7

再来一个更实用的:

import collections

login_user = [
    (r‘http://www.baidu.com‘, ‘usr1‘, ‘pwd1‘),
    (r‘http://www.youdao.com‘, ‘usr2‘, ‘pwd2‘),
    (r‘http://mail.126.com‘, ‘usr3‘, ‘pwd3‘)
]

page_info = collections.namedtuple(‘login_info‘, [‘url‘, ‘username‘, ‘password‘])
for user in login_user:
    x = page_info(*user)
    print(x)

结果:

login_info(url=‘http://www.baidu.com‘, username=‘usr1‘, password=‘pwd1‘)
login_info(url=‘http://www.youdao.com‘, username=‘usr2‘, password=‘pwd2‘)
login_info(url=‘http://mail.126.com‘, username=‘usr3‘, password=‘pwd3‘)

5. deque

deque其实是 double-ended queue 的缩写,双向队列

说到队列就要说到队列和栈了;队列是FIFO,栈是FILO

队列又分为:单向队列(只能从一边放,从另外一边取);双向队列(两头都可以放,也都可以取)

Python中单向队列就是queue.Queue

源代码:

class deque(object):
    """
    deque([iterable[, maxlen]]) --> deque object

    A list-like sequence optimized for data accesses near its endpoints.
    """
    def append(self, *args, **kwargs): # real signature unknown
        """ Add an element to the right side of the deque. """
        pass

    def appendleft(self, *args, **kwargs): # real signature unknown
        """ Add an element to the left side of the deque. """
        pass

    def clear(self, *args, **kwargs): # real signature unknown
        """ Remove all elements from the deque. """
        pass

    def copy(self, *args, **kwargs): # real signature unknown
        """ Return a shallow copy of a deque. """
        pass

    def count(self, value): # real signature unknown; restored from __doc__
        """ D.count(value) -> integer -- return number of occurrences of value """
        return 0

    def extend(self, *args, **kwargs): # real signature unknown
        """ Extend the right side of the deque with elements from the iterable """
        pass

    def extendleft(self, *args, **kwargs): # real signature unknown
        """ Extend the left side of the deque with elements from the iterable """
        pass

    def index(self, value, start=None, stop=None): # real signature unknown; restored from __doc__
        """
        D.index(value, [start, [stop]]) -> integer -- return first index of value.
        Raises ValueError if the value is not present.
        """
        return 0

    def insert(self, index, p_object): # real signature unknown; restored from __doc__
        """ D.insert(index, object) -- insert object before index """
        pass

    def pop(self, *args, **kwargs): # real signature unknown
        """ Remove and return the rightmost element. """
        pass

    def popleft(self, *args, **kwargs): # real signature unknown
        """ Remove and return the leftmost element. """
        pass

    def remove(self, value): # real signature unknown; restored from __doc__
        """ D.remove(value) -- remove first occurrence of value. """
        pass

    def reverse(self): # real signature unknown; restored from __doc__
        """ D.reverse() -- reverse *IN PLACE* """
        pass

    def rotate(self, *args, **kwargs): # real signature unknown
        """ Rotate the deque n steps to the right (default n=1).  If n is negative, rotates left. """
        pass

    def __add__(self, *args, **kwargs): # real signature unknown
        """ Return self+value. """
        pass

    def __bool__(self, *args, **kwargs): # real signature unknown
        """ self != 0 """
        pass

    def __contains__(self, *args, **kwargs): # real signature unknown
        """ Return key in self. """
        pass

    def __copy__(self, *args, **kwargs): # real signature unknown
        """ Return a shallow copy of a deque. """
        pass

    def __delitem__(self, *args, **kwargs): # real signature unknown
        """ Delete self[key]. """
        pass

    def __eq__(self, *args, **kwargs): # real signature unknown
        """ Return self==value. """
        pass

    def __getattribute__(self, *args, **kwargs): # real signature unknown
        """ Return getattr(self, name). """
        pass

    def __getitem__(self, *args, **kwargs): # real signature unknown
        """ Return self[key]. """
        pass

    def __ge__(self, *args, **kwargs): # real signature unknown
        """ Return self>=value. """
        pass

    def __gt__(self, *args, **kwargs): # real signature unknown
        """ Return self>value. """
        pass

    def __iadd__(self, *args, **kwargs): # real signature unknown
        """ Implement self+=value. """
        pass

    def __imul__(self, *args, **kwargs): # real signature unknown
        """ Implement self*=value. """
        pass

    def __init__(self, iterable=(), maxlen=None): # known case of _collections.deque.__init__
        """
        deque([iterable[, maxlen]]) --> deque object

        A list-like sequence optimized for data accesses near its endpoints.
        # (copied from class doc)
        """
        pass

    def __iter__(self, *args, **kwargs): # real signature unknown
        """ Implement iter(self). """
        pass

    def __len__(self, *args, **kwargs): # real signature unknown
        """ Return len(self). """
        pass

    def __le__(self, *args, **kwargs): # real signature unknown
        """ Return self<=value. """
        pass

    def __lt__(self, *args, **kwargs): # real signature unknown
        """ Return self<value. """
        pass

    def __mul__(self, *args, **kwargs): # real signature unknown
        """ Return self*value.n """
        pass

    @staticmethod # known case of __new__
    def __new__(*args, **kwargs): # real signature unknown
        """ Create and return a new object.  See help(type) for accurate signature. """
        pass

    def __ne__(self, *args, **kwargs): # real signature unknown
        """ Return self!=value. """
        pass

    def __reduce__(self, *args, **kwargs): # real signature unknown
        """ Return state information for pickling. """
        pass

    def __repr__(self, *args, **kwargs): # real signature unknown
        """ Return repr(self). """
        pass

    def __reversed__(self): # real signature unknown; restored from __doc__
        """ D.__reversed__() -- return a reverse iterator over the deque """
        pass

    def __rmul__(self, *args, **kwargs): # real signature unknown
        """ Return self*value. """
        pass

    def __setitem__(self, *args, **kwargs): # real signature unknown
        """ Set self[key] to value. """
        pass

    def __sizeof__(self): # real signature unknown; restored from __doc__
        """ D.__sizeof__() -- size of D in memory, in bytes """
        pass

    maxlen = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
    """maximum size of a deque or None if unbounded"""

    __hash__ = None

实例:

很多方法和list的方法相同,比如count,index等,这里列出几个特别的:

raw = [1,2,3]
d = collections.deque(raw)
print(d)                    #结果deque([1, 2, 3])

#右增
d.append(4)
print(d)                    #结果deque([1, 2, 3, 4])
#左增
d.appendleft(0)
print(d)                    #结果deque([0, 1, 2, 3, 4])

#左扩展
d.extend([5,6,7])
print(d)                    #结果deque([0, 1, 2, 3, 4, 5, 6, 7])
#右扩展
d.extendleft([-3,-2,-1])
print(d)                    #结果deque([-1, -2, -3, 0, 1, 2, 3, 4, 5, 6, 7])

#右弹出
r_pop = d.pop()
print(r_pop)                #结果7
print(d)                    #结果deque([-1, -2, -3, 0, 1, 2, 3, 4, 5, 6])
#左弹出
l_pop = d.popleft()
print(l_pop)                #结果-1
print(d)                    #结果deque([-2, -3, 0, 1, 2, 3, 4, 5, 6])

#将右边n个元素值取出加入到左边
print(d)                    #原队列deque([-2, -3, 0, 1, 2, 3, 4, 5, 6])
d.rotate(3)
print(d)                    #rotate以后为deque([4, 5, 6, -2, -3, 0, 1, 2, 3])

熟练使用collections模块,可以让我们更加的Pythonic

时间: 2024-11-06 03:52:39

Python高级数据结构-Collections模块的相关文章

python(43):collections模块

Python作为一个"内置电池"的编程语言,标准库里面拥有非常多好用的模块.比如今天想给大家 介绍的 collections 就是一个非常好的例子. 基本介绍: 我们都知道,python拥有一些内阻的数据类型,比如str,int,list,tuple,dict等,collections模块在这些内置数据的基础上,提供了几个额外的数据类型: namedtuple(): 生成可以使用名字来访问元素内容的tuple子类 deque: 双端队列,可以快速的从另外一侧追加和推出对象 Counte

Python高级数据结构(一)

数据结构 数据结构的概念很好理解,就是用来将数据组织在一起的结构.换句话说,数据结构是用来存储一系列关联数据的东西.在Python中有四种内建的数据结构,分别是List.Tuple.Dictionary以及Set.大部分的应用程序不需要其他类型的数据结构,但若是真需要也有很多高级数据结构可供选择,例如Collection.Array.Heapq.Bisect.Weakref.Copy以及Pprint.本文将介绍这些数据结构的用法,看看它们是如何帮助我们的应用程序的. 关于四种内建数据结构的使用方

Python 中的collections 模块

这个模块中实现了一些类,非常灵活.可以用于替代python 内置的dict .list .tuple .set 类型.并且一些功能是这些内置类型所不存在的. 在网络上找了一些资料,重点说说collections 模块中的 deque .defaultdict.Counter 类 1.class deque类似于python 内置的 list ,不过它是一个双向的list.可以在任意一头进行操作 help(collections.deque) class deque(__builtin__.obj

Python基础:collections模块

collections是Python内建的一个集合模块,提供了许多有用的集合类. 1.Counter 计数器 Counter是一个简单的计数器,例如,统计字符出现的个数: >>> import collections >>> obj = collections.Counter('applebanana') >>> print(obj) Counter({'a': 4, 'n': 2, 'p': 2, 'e': 1, 'l': 1, 'b': 1}) 2

(Python第九天)Collections模块

一.Counter collections是Python内建的一个集合模块,提供了许多有用的集合类 其中有很多类,Counter是一个有助于hashable对象计数的dict子类,它是一个无序的集合,其中hashable对象的元素存储为字典的键,它们的计数存储为字典的值,计数可以为任意整数,包括零和负数 实例:查看Python的LICENSE文件中某些单词出现的次数 原文地址:https://www.cnblogs.com/ywangji/p/10317678.html

Python其他数据结构collection模块-namtuple defaultdict deque Queue Counter OrderDict

nametuple 是tuple扩展子类,命名元组,其实本质上简单类对象 from collections import namedtuple info = namedtuple("Info", ['name', 'age', 'height']) # 赋值,是不是有点像面向对象中实例变量方式 info.name = "北门吹雪" info.age = 18 info.height = 175 # 访问 print(info.name) 其实本质上和下面方式一样 c

Python全栈开发之5、几种常见的排序算法以及collections模块提供的数据结构

在面试中,经常会遇到一些考排序算法的题,在这里,我就简单了列举了几种最常见的排序算法供大家学习,说不定以后哪天面试正好用上,文章后半段则介绍一下collections模块,因为这个模块相对于python提供的基本数据结构(list,tuple,dict)不被人们所熟悉,但是如果你对他们了解的话,用起来也是非常方便高效的. 排序算法 一.冒泡排序(BubbleSort) 步骤: 比较相邻的元素,如果第一个比第二个大,就交换他们两个. 循环一遍后,最大的数就“浮”到了列表最后的位置. 将剩下的数再次

python模块学习之数据结构--collections.counter

python包含非常多的标准数据结构,如list,tuple,dict 和set 都是内置类型.除了这些基本的内置类型.python的collections模块还很多种数据结构实现. collections----容器数据类型模块 python版本:2.6以及以后版本 collections模块包含除内置內型list,dict 和tuple之外的数据类型. 1.1 Counter 1.1.1 Counter作为一个容器(啥是容器呢,能装入不同的对象就是容器),可以跟踪相同数据的次数. 初始化输入

Python高级数据类型模块collections

collections模块提供更加高级的容器数据类型,替代Python的内置dict,list, set,和tuple  Counter对象 提供计数器,支持方便和快速的计数.返回的是一个以元素为键,出现次数为值的字典 cnt = Counter() #创建一个Counter对象lst =['red', 'blue', 'red', 'green', 'blue', 'blue']for word in lst: cnt[word] += 1print cnt # 输出:Counter({'bl