一、编码
推荐阅读《字符编码的前世今生》:http://tgideas.qq.com/webplat/info/news_version3/804/808/811/m579/201307/218730.shtml
1. 常见编码介绍
- GB2312编码:适用于汉字处理、汉字通信等系统之间的信息交换
- GBK编码:是汉字编码标准之一,是在 GB2312-80 标准基础上的内码扩展规范,使用了双字节编码,即每个汉子占2byte。
- ASCII编码:是对英语字符和二进制之间的关系做的统一规定
- Unicode编码:这是一种世界上所有字符的编码。当然了它没有规定的存储方式。
- UTF-8编码:是 Unicode Transformation Format - 8 bit 的缩写, UTF-8 是 Unicode 的一种实现方式。它是可变长的编码方式,可以使用 1~4 个字节表示一个字符,可根据不同的符号而变化字节长度。每个汉子占3byte。
2. 编码转换
Python内部的字符串一般都是 Unicode编码。代码中字符串的默认编码与代码文件本身的编码是一致的。所以要做一些编码转换通常是要以Unicode作为中间编码进行转换的,即先将其他编码的字符串解码(decode)成 Unicode,再从 Unicode编码(encode)成另一种编码。
- decode 的作用是将其他编码的字符串转换成 Unicode 编码,eg name.decode(“GB2312”),表示将GB2312编码的字符串name转换成Unicode编码
- encode 的作用是将Unicode编码转换成其他编码的字符串,eg name.encode(”GB2312“),表示将GB2312编码的字符串name转换成GB2312编码
所以在进行编码转换的时候必须先知道 name 是那种编码,然后 decode 成 Unicode 编码,最后才 encode 成需要编码的编码。当然了,如果 name 已经就是 Unicode 编码了,那么就不需要进行 decode 进行解码转换了,直接用 encode 就可以编码成你所需要的编码。值得注意的是:对 Unicode 进行编码和对 str 进行编码都是错误的。
要在同一个文本中进行两种编码的输出等操作就必须进行编码的转换,先用decode将文本原来的编码转换成Unicode,再用encode将编码转换成需要转换成的编码。
二、Set集合
Set集合是一个无序不重复的。源码中对他的诸多方法进行了说明
1 class set(object): 2 """ 3 set() -> new empty set object 4 set(iterable) -> new set object 5 6 Build an unordered collection of unique elements. 7 """ 8 def add(self, *args, **kwargs): # real signature unknown 9 """ 10 Add an element to a set. 11 12 This has no effect if the element is already present. 13 """ 14 pass 15 16 def clear(self, *args, **kwargs): # real signature unknown 17 """ Remove all elements from this set. """ 18 pass 19 20 def copy(self, *args, **kwargs): # real signature unknown 21 """ Return a shallow copy of a set. """ 22 pass 23 24 def difference(self, *args, **kwargs): # real signature unknown 25 """ 26 Return the difference of two or more sets as a new set. 27 28 (i.e. all elements that are in this set but not the others.) 29 """ 30 pass 31 32 def difference_update(self, *args, **kwargs): # real signature unknown 33 """ Remove all elements of another set from this set. """ 34 pass 35 36 def discard(self, *args, **kwargs): # real signature unknown 37 """ 38 Remove an element from a set if it is a member. 39 40 If the element is not a member, do nothing. 41 """ 42 pass 43 44 def intersection(self, *args, **kwargs): # real signature unknown 45 """ 46 Return the intersection of two sets as a new set. 47 48 (i.e. all elements that are in both sets.) 49 """ 50 pass 51 52 def intersection_update(self, *args, **kwargs): # real signature unknown 53 """ Update a set with the intersection of itself and another. """ 54 pass 55 56 def isdisjoint(self, *args, **kwargs): # real signature unknown 57 """ Return True if two sets have a null intersection. """ 58 pass 59 60 def issubset(self, *args, **kwargs): # real signature unknown 61 """ Report whether another set contains this set. """ 62 pass 63 64 def issuperset(self, *args, **kwargs): # real signature unknown 65 """ Report whether this set contains another set. """ 66 pass 67 68 def pop(self, *args, **kwargs): # real signature unknown 69 """ 70 Remove and return an arbitrary set element. 71 Raises KeyError if the set is empty. 72 """ 73 pass 74 75 def remove(self, *args, **kwargs): # real signature unknown 76 """ 77 Remove an element from a set; it must be a member. 78 79 If the element is not a member, raise a KeyError. 80 """ 81 pass 82 83 def symmetric_difference(self, *args, **kwargs): # real signature unknown 84 """ 85 Return the symmetric difference of two sets as a new set. 86 87 (i.e. all elements that are in exactly one of the sets.) 88 """ 89 pass 90 91 def symmetric_difference_update(self, *args, **kwargs): # real signature unknown 92 """ Update a set with the symmetric difference of itself and another. """ 93 pass 94 95 def union(self, *args, **kwargs): # real signature unknown 96 """ 97 Return the union of sets as a new set. 98 99 (i.e. all elements that are in either set.) 100 """ 101 pass 102 103 def update(self, *args, **kwargs): # real signature unknown 104 """ Update a set with the union of itself and others. """ 105 pass 106 107 def __and__(self, *args, **kwargs): # real signature unknown 108 """ Return self&value. """ 109 pass 110 111 def __contains__(self, y): # real signature unknown; restored from __doc__ 112 """ x.__contains__(y) <==> y in x. """ 113 pass 114 115 def __eq__(self, *args, **kwargs): # real signature unknown 116 """ Return self==value. """ 117 pass 118 119 def __getattribute__(self, *args, **kwargs): # real signature unknown 120 """ Return getattr(self, name). """ 121 pass 122 123 def __ge__(self, *args, **kwargs): # real signature unknown 124 """ Return self>=value. """ 125 pass 126 127 def __gt__(self, *args, **kwargs): # real signature unknown 128 """ Return self>value. """ 129 pass 130 131 def __iand__(self, *args, **kwargs): # real signature unknown 132 """ Return self&=value. """ 133 pass 134 135 def __init__(self, seq=()): # known special case of set.__init__ 136 """ 137 set() -> new empty set object 138 set(iterable) -> new set object 139 140 Build an unordered collection of unique elements. 141 # (copied from class doc) 142 """ 143 pass 144 145 def __ior__(self, *args, **kwargs): # real signature unknown 146 """ Return self|=value. """ 147 pass 148 149 def __isub__(self, *args, **kwargs): # real signature unknown 150 """ Return self-=value. """ 151 pass 152 153 def __iter__(self, *args, **kwargs): # real signature unknown 154 """ Implement iter(self). """ 155 pass 156 157 def __ixor__(self, *args, **kwargs): # real signature unknown 158 """ Return self^=value. """ 159 pass 160 161 def __len__(self, *args, **kwargs): # real signature unknown 162 """ Return len(self). """ 163 pass 164 165 def __le__(self, *args, **kwargs): # real signature unknown 166 """ Return self<=value. """ 167 pass 168 169 def __lt__(self, *args, **kwargs): # real signature unknown 170 """ Return self<value. """ 171 pass 172 173 @staticmethod # known case of __new__ 174 def __new__(*args, **kwargs): # real signature unknown 175 """ Create and return a new object. See help(type) for accurate signature. """ 176 pass 177 178 def __ne__(self, *args, **kwargs): # real signature unknown 179 """ Return self!=value. """ 180 pass 181 182 def __or__(self, *args, **kwargs): # real signature unknown 183 """ Return self|value. """ 184 pass 185 186 def __rand__(self, *args, **kwargs): # real signature unknown 187 """ Return value&self. """ 188 pass 189 190 def __reduce__(self, *args, **kwargs): # real signature unknown 191 """ Return state information for pickling. """ 192 pass 193 194 def __repr__(self, *args, **kwargs): # real signature unknown 195 """ Return repr(self). """ 196 pass 197 198 def __ror__(self, *args, **kwargs): # real signature unknown 199 """ Return value|self. """ 200 pass 201 202 def __rsub__(self, *args, **kwargs): # real signature unknown 203 """ Return value-self. """ 204 pass 205 206 def __rxor__(self, *args, **kwargs): # real signature unknown 207 """ Return value^self. """ 208 pass 209 210 def __sizeof__(self): # real signature unknown; restored from __doc__ 211 """ S.__sizeof__() -> size of S in memory, in bytes """ 212 pass 213 214 def __sub__(self, *args, **kwargs): # real signature unknown 215 """ Return self-value. """ 216 pass 217 218 def __xor__(self, *args, **kwargs): # real signature unknown 219 """ Return self^value. """ 220 pass 221 222 __hash__ = None
Set源码(3.5.1版本)
下边就几个常用的方法进行练习,方便加深理解
测试数据如下
1 a = {1,2,3,4} 2 b = {2,4,6,7}
2.1 set的创建
1 set_1 = set() #创建一个空集合 2 set_2 = {123, "456"} #集合中包含元素123,"456" 3 #将list转换为set 4 list_1 = [1,2,3,4,4] 5 set_3 = set(list_1) 6 # set_3 = [1,2,3,4]
2.2 set.difference和set.difference_update
1 a = {1,2,3,4} 2 b = {2,4,6,7} 3 c = a.difference(b) 4 print(c) # a中存在,b中不存在的,返回一个新的集合 5 out: {1, 3} 6 a.difference_update(b) # 将a中存在,b中不存在的元素集,对a进行修改 7 print(c) 8 out: {1, 3}
set.difference
2.3 set.symmetric_difference和set.symmetric_difference_update
时间: 2024-11-09 16:29:59