‘‘‘panda‘s index objects are responsible for holding the axis labels,like series‘‘‘import pandas as pdobj = Series(range(3),index=[‘a‘,‘b‘,‘c‘])index = obj.indexindexindex[1:]‘‘‘index = immutable‘‘‘index[1]=‘d‘‘‘‘so the index can be valued by function‘‘‘index = pd.Index(np.arange(3))obj2 = Series([1.5,-2.5,0],index=index)obj2 ‘‘‘ evaluate the attribute of index 判断属性用Is,判断存不存在用in‘‘‘obj2.index is index ‘Ohio‘ in frame3.columns ‘2002‘ in obj2.index ‘‘‘Essential functionality‘‘‘‘‘‘reindexing‘‘‘obj=Series([4.5,7.2,-5.3,3.6],index=[‘d‘,‘b‘,‘a‘,‘c‘])obj2=obj.reindex([‘a‘,‘b‘,‘c‘,‘d‘,‘e‘])obj2‘‘‘fill the missing data‘‘‘obj.reindex([‘a‘,‘b‘,‘c‘,‘d‘,‘e‘],fill_value = 0.0)‘‘‘ordering fill the missing data‘‘‘obj3=Series([‘blue‘,‘green‘,‘black‘],index=[0,2,4])obj3.reindex(np.arange(5),method=‘ffill‘) ‘‘‘reindex can be alter row,column and both in data frame‘‘‘frame = DataFrame(np.arange(9).reshape(3,3),index=[‘a‘,‘b‘,‘c‘],columns=[‘Ohio‘,‘Texas‘,‘California‘])frame.reindex([‘a‘,‘b‘,‘c‘,‘d‘])frame.reindex(columns=[‘Ohio‘,‘Texas‘,‘California‘,‘NewYork‘]) months = [‘APR‘,‘MAY‘,‘JUN‘,‘JUL‘,‘AUG‘]frame.reindex(columns=months)label=[‘a‘,‘b‘,‘c‘,‘d‘,‘e‘]states=[‘Ohio‘,‘Texas‘,‘California‘,‘NewYork‘]‘‘‘reindex 仅对x-axis有效‘‘‘frame.reindex(label,method=‘ffill‘)‘‘‘取子矩阵‘‘‘frame.ix([‘a‘,‘b‘,‘d‘],states) ‘‘‘dropping entries from axis‘‘‘obj = Series(np.arange(5.),index=[‘a‘,‘b‘,‘c‘,‘d‘,‘e‘])new_obj = obj.drop(‘c‘)new_obj ‘‘‘drop from data frame‘‘‘data=DataFrame(np.arange(16).reshape(4,4),index=[‘Ohio‘,‘Colorado‘,‘Utah‘,‘NewYork‘],columns=[‘one‘,‘two‘,‘three‘,‘four‘])‘‘‘drop from index‘‘‘data.drop([‘Colorado‘,‘Utah‘])‘‘‘drop from column‘‘‘data.drop(‘two‘,axis=1) ‘‘‘index,selection,filtering‘‘‘obj=Series(np.arange(4.),index=[‘a‘,‘b‘,‘c‘,‘d‘])‘‘‘index可以像数组一样,通过数字定位,index 定位,取一个数,一串数‘‘‘obj[‘b‘]obj[1]obj[1:2]obj[[‘a‘,‘c‘,‘d‘]]obj[[1,3]]obj[obj < 2] obj[‘b‘:‘c‘]=5 data=DataFrame(np.arange(16).reshape(4,4),index=[‘Ohio‘,‘Colorado‘,‘Utah‘,‘New York‘],columns=[‘one‘,‘two‘,‘three‘,‘four‘])‘‘‘follow by columns,但只是单维度的‘‘‘data[‘two‘]data[[‘three‘,‘one‘]]data.ix[‘Ohio‘]data[data[‘three‘]>5]data[:2] ‘‘‘把data小于5的赋值0‘‘‘data[data<5]=0 ‘‘‘按照位置选择值‘‘‘data.ix[‘Colorado‘,‘two‘]data.ix[‘Colorado‘,[‘two‘,‘three‘]]data.ix[[‘Colorado‘,‘Utah‘],[‘three‘,‘four‘]]data.ix[2]data.ix[:‘Utah‘,‘two‘]data.ix[:2,‘two‘]data.ix[data.three>5,:3] ‘‘‘reindex‘‘‘data.ix[[‘Colorado‘,‘Utah‘],[3,0,1]] ‘‘‘arithmetic and data alignment‘‘‘s1=Series([7.3,-2.5,3.4,1.5],index=[‘a‘,‘c‘,‘d‘,‘e‘])s2=Series([-2.1,3.6,-1.5,4,3.1],index=[‘a‘,‘c‘,‘e‘,‘f‘,‘g‘])‘‘‘not overlap return NA‘‘‘s1+s2‘‘‘dataframe‘‘‘df1=DataFrame(np.arange(9.).reshape(3,3),columns=list(‘bcd‘),index=[‘Ohio‘,‘Texas‘,‘Colorado‘])df2=DataFrame(np.arange(12.).reshape(4,3),columns=list(‘bde‘),index=[‘Utah‘,‘Ohio‘,‘Texas‘,‘Oregon‘]) df1+df2‘‘‘只要有一个为空,就是空‘‘‘df1.add(df2,fill_value=0)‘‘‘只要有一个有数,另外一个就设为0‘‘‘‘‘‘reindex‘‘‘df1.reindex(columns=df2.columns,fill_value=0) df1 = DataFrame(np.arange(12.).reshape(3,4),columns=list(‘abcd‘))df2 = DataFrame(np.arange(20.).reshape(4,5),columns=list(‘abcde‘))df1.add(df2,fill_value=0)df1.mul(df2,fill_value=0)df1.div(df2,fill_value=0)df1.sub(df2,fill_value=0)
时间: 2024-11-07 06:05:00