引入所需要的包:
import pandas as pd import numpy as np import matplotlib.pyplot as plt
一、创建对象
通过传递一个list对象来创建一个Series
s = pd.Series([1,3,5,np.nan,6,8])
>>> s 0 1.0 1 3.0 2 5.0 3 NaN 4 6.0 5 8.0 dtype: float64
通过传递一个numpy array,时间索引以及列标签来创建一个DataFrame
ata =pd.date_range(‘20170701‘,periods=6)
>>> data DatetimeIndex([‘2017-07-01‘, ‘2017-07-02‘, ‘2017-07-03‘, ‘2017-07-04‘, ‘2017-07-05‘, ‘2017-07-06‘], dtype=‘datetime64[ns]‘, freq=‘D‘)
通过传递一个能够被转换成类似序列结构的字典对象来创建一个DataFrame
df2 = pd.DataFrame({‘A‘:1, ‘B‘:pd.Timestamp(‘20170702‘), ‘C‘:pd.Series(1,index=list(range(4)),dtype=‘float32‘), ‘D‘:np.array([3]*4,dtype=‘int32‘), ‘E‘:pd.Categorical([‘test‘,‘train‘,‘test‘,‘train‘]), ‘F‘:‘foo‘})
>>> df2 A B C D E F 0 1 2017-07-02 1.0 3 test foo 1 1 2017-07-02 1.0 3 train foo 2 1 2017-07-02 1.0 3 test foo 3 1 2017-07-02 1.0 3 train foo
二、 查看数据
1、查看frame中头部和尾部的行
index = pd.date_range(‘1/1/2000‘, periods=8) s = pd.Series(np.random.randn(5), index=[‘a‘, ‘b‘, ‘c‘, ‘d‘, ‘e‘]) df = pd.DataFrame(np.random.randn(8, 3),index=index,columns=[‘A‘, ‘B‘, ‘C‘]) >>> df.head() A B C 2000-01-01 -0.944924 -0.081706 -1.161476 2000-01-02 -0.205432 0.271903 0.668626 2000-01-03 -0.505713 1.804659 -1.232667 2000-01-04 0.048007 1.246067 0.083038 2000-01-05 -0.561152 0.987697 -0.268812 >>> df.tail(3) A B C 2000-01-06 0.613778 -0.357461 1.649822 2000-01-07 1.045096 -1.840059 1.413085 2000-01-08 -1.481738 -0.663567 0.437833
时间: 2024-10-08 14:53:35