Data manipulation in python (module 6)

1. Pandas plotting

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
%matplotlib notebook
plt.style.use("seaborn-colorblind")

np.random.seed(123)

# cumsum: add value_of_i + value_of_i+1 = value_of_i+2
df = pd.DataFrame({‘A‘: np.random.randn(365).cumsum(0),
                   ‘B‘: np.random.randn(365).cumsum(0) + 20,
                   ‘C‘: np.random.randn(365).cumsum(0) - 20},
                  index=pd.date_range(‘1/1/2017‘, periods=365))
# create a scatter plot of columns ‘A‘ and ‘C‘, with changing color (c) and size (s) based on column ‘B‘
df.plot.scatter(‘A‘, ‘C‘, c=‘B‘, s=df[‘B‘], colormap=‘viridis‘)
#df.plot.box();
#df.plot.hist(alpha=0.7);
#df.plot.kde();#pd.tools.plotting.scatter_matrix(iris); Create scater plots between the different variables and #histograms aloing the diagonals to see the obvious patter

#pd.tools.plotting.parallel_coordinates(iris, ‘Name‘);#visualizing high dimensional multivariate data, each variable in the data set corresponds to an equally spaced parallel vertical line

Output:

2. Seaborn

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib notebook

np.random.seed(1234)

v1 = pd.Series(np.random.normal(0,10,1000), name=‘v1‘)
v2 = pd.Series(2*v1 + np.random.normal(60,15,1000), name=‘v2‘)

# plot a kernel density estimation over a stacked barchart
plt.figure()
plt.hist([v1, v2], histtype=‘barstacked‘, normed=True);
v3 = np.concatenate((v1,v2))
sns.kdeplot(v3);

plt.figure()
# we can pass keyword arguments for each individual component of the plot
sns.distplot(v3, hist_kws={‘color‘: ‘Teal‘}, kde_kws={‘color‘: ‘Navy‘});

plt.figure()
# sns.jointplot(v1, v2, alpha=0.4);

# grid = sns.jointplot(v1, v2, alpha=0.4);
# grid.ax_joint.set_aspect(‘equal‘)

# sns.jointplot(v1, v2, kind=‘hex‘);

# set the seaborn style for all the following plots
# sns.set_style(‘white‘)
# sns.jointplot(v1, v2, kind=‘kde‘, space=0);# space is used to set the margin of the joint plot

Output:

joint plots

Second example

iris = pd.read_csv(‘iris.csv‘)
sns.pairplot(iris, hue=‘Name‘, diag_kind=‘kde‘, size=2);

Third example

iris = pd.read_csv(‘iris.csv‘)
plt.figure(figsize=(8,6))
plt.subplot(121)
sns.swarmplot(‘Name‘, ‘PetalLength‘, data=iris);
plt.subplot(122)
sns.violinplot(‘Name‘, ‘PetalLength‘, data=iris);

Output:

时间: 2024-07-30 13:49:31

Data manipulation in python (module 6)的相关文章

Data manipulation in python (module 4)

1. Matplotlib Backend Layer Deals with th e rendering of plots to screen or files In jupyter notebooks, we use the inline backend Artist Layer Containes containers such Figure, Subplot, and Axes Contains primitives such as a Line2D and Rectangle , an

Data manipulation in python (module 5)

1. Subplots %matplotlib notebook import matplotlib.pyplot as plt import numpy as np plt.figure() # subplot with 1 row, 2 columns, and current axis is 1st subplot axes plt.subplot(1, 2, 1) linear_data = np.array([1,2,3,4,5,6,7,8]) # plot exponential d

Data manipulation in python (module 3)

1. Visualization wheel dimensions Abstraction - Figuration boxes and charts(abstraction) or real-world physical objects(figuration) Functionality - Decoration No embellishments or artistic embellishments Density - Lightness Must be studied in depth o

Data manipulation primitives in R and Python

Data manipulation primitives in R and Python Both R and Python are incredibly good tools to manipulate your data and their integration is becoming increasingly important1. The latest tool for data manipulation in R is Dplyr2 whilst Python relies onPa

Comprehensive learning path – Data Science in Python

http://blog.csdn.net/pipisorry/article/details/44245575 关于怎么学习python,并将python用于数据科学.数据分析.机器学习中的一篇很好的文章 Comprehensive(综合的) learning path – Data Science in Python Journey from a Pythonnoob(新手) to a Kaggler on Python So, you want to become a data scient

[翻译]Python Module of The Week: Counter

Counter是一个来跟踪加入多少个相同值的容器. 初始化:Counter支持三种形式的初始化.它的构造器可以被一组元素来调用,一个包含键值和计数的字典,或者使用关键字参数字符串名称到计数的映射. import collections print collections.Counter(['a', 'b', 'c', 'a', 'b', 'b']) print collections.Counter({'a':2, 'b':3, 'c':1}) print collections.Counter

python module, package

任何Python程序都可以作为模块导入:在导入自己的模块时,需要添加路径: import sys sys.path.append('absolute-path'); (__pycache__是执行main.py时创建的) hello.py内容: def sayHello(): print('hello,world') main.py内容 import sys sys.path.append("/home/icode0410/Documents/code/python/module/modules

Quick Guide: Steps To Perform Text Data Cleaning in Python

Quick Guide: Steps To Perform Text Data Cleaning in Python Introduction Twitter has become an inevitable channel for brand management. It has compelled brands to become more responsive to their customers. On the other hand, the damage it would cause

兔子-- Can not issue data manipulation statements with executeQuery()

  Can not issue data manipulation statements with executeQuery() 出错地方:st.executeQuery("insert  into  student  values("1","2","3")") ; 如果你的SQL 语句是 update,insert等更新语句,用statement的execute() 如果sql语句是查询语句则用statement.execu