Data manipulation in python (module 5)

1. Subplots

%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np

plt.figure()
# subplot with 1 row, 2 columns, and current axis is 1st subplot axes
plt.subplot(1, 2, 1)
linear_data = np.array([1,2,3,4,5,6,7,8])
# plot exponential data on 1st subplot axes
plt.plot(linear_data, ‘-o‘)

exponential_data = linear_data **2
# subplot with 1 row, 2 columns, and current axis is 2nd subplot axes
plt.subplot(1, 2, 2)
plt.plot(exponential_data)

plt.subplot(1, 2, 1)
plt.plot(exponential_data)

# Create a new figure
plt.figure()
ax1 = plt.subplot(1, 2, 1)
plt.plot(linear_data, ‘-o‘)
# pass sharey=ax1 to ensure the two subplots share the same y axis
ax2 = plt.subplot(1, 2, 2, sharey=ax1)
plt.plot(exponential_data, ‘-x‘)

Output:

# create a 3x3 grid of subplots
fig, ((ax1,ax2,ax3), (ax4,ax5,ax6), (ax7,ax8,ax9)) = plt.subplots(3, 3, sharex=True, sharey=True)
# plot the linear_data on the 5th subplot axes
ax5.plot(linear_data, ‘-‘)
# set inside tick labels to visible
for ax in plt.gcf().get_axes():
    for label in ax.get_xticklabels() + ax.get_yticklabels():
        label.set_visible(True)
        # necessary on some systems to update the plot
plt.gcf().canvas.draw()

2 .Histogram

import numpy as np
import matplotlib.pyplot as plt
# create 2x2 grid of axis subplots
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex=True)
axs = [ax1,ax2,ax3,ax4]
# draw n = 10, 100, 1000, and 10000 samples from the normal distribution and plot corresponding histograms
for i, ax in enumerate(axes):
    sample = np.random.normal(0, 1, 10**(i+1))
    ax.hist(sample, bins=100)
    ax.set_title(‘n={}‘.format(10**(i+1)))

Output:

import matplotlib.gridspec as gridspec
plt.figure()

gspec = gridspec.GridSpec(3,3)

top_histogram = plt.subplot(gspec[0, 1:])
side_histogram = plt.subplot(gspec[1:, 0])
lower_right = plt.subplot(gspec[1:, 1:])

Y = np.random.normal(loc=0.0, scale=1.0, size=10000)
X = np.random.random(size=10000)
lower_right.scatter(X, Y)
top_histogram.hist(X, bins=100)
s = side_histogram.hist(Y, bins=100, orientation=‘horizontal‘)

# # clear the histograms and plot normed histograms
top_histogram.clear()
top_histogram.hist(X, bins=100, normed=True)

side_histogram.clear()
side_histogram.hist(Y, bins=100, orientation=‘horizontal‘, normed=True)
# flip the side histogram‘s x axis
side_histogram.invert_xaxis()

# change axes limits
for ax in [top_histogram, lower_right]:
    ax.set_xlim(0, 1)
for ax in [side_histogram, lower_right]:
    ax.set_ylim(-5, 5)

Output:

3. Box plots

import matplotlib.pyplot as plt
import mpl_toolkits.axes_grid1.inset_locator as mpl_il
import pandas as pd
normal_sample =  np.random.normal(loc=0.0, scale=1.0, size=10000)
random_sample = np.random.random(size=10000)
gamma_sample = np.random.gamma(2, size=10000)
df = pd.DataFrame({"normal":normal_sample,
                   "random": random_sample,
                   "gamma":gamma_sample})

plt.figure()
# if `whis` argument isn‘t passed, boxplot defaults to showing 1.5*interquartile (IQR) whiskers with outliers
_ = plt.boxplot([ df[‘normal‘], df[‘random‘], df[‘gamma‘] ], whis=‘range‘)
# overlay axis on top of another
ax2 = mpl_il.inset_axes(plt.gca(), width=‘60%‘, height=‘40%‘, loc=2)
ax2.hist(df[‘gamma‘], bins=100)
# switch the y axis ticks for ax2 to the right side
ax2.yaxis.tick_right()

Output:

4. Heartmap

import matplotlib.pyplot as plt
import numpy as np
plt.figure()

Y = np.random.normal(loc=0.0, scale=1.0, size=10000)
X = np.random.random(size=10000)
plt.figure()
_ = plt.hist2d(X, Y, bins=100)
# add a colorbar legend
plt.colorbar()

Output:

 5.  Animation

import matplotlib.animation as animation
import matplotlib.pyplot as plt
import numpy as np
plt.figure()
n = 100
x = np.random.randn(n)
plt.hist(x, bins=10)
# create the function that will do the plotting, where curr is the current frame
def update(curr):
    # check if animation is at the last frame, and if so, stop the animation a
    if curr == n:
        a.event_source.stop()
        # Clear the current axis
    plt.cla()
    bins = np.arange(-4, 4, 0.5)
    plt.hist(x[:curr], bins=bins)
    plt.axis([-4,4,0,30])
    plt.gca().set_title(‘Sampling the Normal Distribution‘)
    plt.gca().set_ylabel(‘Frequency‘)
    plt.gca().set_xlabel(‘Value‘)
    plt.annotate(‘n = {}‘.format(curr), [3,27])
fig = plt.figure()
a = animation.FuncAnimation(fig, update, interval=100)

Output:

6. Interactivity

Mousing clickigng

import matplotlib.pyplot as plt
import numpy as np
plt.figure()
data = np.random.rand(10)
plt.plot(data)

def onclick(event):
    plt.cla()
    plt.plot(data)
    plt.gca().set_title(‘Event at pixels {},{} \nand data {},{}‘.format(event.x, event.y, event.xdata, event.ydata))

# tell mpl_connect we want to pass a ‘button_press_event‘ into onclick when the event is detected
plt.gcf().canvas.mpl_connect(‘button_press_event‘, onclick)

Output:

from random import shuffle
origins = [‘China‘, ‘Brazil‘, ‘India‘, ‘USA‘, ‘Canada‘, ‘UK‘, ‘Germany‘, ‘Iraq‘, ‘Chile‘, ‘Mexico‘]

shuffle(origins)

df = pd.DataFrame({‘height‘: np.random.rand(10),
                   ‘weight‘: np.random.rand(10),
                   ‘origin‘: origins})
plt.figure()
# picker=5 means the mouse doesn‘t have to click directly on an event, but can be up to 5 pixels away
plt.scatter(df[‘height‘], df[‘weight‘], picker=10)
plt.gca().set_ylabel(‘Weight‘)
plt.gca().set_xlabel(‘Height‘)

def onpick(event):
    origin = df.iloc[event.ind[0]][‘origin‘]
    plt.gca().set_title(‘Selected item came from {}‘.format(origin))

# tell mpl_connect we want to pass a ‘pick_event‘ into onpick when the event is detected
plt.gcf().canvas.mpl_connect(‘pick_event‘, onpick)

Output:

时间: 2024-08-21 21:54:37

Data manipulation in python (module 5)的相关文章

Data manipulation in python (module 4)

1. Matplotlib Backend Layer Deals with th e rendering of plots to screen or files In jupyter notebooks, we use the inline backend Artist Layer Containes containers such Figure, Subplot, and Axes Contains primitives such as a Line2D and Rectangle , an

Data manipulation in python (module 6)

1. Pandas plotting import matplotlib.pyplot as plt import numpy as np import pandas as pd %matplotlib notebook plt.style.use("seaborn-colorblind") np.random.seed(123) # cumsum: add value_of_i + value_of_i+1 = value_of_i+2 df = pd.DataFrame({'A':

Data manipulation in python (module 3)

1. Visualization wheel dimensions Abstraction - Figuration boxes and charts(abstraction) or real-world physical objects(figuration) Functionality - Decoration No embellishments or artistic embellishments Density - Lightness Must be studied in depth o

Data manipulation primitives in R and Python

Data manipulation primitives in R and Python Both R and Python are incredibly good tools to manipulate your data and their integration is becoming increasingly important1. The latest tool for data manipulation in R is Dplyr2 whilst Python relies onPa

Comprehensive learning path – Data Science in Python

http://blog.csdn.net/pipisorry/article/details/44245575 关于怎么学习python,并将python用于数据科学.数据分析.机器学习中的一篇很好的文章 Comprehensive(综合的) learning path – Data Science in Python Journey from a Pythonnoob(新手) to a Kaggler on Python So, you want to become a data scient

[翻译]Python Module of The Week: Counter

Counter是一个来跟踪加入多少个相同值的容器. 初始化:Counter支持三种形式的初始化.它的构造器可以被一组元素来调用,一个包含键值和计数的字典,或者使用关键字参数字符串名称到计数的映射. import collections print collections.Counter(['a', 'b', 'c', 'a', 'b', 'b']) print collections.Counter({'a':2, 'b':3, 'c':1}) print collections.Counter

python module, package

任何Python程序都可以作为模块导入:在导入自己的模块时,需要添加路径: import sys sys.path.append('absolute-path'); (__pycache__是执行main.py时创建的) hello.py内容: def sayHello(): print('hello,world') main.py内容 import sys sys.path.append("/home/icode0410/Documents/code/python/module/modules

Quick Guide: Steps To Perform Text Data Cleaning in Python

Quick Guide: Steps To Perform Text Data Cleaning in Python Introduction Twitter has become an inevitable channel for brand management. It has compelled brands to become more responsive to their customers. On the other hand, the damage it would cause

兔子-- Can not issue data manipulation statements with executeQuery()

  Can not issue data manipulation statements with executeQuery() 出错地方:st.executeQuery("insert  into  student  values("1","2","3")") ; 如果你的SQL 语句是 update,insert等更新语句,用statement的execute() 如果sql语句是查询语句则用statement.execu