[Python Cookbook] Pandas Groupby

Groupby Count

# Party’s Frequency of donations
nyc.groupby(’Party’)[’contb receipt amt’].count()

The command returns a series where the index is the name of a Party and the value is the count of that Party. Note that the series is ordered by the name of Party alphabetically.

Multiple Variables

# Party’s Frequency of donations by Date
nyc.groupby([’Party’, ’Date’])[’contb receipt amt’].count()

Groupby Sum

# Party’s Sum of donations
nyc.groupby(’Party’)[’contb receipt amt’].sum()

# Define the format of float
pd.options.display.float format = ’{:,.2f}’.format nyc.groupby(’Party’)[’contb receipt amt’].sum()

Groupby Order

# Top 5 Donors, by Occupation
df7 = nyc.groupby(’contbr occupation’)[’contb receipt amt’]. sum(). reset  index ()
df7.sort_values(’contb receipt amt’, ascending=False, inplace =True)
df7.head(5)
#or
df7.nlargest(5,’contb receipt amt’)

# Bottom 5 Donors, by Occupation
df8 = nyc.groupby(’contbr occupation’)[’contb receipt amt’]. sum() . reset   index ()
df8 . sort_values (by=’ contb receipt amt ’ , inplace=True) df8.head(5)
# OR
df7.tail(5)
#OR
df8.nsmallest(5,’contb receipt amt’)

Get rid of negative values:

df8 [ df8 . contb receipt amt >0].head(5)

The following commands give an example to find the Top 5 occupations that donated to each cadidate. Note that we need to sort the table based on two variables, firtly sorted by candidate name alphabetically and then sorted by contribution amount in a descending order. Finally, we hope to show the Top 5 occupations for each candidate.

# Top 5 Occupations that donated to Each Candidate
df10 = nyc.groupby ([ ’cand_nm’ , ’contbr_occupation’ ]) [ ’contb_receipt_amt’ ].sum().reset_index ()
df10.sort_values ([ ’cand_nm’ , ’contb_receipt_amt’ ] , ascending =[True , False ], inplace=True)
df10.groupby(’cand_nm’).head(5)

Groupby Plot

#Top 5 Fundraising Candidates Line Graph
df11 = nyc.groupby(’cand_nm’)[’contb_receipt_amt’].sum(). reset_index ()
df11_p = df11.nlargest(5,’contb_receipt_amt’)
df11_g = nyc[nyc.cand_nm.isin(df11_p.cand_nm)][[ ’cand_nm’,’Date’,’contb_receipt_amt’]]
dfpiv=pd.pivot table(df11_g , values=’contb_receipt_amt’, index=[’Date’],columns=[’cand_nm’], aggfunc=np.sum)dfpiv.loc[‘2016-01-01‘:‘2016?01?30‘].plot.line()

原文地址:https://www.cnblogs.com/sherrydatascience/p/10360750.html

时间: 2024-08-12 18:04:37

[Python Cookbook] Pandas Groupby的相关文章

[Python Cookbook] Pandas: 3 Ways to define a DataFrame

Using Series (Row-Wise) import pandas as pd purchase_1 = pd.Series({'Name': 'Chris', 'Item Purchased': 'Dog Food', 'Cost': 22.50}) purchase_2 = pd.Series({'Name': 'Kevyn', 'Item Purchased': 'Kitty Litter', 'Cost': 2.50}) purchase_3 = pd.Series({'Name

[Python Cookbook] Pandas: Indexing of DataFrame

Selecting a Row df.loc[index] # if index is a string, add ' '; if index is a number, no ' ' or df.iloc[row_num] Selecting a Column df['col_name'] Or df.col_name Selecting an Element df.loc[index, 'col_name'] Selecting Multiple Discontinuous Rows df.l

Python数据分析--Pandas知识点(三)

本文主要是总结学习pandas过程中用到的函数和方法, 在此记录, 防止遗忘. Python数据分析--Pandas知识点(一) Python数据分析--Pandas知识点(二) 下面将是在知识点一, 二的基础上继续总结. 前面所介绍的都是以表格的形式中展现数据, 下面将介绍Pandas与Matplotlib配合绘制出折线图, 散点图, 饼图, 柱形图, 直方图等五大基本图形. Matplotlib是python中的一个2D图形库, 它能以各种硬拷贝的格式和跨平台的交互式环境生成高质量的图形,

python之pandas用法大全

python之pandas用法大全 更新时间:2018年03月13日 15:02:28 投稿:wdc 我要评论 本文讲解了python的pandas基本用法,大家可以参考下 一.生成数据表1.首先导入pandas库,一般都会用到numpy库,所以我们先导入备用:?12import numpy as npimport pandas as pd2.导入CSV或者xlsx文件:?12df = pd.DataFrame(pd.read_csv('name.csv',header=1))df = pd.D

Python之pandas用法

Python之pandas用法 导入 import pandas as pd Series 用pandas的Series函数从数组或列表中创建一个可自定义下标(index)并自动维护标号索引的一维数组 a = pd.Series([0.25, 0.5, 0.75, 1.0]) print(a) b = pd.Series([0.25, 0.5, 0.75, 1.0], index=['a', 'b', 'c', 'd']) # 自定义下标 print(b) c = pd.Series({'a':

《Python cookbook》 “定义一个属性可由用户修改的装饰器” 笔记

看<Python cookbook>的时候,第9.5部分,"定义一个属性可由用户修改的装饰器",有个装饰器理解起来花了一些时间,做个笔记免得二刷这本书的时候忘了 完整代码:https://github.com/blackmatrix7/python-learning/blob/master/python_cookbook/chapter_9/section_5/attach_wrapper.py 书中的装饰器(书中称之为访问器函数) def attach_wrapper(o

分享一个python cookbook的在线教程地址

分享一个python cookbook的在线教程地址: http://python3-cookbook.readthedocs.org/zh_CN/latest/ 翻译者:熊能

python cookbook —— Searching and Replacing Text in a File

要将文件中的某一个字符串替换为另一个,然后写入一个新文件中: 首先判断输入格式,至少需要输入被搜索的Text和替换的Text,输入输出文件可以是存在的文件,也可以是标准输入输出流 (os.path是个好东西) import os, sys nargs = len(sys.argv) if not 3 <= nargs <= 5: print "usage: %s search_text replace_text [infile [outfile]]" % os.path.b

Python Cookbook(第3版)中文版pdf

下载地址:网盘下载 内容简介  · · · · · · <Python Cookbook(第3版)中文版>介绍了Python应用在各个领域中的一些使用技巧和方法,其主题涵盖了数据结构和算法,字符串和文本,数字.日期和时间,迭代器和生成器,文件和I/O,数据编码与处理,函数,类与对象,元编程,模块和包,网络和Web编程,并发,实用脚本和系统管理,测试.调试以及异常,C语言扩展等. 本书覆盖了Python应用中的很多常见问题,并提出了通用的解决方案.书中包含了大量实用的编程技巧和示例代码,并在Py