Pandas中DataFrame数据合并、连接(concat、merge、join)之join

pandas.DataFrame.join

自己弄了很久,一看官网。感觉自己宛如智障。不要脸了,直接抄

DataFrame.join(otheron=Nonehow=‘left‘lsuffix=‘‘rsuffix=‘‘sort=False)

Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.

Parameters:
other : DataFrame, Series with name field set, or list of DataFrame

Index should be similar to one of the columns in this one. If a Series is passed, its name attribute must be set, and that will be used as the column name in the resulting joined DataFrame

on : column name, tuple/list of column names, or array-like

Column(s) in the caller to join on the index in other, otherwise joins index-on-index. If multiples columns given, the passed DataFrame must have a MultiIndex. Can pass an array as the join key if not already contained in the calling DataFrame. Like an Excel VLOOKUP operation

how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default: ‘left’

How to handle the operation of the two objects.

  • left: use calling frame’s index (or column if on is specified)
  • right: use other frame’s index
  • outer: form union of calling frame’s index (or column if on is

    specified) with other frame’s index

  • inner: form intersection of calling frame’s index (or column if

    on is specified) with other frame’s index

lsuffix : string

Suffix to use from left frame’s overlapping columns

rsuffix : string

Suffix to use from right frame’s overlapping columns

sort : boolean, default False

Order result DataFrame lexicographically by the join key. If False, preserves the index order of the calling (left) DataFrame

Returns:
joined : DataFrame

See also

DataFrame.merge
For column(s)-on-columns(s) operations

Notes

on, lsuffix, and rsuffix options are not supported when passing a list of DataFrame objects

Examples

>>> caller = pd.DataFrame({‘key‘: [‘K0‘, ‘K1‘, ‘K2‘, ‘K3‘, ‘K4‘, ‘K5‘],
...                        ‘A‘: [‘A0‘, ‘A1‘, ‘A2‘, ‘A3‘, ‘A4‘, ‘A5‘]})
>>> caller
    A key
0  A0  K0
1  A1  K1
2  A2  K2
3  A3  K3
4  A4  K4
5  A5  K5
>>> other = pd.DataFrame({‘key‘: [‘K0‘, ‘K1‘, ‘K2‘],
...                       ‘B‘: [‘B0‘, ‘B1‘, ‘B2‘]})
>>> other
    B key
0  B0  K0
1  B1  K1
2  B2  K2

Join DataFrames using their indexes.==》join on indexes

>>> caller.join(other, lsuffix=‘_caller‘, rsuffix=‘_other‘)
>>>     A key_caller    B key_other
    0  A0         K0   B0        K0
    1  A1         K1   B1        K1
    2  A2         K2   B2        K2
    3  A3         K3  NaN       NaN
    4  A4         K4  NaN       NaN
    5  A5         K5  NaN       NaN

If we want to join using the key columns, we need to set key to be the index in both caller and other. The joined DataFrame will have key as its index.

>>> caller.set_index(‘key‘).join(other.set_index(‘key‘))
>>>      A    B
    key
    K0   A0   B0
    K1   A1   B1
    K2   A2   B2
    K3   A3  NaN
    K4   A4  NaN
    K5   A5  NaN

Another option to join using the key columns is to use the on parameter. DataFrame.join always uses other’s index but we can use any column in the caller. This method preserves the original caller’s index in the result.

>>> caller.join(other.set_index(‘key‘), on=‘key‘)
>>>     A key    B
    0  A0  K0   B0
    1  A1  K1   B1
    2  A2  K2   B2
    3  A3  K3  NaN
    4  A4  K4  NaN
    5  A5  K5  NaN

原文地址:https://www.cnblogs.com/wqbin/p/10363689.html

时间: 2024-10-07 23:16:37

Pandas中DataFrame数据合并、连接(concat、merge、join)之join的相关文章

Pandas中DataFrame数据合并、连接(concat、merge、join)之concat

一.concat:沿着一条轴,将多个对象堆叠到一起 concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, copy=True): objs:需要连接的对象集合,一般是列表或字典: axis:连接轴向: join:参数为'outer'或'inner': join_axes=[]:指定自定义的索

Pandas中DataFrame数据合并、连接(concat、merge、join)之merge

二.merge:通过键拼接列 类似于关系型数据库的连接方式,可以根据一个或多个键将不同的DatFrame连接起来. 该函数的典型应用场景是,针对同一个主键存在两张不同字段的表,根据主键整合到一张表里面. merge(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=True, suffixes=('_x', '_y'), copy=Tr

短视频学习 - 6、pandas之DataFrame数据合并

今日内容 # DataFrame的concat.merge.append操作 简介 # Pandas 的两个主要数据结构,Series(1维)和DataFrame(2维) # 整理数据.清理数据,分析数据.数据建模,然后将分析结果组织成适合绘图或表格显示的形式 常用操作 # 分片合并 concat # 融合结合 merge <{'key': ['func', 'func'], 'v1': [1, 2]},{'key': ['func', 'func'], 'v2': [3,4]}> # 追加

pandas中DataFrame

python数据分析工具pandas中DataFrame和Series作为主要的数据结构. 本文主要是介绍如何对DataFrame数据进行操作并结合一个实例测试操作函数. 1)查看DataFrame数据及属性 df_obj = DataFrame() #创建DataFrame对象 df_obj.dtypes #查看各行的数据格式 df_obj['列名'].astype(int)#转换某列的数据类型 df_obj.head() #查看前几行的数据,默认前5行 df_obj.tail() #查看后几

将pandas的DataFrame数据写入MySQL数据库 + sqlalchemy

将pandas的DataFrame数据写入MySQL数据库 + sqlalchemy [python] view plain copy print? import pandas as pd from sqlalchemy import create_engine ##将数据写入mysql的数据库,但需要先通过sqlalchemy.create_engine建立连接,且字符编码设置为utf8,否则有些latin字符不能处理 yconnect = create_engine('mysql+mysql

pandas中,dataframe 进行数据合并-pd.concat()

``# 通过数据框列向(左右)合并 a = pd.DataFrame(X_train) b = pd.DataFrame(y_train) # 合并数据框(合并前需要将数据设置成DataFrame格式), 其中,如果axis=1,ignore_index将改变的是列上的索引(属性名) print(pd.concat([a,b], axis=1, ignore_index=False)) 原文地址:https://www.cnblogs.com/komean/p/10670548.html

pandas中DataFrame相关

1.创建 1.1  标准格式创建 DataFrame创建方法有很多,常用基本格式是:DataFrame 构造器参数:DataFrame(data=[],index=[],coloumns=[]) In [272]: df2=DataFrame(np.arange(16).reshape((4,4)),index=['a','b','c','d'],columns=['one','two','three','four']) In [273]: df2 Out[273]: one two three

Pandas:DataFrame数据的更改、插入新增的列和行

一.更改DataFrame的某些值 1.更改DataFrame中的数据,原理是将这部分数据提取出来,重新赋值为新的数据. 2.需要注意的是,数据更改直接针对DataFrame原数据更改,操作无法撤销,如果做出更改,需要对更改条件做确认或对数据进行备份. 代码: import pandas as pd df1 = pd.DataFrame([['Snow','M',22],['Tyrion','M',32],['Sansa','F',18],['Arya','F',14]], columns=['

Tomcat中配置数据源和连接池

(1)为什么需要配置数据源和连接池? 我们知道在每次java程序俩接数据库的时候我们都需要请求连接数据库然后打开读取数据然后关闭, 这样使得每一个用户访问的时候都需要服务器做出相应,这样的话服务器端承受巨大的压力,如此效率就会下降, 为了解决这个问题我们可以让数据库提前打开连接等待用户连接当有用户连接的时候,就把数据库已存在的连接 给用户即可 而我们就把这条连接叫做"连接池"  当连接池中的连接足够的话 就分给用户,如果不够用户则在"|队列池"中等待: (2)实现过