关于panda中dataframe的与&运算*(stackoverflow高票答案)

85 down vote favorite

31

What explains the difference in behavior of boolean and bitwise operations on lists vs numpy.arrays?

I‘m getting confused about the appropriate use of the ‘&‘ vs ‘and‘ in python, illustrated in the following simple examples.

    mylist1 = [True,  True,  True,  False,  True]
    mylist2 = [False, True, False,  True, False]  

    >>> len(mylist1) == len(mylist2)
    True

    # ---- Example 1 ----
    >>>mylist1 and mylist2
    [False, True, False, True, False]
    #I am confused: I would have expected [False, True, False, False, False]

    # ---- Example 2 ----
    >>>mylist1 & mylist2
    *** TypeError: unsupported operand type(s) for &: ‘list‘ and ‘list‘
    #I am confused: Why not just like example 1? 

    # ---- Example 3 ----
    >>>import numpy as np

    >>> np.array(mylist1) and np.array(mylist2)
    *** ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
    #I am confused: Why not just like Example 4? 

     # ---- Example 4 ----
    >>> np.array(mylist1) & np.array(mylist2)
    array([False,  True, False, False, False], dtype=bool)
    #This is the output I was expecting!

This answer, and this answer both helped me understand that ‘and‘ is a boolean operation but ‘&‘ is a bitwise operation.

I was reading some information to better understand the concept of bitwise operations, but I am struggling to use that information to make sense of my above 4 examples.

Note, in my particular situation, my desired output is a newlist where:

    len(newlist) == len(mylist1)
    newlist[i] == (mylist1[i] and mylist2[i]) #for every element of newlist

Example 4, above, led me to my desired output, so that is fine.

But I am left feeling confused about when/how/why I should use ‘and‘ vs ‘&‘. Why do lists and numpy arrays behave differently with these operators?

Can anyone help me understand the difference between boolean and bitwise operations to explain why they handle lists and numpy.arrays differently?

I just want to make sure I continue to use these operations correctly going forward. Thanks a lot for the help!

Numpy version 1.7.1

python 2.7

References all inline with text.

EDITS

1) Thanks @delnan for pointing out that in my original examples I had am ambiguity that was masking my deeper confusion. I have updated my examples to clarify my question.

python numpy bit-manipulation boolean-expression ampersand

shareimprove this question

edited May 23 ‘17 at 12:26

Community?

11

asked Mar 25 ‘14 at 21:18

rysqui

9661919

  • 4

    Example 1 only appears to give the correct output. It actually just returns the second list unaltered. Try some other lists, in particular anything where the second list contains a True in a position that‘s False in the first list: Boolean logic dictates a False output at that position, but you‘ll get a True. – user395760 Mar 25 ‘14 at 21:22

  •  

    @delnan Thanks for noticing the ambiguity in my examples. I have updated my examples to highlight my confusion and focus on the aspect of this behavior that I do not understand. I‘m clearly missing something important, because I did not expect the output of Example 1. – rysqui Mar 25 ‘14 at 21:37

  • 2

    In Numpy there‘s np.bitwise_and() and np.logical_and() and friends to avoid confusion. – Dietrich Mar 25 ‘14 at 21:54

  •  

    In example 1, mylist1 and mylist2 does not output the same result as mylist2 and mylist1, since what is being returned is the second list as pointed out by delnan. – user2015487 Feb 16 ‘16 at 17:58

  • 1

    Possible duplicate of Python: Boolean operators vs Bitwise operators – Oliver Ni Nov 6 ‘16 at 16:09

add a comment

7 Answers

active oldest votes

up vote 72 down vote accepted

and tests whether both expressions are logically True while & (when used with True/False values) tests if both are True.

In Python, empty built-in objects are typically treated as logically False while non-empty built-ins are logically True. This facilitates the common use case where you want to do something if a list is empty and something else if the list is not. Note that this means that the list [False] is logically True:

>>> if [False]:
...    print ‘True‘
...
True

So in Example 1, the first list is non-empty and therefore logically True, so the truth value of the and is the same as that of the second list. (In our case, the second list is non-empty and therefore logically True, but identifying that would require an unnecessary step of calculation.)

For example 2, lists cannot meaningfully be combined in a bitwise fashion because they can contain arbitrary unlike elements. Things that can be combined bitwise include: Trues and Falses, integers.

NumPy objects, by contrast, support vectorized calculations. That is, they let you perform the same operations on multiple pieces of data.

Example 3 fails because NumPy arrays (of length > 1) have no truth value as this prevents vector-based logic confusion.

Example 4 is simply a vectorized bit and operation.

Bottom Line

  • If you are not dealing with arrays and are not performing math manipulations of integers, you probably want and.
  • If you have vectors of truth values that you wish to combine, use numpy with &.

原文地址:https://www.cnblogs.com/Rvin/p/9504341.html

时间: 2024-10-20 02:50:14

关于panda中dataframe的与&运算*(stackoverflow高票答案)的相关文章

【转】Cocoa中的位与位运算

转自:http://www.tuicool.com/articles/niEVjy 介绍 位操作是程序设计中对位模式或二进制数的一元和二元操作. 在许多古老的微处理器上, 位运算比加减运算略快, 通常位运算比乘除法运算要快很多. 在现代架构中, 情况并非如此:位运算的运算速度通常与加法运算相同(仍然快于乘法运算).(摘自wikipedia) OC作为c的扩展和超集,位运算自然使用的是c的操作符.c提供了6个位操作符,$,|,^,~,<<,>>.本文不打算做位运算的基础教学,只介绍一

pandas中DataFrame

python数据分析工具pandas中DataFrame和Series作为主要的数据结构. 本文主要是介绍如何对DataFrame数据进行操作并结合一个实例测试操作函数. 1)查看DataFrame数据及属性 df_obj = DataFrame() #创建DataFrame对象 df_obj.dtypes #查看各行的数据格式 df_obj['列名'].astype(int)#转换某列的数据类型 df_obj.head() #查看前几行的数据,默认前5行 df_obj.tail() #查看后几

python中实现三目运算

python中没有其他语言中的三元表达式,不过有类似的实现方法 如: a = 1 b =2 k = 3 if a>b else 4 上面的代码就是python中实现三目运算的一个小demo, 如果a>b, k等于3,否则k等于4 理论上使用if elif else 也可以实现,但是使用三目运算可以大大简化代码,提高开发效率 原文地址:https://www.cnblogs.com/lowmanisbusy/p/9216851.html

pandas DataFrame(4)-向量化运算

pandas DataFrame进行向量化运算时,是根据行和列的索引值进行计算的,而不是行和列的位置: 1. 行和列索引一致: import pandas as pd df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]}) df2 = pd.DataFrame({'a': [10, 20, 30], 'b': [40, 50, 60], 'c': [70, 80, 90]}) print df1 + df2 a b

matlab 中使用 GPU 加速运算

为了提高大规模数据处理的能力,matlab 的 GPU 并行计算,本质上是在 cuda 的基础上开发的 wrapper,也就是说 matlab 目前只支持 NVIDIA 的显卡. 1. GPU 硬件支持 首先想要在 matlab 中使用 GPU 加速运算,需要计算机配备有 NVIDIA 的显卡,可在 matlab 中运行: >> gpuDevice 1 如果本机有 GPU 支持,会列出 CUDADevice 的相关属性. 2. GPU 和 CPU 之间的数据传递 gpuArray:将定义在 C

cocos2dx 中切换场景内存占用过高的处理

cocos2dx 中切换场景内存占用过高的处理 1.运行场景: CCScene *pScene = HelloWorld::scene(); pDirector->runWithScene(pScene); 2.替换场景: (1) CCScene *pScene=SceneTestScene::scene(); CCDirector::sharedDirector()->replaceScene(pScene); (2) CCScene *pScene=SceneTestScene::scen

一次通过中管、两次次通过高管(51,54,47)的经验总结

一次通过中管.两次通过高管的经验总结 自从参加完下半年的信息系统项目管理师考试后,一直怀着忐忑的心情等待公布成绩的那一天.2014年12月31日上午,当信管网提示可以查分时,怀着激动的心情点开查分页面,51,54,47.分数与我的预期略有出入,下午案例分析成绩要高于我的预期.甚至让我有些小意外,而论文成绩则略低于我的想象. 虽然当时合格标准尚未公布,但按以往惯例应该是通过了,心中一阵狂喜,庆幸自己的付出与好运,2014年收获了我一直期盼的信息系统项目管理师资格.本文通过回顾和整理信息系统项目管理

MFC 编辑框中字体大小改变,行高不能改变,只能显示一半的问题,已解决。

CKagulaCEdit是CEdit的一个继承类,m_edit的CKagulaCEdit类型的一个变量 调用的时候,是这样的: 编辑框中字体大小改变,行高不能改变,只能显示一半的问题,问题如下: 这时的显示是这样的: 添加 CEdit::SetFont(m_pfont);这行后, 显示正常:

sql server中如何查看执行效率不高的语句

sql server中,如果想知道有哪些语句是执行效率不高的,应该如何查看呢?下面就将为您介绍sql server中如何查看执行效率不高的语句,供您参考. 在测量功能时,先以下命令清除sql server的缓存 dbcc freeProcCache 在点击某个按钮,执行完后,再执行下面语句,就可以知道系统运行什么Sql和多少次了,其主要慢语句是那些了; SELECT creation_time  N'语句编译时间'         ,last_execution_time  N'上次执行时间'