np.nan is an invalid document, expected byte or unicode string.

ValueError                                Traceback (most recent call last)
<ipython-input-12-1dc462ae8893> in <module>()
     15     print(‘cv prepared!‘)
     16     return df_x.astype(np.float64)
---> 17 df_test = get_feature(test_data,all_table,ready_cols,vec_col)
     18 df_train = get_feature(train_data,all_table,ready_cols,vec_col)

<ipython-input-12-1dc462ae8893> in get_feature(df, all_data, cols, vec_col)
      9     cv=CountVectorizer()
     10     for feature in vec_col:
---> 11         cv.fit(all_data[feature])
     12         df_a = cv.transform(df[feature])
     13         df_x = sparse.hstack((df_x, df_a))

def get_feature(df,all_data,cols,vec_col):
  enc = OneHotEncoder()
  df_x=np.int64(df[cols])
  cv=CountVectorizer()
  for feature in vec_col:
    cv.fit(all_data[feature])
    df_a = cv.transform(df[feature])
    df_x = sparse.hstack((df_x, df_a))
    print(‘Done Feature ‘+ str(feature))
  print(‘cv prepared!‘)
  return df_x.astype(np.float64)

原因分析:我的all_data中存在nan的数据,我在数据读入的时候使用了all_table.fillna(-1),我理解只会填充空值,但是all_table中原本为nan的值,不会改变。改为all_table.fillna(-1),可执行。

原文地址:https://www.cnblogs.com/smartwhite/p/9749168.html

时间: 2024-10-10 22:25:20

np.nan is an invalid document, expected byte or unicode string.的相关文章

python 值比较判断,np.nan is np.nan 却 np.nan != np.nan ,pandas 单个数据框值判断nan

pandas中DataFrame,Series 都有 isnull()方法,而数据框却没有,用了就会报错:AttributeError: 'float' object has no attribute 'isnull' 怎么判断单个框是否为 np.nan Python常规的判断,==,和is, 这对None是有效的 None is NoneOut[49]: True None == NoneOut[50]: True 而对,np.nan,只能用is da1pd.ix[6000996,u'团队']

com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 middle byte 0xc9

当json字符串中含有中文时,使用jackson解析报出如下错误: om.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 middle byte 0xc9 at [Source: [[email protected]; line: 1, column: 12] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1369) at com.

byte[] bytes和string转换

public static string ToHexString ( byte[] bytes ) // 0xae00cf => "AE00CF "        {            string hexString = string.Empty;            if ( bytes != null )            {                StringBuilder strB = new StringBuilder ();            

Delphi Byte数组与String类型的转换

string string = AnsiString = 长字符串,理论上长度不受限制,但其实受限于最大寻址范围2的32次方=4G字节: 变量Str名字是一个指针,指向位于堆内存的字符序列,字符序列起始于@Str[1],@Str[1]偏移负16个字节的空间存储着字串长度.引用计数等信息.字符序列以NULL结束. string[n] string[n] = ShortString = 短字符串,最多容纳255个字符,实际长度是字符长度+1,是Delphi的简单类型: Str[0]存储着字符的个数,

Caused by: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 3939

在使用Gson解析JSON数据时,报错:Caused by: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 3939 原因:传入的参数有问题: while((len=inputStream.read(data))!=-1){ outPutStream.write(data,0,len); } 错写成: while((len=inputStream.read())!=-1

JAVA关于byte数组与String转换的问题

1 public class ToString{ 2 public static void main(String[] args){ 3 String aa = "hellow"; 4 byte[] bb = aa.getBytes(); 5 byte[] cc = aa.getBytes(); 6 7 System.out.println(aa); 8 System.out.println(bb.toString()); 9 System.out.println(cc.toStrin

java byte数组与String互转

java byte数组与String互转 CreationTime--2018年7月6日14点53分 Author:Marydon 1.String-->byte[] 方法:使用String.getBytes(charset)实现 String website = "http://www.cnblogs.com/Marydon20170307"; // String-->byte[],并指定字符集 byte[] b = website.getBytes("utf-

org.hibernate.TypeMismatchException: Provided id of the wrong type for class cn.itcast.entity.User. Expected: class java.lang.String, got class java.lang.Integer at org.hibernate.event.internal.Defau

出现org.hibernate.TypeMismatchException: Provided id of the wrong type for class cn.itcast.entity.User. Expected: class java.lang.String, got class java.lang.Integer at org.hibernate.event.internal.DefaultLoadEventListener.checkIdClass(DefaultLoadEvent

CORE EF The expected type was &#39;System.String&#39; but the actual value was of type &#39;System.Guid&#39;.

[小提示]使用.NET Core EF 开发时,当你的数据库实体中添加了"Id"的字段时,会默认做为主键. 在设置数据库字段类型时如果设置了varchar或者char字段的长度为36时(36是微软GUID的长度),并且数据存储的数据正好是Guid字符串时,在使用EF查询数据库时EF会自动把数据类型转为Guid类型,如果你的数据库实体类中对应的字段正好是string类型时会抛出以下异常: An exception occurred while reading a database val