Datasets for MachineLearning

Public datasets for machine learning  http://homepages.inf.ed.ac.uk/rbf/IAPR/researchers/MLPAGES/mldat.htm

Weka datasets http://www.cs.waikato.ac.nz/ml/weka/datasets.html

时间: 2024-10-12 14:20:56

Datasets for MachineLearning的相关文章

跟 Google 学 machineLearning [1]

时至今日,我才发现 machineLearning 的应用门槛已经被降到了这么低,简直唾手可得.我实在找不到任何理由不对它进入深入了解.如标题,感谢 Google 为这项技术发展作出的贡献.当然,可能其他人做了 99%, Google 只做了 1%,我想说,真是漂亮的 1%. 切入正题,今天从 Youtube 上跟随 Google 的工程师完成了第一个 machineLearning 的小程序.作为学习这项技能的 hello world 吧. 是为记录. 1 from scipy.spatial

Spark1.6 DataSets简介

Apache Spark提供了强大的API,以便使开发者为使用复杂的分析成为了可能.通过引入SparkSQL,让开发者可以使用这些高级API接口来从事结构化数据的工作(例如数据库表,JSON文件),并提供面向对象使用RDD的API,开发只需要调用相关 的方法便可使用spark来进行数据的存储与计算.那么Spark1.6带给我们了些什么牛逼的东西呢? 额... Spark1.6提供了关于DateSets的API,这将是Spark在以后的版本中的一个发展趋势,就如同DateFrame,DateSet

NCEP Datasets

1. Global Change Master Directory http://gcmd.gsfc.nasa.gov/ 2. ECMWF http://apps.ecmwf.int/datasets/data/interim-full-moda/levtype=sfc/ ncep资料下载2009-08-03 12:031. 8天预报.(含0小时)3小时间隔1x1 deg gfs(globle forecast model)每个文件大约1g http://motherlode.ucar.edu/

Datasets for Data Mining and Data Science

From kdnuggets Data repositories AWS (Amazon Web Services) Public Data Sets, provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. BigML big list of public data sources. Bioassay da

MachineLearning - Introduction (Week 1)

http://blog.csdn.net/pipisorry/article/details/43089121 机器学习的来源和用例: Machine Learning - Grew out of work in AI - New capability for computers Examples: - Database mining Large datasets from growth of automation/web. E.g., Web click data, medical recor

[MachineLearning]KNN

# -*- coding: utf-8 -*- """ Created on Wed Jun 18 11:46:15 2014 @author: hp """ import numpy as np import operator def createDataSet(): group=np.random.rand(4,2) labels=['a','b','c','d'] return group,labels def classify0(inX,

spark 笔记 2: Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing

http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf ucb关于spark的论文,对spark中核心组件RDD最原始.本质的理解,没有比这个更好的资料了.必读. Abstract RDDs provide a restricted form of shared memory, based on coarse grained transformations rather than fine-grained updates to s

R TUTORIAL: VISUALIZING MULTIVARIATE RELATIONSHIPS IN LARGE DATASETS

In two previous blog posts I discussed some techniques for visualizing relationships involving two or three variables and a large number of cases. In this tutorial I will extend that discussion to show some techniques that can be used on large datase

Alternating Least Squares(ASL) for Implicit Feedback Datasets的数学推导以及用Python实现

近期在看CF的相关论文,<Collaborative Filtering for Implicit Feedback Datasets>思想非常好,非常easy理解.可是从目标函数 是怎样推导出Xu和Yi的更新公式的推导过程却没有非常好的描写叙述.所以以下写一下 推导: 首先对Xu求导: 当中Y是item矩阵,n*f维,每一行是一个item_vec,C^u是n*n维的对角矩阵. 对角线上的每个元素是c_ui,P(u)是n*1的列向量,它的第i个元素为p_ui. 然后令导数=0,可得: 因为x_