Data Warehouse

Knowledge Discovery Process

OLTP & OLAP

联机事务处理(OLTP, online transactional processing)系统:涵盖组织机构大部分的日常操作,purchasing, inventory, banking,manufacturing, payroll, registration, accounting

联机分析处理(OLAP, online analytical processing)系统:以不同的格式组织和提供数据,以满足不同用户的各种需求,为数据分析和决策方面提供服务。

Distinct features (OLTP vs. OLAP):

User and system orientation: customer vs. market

Data contents: current, detailed vs. historical, consolidated

View: current, local vs. evolutionary, integrated

Access patterns: update vs. read-only but complex queries

Data Warehouse

DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery

Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation

Data Warehouse:

数据仓库将分布在企业网络中不同信息岛上的业务数据集成到一起,存储在一个单一的集成关系型数据库中,利用这样的集成信息,可方便用户对信息访问,可使决策人员对一段时间内的历史数据进行分析,研究事务的发展走势。

A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process.” — W. H.Inmon

data stored in data warehouse has been processed after extracation, cleaning, transformation, load(sort, summarize...) and refresh.

Data Warehouse model : dimensions and measures, you can locate some data by dimension and see the data by measures

Conception model : star schema, snowflake schema(a refinement of star schema), fact constellations(a collection of stars)

Example of Star Schema:

Typical OLAP Operations :

Roll up: summarize data by climbing up hierarchy or by dimension reduction, you can roll up to all to reduce a dimension

Dill down: reverse of Roll-up, from higher level summary to lower level summary or detailed data

Slice and dice: project and select

Priot(rotate): reorient the cube, visualization, 3D to series of 2D planes.

参考

中国科学院大学《数据挖掘》课程slices

时间: 2024-10-10 10:33:20

Data Warehouse的相关文章

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design earn essential techniques from data warehouse legend Bill Inmon on how to build the reporting environment your business needs now! Answers for many valuable business questio

使用PowerShell在Azure China创建Data Warehouse

微软的Azure Data Warehouse是基于MPP架构的分布式系统: Control Node负责管理系统和接受用户的请求,Compute Node负责计算. 目前在国内Azure Data Warehouse已经落地了.可以使用新的Portal页面进行管理,也可以使用PowerShell进行管理. 本文将介绍用PowerShell的管理方式.包括创建.Scale out.Suspend和Resume. 1 环境准备 登陆Azure China,并创建Resource Group $my

混合 Data Warehouse 和 Big Data 倉庫的新架構

(讀書筆記)許多公司,儘管想導入 Big Data,仍必須繼續用 Data Warehouse 來管理結構化的營運數據.系統記錄.而 Big Data 的出現,為 Data Warehouse 提供了一個互補的機會,而不是取代後者. 高度結構化的營運資料 (data,數據),仍然可保留在 Data Warehouse 中:而分散式 (distributed) 的資料,以及會即時改變的資料,則可交由基於 Hadoop 的架構來控制. 圖 1 傳統的 Data Warehouse 和 Data Ma

System Center 2012 R2 POC部署之Services Manager Data Warehouse部署

System Center 2012 R2 POC部署之Services Manager Data Warehouse部署 1. 载入安装光盘,运行安装程序,选择Service Manager数据仓库管理服务器 2. 设置产品注册信息 3. 设置安装路径 4. 检查软硬件环境 5. 配置数据仓库数据库,输入数据库服务器,选择实例 6. 配置附加数据仓库数据市场,输入数据库服务器 7. 配置数据仓库管理组,输入组名称,选择管理组管理员 8. 配置数据仓库报表服务器,输入报表服务器名称 9. 配置服

对数据集“dsArea”执行查询失败。 (rsErrorExecutingCommand),Query execution failed for dataset 'dsArea'. (rsErrorExecutingCommand),Manually process the TFS data warehouse and analysis services cube

错误提示: 处理报表时出错. (rsProcessingAborted)对数据集“dsArea”执行查询失败. (rsErrorExecutingCommand)Team System 多维数据集或者不存在,或者未经处理. 解决方法: Manually process the TFS data warehouse and analysis services cube When you need the freshest data in your reports, when errors have

Data Warehouse Definition

Data Warehouse Definition Different people have different definitions for a data warehouse. The most popular definition came from Bill Inmon, who provided the following: A data warehouse is a subject-oriented(面向主题), integrated(集成的), time-variant(随时间变

BI 底座——数据仓库技术(Data Warehouse)

在开始喷这个主题之前,让我们先看看数据仓库的官方定义: 数据仓库(Data Warehouse)是一个面向主题的(Subject Oriented).集成的(Integrate).相对稳定的(Non-Volatile).反映历史变化(Time Variant)的数据集合,用于支持管理决策.以上是数据仓库的官方定义. "操作型数据库"如银行里记账系统数据库,每一次业务操作(比如你存了5元钱),都会立刻记录到这个数据库中,长此以往,满肚子积累的都是零碎的数据,这种干脏活累活还不得闲的数据库

DataBase vs Data Warehouse

Database https://en.wikipedia.org/wiki/Database A database is an organized collection of data.[1] A relational database, more restrictively, is a collection of schemas, tables, queries, reports, views, and other elements. Database designers typically

Data Warehouse Applications

Three types of DW Application. DW is the basis of these applications. 1.Information Processing The information can be processed by means of querying,basic statiscal analysis,reporting using crosstabs,charts or graphs. 2.OLAP The data can be analysed