Machine Learning with Oracle Database Advanced Analytics

ariticle from: http://www.ateam-oracle.com/ml-with-oracle-database-cloud-advanced-analytics

  • Oracle DB + Oracle Data Mining + Oracle R Enterprise = Database Advanced Analytics (OAA)
  • The Database Advanced Analytics option allows running Machine Learning algorithms within the database itself.
  • The on-prem flavor of OAA is available as part of database Enterprise Edition 12c and upward releases.
  • In OPC, OAA is available as part of DBCS High Performance Edition in OCI Classic, a High Performance Edition of DB System in OCI, and as part of Autonomous Data Warehouse Cloud(ADWC) in OCI which is available in 18c version only.
  • "Oracle Machine Learning" is a Zeppelin based SQL notebook that is available with ADWC

1. Introduction

If you are reading this blog then you already know what is Machine Learning(ML). But it always helps to have a formal frame of reference of its definition, as provided by Tom Mitchell :

Machine Learning is the study of algorithms that learn from experience E with respect to some class of tasks T and performance measure P, such that the algorithms‘ performance at tasks in T, as measured by P, improves with experience E.

The most important part of the definition above, as any data scientist would agree, is the experience E or the data the algorithm (a.k.a. ML model) trains on. Almost always it is the data that differentiates a great ML model from a good one.

Now, if you‘re an enterprise customer embarking on an ML project, chances are the data is being generated by one of the back-end systems, and it is being stored in Oracle database. If your data science team is using the Open Source libraries (say Python scikit-learn) for ML, then this data must typically be packaged and moved over to a different computing infrastructure for further analysis. With such data movement there‘s time involved, security issues etc.

But what if the data never had to be moved, and the Oracle database itself could do all the Machine Learning for you ? This is where the Oracle Database Advanced Analytics(OAA) comes in.
OAA provides parallel, in-database implementation of the commonly used Machine Learning algorithms, ensuring the data always stays within the database.

OAA is the best place to start the ML journey for any enterprise customer, because as any Data Scientist would tell you, most of the modern enterprise ML problems can be solved by the simplest of regression algorithms if the data is good.

In this blog I will provide an overview of Oracle Database AA, describe its on-prem as well as Oracle Public Cloud(OPC) avatars, provide an overview of the provisioning process for OPC, and finally point you to some awesome blogs and documentation pages to bookmark to keep yourself up to date.

2. Product Overview

Oracle Advanced Analytics (OAA) provides an in-database implementation of various Machine Learning algorithms, and it integrates with the open source R language.

The description above can be broken down into two components, which are essentially what OAA consists of :

2.1 Oracle Data Mining

Historically ODM used to be a separate product, but it has now been bundled as part of the Advanced Analytics offering.

ODM provides a set of pre-implemented Machine Learning models that are available to use as SQL functions. These functions are executed in-memory within the database itself, taking full advantage of all the parallelism built within the database.

The list of algorithms include most of the commonly used ones for for various ML problem categories such as:

  • Classification : Naive Bayes, SVM, Decision Tree
  • Regresssion : GLM, Logistic Regression
  • Anomaly Detection: One Class SVM
  • Clustering : K-Means, Orthogonal Partition Clustering
  • Association : Apriori
  • Feature Extraction : Matrix Factorization, PCA, SVD

The list above is by no means exhaustive, and with every release of the product more algorithms are being added.
The most up-to-date list of algorithms and their usage can be found here and here.

Also, not only does ODM do all heavy-lifting of implementing the ML algorithms, it also provides a SQL-Developer based GUI component, called Data Miner GUI , which enables building the ML model in a UI-driven workflow, right from SQL Developer itself.

2.2 Oracle R Enterprise

ORE essentially integrates the R programming language with Oracle database. It is a set of R packages and Oracle database features that enable the R user to operate on database-resident data without using any SQL and execute R scripts that run directly on the database, thus offering the data scientist an R interface to the database.

R users can develop and test R scripts interactively, use CRAN and other packages with the database, and use Oracle database tables as R objects. ORE has overloaded functions that translate R operations into SQL that executes in the database. Similarly the output from the database operation is converted back to R objects.

3. OAA on-prem

Now that we know what the product is, how do we try it out ?

Easy. Advanced Analytics is basically a database product, available as part of the ‘Enterprise Edition‘ of Oracle Database on-premise. This can be downloaded here, along with the associated R and SQL Developer components.

4. OAA in the Oracle Public Cloud (OPC)

While the on-prem approach described above works, much easier options exist if you have OPC subscription.

There are three ways to test-drive OAA in OPC , as described below.

4.1.1 Database Cloud Service

The Oracle Database Cloud Service is the defacto database available with OPC. It comes in various options depending on the desired functionality, and as described in the pricing sheet here , with the High Performance Package and above the database is provisioned with Advanced Analytics features.

On the create database instance screen, simply select the right database version and the edition, and you‘re good to go.

Please note that the option described here provisions the instance in Oracle Compute Infrastructure Classic.

4.1.2. DB System

The second approach is to launch a DB System in the newer Oracle Cloud Infrastructure environment. In OCI , simply to go DB Systems -> Launch DB System, and select the right flavor.

4.1.3 Autonomous Data Warehouse Cloud (ADWC)

The third approach is to use the newly introduced, self-driving ADWC. Go to Autonomous Data Warehouse, click on Create, fill in the details, and the instance is provisioned for you. Please note that ADWC comes only with the 18c database version.

4.2 ADWC and ‘Oracle Machine Learning‘

The Oracle Autonomous Data Warehouse Cloud is a fully-managed database service that is easy to setup (as evident from the provisioning screenshot above) , based on Exadata technology, and truly elastic such that compute and storage can be scaled up or down without any downtime.
It integrates with a number of other Oracle Cloud services including Analytics Cloud, Data Integration Cloud, etc.

ADWC provides two interfaces to access the database:

1. Using traditional SQL Developer based on SQL*Net connection,
   2. Using the newly introduced "Oracle Machine Learning" (OML) notebook. OML is a Zeppelin based SQL Notebook interface, available with ADWC only. It allows writing SQL scripts along with supporting the documentation, assumption, approaches etc to increase productivity.

5. Conclusion

I hope this blog provided a good overview of Oracle Advanced Analytics landscape.

In my next blog I will go through a worked example that uses OAA to detect anomalies in a sample dataset.

In the meantime, below are a few resources to get started with OAA :
1. Oracle Data Mining Tutorial Series
2. Oracle R Tutorial Series
3. Oracle Machine Learning Tutorial

Also, OAA is constantly evolving with new features in every release, and I strongly recommend following Charlie Berger (who leads the Product Management of OAA) at https://blogs.oracle.com/author/charlie-berger to remain updated with the latest features.

原文地址:https://www.cnblogs.com/aiden-liu/p/10798486.html

时间: 2024-12-10 20:05:16

Machine Learning with Oracle Database Advanced Analytics的相关文章

data mining,machine learning,AI,data science,data science,business analytics

数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系? 本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答不出来,我在知乎和博客上查了查这个问题,发现还没有人写过比较详细和有说服力的对比

CS281: Advanced Machine Learning 第一节

接触机器学习领域有一段时间了,跟过不少基础的课程,也看过一些入门的书籍和论文,现在想通过一门公开课系统的进入机器学习的领域,所以选择了advanced Machine Learning,这是哈佛大学的一门高级机器学习公开课,主要教材选用的是kevin Murphy Machine Learning: A Probabilistic Perspective, MIT Press 以及Christopher M. Bishop, Pattern Recognition and Machine Lea

Awesome Machine Learning

Awesome Machine Learning  A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php. If you want to contribute to this list (please do), send me a pull request or contact me @josephmisiti Als

P6 EPPM Manual Installation Guide (Oracle Database)

Contents Oracle Database Manual Configuration Overview ,,★★5 Oracle Database Installation ,,★★6 Creating the Database Structure for Oracle and Loading Application Data ,,★★7 Creating the P6 EPPM Database Structure for Oracle ,,★★7 Copying the Script

Decision Boundaries for Deep Learning and other Machine Learning classifiers

Decision Boundaries for Deep Learning and other Machine Learning classifiers H2O, one of the leading deep learning framework in python, is now available in R. We will show how to get started with H2O, its working, plotting of decision boundaries and

Quick and Easy Installation of Oracle Database 12c on Oracle Linux in Oracle VM VirtualBox

发贴人 Sergio-Oracle 于2018-4-18 23:10:15在Oracle Linux Introduction How Does This Work? Requirements Before You Get Started Steps Clone the vagrant-boxes repository from GitHub Download Oracle Database Installation Files Place the downloaded Database ins

[C5] Andrew Ng - Structuring Machine Learning Projects

About this Course You will learn how to build a successful machine learning project. If you aspire to be a technical leader in AI, and know how to set direction for your team's work, this course will show you how. Much of this content has never been

Auditing Enhancements (Audit Policies and Unified Audit Trail) in Oracle Database 12c Release 1 (12.1)

select substrb(parameter_name ,1,25) name, substrb(parameter_value,1,20) value, substrb(audit_trail ,1,20) trail from dba_audit_mgmt_config_params ; 8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux Home » Art

Machine Learning - VI. Logistic Regression (Week 3)

http://blog.csdn.net/pipisorry/article/details/43884027 机器学习Machine Learning - Andrew NG courses学习笔记 Classification  0.1表示含义 denote with 0 is the negative class denote with 1 is the positive class.  Hypothesis Representation  Decision Boundary  Cost