Data Warehouse & BI Specialization

Data Warehousing Concepts

A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.

In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools,and other applications that manage the process of gathering data and delivering it to business users.

Introduction to Data Warehousing
  • Benefits of Data Warehousing
  • Operational System vs. Data Warehousing System
  • Data Warehouse Characteristics
  • Data Warehouse Definition
Data Warehouse Concepts
  • Extract, Transform, Load (ETL)
  • Data Mart
  • Data Integrity
  • Data Model
  • Slowly Changing Dimensions (SCD)
  • Online Analytical Processing (OLAP)
  • Schema
  • Dimensions
  • Fact
  • Dimensional Database
Database Concepts
  • Database
  • Database Keys
  • Database Anomalies
  • Normal Forms
  • Introduction to Business Intelligence

Data Science Specialization

Data Science is the process of extracting valuable insights from “data”.

Data Science - refers to the skills, technologies, applications and practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning.

Data scientists are those who crack complex data problems with their strong expertise in certain scientific disciplines. They workwith several elements related to mathematics, statistics, computerscience, etc (though they may not be an expert in all these fields).

Data Scientists are Business Analysts or Data Analysts, with a difference!

Key Concepts:
  • Data Science
  • Data Manipulations using ‘R’ Language
  • Machine Learning Technique’s
  • Hadoop Architecture
  • R & Hadoop Integration
  • NOSQL Databases

Data Science syllabus outline
  • A. Introduction
    • Examples, data science articulated, history and context, technology landscape
  • B. Data Manipulation at Scale
    • Databases and the relational algebra
    • Parallel databases, parallel query processing, in-database analytics
    • MapReduce, Hadoop, relationship todatabases, algorithms, extensions, languages
    • Key-value stores and NoSQL; tradeoffs of SQLand NoSQL
  • C. Data Analytics
    • Statistical modeling: basic concepts,experiment design, pitfalls
    • Machine learning: supervised learning (rules,trees, forests, nearest neighbor, regression), optimization (gradient descent and variants),unsupervised learning
  • D. Communicating Results
    • Visualization, data products, visual data analytics
    • Provenance, privacy, ethics, governance
  • E. Special Topics
    • Graph Analytics: structure, traversals, analytics, PageRank, community detection, recursive queries, and semantic web

Need more information, reach us at +91 998 666 9999 or click here

Quick Enquiry