Data Architecture A Primer for the Data Scientist 2nd Edition by WH Inmon, Daniel Linstedt, Mary Levins – Ebook PDF Instant Download/Delivery: 9780128169162 ,0128169168
Full download Data Architecture A Primer for the Data Scientist 2nd Edition after payment

Product details:
ISBN 10: 0128169168
ISBN 13: 9780128169162
Author: WH Inmon, Daniel Linstedt, Mary Levins
Over the past 5 years, the concept of big data has matured, data science has grown exponentially, and data architecture has become a standard part of organizational decision-making. Throughout all this change, the basic principles that shape the architecture of data have remained the same. There remains a need for people to take a look at the “bigger picture” and to understand where their data fit into the grand scheme of things.
Data Architecture: A Primer for the Data Scientist, Second Edition addresses the larger architectural picture of how big data fits within the existing information infrastructure or data warehousing systems. This is an essential topic not only for data scientists, analysts, and managers but also for researchers and engineers who increasingly need to deal with large and complex sets of data. Until data are gathered and can be placed into an existing framework or architecture, they cannot be used to their full potential. Drawing upon years of practical experience and using numerous examples and case studies from across various industries, the authors seek to explain this larger picture into which big data fits, giving data scientists the necessary context for how pieces of the puzzle should fit together.
New case studies include expanded coverage of textual management and analytics
New chapters on visualization and big data
Discussion of new visualizations of the end-state architecture
Data Architecture A Primer for the Data Scientist 2nd Edition Table of contents:
Chapter 1.1: An Introduction to Data Architecture
Abstract
Subdividing Data
Repetitive/Nonrepetitive Unstructured Data
The Great Divide of Data
Textual/Nontextual Data
The Different Forms of Data
Business Value
Chapter 1.2: The Data Infrastructure
Abstract
Two Types of Repetitive Data
Repetitive Structured Data
Repetitive Big Data
The Two Infrastructures
What’s Being Optimized?
Comparing the Two Infrastructures
Chapter 1.3: The “Great Divide”
Abstract
Classifying Corporate Data
The “Great Divide”
Repetitive Unstructured Data
Nonrepetitive Unstructured Data
Different Worlds
Chapter 1.4: Demographics of Corporate Data
Abstract
Chapter 1.5: Corporate Data Analysis
Abstract
Chapter 1.6: The Life Cycle of Data: Understanding Data Over Time
Abstract
Chapter 1.7: A Brief History of Data
Abstract
Paper Tape and Punch Cards
Magnetic Tapes
Disk Storage
Data Base Management System (DBMS)
Coupled Processors
Online Transaction Processing
Data Warehouse
Parallel Data Management
Data Vault
Big Data
The Great Divide
Chapter 2.1: The End-State Architecture—The “World Map”
Abstract
Architectural Components
Different Kinds of Data in the End State Architecture
Shaping the Data Through Models
Where Is the Data Warehouse?
Where Different Types of Questions Are Answered Across the End State Architecture
Data in the Data Lake
Metadata in the End State Architecture
Networked Metadata
An Evolutionary Experience
The Data Lake Architecture
Chapter 3.1: Transformations in the End-State Architecture
Abstract
Redundant Data
Transformations
Customizing Data
Transforming Text
Transforming Application Data
Transforming Data Into a Customized State
Transforming Data Into Bulk Storage
Transforming Data Generated Automatically
Transforming Bulk Data
Transformation and Redundancy
Chapter 4.1: A Brief History of Big Data
Abstract
An Analogy—Taking the High Ground
Taking the High Ground
Standardization With the 360
Online Transaction Processing
Enter Teradata and MPP Processing
Then Came Hadoop and Big Data
IBM and Hadoop
Holding the High Ground
Chapter 4.2: What Is Big Data?
Abstract
Another Definition
Large Volumes
Inexpensive Storage
The Roman Census Approach
Unstructured Data
Data in Big Data
Context in Repetitive Data
Nonrepetitive Data
Context in Nonrepetitive Data
Chapter 4.3: Parallel Processing
Abstract
Chapter 4.4: Unstructured Data
Abstract
Textual Information—Everywhere
Decisions Based on Structured Data
The Business Value Proposition
Repetitive and Nonrepetitive Unstructured Information
Ease of Analysis
Contextualization
Some Approaches to Contextualization
Map Reduce
Manual Analysis
Chapter 4.5: Contextualizing Repetitive Unstructured Data
Abstract
Parsing Repetitive Unstructured Data
Recasting the Output Data
Chapter 4.6: Textual Disambiguation
Abstract
From Narrative Into an Analytical Data Base
Input Into Textual Disambiguation
Mapping
Input/Output
Document Fracturing/Named Value Processing
Preprocessing a Document
E-mails—A Special Case
Spreadsheets
Report Decompilation
Chapter 4.7: Taxonomies
Abstract
Data Models/Taxonomies
Applicability of Taxonomies
What Is a Taxonomy?
Taxonomies in Multiple Languages
Commercial or Private Taxonomies?
Dynamics of Taxonomies and Textual Disambiguation
Taxonomies and Textual Disambiguation—Separate Technologies
Different Types of Taxonomies
Taxonomies—Maintenance Over Time
Chapter 5.1: The Siloed Application Environment
Abstract
The Challenge of Siloed Applications
Building Siloed Applications
What Does a Siloed Application Look Like?
Current Valued Data
Minimal Historical Data
High Availability
Overlap Between Siloed Applications
Frozen Business Requirements
Dismantling Siloed Applications
Chapter 6.1: Introduction to Data Vault 2.0
Abstract
Data Vault Origins and Background
What Is Data Vault 2.0 Modeling?
How Is Data Vault 2.0 Methodology Defined?
Why Do We Need a Data Vault 2.0 Architecture?
Where Does Data Vault 2.0 Implementation Fit?
What Are the Business Benefits of Data Vault 2.0?
What Is Data Vault 1.0?
Chapter 6.2: Introduction to Data Vault Modeling
Abstract
What Is a Data Vault Model Concept?
Data Vault Model Defined
Components of a Data Vault Model
What Makes Business Keys So Interesting?
What Does This Have to Do With Data Vault and Data Warehousing?
How Does This Translate to Data Vault Modeling?
Why Restructure the Data From the Staging Area?
What Are the Basic Rules of the Data Vault Model?
Why Do We Need Many to Many Link Structures?
Primary Key Options for Data Vault 2.0
Chapter 6.3: Introduction to Data Vault Architecture
Abstract
What Is a Data Vault 2.0 Architecture?
How Does NoSQL Fit in to the Architecture?
What Are the Objectives of the Data Vault 2.0 Architecture?
What Is the Objective of the Data Vault 2.0 Model?
What Are Hard and Soft Business Rules?
How Does Managed Self Service BI Fit in the Architecture?
Chapter 6.4: Introduction to Data Vault Methodology
Abstract
Data Vault 2.0 Methodology Overview
How Does CMMI Contribute to the Methodology?
If CMMI Is So Great, Why Should We Care About Agility Then?
Why Include PMP, SDLC If CMMI and Agile Should Be All That’s Needed?
So Then, What Does Six Sigma Contribute to the Data Vault 2 Methodology?
Where Does TQM (Total Quality Management) Fit in to All of This?
Chapter 6.5: Introduction to Data Vault Implementation
Abstract
Implementation Overview
What’s So Important About Patterns?
Why Does Reengineering Happen Because of Big Data?
Why Do We Need to Virtualize Our Data Marts?
What Is Managed Self-Service BI?
Chapter 7.1: The Operational Environment: A Short History
Abstract
Commercial Uses of the Computer
The First Applications
Ed Yourdon and the Structured Revolution
The SDLC
Disk Technology
Enter the DBMS
Response Time and Availability
Corporate Computing Today
Chapter 7.2: The Standard Work Unit
Abstract
Elements of Response Time
An Hourglass Analogy
The Racetrack Analogy
Your Vehicle Runs as Fast as the Vehicle in Front of It
The Standard Work Unit
The SLA
Chapter 7.3: Data Modeling for the Structured Environment
Abstract
The Purpose of the Roadmap
Granular Data Only
The ERD
The Dis
Physical Data Base Design
Relating the Different Levels of the Data Model
An Example of the Linkage
Generic Data Models
Operational Data Models/Data Warehouse Data Models
Chapter 8.1: A Brief History of Data Architecture
Abstract
Chapter 8.2: Big Data/Existing System Interface
Abstract
The Big Data/Existing Systems Interface
The Repetitive Raw Big Data/Existing Systems Interface
Exception Based Data
The Nonrepetitive Raw Big Data/Existing Systems Interface
Into the Existing Systems Environment
The “Context Enriched” Big Data Environment
Analyzing Structured Data/Unstructured Data Together
Chapter 8.3: The Data Warehouse/Operational Environment Interface
Abstract
The Operational/Data Warehouse Interface
The Classical ETL Interface
The ODS and the ETL Interface
The Staging Area
Changed Data Capture
Inline Transformation
ELT Processing
Chapter 8.4: Data Architecture: A High-Level Perspective
Abstract
A High Level Perspective
Redundancy
The System of Record
Different Types of Questions
Different Communities
Chapter 9.1: Repetitive Analytics: Some Basics
Abstract
Different Kinds of Analysis
Looking for Patterns
Heuristic Processing
Freezing Data
The Sandbox
The “Normal” Profile
Distillation, Filtering
Subsetting Data
Bias of the Sample
Filtering Data
Repetitive Data and Context
Linking Repetitive Records
Log Tape Records
Analyzing Points of Data
Outliers
Data Over Time
Chapter 9.2: Analyzing Repetitive Data
Abstract
Log Data
Active/Passive Indexing of Data
Summary/Detailed Data
Metadata in Big Data
Linking Data
Chapter 9.3: Repetitive Analysis
Abstract
Internal, External Data
Universal Identifiers
Security
Filtering, Distillation
Archiving Results
Metrics
Chapter 10.1: Nonrepetitive Data
Abstract
Inline Contextualization
Taxonomy/Ontology Processing
Custom Variables
Homographic Resolution
Acronym Resolution
Negation Analysis
Numeric Tagging
Date Tagging
Date Standardization
List Processing
Associative Word Processing
Stop Word Processing
Word Stemming
Document Metadata
Document Classification
Proximity Analysis
Functional Sequencing Within Textual ETL
Internal Referential Integrity
Preprocessing, Postprocessing
Chapter 10.2: Mapping
Abstract
Chapter 10.3: Analytics From Nonrepetitive Data
Abstract
Call Center Information
Medical Records
Chapter 11.1: Operational Analytics: Response Time
Abstract
Transaction Response Time
Chapter 12.1: Operational Analytics
Abstract
Different Perspectives of Data
Data Marts
The Operational Data Store—ODS
Chapter 13.1: Personal Analytics
Abstract
Chapter 14.1: Data Models Across the End-State Architecture
Abstract
The Different Data Models
Functional Decomposition and Data Flow Diagrams
The Corporate Data Model
The Star Join/Dimensional Data Model
Taxonomies/Ontologies
The Selective Subdivision of Data
Proactive/Reactive Data Models
Chapter 15.1: The System of Record
Abstract
The End User Cycle of Awareness
The System of Record
The System of Record in the End State Architecture
The Role of Age in the System of Record
A Simple Example
The Flow of Data in the System of Record
Other Data Than the System of Record
Is Data Updated in the System of Record?
Detailed and Summary Data in the System of Record
Auditing Data and the System of Record
Text and the System of Record
Chapter 16.1: Business Value and the End-State Architecture
Abstract
The Evolution of the End State Architecture
What is Meant by “Business Value”
Tactical Business Value/Strategic Business Value
Volume of Data Versus Business Value
The “Million in One” Syndrome
Where Business Value Occurs
Data Relevancy Over Time
Where Tactical Decisions Are Made
Chapter 17.1: Managing Text
Abstract
The Challenge of Text
The Challenge of Context
The Processing Components of Textual ETL
Secondary Analysis
Visualization
Merging Text Based Data and Structured Data
Chapter 18.1: An Introduction to Data Visualizations
Abstract
Introduction to Data Visualizations—Overview
Purpose and Context
Visualization—A Science and an Art
Visualization Framework
Step 1: Define
Step 2: Data
Step 3: Design
Step 4: Distribute
Data Visualization Tools and Software
People also search for Data Architecture A Primer for the Data Scientist 2nd Edition:
data architecture a primer for the data scientist 2nd edition
data scientist vs data architect salary
purpose of data architecture
what is data architecture in data analytics
who is a data architect
Tags: WH Inmon, Daniel Linstedt, Mary Levins, Data Architecture, Data Scientist


