SQL Server - Al Bada Services

SQL Server - Al Bada Services

Data Warehouse Architecture James Serra Data Warehouse/BI/MDM Architect [email protected] http://JamesSerra.com/ About me In IT for 28 years Worked as desktop/web/database developer, DBA, BI and DW architect, MDM, PDW Been perm, contractor, consultant, business owner MCSE for SQL Server 2012: Data Platform and BI SME for SQL Server 2012 certs Currently a consultant working with MDS at Schlumberger as a MDM Technical Lead Contributing writer for SQL Server Pro magazine Blog at JamesSerra.com

Agenda Why use a data warehouse? Fast Track Data Warehouse (FTDW) Appliances Data Warehouse vs Data Mart Kimball vs Inmon (Normalized vs Dimensional) Populating a Data Warehouse ETL vs ELT Normalizing and Surrogate Keys SSAS Cubes SQL Server 2012 Tabular Model End-User Microsoft BI Tools Why use a Data Warehouse? All these solutions are for data warehouses only (not OLTP). Reduce stress on production system Optimized for read access, sequential disk scans

Integrate many sources of data Keep historical records Restructure/rename tables and fields Use Master Data Management No IT involvement needed for users to create reports Improve data quality One version of the truth Easy to create BI solutions on top of it (SSAS cubes) Why use a Data Warehouse? Legacy applications + data marts = chaos Production Control MRP Inventory Control Parts Management

Finance Marketing Sales Accounting Logistics Management Reporting Shipping Engineering Raw Goods Actuarial

Order Control Purchasing Human Resources Enterprise data warehouse = order Continuity Consolidation Control Compliance Collaboration Single version of the truth

Enterprise Data Warehouse Every question = decision Hardware Solutions Fast Track Data Warehouse - A reference configuration optimized for data warehousing. This saves an organization from having to commit resources to configure and build the server hardware. Fast Track Data Warehouse hardware is tested for data warehousing which eliminates guesswork and is designed to save you months of configuration, setup, testing and tuning. You just need to install the OS and SQL Server Appliances - Microsoft has made available SQL Server appliances that allow customers to

deploy data warehouse (DW), business intelligence (BI) and database consolidation solutions in a very short time, with all the components pre-configured and pre-optimized. These appliances include all the hardware, software and services for a complete, ready-to-run, out-of-the-box, high performance, Fast Track Data Warehouse Software: SQL Server 2008 R2 Enterprise Windows Server 2008 Configuration guidelines: Physical table structures Indexes Compression SQL Server settings Windows Server settings

Loading Hardware: Tight specifications for servers, storage and networking Per core building block Appliances HP Business Data Warehouse Appliance HP Business Decision Appliance HP Database Consolidation Appliance

HP Enterprise Data Warehouse Appliance Dell Quickstart Data Warehouse Appliance 1000 Dell Quickstart Data Warehouse Appliance 2000 Dell Parallel Data Warehouse Appliance Data Warehouse vs Data Mart Data Warehouse: A single organizational repository of enterprise wide data across many or all subject areas Holds multiple subject areas Holds very detailed information Works to integrate all data sources Feeds dimensional model Data Mart: Subset of the data warehouse that is usually oriented to specific subject The logical combination of all the data marts is a data warehouse

In short, a data warehouse as contains many subject areas, and a data mart contains just one of those subject areas Kimball vs Inmon Normalized (Inmon) vs Dimensional (Kimball) Normalized: Normalization rules Many tables using joins Dimensional: Facts and dimensions Less tables having duplicate data (de-normalized) Easier for user to understand Kimball vs Inmon Top-Down (Inmon) vs Bottom-Up (Kimball) Bottom-Up:

Data marts Logical data warehouse Decentralized Quick results, iterative approach Top-Down: Enterprise data model Centralized Later create data marts

More upfront work but less redo Hybrid: Data Vault Populating a Data Warehouse Frequency of data pull Full Extraction All data Incremental Extraction Only data changed from last run Determine data that has changed Timestamp - Last Updated CDC

Partitioning Triggers MERGE Online Extraction Data from source Replication Database Snapshot Availability Groups Offline Extraction Data from flat file ETL vs ELT Extract, Transform, and Load (ETL) Transform while hitting source system No staging tables Processing done by ETL tools (SSIS) Extract, Load, Transform (ELT)

Uses staging tables Processing done by target database engine (SSIS: Execute T-SQL Statement task instead of Data Flow Transform tasks) Use for big volumes of data Use when source and target databases are the same Use with PDW ELT is better since database engine is more efficient than SSIS Database engine: Transformations SSIS: Data pipeline and workflow management Normalizing and Surrogate Keys Normalize to eliminate redundant data and setup table relationships Surrogate Keys Unique identifier not derived from source system Embedded in fact tables as foreign keys to dimension tables

Allows integrating data from multiple source systems Protect from changes in the source system Allows for slowly changing dimensions Allows you to create rows in the dimension that dont exist in the source (-1 in fact table for unassigned) Improves performance (joins) and database size by using integer type instead of text SSAS Cubes Reasons to use instead of data warehouse: Aggregating (Summarizing) the data for performance Multidimensional analysis slice, dice, drilldown

Hierarchies Advanced time-calculations i.e. 12-month rolling average Easily use Excel to view data Slowly Changing Dimensions (SCD) Data Warehouse Architecture SQL Server 2012 Tabular Model New xVelocity in-memory database in SSAS Build model in Power Pivot or SSDT

Uses existing relational model No star schema, no extra SSIS Uses DAX Faster and easier to use than multidimensional model End-User Microsoft BI Tools Excel PivotTables SQL Server Reporting Services (SSRS) Report Builder PowerPivot

PerformancePoint Services (PPS) Power View Resources:

Data Warehouse Architecture Kimball and Inmon methodologies: http://bit.ly/SrzNHy SQL Server 2012: Multidimensional vs tabular: http://bit.ly/SrzX1x Data Warehouse vs Data Mart: http://bit.ly/SrAi4p Fast Track Data Warehouse Reference Guide for SQL Server 2012: http://bit.ly/SrAwsj Complex reporting off a SSAS cube: http://bit.ly/SrAEYw Surrogate Keys: http://bit.ly/SrAIrp Normalizing Your Database: http://bit.ly/SrAHnc Difference between ETL and ELT: http://bit.ly/SrAKQa Microsofts Data Warehouse offerings: http://bit.ly/xAZy9h Microsoft SQL Server Reference Architecture and Appliances: http://bit.ly/y7bXY5 Methods for populating a data warehouse: http://bit.ly/SrARuZ Great white paper: Microsoft EDW Architecture, Guidance and Deployment Best Practices: http:// bit.ly/SrAZug End-User Microsoft BI Tools Clearing up the confusion: http://bit.ly/SrBMLT Microsoft Appliances: http://bit.ly/YQIXzM

Recently Viewed Presentations

  • Maximal Unitarity at Two Loops David A. Kosower

    Maximal Unitarity at Two Loops David A. Kosower

    Do Quadruple Cuts Have Solutions? The delta functions instruct us to solve. 1 quadratic, 3 linear equations 2 solutions. If . k. 1. and k. 4. are massless, we can write down the solutions explicitly solves eqs 1,2,4; Impose 3rd...
  • Crime is Normal - djjr-courses.wdfiles.com

    Crime is Normal - djjr-courses.wdfiles.com

    Contrast to crime as pathology "Even in a society of angels…" Erikson: unless "rhythm of group life is punctuated by moments of deviant behavior…social organization would be impossible." "It does not offend the collective conscience because it is a crime,...
  • Risk assessment and control of risks - student.riskassess.ca

    Risk assessment and control of risks - student.riskassess.ca

    Time of change: new laws, risk assessments, GHS and labelling in the laboratory Phillip Crisp and Eva Crisp
  • Brachial Plexus - WordPress.com

    Brachial Plexus - WordPress.com

    Anatomy and Evaluation of the Brachial Plexus ... paresthesia, pain, cool and pale skin, cyanosis or edema in upper extremity, and swollen veins (Prentice, pp. 683-684) Patient may also develop unilateral atrophy and/or lowered shoulder on affected side (Duralde, 2000)...
  • Malware - University Of Maryland

    Malware - University Of Maryland

    Technological arms race between those who wish to detect and those who wish to evade detection. Started off innocuously. Became professional, commoditized. Economics, cyber warfare, corporate espionage. Advanced detection: based on behavior, anomalies. Must react to attacker responses
  • LiDAR - URISA Ontario

    LiDAR - URISA Ontario

    The Ontario Ministry of Agriculture, Food and Rural Affairs (OMAFRA) has funded a two year project (2016-2018) to acquire LiDAR data in targeted areas of Ontario to support soil mapping work and other initiatives. OMAFRA is working closely with MNRF...
  • Instructions for Using This Game Template

    Instructions for Using This Game Template

    Classroom Jeopardy Challenge: Understanding SEM Playing SEM Jeopardy Remember this is Jeopardy, so where you see the text "answer", this is the prompt the students will see, and where you see "question" should be the student's response. ... Understanding SEM...
  • TN HOUSING TRUST FUND - Amazon S3

    TN HOUSING TRUST FUND - Amazon S3

    The Housing Trust Fund is now the Tennessee Housing Trust Fund (THTF). The amount available for the Fall Round is approximately. $2.0 million. Eligible activities are Rental Only. The maximum grant is $500,000. Income limits are at or below 80%...