An overview of Data Warehousing and OLAP Technology

An overview of Data Warehousing and OLAP Technology

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai

Introduction What is data warehouse ? Explanation of definition Data warehouse Vs. Operational Database Data warehouse architecture Back end tools Conceptual model Database design Warehouse servers Index structures Meta data Conclusion References 2

Introduction Essential elements of decision support Enables The Knowledge Worker to make better and faster decisions Used in many industries like: Manufacturing (for order shipment) Retail (for inventory management) Financial Services (claims and risk analysis) Every major database vendor offers product in this area 3 What is Data Warehouse ? A data warehouse is a subject-oriented,

integrated, time-varying, non-volatile collection of data that is used primarily in organizational decision making Typically maintained separately from operational databases 4 Explanation of definition Subject-Oriented: Designed around subject such as customer, vendor, product and activity Does not includes data that are not needed for Decision support system (DSS) Integrated: Most important feature

Consistent naming convention, measurement of variables and so forth The data should be stored in single globally acceptable fashion 5 Explanation (continues) Time Varying: All data in the warehouse should be accurate as of some moment in time Data stored over a long time horizon (5 10 years) Key structure contains element of time (implicitly or explicitly) Data once correctly recorded cant be updated Non Volatile: No Update of data allowed

only loading and access of data operations 6 Data Warehouse Vs. Operational Database Data Warehouse Operational Database user Knowledge worker Clerk, IT professional Function

Decision support Day to day operations Data Historical,summarized, multidimensional, integrated Current, up-to-date, detailed Unit of work Complex query

Short, simple transaction metric Query throughout, response Transaction throughput 7 Architecture

Data sourcing,migration,cleanup tools Meta data repository Data marts Data query, reporting, analysis and mining tools Data warehouse administration and management 8 Architecture (continues) Distributed Data warehouse Load balancing, scalability,higher availability Meta data replicated and centrally administrated Too expansive Data marts

Departmental subset focused on selected subjects example: marketing department includes customer, sales and product tabels Has own repository and administration May lead to complex integration problems if not designed properly 9 Back end tools and Utilities Data cleaning, loading, refreshing tools Cleaning Multiple source, possibility of errors Example: replace string sex by gender Loading Building indices, sorting and making access paths Large amount of data

Incremental loading Only updated tuples are inserted ,Process hard to manage Refresh Propagating updates When to refresh ? Set by administrator depending on user needs and traffic 10 Conceptual Model and front end tools Multi dimensional view Dimensions together uniquely determine the measure

Example: Sales can be represented as city,product, data Each dimension is described by set of attribute Example: product consist of Category of product Industry of product Year of introduction Front end tools Multi dimensional spreadsheet Supports Pivoting-reorientation Roll_up - summarized data Drill_down - go from high level to low level summary 11 Database design Two ways to represent Multi dimensional model

Star schema Database consist of single fact table and single table for each dimension Each tuples in fact table consist of pointer to each of dimension Snowflake schema Refinement over star schema Dimensional hierarchy is explicitly represented by normalizing dimension tables 12 Warehouse Servers Specialized SQL servers Provides advanced query language and query processing support for SQL queries over star and

snowflake schemas Example: Redbrick ROLAP Between relational back end and client front end tools Extend traditional relational servers to support multidimensional queries Example: Microstratergy MOLAP Multidimensional storage engine Direct mapping Example: Essbase from Arbor Inc. 13 Index structures Bit map indices Use single bit to indicate specific value of attribute

Example: instead of storing eight characters to record engineer as skill of employee use single bit id# Name Skill 1000 John 1 Join indices Maintains the relationship between foreign key with its matching primary keys 14 Meta data and warehouse management Its data about data Used for building, maintain, managing and using data warehouse Administrative meta data

Information about setting up and using warehouse Business meta data Business terms and definition Operational meta data Information collected during operation of warehouse 15 Conclusion Data warehouse is the technology for the future. data warehouse enables knowledge worker to make faster and better decisions 16

References Inmon W. H.,Building the data warehouse www.olapcouncil.org www.pwp.starnetinc.com www.arborsoft.com Kimball, R. The data warehouse toolkit. 17

Recently Viewed Presentations

  • Sound Reasoning 3 - Speech-Language Therapy

    Sound Reasoning 3 - Speech-Language Therapy

    We "prefer" to articulate words with a rise and fall in sonority; p ... This rise - fall tendency is called the Sonority Sequencing Principle (SSP) [ ] and [ ] are more 'natural' for us to say than[ ]...
  • ORC Update - What's New and Different for 2012

    ORC Update - What's New and Different for 2012

    What is the ORC? A $1.7 million collection of curricular-aligned, authoritative digital resources licensed on behalf of all Alberta: K-12 students and their parents . School staff. Pre-service teachers . Public library staff * Funded by a yearly Grant-In-Aid from...
  • Why your data matters? - UK Renal Registry

    Why your data matters? - UK Renal Registry

    Principles Assurance by the receiving organisation that it has confidentiality policies and that their employees are trained in confidentiality, including citation of the relevant policies A number of requests for data sharing have been approved in the past 12 months...
  • Selective Laser Sintering - 123seminarsonly.com

    Selective Laser Sintering - 123seminarsonly.com

    Selective Laser Sintering Selective laser sintering is an additive rapid prototyping technique that uses a high power laser to fuse small particles of thermoplastic, metal, polyamide (nylon), ceramic, or glass filled nylon. SLS offers the key advantage of making functional...
  • MATV Memorial All Terrain Vehicle

    MATV Memorial All Terrain Vehicle

    MATV Memorial University All Terrain Vehicle Team members Jonathan Cole Trevor Dwyer Fabio Faragalli Hydraulic System Motor, Pump & Engine Sizing Example Powered by 6 hydraulic wheel motors One pump per side, powering 3 hydraulic wheel motors each - in...
  • MOOD DISORDERS Chapter E.2 BIPOLAR DISORDER IN CHILDREN

    MOOD DISORDERS Chapter E.2 BIPOLAR DISORDER IN CHILDREN

    the overstimulation from TV, movies, and video games), providing freshly. prepared food (limiting the intake of sugar, dairy, cymene salicylic food and food. preservatives) and supplements of zinc and iron. Some clinics in China may use. psychiatric medications while others...
  • Brainstorm - saugerties.k12.ny.us

    Brainstorm - saugerties.k12.ny.us

    (biotic potential) Favorable light. Favorable temperature. Favorable chemical environment (optimal level of critical nutrients) Abiotic. Biotic. High reproductive rate. Generalized niche. Adequate food supply. Suitable habitat. Ability to compete for resources. Ability to hide from or defend. against predators. Ability...
  • Doing the LEQ - teacher.kent.k12.wa.us

    Doing the LEQ - teacher.kent.k12.wa.us

    A. The Thesis paragraph. While not required on the rubric, a good thesis paragraph will put the topic at hand into context- you should explain the broader historical events, developments, or processes immediately relevant to the question.. While not required...