Skip to main content

Attending Workshop on Data Mining @ Chennai

I recently attended a CSIR Sponsored Workshop on 
Recent Trends In Data Warehousing, Mining & Biological Databases
Date: 27, 28th August 2010
Venue: CSE -Dept, Dr.M.G.R University, Maduravoyal, Chennai - 95, TN.


Chennai  (The Gateway to the South India, the fourth largest city in India).


Advances in digital storage technology have resulted in growing volumes of enterprise data. Data warehouse is a repository of an organization's electronically stored data. It is used to systematically organize, understand, analyze and arrive at strategic decision in the enterprise. Data mining is a mechanism to extract hidden interesting patterns from the large databases. It is used for a variety of purposes like engineering design, decision Support systems, fault detection, predictive maintenance and customer relationship management. This workshop provides the participants an opportunity to learn about the data warehousing and data mining techniques and hands on sessions with tools like, Sybase IQ, Clementine & RapidMiner. We are arranging one special session in biological databases by  Dr. Sunil Kumar Verma, CCMB.


Click here Broacher and Session Details


Click here for Dr.MGR University and Workshop Details 


Some Highlights of the workshop:
  • Data Warehouse need to satisfy 4 key properties:
        • Subject Oriented
        • Integrated
        • Time Variant / Historical Collection
        • Non - Volatile / Voluminous Data
  • Data = Raw Facts
  • Information = Processed Data
  • Objectives of Data Warehouse:
        • Decision making for the organization
  • Move from Data to Information and to Knowledge
  • Data Mining : Knowledge Discovery in Database
    • Definition: Extraction of interesting,  non - trivial, implicit, previously unknown and potentially useful information or patterns from data in large DB.
  • Issues in Data Mining:
        • Defect classification and Prediction system
        • Developing a unifying theory of data mining
        • Mining complex knowledge from complex data
  • Data Mining: Data passed through Data Mining tool to get the hidden information
    • Data present in Data Warehouse is junk and it gives no information
  • Data Mining :
        • Interactive
        • Iterative Process
        • Enhance the value of existing information resources
        • To find underlying relationships and features
        • To predict future trend and behavior
        • 80% of work in DM is Data Preparation.
        • 20% effort goes into selection of Algorithm.
    • 3 Main Modeling Techniques:
        • Classification
        • Association
        • Segmentation
    • Out liars : Irrelevant data; Filter out lairs using clustering tools.
    • Customer Churn: With Respect to Telephonic Industry.
        • Churn: Going to drop your service.
        • Non-Churn: Going to leave the service.
  • About Marketing Types:
    • RFM - Recency, Frequency, Magnitude
        • Interactive Marketing - Promotional Marketing (TV, Radio and Internet)
        • Market Basket Analysis
        • Trend Analysis - Future Business
  • About Types of Data:
        • Range Data / Continuous Data
        • Nominal Data / Set Data
        • Ordinal Data / Which can be ordered and ranked
  • -----------------------------------------------------------------------------------------------------------------
  • Session - 3 : Biological databases
        • Computational Predictions in Cell Signalling
  • -----------------------------------------------------------------------------------------------------------------
        • Operational Systems - Systems that helps us run the day-today enterprise operations, (OLTP).
        • Data -> Information -> Knowledge
  • Operational Data helps the organization meet Operational requirements for data. 
  • The Data Warehouse data helps the organization to meet Strategic Requirement for Information.
  • Data Warehouse is a Subject Oriented, Integrated, Time Varying, Non - Volatile collection of data that is used primarily in organizational decision making.
  • Data Warehouse is Decision Support System (DSS) - A Data Base integrated from different sources, which is cleaned.
  • Data Warehouse Features:
        • Historical, Descriptive - Caters to the entire spectrum of management, multidimensional view on the enterprise data.
  • Data Warehouse Application Areas
    • CRM
    • Claim Analysis
    • Procurement Analysis
    • Inventory Analysis
  • Data Mart: Small version of Data Warehouse - particular, specific purpose of Data Warehouse.
    • Main Features:
      • Good Performance
      • Easily Understood
      • Less Information than Warehouse
      • Single Subject Area
      • Fewer Dimensions
      • Time & Cost - Economical
    • Dis-Advantages:
      • More number of Data Marts - Complex to maintain
      • Redundancy / Duplicate Data
    • ODS Vs OLTP and ODS vs DW
  • Data Warehouse Architecture:
    • Ralph Kimball World
    • Star Schema
    • ER model for DW
  • Rapid Miner
    • Two Types of Data:
      • Training Data
      • Testing Data
    • Sybase IQ - Largest Data Warehouse in the World - 1 Peta Byte (10 to power 15).
    • ETL - Extract, Translate and Load to memory

Recently TCS and Reliance has moved to Data Warehousing.

In Brief, the sessions were packed with lots of information.




    Comments

    Abarshini said…
    I actually enjoyed reading through this posting.Many thanks.




    Data Mining Company in Chennai
    revathi said…

    Hey... You have nice Blog.. Keep follow this excellent work.
    Data Mining Company in Chennai