In recent times, Big Data has emerged as a powerful new technology paradigm. Big data, one of the most talked-about information technology (IT) solutions, has emerged as a new technology paradigm to create business agility and predictability by analyzing data coming from various sources. To manage the massive data generated by social media, online transactions, Weblogs, or sensors, Big Data incorporates innovative technologies in data management (unstructured, semi-structured and structured), processing, real-time analytics, and visualization.
This course is designed for managers, analysts, architects, and developers seeking an understanding of Big Data concepts, the related technology landscape and deployment patterns.
In this program, Instructor will explain and illustrate Big Data concepts, Grid and Cluster computing architecture, Data warehousing and Data Marts, Hadoop(MapReduce, HDFS), Hive, NoSQL(Couchbase, MongoDB, Cassandra, Neo4J), and R with various industry use cases.
PURPOSE OF THE COURSE
The purpose of the three (3) weeks intensive certificate program is to provide high level understanding of Big Data concepts, Grid and Cluster computing architecture, Data warehousing and Data Marts, Hadoop(MapReduce, HDFS), Hive, NoSQL(Couchbase, MongoDB, Cassandra, Neo4J), and R with industry use cases.
WHO SHOULD ATTEND?
This course is designed for individuals seeking to master the Big Data concepts, tools, platform, Hadoop, NoSQL, and R needed for advanced analytics and reporting. In addition, this course is designed to meet the needs of Big data awareness and significance among professionals across industries to fulfill professional development training requirements (3 Continuing Education Units conferred upon successful completion of the course exit examination).
Upon completion of this course, participants will be able to:
- Understand the concept of ‘Big Data, Hadoop, NoSQL, and R
- Learn Grid and Cluster computing architecture.
- Understand Data warehousing and Data Marts.
- Learn basic concepts and architecture of Hadoop(MapReduce, HDFS), Hive, NoSQL(Couchbase, MongoDB, Cassandra, Neo4J), and R with industry use cases.
There are no formal prerequisites for this course and is open to any professional seeking to practical knowledge of Big Data concepts, tools, platform, and deployment use cases as part of their professional career development.
This class will be presented as ONLINE course spanning a period of three (3) weeks. In addition to study materials presented in your classroom (text is included as part of your registration) and recorded lectures, you will also have direct interaction with your instructor and other students by way of asynchronous discussions. While this course has been designed for students/ working professionals being delivered in an interactive online classroom learning environment that accommodates busy schedules, each participant understands that the required progression of weekly assignments and discussions necessitates a minimum of ten (10) hours devotion per each week of the class.
Module 1: Basic Concepts and characteristics of Big Data
- Volume of Data
- Variety of Data
- Velocity of Data
Module 2: Data Management in the warehouse and in the Big Data
- Acquire data
- Organize data
- Analyze data
- Visualize and decide (Analytics and Reporting)
- Data Marts and Data Warehousing concepts
- CAP Theorem
Module 3: Why is Big Data important?
- When to consider big Data solution
- Patterns for big data deployment
- IT log analytics
- Fraud detection Pattern
- Social media pattern
- Call center pattern
- Risk Patterns for modelling and management
Module 4: Big data computing architecture
- Understanding of various technology architecture in needed for big data processing
- Grid computing architecture
- Scale out Cluster computing architecture
- Building and assessment of robust scale out cloud deployment for Big Data processing workloads
Module 5: Basics of Virtualization
- Virtualization concepts and deployment in Cloud based Data canter
- Hardware and OS virtualization
- Hypervisor and Bare metal deployment
- Virtual Box for portability and VM instance image
- Maximum utilization of resources and monetization
Module 6: History of Hadoop
- Components of Hadoop
- Hadoop Distributed File system
- Basics of Map Reduce
Module 7: Understanding of No SQL
- Basic Schema in NoSQL
- Compare and contrast with Hadoop
- High level understanding of
Module 8: Data Discovery and Visualization
- Business Intelligence reporting, dashboard and analytics
- DVT tools Endeca and Tableau
Module 9- Advanced Analytics using R
Module 10- Industry use cases
Criteria For Passing The Course:
Complete all graded quizzes, assignments on time