
Advanced Citizen Data Scientist Certification Powered by Apache

Public Class
7 Days Offline Training
6 Hours e-Learning
HRDF Claimable
Certificate of Attendance
Reference Book
Light Refreshment
In-house
7 Days Offline Training
6 Hours e-Learning
HRDF Claimable
Certificate of Attendance
Reference Book
All of our in-house classes can be customised to your organisation’s needs
- Course Details
- Course Outline
Introduction
The Advanced Citizen Data Scientist role is designed for individual in the organisation who aspires to use open-source technology to perform sophisticated machine learning and simple predictive analytics.It is also highly relevant to other key staff involved in the requirements input, design, development, delivery and ultimate use of the digital initiatives including data consumer, digital initiatives decision maker, business analyst, and operation managers or staff. Possible use cases will be discussed. In addition, there will be hands-on assignments for participants to practise and apply the topics covered.
Who Should Attend
Data or Business Analysts Accountants or Finance Personnels Digital Transformation Team Managerial roles ExecutivesCertification Outcomes
Discuss, explore and communicate to other stakeholders with the big data and application of open-source technology in context Show business data and present scenario Prepping the data and create data models for simple to intermediate complex self-service descriptive and diagnostic analytics with open-source technologies – Superset and SQL Lab Implement simple predictive analytics, with machine-learning via open-source technologies – PySpark API, MLlibPre-requisites
Basic computer software skill (e.g.: Excel) Basic internet skill (e.g.: Chrome, Firefox) Basic English skill
Module 1 : Big Data and Open-source Fundamentals Certification
Big Data Analytics
Module 2 : Data Visualisation Essentials Certification Descriptive Analytics
Module 3 : Data Analytics Essentials Certification Analytics Process
Module 4 : Python for Big Data Analytics Certification Machine Learning Essentials
E-Learning : Modern Data Analytics Primers Data Visualisation 1 (Descriptive Analytics)
- The foundation concept of big data
- How Big Data impacts the business world today
- Analytics capabilities: Descriptive, Diagnostic, Predictive & Prescriptive from the big data angle
- Introduction to the Open-Source technologies and its development history
- The top projects from the world’s popular open-source software foundations such as Apache, Linux and Python
- How open-source technology becomes the trend to drive innovation and being the most critical part of enriching big data analytics
- Design concept and feature of data visualisation
- The introduction of Apache Superset, a user-friendly data visualisation tool that is essential to most of the data users today
Module 2 : Data Visualisation Essentials Certification Descriptive Analytics
- The goal and value of descriptive analytics
- Data prep and tasks for descriptive analytics
- The key-value and best practices in data visualisation
- The key attributes and styles in performing data visualisation
- The use and key features of Apache Superset
- Data prepping for effective data visualisation with Superset - modelling, filtering, style & form
Module 3 : Data Analytics Essentials Certification Analytics Process
- The analytics process of diagnostic and predictive analytics
- Data prep tasks- data collection, data cleansing, data munging and data visualisation
- Build analytics model - convert unstructured data into quantified metrics
- Diagnostic analytics objectives, processes, data prepping
- Data modelling for diagnostic analytics with Hive, Spark SQL and PySpark
- Data visualisation for diagnostic analytics - Apache Superset
- Predictive analytics objectives, best practices, processes, data prepping and model building using Hive, Spark SQL and PySpark
- Popular machine learning algorithms
- Predictive modelling - decision-tree, clustering with Python and Spark MLlib
Module 4 : Python for Big Data Analytics Certification Machine Learning Essentials
- Machine Learning (ML) and Artificial Intelligence (AI)
- Machine Learning best practice – CRISP-DM
- Popular yet powerful Machine Learning algorithms
- Python data structure and programming
- Lambda and Python libraries
- Data profiling, statistical computation, data cleansing & munging with Pandas
- Perform interactive data analytics with Zeppelin
E-Learning : Modern Data Analytics Primers Data Visualisation 1 (Descriptive Analytics)
- How we utilise a popular tool (from a multinational technology company) to perform descriptive analytics
- The process in its entirety when executing descriptive analytics for a business
- What data to observe and analyse to obtain hindsight metrics
- How the obtained data can be presented to answer the business question of “What Happened?”
- Valuable opportunities (free and paid) for you to learn how to perform descriptive analytics in your current work
- How we utilise a popular software (from a data visualisation company) to perform diagnostic analytics
- The process in its entirety when executing descriptive analytics for a business
- What can be done to get insight metrics; typically produced by cross-tabbing hindsight metrics with other factors
- How the obtained data can be presented to help answer the business question of “Why Did It Happen?”
- Valuable opportunities (free and paid) for you to learn how to perform diagnostic analytics in your current work
- How we utilise a popular open-source software to perform predictive analytics
- The process in its entirety when executing predictive analytics for a business
- What can be done with data to project foresight metrics to help with business continuity planning
- How the obtained data can be presented to help answer the business question of “What Will Happen?”
- Valuable Opportunities (free and paid) for you to learn how to perform predictive analytics in your current work
Methodology
This program will be conducted with interactive lectures, PowerPoint presentation, discussion, practical exercise and self-paced learning.
Examination
40 Multiple Choices | 60 Minutes (Module 1) 50 Multiple Choices | 75 Minutes (Module 2) 50 Multiple Choices | 75 Minutes (Module 3) 50 Multiple Choices | 75 Minutes (Module 4)Computer System Requirement
As this course involves hands-on, please bring your own laptop with minimum system requirement equivalent of Intel Core i3, 8GB ram in memory, 40GB Hard Disk Storage Space

