ORACLE

CCA Data Analyst

A CCA Data Analyst has proven their core analyst skills to load, transform, and model Hadoop data in order to define relationships and extract meaningful results from the raw input.

CCA Data Analyst Exam (CCA159)

 

Required Skills

Prepare the Data

 

Use Extract, Transfer, Load (ETL) processes to prepare data for queries.

    Import data from a MySQL database into HDFS using Sqoop

    Export data to a MySQL database from HDFS using Sqoop

    Move data between tables in the metastore

    Transform values, columns, or file formats of incoming data before analysis

    Provide Structure to the Data

   Use Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala.

    Create tables using a variety of data types, delimiters, and file formats

    Create new tables using existing tables to define the schema

    Improve query performance by creating partitioned tables in the metastore

    Alter tables to modify existing schema

    Create views in order to simplify queries

 

Data Analysis

 

    Use Query Language (QL) statements in Hive and Impala to analyze data on the cluster.

    Prepare reports using SELECT commands including unions and subqueries

    Calculate aggregate statistics, such as sums and averages, during a query

    Create queries against multiple data sources by using join commands

    Transform the output format of queries by using built-in functions

    Perform queries across a group of rows using windowing functions

 

What should you expect?

 

You are given eight to twelve customer problems with a unique large data set, a CDH cluster, and 120 minutes. For each problem, you must implement a technical solution with a high degree of precision that meets all the requirements. You may use any tool or combination of tools on the cluster (see list below) -- you get to pick the tool(s) that are right for the job. You must possess enough knowledge to analyze the problem and arrive at an optimal approach given the time allowed. You need to know what you should do and then do it on a live cluster, including a time limit and while being watched by a proctor.

 

    Number of Questions: 812 performance-based (hands-on) tasks on CDH 5 cluster. See below for full cluster configuration

    Time Limit: 120 minutes

    Passing Score: 70%

    Language: English

 

Who is this for?

 

Candidates for CCA Data Analyst can be SQL devlopers, data analysts, business intelligence specialists, developers, system architects, and database administrators. There are no prerequisites.

What is the best way to prepare?

 

The CCA Data Analyst exam was created to identify talented SQL developers looking to stand out and be recognized by employers looking for these skills. It is recommended that those looking to achieve this certification start by taking Clouderas Data Analyst training course, which has the same objectives as the exam.



Course Application