Skip to main content

Drive your career forward

Professional Certificate in
NoSQL, Big Data and Spark Fundamentals
IBM

What you will learn

  • Differentiate between the four main categories of NoSQL repositories and work hands-on with MongoDB, Cassandra and IBM Cloudant.
  • Apply your knowledge of the characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools, including Hadoop, HDFS, Hive and HBase.
  • Describe parallel programming using Resilient Distributed Datasets (RDDs), DataFrames and SparkSQL. Understand how Catalyst and Tungsten benefit Spark programmer and see how ETL work using DataFrames.
  • Acquire real-world data engineering and machine learning skills using Spark Structured Streaming, DataFrames, GraphFrames, Spark ML, Regression, Classification, and clustering, including the k-means algorithm and ETL using Spark.
  • Gain hands-on experience using SparkSQL, Apache Spark on IBM Cloud.
  • Learn about scaling out using the IBM Spark Environment in Watson Studio, running Spark on Kubernetes, setting Spark configurations, and performing monitoring and performance tuning.

Data engineers and Big Data professionals are in overwhelming demand. NoSQL and Big Data technology skills such as Apache Spark are a must-have for modern day data-driven decision-making. This three-course Professional Certificate from IBM opens the door for data engineering and big data careers.

Starting with NoSQL Database Basics, this course introduces you to NoSQL fundamentals, including the four key non-relational database categories. By the end of the course, you will have hands-on skills working with MongoDB, Cassandra, and IBM Cloudant NoSQL databases.

A crucial aspect of data engineering is the acquisition and management of Big Data and Big Data Analytics scalability and performance. When you enroll in Big Data, Hadoop, and Spark Basics, you'll discover the characteristics, features, benefits, limitations, and applications of some of the more popular Big Data processing tools. You explore the open-source ecosystem of Apache tools, including Apache Hadoop, Apache Hive, and Apache Spark, including Spark on Kubernetes. Discover how to leverage Spark to deliver reliable insights. You'll gain hands-on data analysis skills using PySpark and Spark SQL and create a streaming analytics application using Spark Streaming, and more.

Then enroll in Apache Spark for Data Engineering and Machine Learning to discover how data and machine learning engineers use Spark Structured Streaming, GraphFrames, Regression, Classification, and clustering. Learn about clustering and how to apply the k-means clustering algorithm using Spark MLlib. Extraction Transformation and Loading, (ETL) is at the heart of data and machine learning engineering, and you'll gain skills using Spark to perform extract, transform and load (ETL) tasks. This course culminates with a hands-on Spark project.

This Professional Certificate does not require any prior programming or data science skills; however, prior basic data literacy and SQL skills will prove valuable in completing this program.

Expert instruction
3 skill-building courses
Self-paced
Progress at your own speed
4 months
2 - 3 hours per week
Discounted price: $222.30
Pre-discounted price: $247USD
For the full program experience

Courses in this program

  1. IBM's NoSQL, Big Data and Spark Fundamentals Professional Certificate

  2. 2–3 hours per week, for 5 weeks

    This course introduces you to the fundamentals of NoSQL, including the four key non-relational database categories. By the end of the course you will have hands-on skills for working with MongoDB, Cassandra and IBM Cloudant NoSQL databases.

  3. 2–3 hours per week, for 6 weeks

    This course provides foundational big data practitioner knowledge and analytical skills using popular big data tools, including Hadoop and Spark. Learn and practice your big data skills hands-on.

  4. 2–3 hours per week, for 3 weeks

    This short course introduces you to the fundamentals of Data Engineering and Machine Learning with Apache Spark, including Spark Structured Streaming, ETL for Machine Learning (ML) Pipelines, and Spark ML. By the end of the course, you will have hands-on experience applying Spark skills to ETL and ML workflows.

    • The Dice Tech Job Report lists Data Engineering as the fastest-growing tech occupation with year-over-year growth of 50%.
    • Data engineering jobs are listed as one of the top 10 jobs in Glassdoor's best jobs in America for 2020.
    • Jefferson Parker lists NoSQL second in its list of the top eight demand Big Data Skills. Multiple sources report expected NoSQL growth of 30% through 2026, with, based on PayScale rankings, with salaries of more than 107K annually.
    • In a Towards Data Science 2020 analysis of major site job listings, Apache Spark appears in half of job listings for data engineers. Spark programming language is the third most requested Big Data technology skill by employers.

Meet your instructors
from IBM

Aije Egwaikhide
Senior Data Scientist
IBM
Karthik Muthuraman
Software Engineer (Machine Learning)
IBM
Romeo Kienzler
Chief Data Scientist
IBM
Rav Ahuja
AI and Data Science Program Director
IBM
Steve Ryan
Instructor & Content Developer
Skill-Up Technologies
Ramesh Sannareddy
Content Developer
Skillup

Experts from IBM committed to teaching online learning

Enrolling Now

Discounted price: $222.30
Pre-discounted price: $247USD
3 courses in 4 months
Pursue the Program

Propelling

Drive your career forward with university-backed credit programs and verified certificates

Convenient

Study and demonstrate knowledge on your schedule

Flexible

Try a course before you pay

Supportive

Learn with university partners and peers from around the world