, Applied Data Science and Machine Learning

APPLIED DATA

SCIENCE AND

MACHINE

LEARNING

We’ve teamed up with GTK Cyber, the best at the intersection between data science and cybersecurity for this interactive course. Together we’ve made this on-demand version of the course that was previously only taught live and privately to leading industry organizations.

Learn how to utilize data science techniques to quickly manipulate and analyze network & security data and ultimately uncover valuable insights from this data.

Next Live Class Date – May 17th 2022

Industry Training Partners Include:

Course Objectives

  • Rapidly explore, visualize and analyze security data using open source tools

  • Analyze big data and make data-driven predictions through probabilistic modeling and statistical inference

  • Identify and deploy modeling in order to extract meaningful information for decision making

  • Construct, train, evaluate & deploy supervised ML models to solve difficult security related problems

  • Construct unsupervised models for anomaly detection and other exploratory analysis

32
Lecture Hours

Over 60 videos with your instructor

10
Modules

Curriculum detailed below

15
Labs

With over 12 hours of lab content

Digital
Badge

Awarded upon course completion

Questions?

Book a call with our team

Course Overview

The Applied Data Science for Cybersecurity interactive course will teach cybersecurity professionals how to use data science techniques to quickly manipulate and analyze network & security data and ultimately uncover valuable insights from this data. Through cutting-edge labs and real-world data sets, students will gain real and applicable knowledge in data science specifically applied to cybersecurity challenges.

The course will cover the entire data science process from data preparation, feature engineering and selection, exploratory data analysis, data visualization, machine learning, model evaluation & optimization and finally, implementing at scale. Participants will learn how to read data in a variety of common formats, and then write scripts to analyze and visualize the data in meaningful ways. The course is specifically designed to provide sophisticated training in data science as applied to cybersecurity-related challenges & scenarios. Additionally, it would be beneficial for participants to be comfortable with the basics of Python programming, but it is not required to in order to take this course.

Once you’ve completed this course you have the skills to read data in a variety of common formats then write scripts to analyze and visualize that data.

Meet Your Instructors

, Applied Data Science and Machine Learning

Charles Givre, CISSP

Charles Givre is a solutions-focused Senior Technical Executive and Data Scientist with 20+ years of success across the technology, data science, fintech, education, and cybersecurity industries.

, Applied Data Science and Machine Learning

Curtis Lambert, CISSP

Curtis is a Lead Data Scientist at DataDistillr. He has been everything from a heavy equipment mechanic to a linguist but has always fostered a love of technology. Before joining DataDistillr and GTK he conducted cyber security analysis for the U.S. Army and DOD in roles ranging from data scientist to malware reverse engineer.

Curriculum

This module introduces the course, as well as key concepts of data science and their application to security:

• Introduce the data science process

• Discuss case studies of the application of data science and machine learning to security

• Introduce the Python data science ecosystem to include Jupyter Notebook and various Python modules

• Introduce the Griffon VM

This module introduces the concept of vectorized computing, and how to create, manipulate and summarize one dimensional data using the Pandas module in Python. Additionally, we will cover basic statistical concepts.

By the end of this module, the students will be able to:

• Create a series object

• Explore and summarize data within a Series

• Understand and generate Tukey 5 number summaries, as well as calculate other common statistical measures

This module builds on the concepts taught in module 1, and introduces the students to the DataFrame: a two dimensional, vectorized data structure.  This module also covers how to directly injest security data into a Pandas DataFrame.

By the end of this module, students will be able to:

• Create a DataFrame using the various "read_" functions

• Import various security data into dataframes

• Flatten complex nested data

• Join and merge data sets

• Calculate aggregate statistics

Data visualization is a powerful technique to have in your analytic toolkit.  This module will cover the theory of data visualization as well as the actual process and coding of creating effective visualizations.

By the end of this module, students will:

• Understand techniques about how to make effective data visualizations

• Be familiar with various modules in Python to create visualizations

• Be able to create both static and interactive visualizations

The Machine Learning modules walk the student through the machine learning process from beginning to end, starting with feature engineering and selection, model creation and ultimately evaluating and improving model performance.

This module introduces students to the concepts behind classifiers and various classification algorithms.  Students will also learn to evaluate classifier performance.

By the end of this module, students will:

• Understand the functioning of a classifier and various classification algorithms including Support Vector Machines, Decision Trees, and Random Forests

• Be able to create models using various classification algorithms in Python using Scikit-Learn.

• Be able to evaluate the performance of a model and tune its hyperparameters.

Module 6 introduces students to unsupervised learning and how to apply it to security problems.   The module mainly focuses on clustering and its uses and limitations.  

By the end of this module, students will:

• Understand various distance measurement functions

• Understand the concepts behind K-Means and DBSCAN

• Be able to create clusterers and evaluate their performance

This module covers various techniques that can be used to detect anomalies in security data.  This module will introduce the students to various approaches to anomaly detection, including forecasting, unsupervised machine learning and other statistical techniques. 

By the end of this module, students will:

• Understand how to detect anomalies in security data

• Be able to implement common unsupervised machine learning algorithms such as one-class support vector machines and isolation forests to detect anomalous data

• Be able to perform anomaly detection using statistical metrics such as the Grubbs’ test

• Understand the challenges associated with using Machine Learning to detect anomalies

Deep learning is one of the most exciting and new areas in machine learning.  In this module, we will give the students a conceptual overview of deep learning and its application to security problems.   Given the complexity of this topic, this module is intended to be a more conceptual introduction, not an in-depth technical course. 

By the end of this module, students will:

• Understand the basic concepts behind deep neural nets (DNN)

• Become familiar the Python deep learning ecosystem

• Understand the concepts behind more advanced NN such as convolutional NNs

One of the most interesting topics in security data science is the possibility that machine learning models can be hacked.  This module covers the real-world implications of hacking machine learning models, and the techniques used to hack. 

By the end of this module, students will:

• Understand the risks of deploying ML models

• Understand the techniques used to hack ML models

• Understand ways of minimizing risk to a production model

Frequently Asked Questions

Tuition for the course is one-time payment of $3000. This includes lifetime access to the course on our custom learning platform along with any updates on the labs, lectures, and virtual machines.

The live course is $4000 and includes all access to the on-demand content.

  • Python for Data Analysis
  • Data Science for Business
  • Creating a Data-Driven Organization
  • Data-Driven Security
  • Mastering Machine Learning with scikit-learn
  • Hands-On Machine Learning with Scikit-Learn and TensorFlow
  • Learning Apache Drill
  • Deep Learning

It is required that participants are comfortable with the basics of Python programming.

You must be able to run a virtual machine on your computer. This will require at a minimum of 8GB of ram/memory, a 2 core processor, and 60GB of hard drive space free.

It is strongly recommended to have at least 16GB of ram/memory, a 4 core processor, and 120GB of hard drive space free.

  • Anaconda
  • TensorFlow (and supporting libraries)
  • Numpy
  • Scikit-learn
  • YellowBrick
  • Seaborn
  • Pandas Profiling
  • Matplotlib
  • VMWare Workstation/Player/Fusion

This course is fully self-paced and can be completed whenever the time is convenient for you if you purchase the on-demand version.

If you register for the limited time live classes, you will also obtain the fully self-paced content.

Yes.

If you purchase this course, you are eligible for $500 off the Cyber Defense Analyst Bootcamp on top of the discounts available for that course as well.

You will receive a confirmation email after purchasing the course and must create an account on our Cyber Learning Platform here:

 

Level Effect Cyber Learning Platform