ADS201 Applied Data Science for Artificial Intelligence and Cybersecurity

The Why Behind the Course

We made this course to teach cybersecurity professionals how to use AI/ML to defend their organizations. Additionally, we want cybersecurity professionals to understand how to attack artificial intelligence applications and what are the associated risks.

Course Description

The course will cover the entire data science process from data preparation, feature engineering and selection, exploratory data analysis, data visualization, machine learning, model evaluation & optimization and finally, implementing at scale. Participants will learn how to read data in a variety of common formats, and then write scripts to analyze and visualize the data in meaningful ways. The course is specifically designed to provide sophisticated training in data science as applied to cybersecurity-related challenges & scenarios.

Learning Outcomes

Anyone who wishes to incorporate automated data analysis, artificial intelligence, machine learning and data science into their cybersecurity work should take this course and expect the following outcomes:

Rapidly explore, visualize and analyze security data using open source tools
Analyze big data and make data-driven predictions through probabilistic modeling and statistical inference
Identify and deploy modeling in order to extract meaningful information for decision making
Construct, train, evaluate & deploy supervised ML models to solve difficult security related problems
Construct unsupervised models for anomaly detection and other exploratory analysis

Prerequisites

Level Effect’s Cybersecurity Fundamentals courses starting with IT
GTK Cyber's Python Programming for Data Science and Cybersecurity
0-1+ years of professional experience in technology, preferably within Data Science
Hobbyists with a solid understanding of Data Science, or Cybersecurity, or IT and some Python + Jupyter knowledge or willing to learn it

Course Author

Name: Charles Givre (LinkedIn)

Currently: Head of Artificial Intelligence, Stealth Startup

Bio: Charles is the CEO and founder of DataDistillr, which is dedicated to making the world's data easy to use and query. Prior to founding DataDistillr, Charles worked as a data scientist in cyber for JP Morgan and Deutsche Bank. Mr. Givre has taught (and is teaching) security data science courses at Blackhat and is a sought-after instructor. Mr. Givre co-authored the O'Reilly book Learning Apache Drill and is the PMC Chair for the Apache Drill project.

Course Author

Name: Curtis Lambert (LinkedIn)

Currently: Senior Data Scientist, Raytheon

Bio: Curtis has more than 15 years experience supporting cyber security missions for the U.S. DOD specializing in application of data science techniques to national security challenges across cyberspace. He holds multiple SANS certifications in cyber security and loves taking on challenges others say can't be solved. Curtis started his career journey as a heavy equipment mechanic in central California working on agricultural equipment. He spent 6 years in the U.S. Army as a linguist and data analyst before becoming a consultant with BAH where he spent 9 more years supporting a variety of national security missions. Curtis is a CISSP and holds multiple SANS certifications. He is a relentless pursuer of knowledge and constantly engages in self-education through books, videos, and courses.

Comprehensive

Data Analysis

Gain hands-on experience with vectorized computing, data frame management, and creating both static and interactive visualizations, essential for data interpretation and presentation.

Machine Learning in Cybersecurity

Tailored for cybersecurity applications, including practical training on classifiers, clustering, anomaly detection, and deep learning, all framed within security contexts. Address the challenges of hacking machine learning models, equipping students with knowledge to protect AI systems.

AI Model

Security Risks

Focus on the practical implications for cybersecurity and AI model hacking. Students explore neural networks, including CNNs and RNNs, learning to apply these to security tasks and understand how to safeguard against vulnerabilities in AI technologies.

The GTK Cyber course was well organized and flowed nicely, all of the topics were relevant.

Student

BlackHat

I found it informative and interesting.

Student

BlackHat

The Jupyter notebooks and answers were a big help.

Student

BlackHat

Learning Modules

Note - this content is not finalized and may be subject to change prior to release.

Introduction to Data Science
This module introduces the course, as well as key concepts of data science and their application to security.
- Define Python
- Introduce the data science process
- Discuss case studies of the application of data science and machine learning to security
- Introduce the Python data science ecosystem to include Jupyter Notebook and various Python modules
- Introduce the Griffon VM
Exploratory Data Analysis in One Dimension
This module introduces the concept of vectorized computing, and how to create, manipulate and summarize one dimensional data using the Pandas module in Python. Additionally, we will cover basic statistical concepts.
- Create a series object
- Explore and summarize data within a Series
- Understand and generate Tukey 5 number summaries, as well as calculate other common statistical measures
Exploratory Data Analysis in Two Dimensions
This module builds on the concepts taught in module 1, and introduces the students to the DataFrame: a two dimensional, vectorized data structure. This module also covers how to directly ingest security data into a Pandas DataFrame.
- Create a DataFrame using the various read_ functions
- Import various security data into dataframes
- Flatten complex nested data
- Join and merge data sets
- Calculate aggregate statistics
Data Visualization
Data visualization is a powerful technique to have in your analytic toolkit. This module will cover the theory of data visualization as well as the actual process and coding of creating effective visualizations.
- Understand techniques about how to make effective data visualizations
- Be familiar with various modules in Python to create visualizations
- Be able to create both static and interactive visualizations
Machine Learning - Feature Engineering
The Machine Learning modules walk the student through the machine learning process from beginning to end, starting with feature engineering and selection, model creation and ultimately evaluating and improving model performance.
- Complete walkthrough of the machine learning process
Supervised Machine Learning
This module introduces students to the concepts behind classifiers and various classification algorithms. Students will also learn to evaluate classifier performance.
- Understand the functioning of a classifier and various classification algorithms including Support Vector Machines, Decision Trees, and Random Forest
- Be able to create models using various classification algorithms in Python using Scikit-Learn
- Be able to evaluate the performance of a model and tune its hyperparameters.
Unsupervised Learning
This module introduces students to unsupervised learning and how to apply it to security problems. The module mainly focuses on clustering and its uses and limitations.
- Understand various distance measurement functions
- Understand the concepts behind K-Means and DBSCAN
- Be able to create clusterers and evaluate their performance
Anomaly Detection
This module covers various techniques that can be used to detect anomalies in security data. This module will introduce the students to various approaches to anomaly detection, including forecasting, unsupervised machine learning and other statistical techniques.
- Understand different approaches for detecting anomalies in security data
- Be able to implement common unsupervised machine learning algorithms such as one-class support vector machines and isolation forests to detect anomalous data
- Be able to perform anomaly detection using statistical metrics such as the Grubbs’ test
- Understand the challenges associated with using Machine Learning to detect anomalies
Hunting with Data Science
Deep learning is one of the most exciting and new areas in machine learning. In this module, we will give the students a conceptual overview of deep learning and its application to security problems. Given the complexity of this topic, this module is intended to be a more conceptual introduction, not an in-depth technical course.
- Understand the basic concepts behind deep neural nets (DNN)
- Become familiar the Python deep learning ecosystem
- Understand the concepts behind more advanced NN such as convolutional NNs
Deep Learning
This module covers the topics of deep learning, neural networks, convolutional neural networks, recurrent neural networks, and their applications in cybersecurity. It also discusses the use of deep learning tools such as TensorFlow and Keras.
- Gain understanding of deep learning concepts including neural networks, CNNs, and RNNs.
- Learn how to apply deep learning techniques to cybersecurity tasks.
- Acquire practical skills in using TensorFlow, Keras, and Word2Vec for implementing deep learning models and processing cybersecurity data.
Hacking Machine Learning Models
One of the most interesting topics in security data science is the possibility that machine learning models can be hacked. This module covers the real-world implications of hacking machine learning models, and the techniques used to hack.
- Understand the risks of deploying ML models
- Understand the techniques used to hack ML models
- Understand ways of minimizing risk to a production model
Course Assessement

ADS201

Applied Data Science & Machine Learning

For Artificial Intelligence and Cybersecurity

The Why Behind the Course

Course Author - AI Image Analysis Demo and Q&A

Course Description

Learning Outcomes

Prerequisites

0

0

0

Course Author

Course Author

Course Features

Comprehensive

Data Analysis

Machine Learning in Cybersecurity

AI Model

Security Risks

Explore the Curriculum

Learning Modules

Introduction to Data Science

Exploratory Data Analysis in One Dimension

Exploratory Data Analysis in Two Dimensions

Data Visualization

Machine Learning - Feature Engineering

Supervised Machine Learning

Unsupervised Learning

Anomaly Detection

Hunting with Data Science

Deep Learning

Hacking Machine Learning Models

Course Assessement

Course Cost

Frequently Asked Questions

What are some recommended readings for this course?

Are there any prerequisites for this course?

What additional resources are recommended?