Attend one of our FREE events here
As data plays a larger part in every organisation, especially digital ones, it becomes more and more necessary to understand where data comes from, how it's stored, and how it's manipulated.
Module 1 - Data collection.
Introduction to data pipelines
Introduction to APIs
HTTP requests
Selenium for web scraping
Performing basic web automation actions
Referencing HTML elements using Xpath
Cron
Module 2 - Data formats & the Pandas Python library.
CSV, JSON, Parquet file formats
Intro to Pandas
Module 3 - Data cleaning.
Reasons for and approaches to data cleaning
Handling missing data
Module 4 - Intro to Cloud.
Data Lakes and Warehouses
AWS S3
AWS DynamoDB
AWS RDS
Module 5 - SQL.
SQL basics
SQL Join operations
SQL Aggregations
SQL Subqueries
Module 6 - ETL, Distributed Computing & Data Versioning
Intro to DataBricks
Distributed computing with PySpark
ETL
Data Versioning
This chapter is a comprehensive introduction to data science. We'll start by learning how to explore your data and visualise it, through introducing the industry stardard tools for doing so.
Modelling data is a key part of data science, and a key tool for doing this is machine learning (ML). There will be a deep focus on being practical - actually building every algorithm you learn and knowing how and when to apply it. We’ll begin with an introduction to the basic machine learning problems and the different types of ML. In successive sessions, you'll learn the theory behind a particular algorithm before implementing it and using it on a real dataset.
In this unit, we'll also introduce deep learning - a class of models for representing more complex relationships withing data, especially unstructured data like images.
Module 1 - Exploratory data analysis
Descriptive statistics
Data visualisation
Module 2 - Introduction to machine learning
What is ML?
When (NOT) to use ML?
Linear regression
Scikit-learn
Validation and testing
Hyperparameters, grid search and K-fold cross validation
Module 3 - Theory
Evaluation metrics
Fitting Polynomials and the curse of dimensionality
Bias & variance, underfitting and overfittting
Regularisation
Maximum Likelihood Estimation (MLE)
Module 4 - Supervised models
Classification
Multiclass classification
Decision trees
Module 5 - Ensembles
Random forests
Module 6 - Unsupervised learning techniques
K-means clustering
PCA and t-SNE
Module 7 - PyTorch
Automatic differentiation
PyTorch Datasets and DataLoaders
PyTorch Lightning
Making custom datasets
Module 8 - Deep learning
Neural networks
Dropout
Batch Normalisation
Convolutional Neural Networks (CNNs)
Optimisation for deep learning
In this chapter, we'll focus on the engineering required to put models into the real world.
Module 1 - Acceleration
GPUs for PyTorch
AWS EC2
Cloud training
Module 2 - Running experiments
AWS Sagemaker for training
Ray Tune
Module 3 - Pretrained models
Pretrained models
Transfer learning
Module 4 - Building cloud solutions
AWS Sagemaker for deployment
AWS Lambda
AWS API Gateway
Demo building an end-to-end cloud solution
As the AI systems of more companies mature, there becomes a need for an emerging role that crosses ML engineering with DevOps - ML Operations Engineer, or MLOps Engineer for short. This role is responsible for building, deploying and maintaining ML infrastructure that other teams work with.
Module 1 - Deployment
Testing using PyTest
Building an API
Docker
TorchServe
Module 2 - Automated deployment, CI/CD and scaling
Kubernetes
Kubeflow
AWS CloudFormation
GitHub Actions for CI/Cd
Module 3 - Monitoring & continuous training (CT)
Prometheus
Grafana
Thanos
AWS Sagemaker for monitoring
Continuous training
"By the time your graduate, you'll know everything a data scientist or machine learning engineer will need to have an impact in the workplace."
Evgeny Dyshlyuk - Research Scientist, Imperial College London
Data science and machine learning are about understanding how data can be used to make key business decisions and automate processes. Given the huge amounts of data being collected in the digital age, companies across all fields want to utilise their data to inform their decisions and improve operational efficiency. The demand for data scientists has tripled over the past 5 years. The number of machine learning engineer positions on Indeed quadrupled between 2015 and 2018.
This course was created with the intention of helping meet that demand. Most of our students have a STEM background and are required to have a basic understanding of linear algebra, statistics and coding. The 15 minute quiz you complete during the application process will assess this and will give you access to precourse material to fill in any gaps in knowledge you may have.
If you are love solving problems across different fields using data and are looking to get hired doing this, you have come to the right place.
Throughout the programme, you will build a portfolio of projects that will showcase your practical skills. But we know there's more to getting hired than technical knowledge.
At the start of the programme you will have a consultation with one of our coaches to help figure out the optimal career for you. As you progress in your learning, our career coaches will help you polish your CV, hold mock interviews, audit your LinkedIn and keep you accountable in your job search process.
You will be recommended to exclusive roles with our hiring partners and be matched with an industry mentor to make sure you are ready for the workplace.
Our goal is for you to feel 100% confident going into any hiring process.
We connect our students to world class AI industry mentors. They’ll lecture technical topics in class, answer questions and share informal career advice in scheduled office hours.
Dedicated support means that on top of the 12 hours in class per week, you’ll have scheduled group office hours weekly, support through Slack and 1-on-1 sessions available to book.
Don’t waste a second. Learn from the comfort of your own home. Reach instructors instantly. Be ready with just an internet connection and your laptop.
Daniel started out as an analyst over 10 years ago now. He quickly transitioned into data science, taking on various roles and honing his skills for 4 years.
He has since grown a passion for teaching, and has been delivering data science education for the past 3 years.
Nihir is a talented NLP specialist, starting his journey many years ago with his bachelors in Computer Science and masters in Data Processing.
He is currently pursuing his PhD in Natural Language Processing at Imperial College London, in
Dr Lucia Specia's Multimodal AI lab.
Szymon is our very own AI prodigy. He is a voracious coder and contributor to the coding community.
Over 408,000 people have read his answers to Python and machine learning questions on Stack Overflow (the Quora of coding). This puts him in the top 0.4% of contributors in the world!
He has written many open source libraries, which over 1000 people have starred on Github.
Having contracted in machine learning for over 10 companies across 4 years, Ali knows all about how machine learning is deployed at a wide variety of companies.
He is passionate about education, having taught over 1200 students over the years.
© Core AI Limited 2020