Skip to content

Instantly share code, notes, and snippets.

@realasib
Last active August 16, 2024 16:39
Show Gist options
  • Select an option

  • Save realasib/ff08c6bb3f20873c8380b55763d96634 to your computer and use it in GitHub Desktop.

Select an option

Save realasib/ff08c6bb3f20873c8380b55763d96634 to your computer and use it in GitHub Desktop.
Data Analyst Roadmap

Months 1-3: Math, Statistics, Python (Beginner to Intermediate), and MS Excel

1. Math Foundation

Weeks 1-3: Algebra

  • Day 1-2: Introduction to Algebra

    • Topics: Variables, expressions, and equations.
    • Resources: Khan Academy - Algebra 1: Intro to algebra.
    • Practice: Solve simple equations and inequalities.
  • Day 3-4: Solving Linear Equations

    • Topics: One-variable linear equations.
    • Resources: Khan Academy - Solving equations.
    • Practice: 10-15 practice problems.
  • Day 5-6: Graphing Linear Functions

    • Topics: Slope-intercept form, graphing.
    • Resources: Khan Academy - Graphing lines.
    • Practice: Graph various linear equations.
  • Day 7: Review and Practice

    • Activities: Review concepts, solve mixed problems, and take a practice quiz.

Weeks 4-6: Probability Basics

  • Day 8-10: Introduction to Probability

    • Topics: Basic probability concepts, probability rules.
    • Resources: Khan Academy - Probability and Statistics.
    • Practice: Simple probability exercises.
  • Day 11-12: Conditional Probability

    • Topics: Conditional probability, independence.
    • Resources: Khan Academy - Conditional probability.
    • Practice: Problems involving conditional probability.
  • Day 13-14: Bayes' Theorem

    • Topics: Bayes' theorem and its applications.
    • Resources: Khan Academy - Bayes' theorem.
    • Practice: Exercises applying Bayes' theorem.
  • Day 15: Review and Practice

    • Activities: Review probability topics, solve mixed problems, and take a quiz.

Weeks 7-9: Statistics

  • Day 16-18: Descriptive Statistics

    • Topics: Mean, median, mode, variance, standard deviation.
    • Resources: Khan Academy - Descriptive statistics.
    • Practice: Calculate descriptive statistics for sample datasets.
  • Day 19-21: Probability Distributions

    • Topics: Normal distribution, binomial distribution.
    • Resources: Khan Academy - Probability distributions.
    • Practice: Work with distributions and calculate probabilities.
  • Day 22-23: Central Limit Theorem

    • Topics: Understanding the central limit theorem and its importance.
    • Resources: Khan Academy - Central Limit Theorem.
    • Practice: Simulations and exercises using the central limit theorem.
  • Day 24: Review and Practice

    • Activities: Review statistics concepts, complete practice problems, and take a quiz.

Weeks 10-12: Linear Algebra and Calculus Basics

  • Day 25-27: Linear Algebra Basics

    • Topics: Vectors, matrices, matrix operations.
    • Resources: Khan Academy - Linear Algebra.
    • Practice: Solve problems related to vectors and matrices.
  • Day 28-30: Matrix Operations

    • Topics: Matrix addition, multiplication, inversion.
    • Resources: Khan Academy - Matrix operations.
    • Practice: Perform various matrix operations.
  • Day 31-33: Introduction to Calculus

    • Topics: Derivatives, limits, integrals.
    • Resources: Khan Academy - Calculus.
    • Practice: Basic calculus problems involving differentiation and integration.
  • Day 34-35: Review and Practice

    • Activities: Review linear algebra and calculus topics, solve mixed problems, and take a quiz.

2. Python for Data Analysis (Beginner to Intermediate)

Weeks 1-2: Python Basics (Revisit)

  • Day 1-2: Python Syntax and Variables

    • Topics: Basic syntax, variables, and data types.
    • Resources: “Python for Everybody” on Coursera.
    • Practice: Write simple scripts to practice syntax and variable usage.
  • Day 3-4: Control Flow

    • Topics: Loops (for, while), conditional statements (if, elif, else).
    • Resources: Python documentation and tutorials.
    • Practice: Create scripts that use loops and conditionals.
  • Day 5-6: Functions

    • Topics: Defining and calling functions, arguments, and return values.
    • Resources: “Python for Everybody” - Functions.
    • Practice: Write functions to perform various tasks.
  • Day 7: Review and Practice

    • Activities: Review Python basics, complete exercises, and small projects.

Weeks 3-5: Data Manipulation with Pandas

  • Day 8-10: Introduction to Pandas

    • Topics: Series, DataFrames, basic operations.
    • Resources: “Python for Data Analysis” by Wes McKinney.
    • Practice: Create and manipulate DataFrames.
  • Day 11-13: Data Cleaning

    • Topics: Handling missing values, data types, and indexing.
    • Resources: Pandas documentation.
    • Practice: Clean and preprocess sample datasets.
  • Day 14-15: Data Aggregation and Grouping

    • Topics: GroupBy operations, aggregations.
    • Resources: Pandas tutorials.
    • Practice: Perform grouping and aggregation tasks on datasets.

Weeks 6-8: NumPy and Matplotlib

  • Day 16-18: NumPy Basics

    • Topics: Arrays, array operations, broadcasting.
    • Resources: NumPy documentation and tutorials.
    • Practice: Work with NumPy arrays and perform mathematical operations.
  • Day 19-21: Introduction to Matplotlib

    • Topics: Basic plotting, customizing plots.
    • Resources: Matplotlib documentation.
    • Practice: Create and customize basic plots.
  • Day 22-23: Advanced Matplotlib

    • Topics: Subplots, multi-axes, and complex visualizations.
    • Resources: Matplotlib tutorials and examples.
    • Practice: Build more complex visualizations.
  • Day 24: Review and Practice

    • Activities: Review NumPy and Matplotlib concepts, complete exercises, and small projects.

Weeks 7-9: Seaborn and Advanced Visualizations

  • Day 25-27: Introduction to Seaborn

    • Topics: Statistical plots, color palettes.
    • Resources: Seaborn documentation and tutorials.
    • Practice: Create various types of plots with Seaborn.
  • Day 28-30: Advanced Seaborn

    • Topics: Heatmaps, pair plots.
    • Resources: Seaborn documentation.
    • Practice: Build complex visualizations.
  • Day 31-33: Integrating Visualizations

    • Topics: Combining Matplotlib and Seaborn plots.
    • Resources: Example notebooks and tutorials.
    • Practice: Create integrated visual reports.
  • Day 34-35: Review and Practice

    • Activities: Review advanced visualization techniques, complete exercises, and small projects.

Months 4-6: SQL, EDA, Advanced Python, and Excel Projects

3. Mastering MS Excel

Weeks 1-3: Excel Basics

  • Day 1-2: Excel Interface and Basic Functions

    • Topics: Navigation, cell formatting, basic formulas (SUM, AVERAGE).
    • Resources: Excel Easy.
    • Practice: Perform basic calculations and formatting.
  • Day 3-4: Working with Data

    • Topics: Data entry, sorting, filtering.
    • Resources: YouTube tutorials on Excel basics.
    • Practice: Sort and filter data in sample spreadsheets.
  • Day 5-6: Formatting and Charts

    • Topics: Conditional formatting, basic charts (bar, line).
    • Resources: ExcelJet.
    • Practice: Create and format basic charts.
  • Day 7: Review and Practice

    • Activities: Review Excel basics, complete exercises, and small projects.

Weeks 4-6: Advanced Excel Formulas and Functions

  • Day 8-10: Advanced Formulas

    • Topics: VLOOKUP, INDEX-MATCH, IFERROR.
    • Resources: ExcelJet, ExcelIsFun YouTube channel.
    • Practice: Use advanced formulas in sample scenarios.
  • Day 11-13: Data Manipulation

    • Topics: Text functions, date functions, complex formula combinations.
    • Resources: Excel tutorials.
    • Practice: Apply functions to manipulate and analyze data.
  • Day 14-15: PivotTables and PivotCharts

    • Topics: Creating PivotTables, summarizing data, creating PivotCharts.

    • Resources: Coursera’s “Excel Skills for Business” by Macquarie University.

    • Practice: Create and use PivotTables and PivotCharts with sample data.

Weeks 7-9: Data Analysis with Excel

  • Day 16-18: Advanced Data Analysis

    • Topics: Data validation, conditional formatting.
    • Resources: Excel tutorials and guides.
    • Practice: Apply data validation rules and conditional formatting.
  • Day 19-21: Excel Dashboards

    • Topics: Creating interactive dashboards, combining charts and data.
    • Resources: YouTube tutorials on creating dashboards.
    • Practice: Build a dashboard for a sample dataset.
  • Day 22-23: Macros and VBA Basics

    • Topics: Recording macros, basic VBA programming.
    • Resources: “Excel VBA Programming For Dummies” by Michael Alexander.
    • Practice: Record and edit macros, write simple VBA scripts.
  • Day 24: Review and Practice

    • Activities: Review advanced Excel topics, complete exercises, and small projects.

4. SQL Mastery

Weeks 10-12: SQL Mastery

  • Day 25-27: SQL Basics

    • Topics: SELECT, WHERE, JOIN, GROUP BY.
    • Resources: “SQL for Data Science” on Coursera.
    • Practice: Write and execute basic SQL queries.
  • Day 28-30: Intermediate SQL

    • Topics: Subqueries, complex joins, UNION.
    • Resources: Mode Analytics SQL Tutorial.
    • Practice: Solve complex SQL problems.
  • Day 31-33: Advanced SQL

    • Topics: Window functions, CTEs (Common Table Expressions).
    • Resources: “Advanced SQL for Data Scientists” on Udemy.
    • Practice: Write advanced queries and optimize performance.
  • Day 34-35: Review and Practice

    • Activities: Review SQL concepts, complete exercises, and small projects.

Weeks 1-2: Data Cleaning

  • Day 1-2: Data Cleaning Techniques

    • Topics: Handling missing values, outliers.
    • Resources: Kaggle’s “Data Cleaning” tutorial.
    • Practice: Clean sample datasets.
  • Day 3-4: Data Normalization

    • Topics: Standardization, normalization.
    • Resources: Tutorials on data normalization.
    • Practice: Apply normalization techniques.
  • Day 5-6: Data Transformation

    • Topics: Reshaping, merging datasets.
    • Resources: Pandas documentation.
    • Practice: Transform datasets for analysis.
  • Day 7: Review and Practice

    • Activities: Review data cleaning and transformation, complete exercises.

Weeks 3-5: Descriptive Statistics and Visualization

  • Day 8-10: Descriptive Statistics

    • Topics: Mean, median, mode, dispersion.
    • Resources: “Python Data Science Handbook” by Jake VanderPlas.
    • Practice: Calculate and interpret descriptive statistics.
  • Day 11-13: Exploratory Data Analysis (EDA)

    • Topics: Distribution analysis, correlation.
    • Resources: Kaggle EDA tutorials.
    • Practice: Perform EDA on sample datasets.
  • Day 14-15: Reporting and Documentation

    • Topics: Writing clear reports, using Jupyter notebooks.
    • Resources: Medium articles on data storytelling.
    • Practice: Document EDA results in a Jupyter notebook.

Weeks 6-12: Advanced Python Projects and Excel Projects

  • Day 16-21: Python Projects

    • Projects: Automate Excel tasks, build a web scraper, data pipeline.
    • Resources: Python libraries (openpyxl, BeautifulSoup).
    • Practice: Develop and document projects.
  • Day 22-27: Advanced Excel Projects

    • Projects: Financial dashboard, inventory system, business data analysis.
    • Resources: Excel resources and VBA tutorials.
    • Practice: Implement and refine projects.

Weeks 1-3: Tableau/Power BI Basics

  • Day 1-2: Tableau Basics

    • Topics: Interface, basic charts, data import.
    • Resources: “Data Visualization with Tableau” on Coursera.
    • Practice: Create simple dashboards.
  • Day 3-4: Power BI Basics

    • Topics: Interface, importing data, basic visualizations.
    • Resources: Power BI YouTube tutorials.
    • Practice: Build basic reports.
  • Day 5-6: Advanced Visualizations

    • Topics: Interactive dashboards, advanced charts.
    • Resources: Tableau Public gallery.
    • Practice: Create interactive visualizations.
  • Day 7: Review and Practice

    • Activities: Review visualization tools, complete exercises.

Weeks 4-6: Advanced Data Projects

  • Day 8-10: Machine Learning Basics

    • Topics: Linear regression, decision trees.
    • Resources: “Hands-On Machine Learning with Scikit-Learn” by Aurélien Géron.
    • Practice: Implement basic ML models.
  • Day 11-13: Model Evaluation

    • Topics: Cross-validation, hyperparameter tuning.
    • Resources: Coursera’s “Machine Learning” by Andrew Ng.
    • Practice: Evaluate and improve ML models.
  • Day 14-15: Portfolio Projects

    • Projects: Customer churn prediction, market basket analysis.
    • Resources: Kaggle competitions.
    • Practice: Develop and showcase projects.

Weeks 7-12: Job Preparation

  • Day 16-18: Resume and Portfolio

    • Tasks: Update resume, create portfolio.
    • Resources: LinkedIn Learning.
    • Practice: Refine and tailor resume.
  • Day 19-21: Networking and Applications

    • Tasks: Connect with professionals, apply to jobs.
    • Resources: LinkedIn, Data Science conferences.
    • Practice: Apply to jobs, follow up on applications.
  • Day 22-28: Interview Preparation

    • Technical: Practice SQL, Python, Excel problems.
    • Behavioral: Prepare for common questions.
    • Resources: Pramp, LeetCode.
    • Practice: Mock interviews, review answers.

Data Analyst Detailed Roadmap and Syllabus

Table of Contents

  1. Months 1-3: Math, Statistics, Python (Beginner to Intermediate), and MS Excel

  2. Months 4-6: SQL, EDA, Advanced Python, and Excel Projects

  3. Months 7-9: Data Visualization, Advanced Projects, and Excel Mastery

  4. Months 10-12: Job Preparation, Networking, and Advanced Topics

Months 1-3: Math, Statistics, Python (Beginner to Intermediate), and MS Excel

  1. Math Foundation

    • Week 1-3: Algebra

      • Topics: Equations, inequalities, functions, logarithms.
      • Resources: Khan Academy’s Algebra course.
      • Practice: Solve 10-15 problems daily.
    • Week 4-6: Probability Basics

      • Topics: Probability rules, conditional probability, Bayes' theorem.
      • Resources: “Introduction to Probability” by MIT OpenCourseWare.
      • Practice: Simple probability problems, use Python to simulate.
    • Week 7-9: Statistics

      • Topics: Descriptive statistics, distributions (normal, binomial), central limit theorem.
      • Resources: Coursera’s “Statistics with Python” by University of Michigan.
      • Practice: Analyze datasets to calculate mean, variance, and visualize distributions.
    • Week 10-12: Linear Algebra and Calculus Basics

      • Linear Algebra: Vectors, matrices, dot product, matrix operations.
      • Calculus: Derivatives, integrals, application to data trends.
      • Resources: Khan Academy for Linear Algebra and Calculus.
      • Practice: Use Python (NumPy) for matrix operations and differentiation.
  2. Python for Data Analysis (Beginner to Intermediate)

    • Week 1-2: Python Basics (Revisit)

      • Topics: Variables, data types, loops, functions, file handling.
      • Resources: “Python for Everybody” by University of Michigan on Coursera.
      • Practice: Write basic scripts daily, focus on understanding syntax and control flow.
    • Week 3-5: Data Manipulation with Pandas

      • Topics: DataFrames, indexing, filtering, groupby, merging, and joining.
      • Resources: “Python for Data Analysis” by Wes McKinney.
      • Practice: Work with datasets, clean data, and perform operations like merging and grouping.
    • Week 6-8: NumPy and Matplotlib

      • NumPy: Arrays, broadcasting, matrix operations.
      • Matplotlib: Basic plots (line, bar, scatter), customizations.
      • Resources: Official documentation, YouTube tutorials.
      • Practice: Recreate charts from examples, apply NumPy operations to data.
    • Week 9-12: Seaborn and Advanced Visualizations

      • Seaborn: Statistical plots, heatmaps, pairplots.
      • Advanced Visualizations: Subplots, multi-axes, 3D plots.
      • Resources: Seaborn documentation, notebooks.
      • Practice: Create visual reports for small datasets, refine aesthetics.
  3. Mastering MS Excel

    • Week 1-3: Excel Basics

      • Topics: Interface, basic formulas, formatting, data entry.
      • Resources: Excel Easy, YouTube tutorials.
      • Practice: Daily tasks involving cell operations, basic calculations, and simple formatting.
    • Week 4-6: Advanced Excel Formulas and Functions

      • Topics: VLOOKUP, INDEX-MATCH, IFERROR, COUNTIF, SUMIFS, text functions.
      • Resources: ExcelJet, ExcelIsFun YouTube channel.
      • Practice: Use complex formulas in real-life scenarios, solve Excel problems.
    • Week 7-9: Data Analysis with Excel

      • Topics: PivotTables, PivotCharts, data validation, conditional formatting.
      • Resources: Coursera’s “Excel Skills for Business” by Macquarie University.
      • Practice: Analyze datasets, summarize data with PivotTables, and visualize with PivotCharts.
    • Week 10-12: Excel Macros and VBA

      • Topics: Introduction to Macros, recording macros, basic VBA programming.
      • Resources: “Excel VBA Programming For Dummies” by Michael Alexander.
      • Practice: Automate repetitive tasks, create custom functions, and enhance productivity with VBA.

Months 4-6: SQL, EDA, Advanced Python, and Excel Projects

  1. SQL Mastery

    • Week 1-3: SQL Basics

      • Topics: SELECT, WHERE, GROUP BY, HAVING, JOIN, subqueries.
      • Resources: “SQL for Data Science” by University of California, Davis on Coursera.
      • Practice: Daily SQL challenges on HackerRank or LeetCode.
    • Week 4-6: Advanced SQL

      • Topics: Window functions, CTEs (Common Table Expressions), complex joins.
      • Resources: Mode Analytics SQL Tutorial, “Advanced SQL for Data Scientists” on Udemy.
      • Practice: Write complex queries, optimize performance, join multiple tables.
  2. Exploratory Data Analysis (EDA)

    • Week 1-2: Data Cleaning

      • Topics: Handling missing values, outliers, data normalization.
      • Resources: Kaggle’s “Data Cleaning” challenge.
      • Practice: Clean and prepare datasets for analysis.
    • Week 3-5: Descriptive Statistics and Visualization

      • Topics: Descriptive stats, distribution analysis, correlation.
      • Resources: “Python Data Science Handbook” by Jake VanderPlas.
      • Practice: Perform EDA on public datasets (e.g., Titanic, Iris).
    • Week 6: Reporting and Documentation

      • Topics: Writing clear, concise analysis reports.
      • Resources: Medium articles on storytelling with data.
      • Practice: Document EDA results in Jupyter notebooks, share on GitHub.
  3. Advanced Python Projects

    • Week 7-12: Complete 2-3 Small to Intermediate Projects
      • Project Ideas:
        • Automate Excel tasks with Python (e.g., data extraction and analysis).
        • Build a simple web scraper to gather data for analysis.
        • Create a data pipeline that ingests, cleans, and analyzes data.
      • Resources: Python libraries like openpyxl for Excel, BeautifulSoup for web scraping.
      • Practice: Implement projects end-to-end, document and share on GitHub.
  4. Advanced Excel Projects

    • Week 7-12: Work on Real-World Excel Projects
      • Project Ideas:
        • Create a financial dashboard using advanced Excel functions and charts.
        • Develop an inventory management system with Excel and VBA.
        • Analyze business data and present findings using PivotTables and advanced visualizations.
      • Practice: Apply Excel skills to business scenarios, automate repetitive tasks with VBA.

Months 7-9: Data Visualization, Advanced Projects, and Excel Mastery

  1. Data Visualization Skills

    • Week 1-3: Tableau/Power BI Basics

      • Topics: Interface, importing data, basic charts, calculated fields.
      • Resources: “Data Visualization with Tableau” on Coursera, Power BI YouTube tutorials.
      • Practice: Create dashboards for small datasets.
    • Week 4-6: Advanced Visualization Techniques

      • Topics: Interactive dashboards, storytelling with data, custom visualizations.
      • Resources: Tableau Public gallery for inspiration.
      • Practice: Recreate advanced dashboards, integrate real-time data.
    • Week 7-9: Portfolio Projects

      • Project Ideas:
        • Build a Tableau dashboard for a public dataset.
        • Visualize global economic indicators.
        • Interactive Power BI report on company performance.
      • Practice: Document projects, share dashboards publicly.
  2. Advanced Data Projects

    • Week 1-6: Work on 2-3 Complex Projects

      • Project Ideas:
        • Customer churn prediction with logistic regression.
        • Market basket analysis using association rules.
        • Time series forecasting for sales data.
      • Resources: Kaggle competitions, data.world.
      • Practice: Apply full data analysis workflow: EDA, modeling, visualization, and reporting.
    • Week 7-12: Freelance and Open Source Contributions

      • Freelancing: Offer services on Upwork, Fiverr for data analysis projects.
      • Open Source: Contribute to data science projects on GitHub.
      • Practice: Build a professional network, get feedback from the community.

Months 10-12: Job Preparation, Networking, and Advanced Topics

  1. Advanced Topics in Data Analysis

    • Week 1-3: Machine Learning Basics

      • Topics: Linear regression, decision trees, clustering.
      • Resources: “Hands-On Machine Learning with Scikit-Learn” by Aurélien Géron.
      • Practice: Implement basic ML models using scikit-learn, analyze results.
    • Week 4-6: Model Evaluation and Improvement

      • Topics: Cross-validation, hyperparameter tuning, model interpretability.
      • Resources: Coursera’s “Machine Learning” by Andrew Ng.
      • Practice: Experiment with improving models on Kaggle datasets.
  2. Job Preparation

    • Week 1-3: Resume and Portfolio

      • Tasks: Update resume, focus on projects and skills.
      • Resources: LinkedIn Learning courses on resume writing.
      • Practice: Tailor resume for each job application, use keywords from job descriptions.
    • Week 4-6: Networking and Applications

      • Tasks: Connect with professionals on LinkedIn, attend webinars, apply to jobs.
      • Resources: LinkedIn, Data Science conferences.
      • Practice: Apply to 10-15 jobs weekly, follow up with connections.
    • Week 7-12: Interview Preparation

      • Technical Interviews: Practice SQL, Python, Excel, and data analysis problems.
      • Behavioral Interviews: Prepare for common questions, STAR method.
      • Resources: Pramp for mock interviews, LeetCode for coding practice.
      • Practice: Daily practice sessions, review key concepts, and refine answers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment