This project focuses on analyzing the Data Analyst job market using SQL to uncover insights related to salary trends, skill demand, and optimal skill combinations.
The analysis answers practical, career-focused questions using real-world job posting data.
All insights are derived from SQL queries executed on a PostgreSQL database, with results visualized for clearer interpretation.
With the rapid rise of data-driven decision-making, Data Analyst roles have become highly competitive. Job seekers often struggle to understand:
- Which roles offer the highest salaries
- Which skills are truly in demand
- How skill demand correlates with compensation
This project addresses these questions by analyzing job postings, salary information, company data, and required skills using SQL.
-
PostgreSQL (SQL)
- Joins, aggregations, filtering
- CTEs and subqueries
- Window functions (
RANK) - Date & time zone conversions
-
Python
pandasfor handling query outputsmatplotlibfor generating visualizations
-
Git & GitHub
- Version control
- Project documentation
📌 Large datasets, generated results, and visualizations are excluded from the repository for cleanliness and efficiency.
📂 Sql_Project_Data_job_Analysis
└── 📁.vscode
└── 🛠️settings.json
└── 📁advanced_sql
├── ⛁ Case_Expressions.sql
├── ⛁ Database_Creation(sql_course).sql
├── ⛁ Date_Functions.sql
├── ⛁ Monthwise_job_tables.sql
├── ⛁ Sample TABLE jobs_applied.sql
├── ⛁ Subqueries_&_CTEs.sql
└── ⛁ Union_Operators.sql
└── 📁csv_files
├── 👻.DS_Store
├── 🧾company_dim.csv
├── 🧾job_postings_fact.csv
├── 🧾skills_dim.csv
└── 🧾skills_job_dim.csv
└── 📁images
├── 🖼️ 1_top_paying_jobs.png
├── 🖼️ 2_top_paying_jobs_skills.png
├── 🖼️ 3_top_demanded_skills.png
├── 🖼️ 4_top_skills_by_salary.png
├── 🖼️ 5_top_optimal_skills.png
└── 🖼️ 6_latest_jobs.png
└── 📁project_files_sql
├── ⛁ 1_top_paying_jobs.sql
├── ⛁ 2_top_paying_job's_skills.sql
├── ⛁ 3_top_demanded_skills.sql
├── ⛁ 4_top_skills_by_salary.sql
├── ⛁ 5_top_optimal_skills.sql
└── ⛁ 6_latest_jobs.sql
└── 📁results_csv
├── 🧾latest_jobs.csv
├── 🧾top_optimal_skills.csv
├── 🧾top_pay_jobs_skills.csv
├── 🧾top_paying_jobs.csv
├── 🧾top_salary_skills.csv
└── 🧾top_skills_DA.csv
└── 📁sql_load
├── ⛁ 1_create_database.sql
├── ⛁ 2_create_tables.sql
└── ⛁ 3_modify_tables.sql
└── 🚫.gitignore
└── 🐍generate_vizualizations.py
└── 📝README.md
📌 Significance:
.vscode/: VS Code + SQLTools configuration (gitignored)advanced_sql/: SQL learning & experimentation (7 fundamental files)csv_files/: 129MB raw csv files of jobs_dataset , to be loaded into empty tables created insql_load/'s2_create_tables.sql, using3_modify_tables.sql(gitignored - main 123MB job_postings_fact.csv)images/: Generated PNG visualizations from analysis (gitignored)project_files_sql/: All 6 core queries which was used to perform the data analysis on our jobs_dataset, on the basis of the problem statements mentioned for each query (main deliverables)results_csv/: Raw SQL query outputs (gitignored)sql_load/: Production-ready ETL pipeline (database setup + data loading).gitignore: Smartly excludes 129MB+ data (1.5MB GitHub repo)generate_visualizations.py: Python script creating charts (gitignored)
📂 Working in project_files_sql/ folder - contains All 6 core SQL queries which is used to perform the data analysis on our jobs_dataset, on the basis of the problem statements mentioned for each query (in the --comments) :
SQL File: 1_top_paying_jobs.sql
Problem Explanation:
This query identifies the top 10 highest-paying Data Analyst jobs that are available remotely and have non-null salary information.
The objective is to evaluate whether high compensation is limited to on-site roles or achievable in remote positions.
Result Analysis:
The results show that several remote Data Analyst roles offer salaries exceeding $150K–$200K annually.
This confirms that location independence does not restrict earning potential and that high-paying opportunities exist across a range of companies.
SQL File: 2_top_paying_job's_skills.sql
Problem Explanation:
This analysis determines which skills are most commonly required in the highest-paying Data Analyst roles.
Result Analysis:
SQL appears in nearly all top-paying jobs, reinforcing its role as a foundational skill.
Python and Tableau are also highly prevalent, highlighting the importance of programming and data visualization in high-compensation roles.
SQL File: 3_top_demanded_skills.sql
Problem Explanation:
This query ranks skills by the number of Data Analyst job postings that require them, revealing overall market demand.
Result Analysis:
SQL and Excel dominate demand, confirming they are baseline requirements for most roles.
Python, Tableau, and Power BI follow closely, emphasizing the importance of analytical and visualization skills.
SQL File: 4_top_skills_by_salary.sql
Problem Explanation:
This analysis identifies skills associated with the highest average salaries, regardless of how frequently they appear in job postings.
Result Analysis:
Highly specialized and infrastructure-related skills command premium pay.
These skills are less common but significantly increase earning potential, often reflecting scarcity and advanced expertise.
SQL File: 5_top_optimal_skills.sql
Problem Explanation:
This query combines job demand and average salary to identify the most optimal skills to learn.
Result Analysis:
SQL offers the strongest balance between demand and compensation.
Python and Tableau provide strong salary upside, while Excel—though highly demanded—offers less differentiation in pay.
SQL File: 6_latest_jobs.sql
Problem Explanation:
This query retrieves the most recent Data Analyst job postings after February 2023 and converts posting times from UTC to U.S. Eastern Time.
Result Analysis:
The latest postings show continued hiring activity across regions and remote roles.
Salary ranges vary widely, reflecting differences in role scope, seniority, and company expectations.
- How to write real-world analytical SQL queries
- Practical use of CTEs, subqueries, and window functions
- How to translate raw SQL results into actionable insights
- How demand and salary do not always correlate directly
- How to document a complete data analysis project professionally
This project demonstrates that:
- SQL is the most critical skill for Data Analysts
- Combining SQL with Python/R and visualization tools maximizes career potential
- Specialized skills unlock higher salary tiers
- Remote roles can offer compensation comparable to on-site positions
Overall, this project shows how SQL-driven analysis can provide meaningful insights into the data job market and support informed, data-driven career decisions.





