Where is this marketplace losing money, time, and customers? That is the central question behind this project.
Using 100K real orders from Olist, a Brazilian e-commerce marketplace, this analysis covers the full operational picture across 9 relational tables: revenue by category, delivery performance by state and seller, a composite seller scorecard, and the relationship between late deliveries and customer review scores.
Every insight came from SQL. No Python, no shortcuts. Queries ranged from basic aggregations to multi-table joins, window functions, CTEs, and subqueries. The results were visualized in a Tableau dashboard themed around the Brazilian flag colors.
The Olist dataset was downloaded from Kaggle as 9 CSV files and imported into a local SQLite database using the command line. All analysis was written in SQL using VSCode with the SQLTools and SQLite extensions.
The project was structured across 6 phases: data setup and exploration, revenue and category analysis, delivery performance, seller scorecard, review score investigation, and Tableau dashboard build. Each phase produced exported CSV results used to build the final dashboard.
The seller scorecard used two chained CTEs and three window functions to rank sellers across review score, delivery performance, and order volume simultaneously, then combined them into a composite score. Sellers were flagged as High Volume Low Quality or Hidden Gem based on their metric combinations.
The review score investigation used subqueries and CASE WHEN bucketing to find the exact delivery delay threshold where customer satisfaction drops sharply, and then cross-referenced that finding against product categories to separate delivery problems from product quality problems.
All findings were visualized in a Tableau Public dashboard using Brazilian flag colors.
The 4-day late threshold is the most actionable finding. An automated alert system that flags orders approaching that threshold would give Olist a window to intervene before a bad review is submitted.
High Volume Low Quality sellers represent a platform risk. They drive revenue but erode trust. A seller review process triggered by sustained low scores would protect the platform without cutting volume unnecessarily.
The categories with poor review scores and low late delivery rates need a different fix entirely. Office Furniture and Fashion Male Clothing are product quality problems, not logistics problems. Stricter listing standards or seller onboarding requirements in those categories would address the root cause.
A natural next step would be rebuilding this analysis in a cloud SQL environment like BigQuery, automating the data pipeline, and connecting Tableau directly to a live data source rather than static CSV exports.