// Data Analysis · SQL · March 2026

CART TO COMPLAINT AN OPERATIONAL SQL ANALYSIS OF BRAZILIAN E-COMMERCE

DateMarch 2026
TypeData Analysis · SQL · Visualization
DatasetOlist Brazilian E-Commerce, 9 tables, ~100K orders
Overview / Goal

Where is this marketplace losing money, time, and customers? That is the central question behind this project.

Using 100K real orders from Olist, a Brazilian e-commerce marketplace, this analysis covers the full operational picture across 9 relational tables: revenue by category, delivery performance by state and seller, a composite seller scorecard, and the relationship between late deliveries and customer review scores.

Every insight came from SQL. No Python, no shortcuts. Queries ranged from basic aggregations to multi-table joins, window functions, CTEs, and subqueries. The results were visualized in a Tableau dashboard themed around the Brazilian flag colors.

Key Findings
  • Health and Beauty leads revenue at $1.26M, but Watches and Gifts has the highest average order value at $201. High revenue does not always mean high value per transaction.
  • Revenue grew nearly 10x in one year, from $111K in January 2017 to $924K in January 2018. A Black Friday spike to $987K in November 2017 confirms seasonal demand is significant.
  • 91.7% of orders arrive before the estimated delivery date. Olist deliberately pads estimates to manage customer expectations.
  • 4 days late is the critical threshold. Crossing it drops average review scores from 4.29 to below 2.5, a drop of more than one full star.
  • 37% of all 1-star reviews are linked to late delivery. But 63% are not, meaning product quality and unmet expectations matter just as much.
  • High volume sellers do not dominate the composite scorecard. One seller with nearly 1,000 orders scores below 3.5 on reviews, quietly damaging the platform at scale.
  • The worst rated categories have low late delivery rates. Office Furniture, Fashion Male Clothing, and Fixed Telephony score poorly because of product quality issues, not logistics problems.
Methods / Process

The Olist dataset was downloaded from Kaggle as 9 CSV files and imported into a local SQLite database using the command line. All analysis was written in SQL using VSCode with the SQLTools and SQLite extensions.

The project was structured across 6 phases: data setup and exploration, revenue and category analysis, delivery performance, seller scorecard, review score investigation, and Tableau dashboard build. Each phase produced exported CSV results used to build the final dashboard.

The seller scorecard used two chained CTEs and three window functions to rank sellers across review score, delivery performance, and order volume simultaneously, then combined them into a composite score. Sellers were flagged as High Volume Low Quality or Hidden Gem based on their metric combinations.

The review score investigation used subqueries and CASE WHEN bucketing to find the exact delivery delay threshold where customer satisfaction drops sharply, and then cross-referenced that finding against product categories to separate delivery problems from product quality problems.

All findings were visualized in a Tableau Public dashboard using Brazilian flag colors.

Recommendations / Next Steps

The 4-day late threshold is the most actionable finding. An automated alert system that flags orders approaching that threshold would give Olist a window to intervene before a bad review is submitted.

High Volume Low Quality sellers represent a platform risk. They drive revenue but erode trust. A seller review process triggered by sustained low scores would protect the platform without cutting volume unnecessarily.

The categories with poor review scores and low late delivery rates need a different fix entirely. Office Furniture and Fashion Male Clothing are product quality problems, not logistics problems. Stricter listing standards or seller onboarding requirements in those categories would address the root cause.

A natural next step would be rebuilding this analysis in a cloud SQL environment like BigQuery, automating the data pipeline, and connecting Tableau directly to a live data source rather than static CSV exports.

← Back to all projects