Our Work

Projects That Run in Production

Every project listed here was delivered to a real client, runs on real infrastructure, and handles real data. No concept projects. No internal experiments.

CASE / 01

Automotive Data Platform for Swedish E-commerce

automotive_etl.py
SupplierFeeds PythonETL PostgreSQL TecDocMatching PrestaShopStorefront
5M+ Parts Records149 BrandsReal-Time Sync
CLIENT TYPE
Swedish Automotive E-commerce Platform
INDUSTRY
Automotive / E-commerce / Retail
SERVICES
Data Engineering, Web Scraping, ETL Pipeline
STACK
Python, Apache Airflow, PostgreSQL, Scrapy, PrestaShop, TecDoc, Docker

THE CHALLENGE

A Swedish automotive parts retailer needed to integrate over 5 million parts records from TecDoc and multiple supplier feeds into their PrestaShop storefront. Manual catalog management was causing product gaps, pricing errors, and inventory mismatches that were costing them sales. Existing processes involved overnight manual batch uploads that frequently failed without notification.

WHAT WE BUILT

  • A full TecDoc integration pipeline extracting parts data, vehicle compatibility mappings, and brand hierarchies across 149 automotive brands
  • Automated supplier feed ingestion from multiple European parts distributors with schema normalization and conflict resolution
  • A data enrichment layer matching TecDoc article numbers to supplier stock and pricing data
  • A custom PrestaShop PHP plugin for real-time catalog sync from the PostgreSQL warehouse
  • Apache Airflow DAGs for scheduled pipeline execution with retry logic, failure alerting, and full task logging
  • Data quality checks at every transformation stage catching null anomalies and schema drift before records reach the storefront

THE OUTCOME

  • 5 million+ parts records fully integrated and searchable by vehicle registration number
  • 149 brands covered with complete vehicle compatibility data
  • Pipeline runtime reduced from overnight manual batch to sub-hourly automated updates
  • Zero silent failures: every pipeline error surfaces immediately via Slack alerting
  • Client team operates the system independently with runbooks and observability dashboards
PythonApache AirflowPostgreSQLScrapyPrestaShopTecDocDocker

CASE / 02

Medallion Architecture Pipeline for Healthcare Claims Data

medallion_dag.py
Bronze Layer Raw Ingestion and Schema Validation Silver Layer Transformation and Deduplication Gold Layer Analytical Tables and Reporting
3 Layer ArchitectureFull ObservabilityApache Airflow Orchestrated
CLIENT TYPE
Healthcare Data Platform
INDUSTRY
Healthcare / Data Engineering
SERVICES
Data Engineering, Data Architecture, ETL Pipeline
STACK
Python, Apache Airflow, PostgreSQL, dbt, Docker

THE CHALLENGE

A healthcare data platform needed to process large volumes of CMS Medicare claims data across inpatient, outpatient, carrier, and prescription drug domains. The existing pipeline was a collection of unorchestrated scripts with no data quality enforcement, no observability, and no clear separation between raw and analytical data. Data engineers spent significant time debugging failures that were discovered only when downstream reports broke.

WHAT WE BUILT

  • A three-layer medallion architecture with clear separation between raw ingestion, transformation, and analytical output
  • Bronze layer: raw file ingestion with schema validation, audit logging, and idempotent load patterns
  • Silver layer: standardized transformations, deduplication, business rule enforcement, and data type normalization
  • Gold layer: clean, aggregated analytical tables structured for direct consumption by reporting and AI systems
  • Apache Airflow DAGs with task-level logging, SLA monitoring, and failure alerting across all three layers
  • dbt models for declarative, version-controlled transformations with automated testing
  • Full data lineage tracking showing the path from raw source file to analytical table for every record

THE OUTCOME

  • Zero undetected pipeline failures since deployment: every error surfaces with context before downstream impact
  • Data engineering team time spent debugging reduced by over 60 percent
  • New claim types added to the pipeline without restructuring existing DAGs
  • Complete data lineage from raw source to analytical layer for governance and audit requirements
PythonApache AirflowPostgreSQLdbtDocker

CASE / 04

AI Automation Pipeline for Business Operations

automation_flow.json
Trigger DataFetch LLMProcessing DecisionLogic DatabaseWrite SlackAlert
80% Manual Work EliminatedSelf-MonitoringMulti-Channel Output
CLIENT TYPE
Business Operations Platform
INDUSTRY
Operations / AI Automation
SERVICES
AI Automation, Agent Development, Workflow Engineering
STACK
n8n, Python, LangChain, OpenAI API, PostgreSQL, Slack API, Docker

THE CHALLENGE

A business operations team was spending multiple hours per day on a multi-step data handling process that involved fetching data from external sources, applying classification logic, writing results to a database, and notifying relevant stakeholders. The process was manual, error-prone, and delayed decisions because the data was always hours behind.

WHAT WE BUILT

  • An end-to-end n8n workflow replacing the entire manual process with a fully autonomous pipeline
  • Custom Python function nodes for data transformation and business logic that exceeded n8n's native capabilities
  • LangChain integration for LLM-powered classification and decision steps within the workflow
  • Webhook and scheduled polling triggers replacing manual process initiation
  • PostgreSQL write-back with structured output and full audit trail of every automated decision
  • Slack alerting for both successful completions and failures with full context for rapid resolution
  • Error handling, retry logic, and circuit breaker patterns preventing cascade failures

THE OUTCOME

  • Manual processing time reduced by over 80 percent
  • Data freshness improved from hours to minutes
  • Zero missed processing cycles since deployment
  • The same automation pattern is now reused across three additional workflows for the same client
n8nPythonLangChainOpenAI APIPostgreSQLSlack APIDocker
RESPONSE TIME < 24H

Ready to build something that actually works in production?

Tell us about your data challenge. We will respond within 24 hours with a clear assessment and a practical plan.

Start a Conversation