Webskyne
Webskyne
LOGIN
← Back to journal

21 May 202611 min read

From Zero to $2M ARR: How a Bootstrapped Fintech Startup Built a Scalable Payment Platform on Flutter and NestJS

When PayStream approached us in mid-2024, they were processing only $50,000 in monthly transactions through a brittle no-code prototype built on Firebase and Bubble.io — and dreaming of $2 million ARR. Eighteen months, two major architecture decisions, a full Firebase-to-NestJS migration, and one PCI DSS compliance sprint later, they had a production-grade payment infrastructure handling $420,000 in monthly transaction volume across 12,000 active users — with 99.98% platform uptime and a dispute rate that fell from 3.1 percent to 0.34 percent. This case study breaks down every critical decision point, from why Flutter won over native mobile and whether NestJS truly beats Express or .NET for a money-handling backend, to how the team structured their AWS deployment so regulators in three countries signed off in under a year. It also covers the costly mistakes — undetected idempotency gaps, underbudgeted KYC localization, and accidentally expensive Lambda retry loops — and the specific metrics that finally convinced founders their architecture investment was worth every dollar.

Case Studyfintechflutternestjsawsmicroservicespaymentspci-complianceevent-sourcing
From Zero to $2M ARR: How a Bootstrapped Fintech Startup Built a Scalable Payment Platform on Flutter and NestJS
## Overview In June 2024, PayStream — a peer-to-peer payment platform focused on gig workers and micro-merchants across Indonesia and Southeast Asia — was sitting on a $2.2M seed round but had no product team, no production-grade codebase, and four months of runway left to prove product-market fit. Their launch prototype, cobbled together on Firebase and Bubble.io, had crashed six times during the previous quarter's promotional campaign and cost them more than $80,000 in failed transaction disputes. Their chief technology officer, Maya Sintani, described the situation plainly: "We had investors asking for scale metrics while our infrastructure couldn't survive Black Friday traffic on a good day." What made PayStream's situation particularly interesting — and the reason this case study is worth reading if you are leading a fintech startup — was the clash between the two tech options every founder eventually faces: keep moving fast with the tools that got you here, or slow down and build the infrastructure that could take you there. Over the next 18 months, working alongside their engineering team, we helped them replace the Firebase monolith with a Flutter mobile client and NestJS microservice backend deployed on AWS, implemented real-time transaction monitoring, and navigated the regulatory compliance requirements that come with managing payment licenses across three countries. By March 2026, PayStream was processing $420,000 in monthly transaction volume, serving 12,000 active users, and had just closed a $5M Series A at a $38M valuation. This study traces every critical decision point: why we chose Flutter over native mobile, why NestJS won over Express and .NET, how we structured the migration to keep the business running, the architecture that finally stopped the "it's working in staging" paradox, and the specific numbers — revenue, cost, uptime, team velocity — that proved the investment worth it. --- ## Challenge ### Product Risk and Technical Debt PayStream's launch prototype, built on Firebase (Realtime Database, Firestore, Cloud Functions) and supplemented by Bubble.io for the merchant dashboard, was never designed for regulation-grade payment flows. At 3,000 concurrent users — a modest volume for a startup with their ambitions — the system exhibited consistent failure modes: - **Firebase Realtime Database hot-paths:** Financial transaction records and user session data lived in the same Firebase database. Peak-hour reads saturated the connection pool, pushing latency from 300ms to over 4 seconds and triggering cascading write failures across the entire app. - **No audit trail or reconciliation:** Because Firebase is a real-time database without robust transaction logging, the team had no way to verify whether a money transfer had been committed. When disputes arose, resolving them required manually reading through Firebase logs. Four out of every seven disputes had required outside legal review. - **Monolithic Cloud Functions:** The entire backend logic — user registration, KYC verification, payment orchestration, notifications, fraud detection — lived in a single Firebase Cloud Function with over 12,000 lines of JavaScript. Testing it required deploying to Firebase. Rollbacks took 45 minutes. - **Bubble.io vendor lock-in:** The merchant dashboard was built in Bubble.io, which meant they could neither customize the code nor integrate it with the mobile app in a way that passed the API contract requirements of their payment processor partners. Any API change required rebuilding the entire Bubble.io workflow — a multi-day effort. - **Single-region Firebase hosting:** All users — including merchants in Indonesia, Vietnam, and Singapore — connected to Firebase hosted in Singapore. Latency measurements from Jakarta averaged 620ms, well above the 300ms threshold set by their payment processors. ### Regulatory and Compliance Pressure By late 2024, PayStream was in active discussions with regulators in Indonesia (OJK), Singapore (MAS), and Vietnam (SBV) about obtaining payment service licenses. Each regulatory body required comprehensive operational controls: transaction immutability, PCI DSS compliance, audit trails spanning seven years, geo-fenced data residency, and real-time fraud detection systems. Firebase — while fast to prototype — offered no native support for any of those requirements. Payment processors (Stripe, Xendit, DANA) also mandated PCI audit documentation. The Firebase team could not produce it, creating a hard blocker in partner negotiations. ### Business Impact ![PayStream fintech dashboard, dashboard with analytics charts](https://images.unsplash.com/photo-1551288049-bebda4e38f71?auto=format&fit=crop&w=1200&q=80) The technical debt was not abstract. Here were the downstream business consequences: - **$18,000 in monthly payment processor fines:** Missed settlement windows and failed transaction feeds led to fees from Stripe and Xendit every single month of Q4 2024. - **89% support ticket escalation rate:** When the mobile app crashed at payment confirmation, the support team had to reconstruct transactions manually from Firebase logs. The team of three support engineers was spending 18 hours per day on escalations instead of building customer relationships. - **Hiring difficulty for senior engineers:** Every candidate who interviewed described the stack as a liability and requested significant compensation to come on board. - **Stalled product roadmap:** The engineering team had delivered only one product feature every eight weeks throughout 2024 — their stated roadmap expected a feature every three weeks. --- ## Goals ### Primary Technical Goals 1. **Achieve 99.99% payment system uptime:** Less than 4.38 minutes of downtime per month — a five-fold improvement from 99.7% during Q4 2024. 2. **Process 25,000 concurrent transactions per second:** Set with a 14% margin above their projected Q4 2025 peak of 22,000 TPS. 3. **Meet PCI DSS Level 2 compliance within 12 months:** Every technical and operational requirement for payment processor audits. 4. **Reduce infrastructure cost per transaction to below $0.08:** From $0.17 in Q4 2024 — a 53% reduction target. ### Non-Technical Business Goals - **Board readiness:** Technical roadmap by January 2025 that inspired investor confidence. - **One-click internationalization:** Arabic, Bahasa Indonesia, and Vietnamese language support accessible by product teams without engineering involvement. - **Developer retention:** Address reasons senior engineers were considering offers elsewhere. --- ## Approach ### Architecture: Event-Driven Microservices with CQRS We rejected two obvious paths: staying on Firebase and iterating, which would have delivered 99.9% uptime at best, or a knee-jerk rewrite with an estimated 24+ month burn rate and no revenue during the process. Instead, we chose an incremental strangler fig migration with an event-driven backbone, replacing one subsystem at a time while keeping Firebase serving production traffic throughout. The adoption of **Command Query Responsibility Segregation (CQRS)** was another deliberate choice. Financial transactions demand write-side transactionality; the same data, read for dashboards and reports, demands a different access pattern. Separating write and read models allowed us to use PostgreSQL for writes (transactional guarantees) and DynamoDB for reads (single-digit millisecond queries) — a classic CQRS trade-off that saved approximately $3,200 monthly in database costs. ### Technology Decisions | Layer | Technology | Rationale | |-------|-----------|-----------| | **Mobile Client** | Flutter (Dart) | Single codebase, 60fps payment UI, deterministic rendering across OEM skins | | **Backend** | NestJS (TypeScript) | Type safety for financial logic, built-in DI, proven microservice module | | **API Gateway** | AWS API Gateway + AppSync | Rate limiting, auth integration, managed scaling, GraphQL optionality | | **Transaction Database** | PostgreSQL (RDS/Aurora) | ACID guarantees, JSONB document flexibility, full-text search | | **Cache/Read Layer** | DynamoDB + ElastiCache Redis | Single-digit ms reads, DynamoDB TTL-driven session expiry | | **Event Backbone** | EventBridge + SQS | Decoupled service communication, audit trail, replay capability | | **Observability** | Datadog + X-Ray + CloudWatch | Full request tracing, business KPI monitoring, synthetic front-door monitors | | **Infra as Code** | AWS CDK (TypeScript) | Version-controlled infrastructure, CDK Pipelines for staged deploys | | **Queue Processing** | BullMQ (Redis) | Reliable job queue for settlement reconciliations and fraud scoring | ### Migration Strategy: The Strangler Fig in Four Phases | Phase | Duration | Focus | Go Criteria | |-------|----------|-------|-------------| | **Phase 0: Compliance Foundation** | Weeks 1–4 | PCI scope definition, data residency, audit logging | PCI audit signed off by external QSA | | **Phase 1: Core Payment Engine** | Weeks 5–16 | Stripe/Xendit integration, transaction write path | 10,000 TPS benchmarked, zero disputes | | **Phase 2: KYC and User Onboarding** | Weeks 17–24 | Identity verification service, Liveness check | KYC approval from OJK and MAS | | **Phase 3: Flutter App Launch** | Weeks 25–36 | Mobile client, push notifications, fraud detection | App Store + Play Store approved | | **Phase 4: Decommission Firebase** | Weeks 37–52 | Cutover plan, monolith retirement | 100% traffic on new infra, zero incidents 30 days | --- ## Implementation ### Phase 0: Compliance Before Code Before writing a single line of migration code, we spent four weeks on what mattered most in a payment business: compliance architecture. The external Qualified Security Assessor (QSA) defined PCI scope before any infrastructure was provisioned. We used AWS's Well-Architected Framework to model cardholder data processing — identifying which AWS accounts would contain sensitive resources and establishing cross-account IAM boundaries using AWS Organizations. The single most consequential early implementation was payment token vaulting using AWS KMS with envelope encryption; raw card numbers never crossed the application boundary, and we could prove it to auditors with automated compliance reports generated via AWS Artifact. ### Phase 1: The Payment Engine (Weeks 5–16) The core payment engine was the highest-risk and highest-reward subsystem to migrate. We designed it around five principles: idempotency, immutability, auditability, observed-by-design, and graceful degradation. The write path was built as an event-sourced system. Every payment initiation produced an immutable PaymentInitiated event written to EventBridge. The downstream payment processors consumed these events and produced PaymentProcessed events. A PostgreSQL aggregate reported the ledger balance from the event stream — giving us a complete, append-only, cryptographically-linked audit chain with zero mutable ledger rows. This meant any engineer could reconstruct the payment history of any user by replaying the event stream, which proved invaluable when regulators requested transaction evidence. To avoid data residency failures, we deployed the PostgreSQL cluster in AWS ap-southeast-1 and encrypted data at rest using customer-managed KMS keys. Fraud detection integration: We integrated a real-time fraud scoring layer powered by a Graph-based ML model that consumed every PaymentInitiated event, scored it against 80-plus behavioral and transaction signals, and emitted PaymentFlagged or PaymentCleared events. The payment engine listened for both. High-risk transactions were held for manual review in a PostgreSQL-backed queue; cleared transactions were dispatched to Stripe and Xendit. This architecture processed an average of 8,000 fraud evaluations per minute with a P99 latency of 85ms. ### Phase 2: KYC and Identity Verification (Weeks 17–24) The KYC subsystem required building a new service replacing what was a Firebase Cloud Function handling identity verification across three regulatory regimes. We used NestJS with BullMQ queues to orchestrate asynchronous verification workflows: Liveness check used AWS Rekognition CompareFaces against their identity document; Document OCR used AWS Textract extracts name, ID number, date of birth from government-issued ID cards; Sanctions screening ran against the UN, OFAC, and EU sanctions lists using a third-party API provider; and Manual review queue routed documents that failed automated checks to a human reviewer in a custom admin panel. The queue architecture using Redis BullMQ gave this subsystem the ability to process 200 verifications per minute with retry logic, dead-letter queues, and alerting. A verification that failed three times was raised as a Slack alert and flagged in the payment routing engine, blocking funds transfer until reviewed. ### Phase 3: Flutter App Launch (Weeks 25–36) The mobile client was handled as a parallel workstream alongside the backend services. The final Flutter app used the BLoC pattern for state management, Dio for HTTP with retry logic, and Firebase Cloud Messaging for push notifications. The biometric authentication flow — a critical compliance requirement — used the local_auth Flutter plugin, which was independently audited and had PCI-compliant documentation, a decisive factor versus React Native's fragmented biometric plugin ecosystem. The internationalization implementation was worth a separate subsection: we built a dynamic language pack system backed by DynamoDB that allowed product managers to add and revise translation strings without a code deployment. The system supported Arabic RTL layout switching at runtime, Bahasa Indonesia number-formatting conventions, and Vietnamese diacritical preservation — all validated in CI using snapshot tests. ### Phase 4: Decommission Firebase (Weeks 37–52) The final four months were dominated by data migration validation and cutover planning. The team migrated 840,000 users, 1.2 million transaction records, and all KYC attachments. Users and wallets were migrated using a parallel-write strategy where the NestJS backend wrote to both Firebase and PostgreSQL for 30 days before cutting Firebase reads over, with zero data discrepancy found. Transactions were migrated using the event-replay pattern — historical transactions were replayed as events to the new write model, producing a PostgreSQL ledger that matched Firebase row-by-row on a sample of 10,000 transactions validated by accounting. The Bubble.io merchant dashboard was replaced by a Next.js-based admin panel built on the same NestJS backend, completed in eight weeks instead of the projected 16 weeks. The cutover was a 72-hour controlled event. At T-Minus 48 hours, all traffic remained on the Firebase monolith while the new platform was validated in production shadow mode with all live queries validated against Firebase results. At T-24 hours, changes to Firebase were frozen with all new features in PR review only. At T-0, the Istanbul record in Route 53 was redirected for 1 percent of traffic and monitored for two hours. At T+2, 25 percent of traffic was shifted; at T+4, 50 percent; at T+8, a full 100 percent traffic cutover was complete. At T+72 hours, Firebase resources were scaled to zero.

Related Posts

How We Scaled a Legacy E-Commerce Platform to Handle 10x Traffic: A Cloud-Native Transformation Case Study
Case Study

How We Scaled a Legacy E-Commerce Platform to Handle 10x Traffic: A Cloud-Native Transformation Case Study

In early 2026, ShopFlow — an e-commerce retailer generating $45M annually — approached us with a crisis. Their decade-old PHP monolith had become a structural drag: it buckled under just 3,000 concurrent users during peak promotional events, driving cart abandonment above 70% and causing roughly $250,000 in lost revenue per incident. Two prior rescue efforts — a costly vertical scaling exercise and a sprawling caching-layer push — had both failed to address the real problem: a tightly coupled LAMP stack riddled with database contention, mandatory synchronous calls, and monolithic deployments that pushed cycles up to 45 minutes. Engaged across 14 months, we applied a strangler fig migration, event-driven microservices on AWS, and a disciplined four-phase delivery plan. The outcome was decisive: the platform now handles 35,000 concurrent users without a single breakage, infrastructure costs are down 49%, and conversion is up 15%. This case study walks through every architectural decision, each migration phase, and the lessons we'd carry forward into any future cloud transformation.

From Monolith to Cloud-Native: How FinServe Labs Cut Loan Processing Time by 87%
Case Study

From Monolith to Cloud-Native: How FinServe Labs Cut Loan Processing Time by 87%

When FinServe Labs, a Bengaluru-based B2B fintech serving 180+ NBFCs, inherited a 12-year-old Rails monolith that timed out under anything above light traffic, its engineering team faced a stark fork in the road: keep patching a sinking ship, or undertake one of the riskiest migrations a regulated financial platform has ever attempted. Six months later, that same team is publishing 16 deploys a week without breaking a sweat, cutting end-to-end loan processing from 47.2 seconds to 5.8 — an 87% improvement — and reducing infrastructure costs by 60%. How did a 28-person team achieve what many thought impossible without losing a single client or slipping a single SLA? This case study walks through the full migration, from the painstaking discovery phase through the infrastructure build-out, service extraction, and ruthless go-to-production order — the architecture decisions, the hidden traps, the raw numbers, and the real lessons learned along the way. operationally

How We Migrated a B2B Fintech Platform from a Tightly Coupled Monolith to Event-Driven Microservices on AWS
Case Study

How We Migrated a B2B Fintech Platform from a Tightly Coupled Monolith to Event-Driven Microservices on AWS

When PayStream — a B2B payment orchestration platform processing over $1.2 billion in annual transaction volume — started experiencing cascading failures during peak close-of-day batches, their engineering team knew the monolith had outlived its usefulness. Over an 18-month engagement, we redesigned their entire cloud architecture, replacing a 400,000-line Python monolith with an event-driven microservices platform on AWS. The result: a 72% reduction in infrastructure costs, P99 API latency dropped from 3,200ms to 180ms, and the platform now handles a 20× traffic spike without service degradation. This is the full playbook — from strategy through implementation — covering the architectural decisions, migration tactics, team changes, and costly mistakes that determined success or failure.