How we cut page load times by 65% with Edge Caching, Image Optimization, and Adaptive Compression for a Global Fintech Platform

A global fintech platform serving 12 million monthly active users was struggling with 4.8-second median page load times, especially in emerging markets. In this case study, we walk through how a focused, three-track performance program — combining edge-level caching, modern image pipelines, and adaptive compression — reduced median load times by 65% while actually improving core business metrics. We share the technical architecture, the rollout strategy, the trade-offs we debated, and the results that convinced the executive team to fund a second phase.

## Overview In early 2025, the product engineering team at a global fintech platform came to us with a problem that was costing them both users and revenue. The platform — a personal finance and payments application used by 12 million monthly active users across 28 countries — had a median page load time of 4.8 seconds on mobile, with emerging market users experiencing times above 7 seconds. The company's own analytics showed a 1.2 percent drop in completed transactions for every additional second of load time, and their support team was fielding thousands of complaints every month about app lag. We were asked to lead a performance transformation program that could deliver measurable improvements within six months. The CEO made it clear: this wasn't an infrastructure vanity project. It was a revenue and retention initiative. ## Challenge The challenge was multifaceted, and it didn't reduce neatly to a single bottleneck. **Inconsistent geographic performance.** The application relied on a single origin region in Western Europe. Users in Southeast Asia, Africa, and Latin America were experiencing round-trip latencies of 200-400 milliseconds before a single byte of application data was transferred. This wasn't theoretical — it showed up directly in their Core Web Vitals, where 62 percent of mobile sessions were rated "poor" by Google's thresholds. **Unoptimized media.** A full 43 percent of total page weight came from images. The platform served unoptimized hero images, product screenshots, and user-generated content at full resolution regardless of the device or network conditions. On a 3G connection, a typical dashboard page would require the transfer of 6.2 megabytes before it became interactive. **Monolithic frontend bundle.** The single-page application had grown organically over four years. The initial JavaScript bundle was 1.8 megabytes gzipped, with significant portions of the application code loading eagerly even for pages that most users never visited. Code splitting existed in theory but hadn't been maintained aggressively as new features shipped. **Legacy compression settings.** The CDN was configured with conservative compression settings that hadn't been revisited since 2021. Modern compression algorithms like Brotli were available but not enabled, and several critical API endpoints returned uncompressed JSON. ## Goals We established four concrete goals with the executive team, each tied to business outcomes rather than just technical benchmarks: 1. **Reduce median mobile page load time to under two seconds.** This was the headline metric, chosen because it correlated directly with transaction completion rates in their existing data. 2. **Reduce bounce rate on slow networks by at least 20 percent.** We identified the critical 3G/4G segment and set a specific target for that user group. 3. **Improve Core Web Vitals "good" ratings from 32 percent to 70 percent of sessions.** We wanted a metric that Google Search was already influencing and that the team could track independently through existing analytics. 4. **Achieve all improvements without a major frontend rewrite.** Constraint: the team had committed to new feature development during the same period, so engineering capacity was limited to approximately 30 percent of frontend headcount. ## Approach We designed a three-track program that could be executed in parallel by small feature teams, with a central performance guild providing standards, tooling, and measurement infrastructure. **Track 1: Edge Caching and Origin Offloading** We migrated the application to a multi-region deployment model using edge compute nodes. Rather than routing all requests to a single origin, we deployed API and static asset serving to 14 edge locations. For content that could tolerate short staleness windows — product catalogs, user dashboard summaries, help content — we implemented edge-level caching with a 60-second TTL and stale-while-revalidate semantics. For truly dynamic content like account balances and transaction histories, we implemented a tiered cache strategy where the edge held a lightweight index, and the deep origin was only queried when the index indicated a data change. **Track 2: Modern Image Pipeline** We built a centralized image optimization service that sat between the application and the object storage layer. Every image uploaded or referenced by the application was processed at request time through this service, which applied responsive sizing (generating five variants between 320 and 2,048 pixels wide), format negotiation (automatically serving WebP or AVIF with JPEG fallbacks), and quality calibration (adjusting compression levels based on network speed detection). **Track 3: Adaptive Compression and Bundle Optimization** We implemented Brotli compression across all text-based responses and API endpoints. On the frontend, we introduced a module federation architecture that allowed the dashboard bundle to be split into 12 independently loadable chunks. Critical-path code was identified through real-user monitoring and marked for preload, while secondary feature modules were marked for prefetch only after the main application became interactive. ## Implementation The implementation was phased over five months, with each track rolling out to production in a staged manner. **Phase 1: Foundation and Measurement (Weeks 1-4)** Before any changes went live, we built a performance observability stack. We instrumented the application with custom metrics for every performance boundary: time to first byte, DOM content loaded, first contentful paint, and largest contentful paint. We also set up synthetic monitoring from 18 geographic locations to catch regressions that real-user data might miss in the early weeks. This investment in measurement turned out to be critical — it gave the team confidence to ship changes independently because the observability stack would catch cross-cutting impacts. **Phase 2: Edge Migration (Weeks 5-10)** The edge migration was the most operationally sensitive phase. We used a blue-green deployment model, routing 5 percent of traffic to the new edge infrastructure and gradually increasing the percentage. We wrote custom cache invalidation logic that respected the fintech regulatory requirement that certain account-related content could never be cached at the edge in a way that might expose it to unauthorized users. The team ran weekly cache penetration audits to ensure that sensitive data never leaked into shared edge caches. **Phase 3: Image Pipeline Rollout (Weeks 8-14)** The image service was rolled out behind a feature flag that allowed us to serve optimized and unoptimized images in parallel. This A/B capability was invaluable — we could compare identical pages with and without optimization, controlling for content variation. The initial rollout showed a 72 percent reduction in image payload and a 15 percent improvement in largest contentful paint, but it also revealed an important problem: the image service itself became a latency bottleneck under peak load because it was performing on-demand transcoding. To address this, we added a pre-warm step that processed the most frequently requested images during low-traffic periods, and we introduced a CDN-level cache for optimized variants. After these changes, p95 latency for the image service dropped from 340 milliseconds to 28 milliseconds. **Phase 4: Bundle Optimization (Weeks 12-18)** The bundle optimization required the most careful coordination because it touched code owned by four different product squads. We established a bundle budget policy: no feature could add more than 25 kilobytes to the initial load bundle without a performance guild review. We used a combination of dynamic imports, route-based code splitting, and third-party script lazy loading to bring the initial bundle down from 1.8 megabytes to 420 kilobytes. A particularly impactful change was removing an unused analytics library that had been bundled into the main application but was only used on two administrative pages. The library accounted for 78 kilobytes of initial JavaScript. Its removal alone reduced time-to-interactive by 0.8 seconds. ![Performance optimization workflow showing analytics and monitoring](https://images.unsplash.com/photo-1551288049-bebda4e38f71?w=1200&q=80) ## Results The results exceeded our initial targets and, more importantly, held steady as user traffic grew by 18 percent in the three months following the rollout. **Page load performance.** Median mobile page load time dropped from 4.8 seconds to 1.68 seconds — a 65 percent improvement. For the emerging market segment that had been most affected, median load time dropped from 7.2 seconds to 2.1 seconds, which was above our stretch goal. **Business metrics.** Transaction completion rate increased by 8.3 percent in the month following the full rollout, directly correlating with the load time improvements. Customer support tickets related to "slow app" or "loading issues" decreased by 41 percent. Monthly active user retention in the 30-day cohort improved by 2.4 percentage points. **Core Web Vitals.** The percentage of sessions rated "good" by Google's Core Web Vitals thresholds increased from 32 percent to 78 percent. This improvement had a measurable secondary benefit: organic search traffic to the platform's help center and marketing pages increased by 12 percent, which the SEO team attributed directly to the improved performance signals. **Infrastructure cost.** An unexpected positive outcome was a 22 percent reduction in origin server load. By shifting traffic to the edge and implementing more aggressive caching, the team was able to decommission two origin servers during the quarterly capacity planning cycle. ## Metrics Summary | Metric | Before | After | Change | |--------|--------|-------|--------| | Median mobile page load | 4.8s | 1.68s | -65% | | Emerging market median load | 7.2s | 2.1s | -71% | | Core Web Vitals "good" sessions | 32% | 78% | +144% | | Transaction completion rate | Baseline | +8.3% | Significant | | "Slow app" support tickets | Baseline | -41% | Significant | | 30-day user retention | Baseline | +2.4pp | Significant | | Organic search traffic | Baseline | +12% | Significant | | Origin server load | Baseline | -22% | Cost savings | | Initial JavaScript bundle | 1.8MB | 420KB | -77% | | Image payload per page | 2.7MB | 0.76MB | -72% | ## Lessons Learned Through this engagement, we identified several lessons that we have applied to subsequent performance work and that we believe have broad applicability. **Measure before you optimize.** The observability investment in Phase 1 paid for itself many times over. Without granular metrics, we would have shipped optimizations that looked good on paper but didn't address the actual bottlenecks. For example, initial instinct suggested that API response times were the primary problem, but the data showed that image transfer times were actually the dominant factor on mobile. **Incremental beats big-bang.** The phased rollout with feature flags allowed us to validate each track independently and identify problems early. The image service latency issue was caught during the 5 percent test phase, before it affected all users. A single big-bang release would have made that incident much more costly. **Staleness windows are a feature, not a compromise.** The team's initial instinct was to set cache TTLs to 10 seconds because they were nervous about serving stale financial data. By carefully analyzing what content could tolerate various staleness windows and implementing the tiered cache strategy, we found that most content could be cached for 60 seconds or more without any user-visible impact. This was one of the highest-leverage decisions in the program. **Bundle budgets prevent bundle drift.** The 25 kilobyte review policy was initially unpopular with product teams, but it created a culture of performance awareness that continued after the engagement ended. Two months after our involvement concluded, the team self-organized to reduce the bundle by an additional 60 kilobytes without any external prompting. **Performance is a product feature.** The 8.3 percent increase in transaction completion rate transformed how the organization talked about performance. What had been an engineering concern became a business metric, and that shift in framing is what secured funding for the second phase of work, which is now focused on reducing initial payload for new users who don't have cached assets. We are now working with the same engineering leadership to apply this same methodology to their international expansion into three new markets, where the first experience on the platform will determine whether a new user becomes a long-term customer.

How we cut page load times by 65% with Edge Caching, Image Optimization, and Adaptive Compression for a Global Fintech Platform

Related Posts

From Legacy to Lightning: The Digital Transformation of Greenfield Financial

How Sabre Energy Cut Cloud Infrastructure Costs by 42% Through Serverless Transformation

From School Walls to Digital Archives: How Starlings ED Migrated 40+ Years of Student Records to a Cloud-First Platform