22 April 2026 • 6 min
Building a Real-Time Collaborative Editor for Distributed Teams
How we architected and delivered a WebSocket-powered collaborative document platform enabling 50+ team members to edit simultaneously, achieving sub-100ms sync latency and zero data conflicts across six time zones. This case study explores the technical challenges, architectural decisions, and measurable outcomes of implementing operational transformation for a global enterprise.
Overview
When a leading enterprise software company needed to unify their fragmented document workflows across offices in San Francisco, London, Singapore, Sydney, Tokyo, and São Paulo, they turned to us for a solution. Their existing tools couldn't support real-time simultaneous editing, forcing teams to rely on cumbersome version control systems, email attachments, and endless comment threads that fragmented decision-making and delayed product releases.
We built a custom collaborative editor platform from the ground up, leveraging WebSocket connections for real-time synchronization, Operational Transformation (OT) algorithms for conflict resolution, and a sophisticated presence system that shows who's viewing and editing each document in real-time. The result was a 73% reduction in time-to-decision for cross-functional reviews and a completely transformed way their distributed teams work together.
The Challenge
The client's product teams were spread across six time zones, with core engineering in San Francisco, design in London, product management split between Singapore and Tokyo, and sales operations in Sydney and São Paulo. Their existing workflow involved:
- Google Docs with limited version history and confusing comment chains
- GitHub wikis that required technical setup for non-technical team members
- Endless email threads with attachment versions
- In-person or Zoom review meetings to consolidate feedback
The pain points were severe: product spec reviews that should take hours instead took 2-3 weeks due to the sync overhead. Design reviews required everyone to be in the same Zoom call simultaneously, which was impossible across their time zone spread. Technical documentation lived in Confluence but wasn't linked to product specs, creating knowledge silos.
They needed a unified platform where team members could:
- Simultaneously edit documents without overriding each other's changes
- See who's actively working on which section
- Maintain full version history with attribution
- Integrate with their existing GitHub and Slack workflows
- Access granular permissions at section level, not just document level
Goals
We established clear success criteria with the client:
- Real-time sync latency under 200ms - Users should see changes from colleagues within the same network round-trip
- Zero data conflicts - No user edits should ever be lost, regardless of network conditions
- Seamless offline support - Local editing should sync transparently when connectivity resumes
- Section-level permissions - Different team members can have edit, comment, or view access to specific sections
- One-click import from GitHub - Technical specs should flow bidirectionally between code repos and collaborative docs
- Full audit trail - Every change attributed to the right person with timestamps
Our Approach
Technology Stack Selection
After evaluating several approaches including CRDTs (Conflict-free Replicated Data Types), OT (Operational Transformation), and centralized lock-based systems, we chose Operational Transformation for several key reasons:
- Extensive production hardening in tools like Google Docs and Etherpad
- Predictable behavior in edge cases
- Lower client-side computational requirements
- Mature library support in JavaScript
The full stack included:
- WebSocket Server: Custom Node.js server using the 'ws' library with custom framing for OT operations
- Operational Transformation Engine: Modified ShareDB with custom transformation functions
- Frontend: React with ProseMirror for rich text editing
- Database: PostgreSQL for document storage with JSONB columns for structured content
- Redis: For presence state and temporary operation cache
- Object Storage: S3-compatible storage for document attachments
Architecture Design
We designed a hybrid architecture that balances consistency with performance:
- Document Authority: Each document has a designated authority server that sequences all operations
- Operation Forwarding: All connected clients forward operations to their nearest authority
- Anti-Entropy: Periodic state reconciliation catches desync from network partitions
- Local Shadow Copies: Every client maintains a local shadow for offline editing
Implementation
The implementation took 14 weeks across four distinct phases:
Phase 1: Foundation (Weeks 1-4)
- Set up WebSocket infrastructure with connection pooling and health checks
- Implemented basic document CRUD with PostgreSQL
- Created initial ProseMirror integration with custom schema
- Built authentication and authorization layer
Phase 2: Real-Time Sync (Weeks 5-8)
- Integrated ShareDB OT engine with ProseMirror
- Built operation transform functions for all document operations
- Implemented presence system (who's viewing, cursor position)
- Created client-side operation buffering and batching
Phase 3: Offline & Integrations (Weeks 9-11)
- Built IndexedDB local storage with sync queue
- Implemented conflict resolution UI for merge decisions
- Created GitHub import/export pipeline
- Built Slack notifications for mentions and comments
Phase 4: Polish & Scale (Weeks 12-14)
- Performance optimization for 50+ concurrent editors
- Comprehensive load testing
- Security audit and penetration testing
- User acceptance testing with 20 beta users across all time zones
Results
The platform launched successfully and exceeded all performance targets. Here's what the client achieved:
- 77% reduction in time-to-decision for cross-functional product reviews (from 18 days average to 4 days)
- 156 hours saved weekly in coordination overhead across all teams
- 89% user adoption within the first month (against a 70% target)
- Zero data loss incidents in the first 6 months of production
- Average sync latency of 87ms (well under the 200ms target)
Key Metrics
| Metric | Before | After | Improvement |
|---|---|---|---|
| Time to review consensus | 18 days | 4 days | -78% |
| Weekly coordination hours | 42 hours | 8 hours | -81% |
| Document version conflicts | 12 per month | 0 | -100% |
| User satisfaction score | 3.2/10 | 8.7/10 | +172% |
| Cross-timezone meetings | 15/week | 3/week | -80% |
| Average sync latency | N/A | 87ms | Target: <200ms ✓ |
Lessons Learned
Several insights emerged from this project that inform our recommendations for similar initiatives:
- Start with OT, not CRDTs: While CRDTs are theoretically superior for decentralized systems, OT is better understood and has battle-tested library support. Don't choose CRDTs unless you truly need offline-first with eventual consistency.
- Invest in presence early: The presence system (showing who's viewing, cursor positions) was more valuable than we anticipated for reducing perceived latency and building collaborative trust.
- Offline is essential: Real-world distributed teams often have unreliable connectivity. Building robust offline support from the start saved us significant rework later.
- Integration points matter: The GitHub and Slack integrations drove adoption more than the core editing feature. Think about workflow integration from day one.
- Section-level permissions are complex: Implementing granular permissions at the section level added significant complexity. Consider whether document-level permissions meet your needs before going more granular.
This platform continues to serve the client's 500+ person organization, with plans to expand to their customer-facing documentation. If you're building collaborative tools for distributed teams, we'd love to share our learnings.
