erichowens

real-time-collaboration-engine

20
3
# Install this skill:
npx skills add erichowens/some_claude_skills --skill "real-time-collaboration-engine"

Install specific skill from multi-skill repository

# Description

Build real-time collaborative editing with WebSockets, OT/CRDT conflict resolution, and presence awareness. Implements cursor tracking, optimistic updates, and offline sync. Use for collaborative editors, whiteboards, video editing. Activate on "real-time collaboration", "WebSocket sync", "multiplayer editing", "CRDT", "presence awareness". NOT for simple chat, request-response APIs, or single-user apps.

# SKILL.md


name: real-time-collaboration-engine
description: Build real-time collaborative editing with WebSockets, OT/CRDT conflict resolution, and presence awareness. Implements cursor tracking, optimistic updates, and offline sync. Use for collaborative editors, whiteboards, video editing. Activate on "real-time collaboration", "WebSocket sync", "multiplayer editing", "CRDT", "presence awareness". NOT for simple chat, request-response APIs, or single-user apps.
allowed-tools: Read,Write,Edit,Bash(npm:,websocket:)


Real-Time Collaboration Engine

Expert in building Google Docs-style collaborative editing with WebSockets, conflict resolution, and presence awareness.

When to Use

βœ… Use for:
- Collaborative text/code editors
- Shared whiteboards and design tools
- Multi-user video editing timelines
- Real-time data dashboards
- Multiplayer game state sync

❌ NOT for:
- Simple chat applications (use basic WebSocket)
- Request-response APIs (use REST/GraphQL)
- Single-user applications
- Read-only data streaming (use Server-Sent Events)

Quick Decision Tree

Need real-time collaboration?
β”œβ”€β”€ Text editing? β†’ Operational Transform (OT)
β”œβ”€β”€ JSON data structures? β†’ CRDTs
β”œβ”€β”€ Cursor tracking only? β†’ Simple WebSocket + presence
β”œβ”€β”€ Offline-first? β†’ CRDTs (better offline merge)
└── No conflicts possible? β†’ Basic broadcast

Technology Selection

Conflict Resolution Strategies (2024)

Strategy Best For Complexity Offline Support
Operational Transform (OT) Text, ordered sequences High Limited
CRDTs JSON objects, sets Medium Excellent
Last-Write-Wins Simple state Low Basic
Three-Way Merge Git-style editing High Good

Timeline:
- 2010: Google Wave uses OT
- 2014: Figma adopts CRDTs
- 2019: Yjs (CRDT library) released
- 2022: Automerge 2.0 (CRDT library) released
- 2024: PartyKit simplifies real-time infrastructure


Common Anti-Patterns

Anti-Pattern 1: Broadcasting Every Keystroke

Novice thinking: "Send every change immediately for real-time feel"

Problem: Network floods with tiny messages, poor performance.

Wrong approach:

// ❌ Sends message on every keystroke
function Editor() {
  const handleChange = (text: string) => {
    socket.emit('text-change', { text });  // Every keystroke!
  };

  return <textarea onChange={(e) => handleChange(e.target.value)} />;
}

Why wrong: 100 WPM typing = 500 messages/minute = network congestion.

Correct approach:

// βœ… Batches changes every 200ms
function Editor() {
  const [pendingChanges, setPendingChanges] = useState<Change[]>([]);

  useEffect(() => {
    const interval = setInterval(() => {
      if (pendingChanges.length > 0) {
        socket.emit('text-batch', { changes: pendingChanges });
        setPendingChanges([]);
      }
    }, 200);

    return () => clearInterval(interval);
  }, [pendingChanges]);

  const handleChange = (change: Change) => {
    setPendingChanges(prev => [...prev, change]);
  };

  return <textarea onChange={handleChange} />;
}

Impact: 500 messages/minute β†’ 5 messages/second (90% reduction).


Anti-Pattern 2: No Conflict Resolution Strategy

Problem: Concurrent edits cause data loss or corruption.

Symptom: Users see their changes disappear, documents become inconsistent.

Wrong approach:

// ❌ Last write wins, overwrites concurrent changes
socket.on('text-change', ({ userId, text }) => {
  setDocument(text);  // Loses concurrent edits!
});

Why wrong: If User A and B edit simultaneously, one change is lost.

Correct approach (OT):

// βœ… Operational Transform for text
import { TextOperation } from 'ot.js';

socket.on('operation', ({ userId, operation, revision }) => {
  const transformed = transformOperation(
    operation,
    pendingOperations,
    revision
  );

  applyOperation(transformed);
  incrementRevision();
});

function transformOperation(
  incoming: Operation,
  pending: Operation[],
  baseRevision: number
): Operation {
  // Transform incoming against pending operations
  let transformed = incoming;
  for (const op of pending) {
    transformed = TextOperation.transform(transformed, op)[0];
  }
  return transformed;
}

Correct approach (CRDT):

// βœ… CRDT for JSON objects
import * as Y from 'yjs';

const ydoc = new Y.Doc();
const ytext = ydoc.getText('document');

// Automatically handles conflicts
ytext.insert(0, 'Hello');

// Sync with peers
const provider = new WebsocketProvider('ws://localhost:1234', 'room', ydoc);

Impact: Concurrent edits merge correctly, no data loss.


Anti-Pattern 3: Not Handling Disconnections

Problem: User goes offline, loses work or sees stale state.

Wrong approach:

// ❌ No offline handling
socket.on('disconnect', () => {
  console.log('Disconnected');  // That's it?!
});

Why wrong: Pending changes lost, no reconnection strategy, bad UX.

Correct approach:

// βœ… Queue changes offline, sync on reconnect
const [isOnline, setIsOnline] = useState(true);
const [offlineQueue, setOfflineQueue] = useState<Change[]>([]);

socket.on('disconnect', () => {
  setIsOnline(false);
  showToast('Offline - changes will sync when reconnected');
});

socket.on('connect', () => {
  setIsOnline(true);

  // Send queued changes
  if (offlineQueue.length > 0) {
    socket.emit('sync-offline-changes', { changes: offlineQueue });
    setOfflineQueue([]);
  }
});

const handleChange = (change: Change) => {
  if (isOnline) {
    socket.emit('change', change);
  } else {
    setOfflineQueue(prev => [...prev, change]);
  }
};

Timeline context:
- 2015: Offline-first apps rare
- 2020: PWAs make offline UX standard
- 2024: Users expect seamless offline editing


Anti-Pattern 4: Client-Only State Sync

Problem: No server authority, clients get out of sync.

Wrong approach:

// ❌ Clients broadcast to each other directly
socket.on('peer-change', ({ userId, change }) => {
  applyChange(change);  // No validation, no server state
});

Why wrong: Malicious client can send invalid data, no recovery from desync.

Correct approach:

// βœ… Server is source of truth
// Client
socket.emit('operation', { operation, clientRevision });

socket.on('ack', ({ serverRevision }) => {
  if (serverRevision !== expectedRevision) {
    // Desync detected, request full state
    socket.emit('request-full-state');
  }
});

// Server
io.on('connection', (socket) => {
  socket.on('operation', ({ operation, clientRevision }) => {
    // Validate operation
    if (!isValid(operation)) {
      socket.emit('error', { message: 'Invalid operation' });
      return;
    }

    // Apply to server state
    const serverRevision = applyOperation(operation);

    // Broadcast to all clients
    io.emit('operation', { operation, serverRevision });
  });
});

Impact: Data integrity guaranteed, can recover from client bugs.


Anti-Pattern 5: No Presence Awareness

Problem: Users can't see who's editing what, causing edit conflicts.

Symptom: Two people editing same section unknowingly.

Wrong approach:

// ❌ No awareness of other users
function Editor() {
  return <textarea />;  // Flying blind!
}

Correct approach:

// βœ… Show active users and cursors
import { usePresence } from './usePresence';

function Editor() {
  const { users, updateCursor } = usePresence();

  const handleCursorMove = (position: number) => {
    socket.emit('cursor-move', { userId: myId, position });
  };

  return (
    <div>
      {/* Show who's online */}
      <UserList users={users} />

      {/* Show remote cursors */}
      <EditorWithCursors
        content={content}
        cursors={users.map(u => u.cursor)}
        onCursorMove={handleCursorMove}
      />
    </div>
  );
}

Features:
- Active user list with avatars
- Cursor positions color-coded by user
- Selection ranges highlighted
- "User X is typing..." indicators


Implementation Patterns

Pattern 1: WebSocket Setup with Reconnection

import { io } from 'socket.io-client';

const socket = io('ws://localhost:3000', {
  reconnection: true,
  reconnectionDelay: 1000,
  reconnectionDelayMax: 5000,
  reconnectionAttempts: Infinity,
  transports: ['websocket', 'polling']  // Fallback
});

socket.on('connect', () => {
  console.log('Connected:', socket.id);
});

socket.on('disconnect', (reason) => {
  if (reason === 'io server disconnect') {
    // Server disconnected, manually reconnect
    socket.connect();
  }
});

socket.on('connect_error', (error) => {
  console.error('Connection error:', error);
});

Pattern 2: Operational Transform (Text)

import { TextOperation } from 'ot.js';

class OTEditor {
  private revision = 0;
  private pendingOperations: TextOperation[] = [];

  applyLocalOperation(op: TextOperation): void {
    // Apply immediately (optimistic update)
    this.applyToEditor(op);

    // Send to server
    this.sendOperation(op);

    // Store as pending
    this.pendingOperations.push(op);
  }

  receiveRemoteOperation(op: TextOperation, serverRevision: number): void {
    // Transform against pending operations
    let transformed = op;
    for (const pending of this.pendingOperations) {
      [transformed, pending] = TextOperation.transform(transformed, pending);
    }

    // Apply transformed operation
    this.applyToEditor(transformed);
    this.revision = serverRevision;
  }

  acknowledgeOperation(serverRevision: number): void {
    // Remove acknowledged operation from pending
    this.pendingOperations.shift();
    this.revision = serverRevision;
  }
}

Pattern 3: CRDT with Yjs

import * as Y from 'yjs';
import { WebsocketProvider } from 'y-websocket';

// Create shared document
const ydoc = new Y.Doc();

// Define shared types
const ytext = ydoc.getText('content');
const ymap = ydoc.getMap('metadata');
const yarray = ydoc.getArray('users');

// Connect to sync server
const provider = new WebsocketProvider(
  'ws://localhost:1234',
  'room-name',
  ydoc
);

// Listen to changes
ytext.observe(event => {
  console.log('Text changed:', event.changes);
});

// Make changes (automatically synced)
ytext.insert(0, 'Hello ');
ytext.insert(6, 'World!');

// Undo/redo support
const undoManager = new Y.UndoManager(ytext);
undoManager.undo();
undoManager.redo();

Pattern 4: Presence Awareness

import { Awareness } from 'y-protocols/awareness';

const awareness = provider.awareness;

// Set local state
awareness.setLocalState({
  user: {
    name: 'Alice',
    color: '#ff0000',
    cursor: { line: 10, ch: 5 }
  }
});

// Listen to changes
awareness.on('change', ({ added, updated, removed }) => {
  // Update UI with user cursors/selections
  const states = awareness.getStates();
  states.forEach((state, clientId) => {
    if (clientId !== awareness.clientID) {
      renderCursor(state.user.cursor, state.user.color);
    }
  });
});

Pattern 5: Optimistic Updates with Rollback

class OptimisticEditor {
  private optimisticChanges = new Map<string, Change>();

  async applyChange(change: Change): Promise<void> {
    const changeId = generateId();

    // Apply immediately (optimistic)
    this.applyToUI(change);
    this.optimisticChanges.set(changeId, change);

    try {
      // Send to server
      const result = await this.sendToServer(change);

      // Success - remove from optimistic
      this.optimisticChanges.delete(changeId);

    } catch (error) {
      // Failed - rollback
      this.rollback(changeId);
      this.showError('Could not apply change');
    }
  }

  private rollback(changeId: string): void {
    const change = this.optimisticChanges.get(changeId);
    if (change) {
      this.revertInUI(change);
      this.optimisticChanges.delete(changeId);
    }
  }
}

Production Checklist

β–‘ WebSocket connection with auto-reconnect
β–‘ Offline queue for pending changes
β–‘ Conflict resolution strategy (OT or CRDT)
β–‘ Server authority (clients can't desync)
β–‘ Presence awareness (cursors, active users)
β–‘ Optimistic updates with rollback
β–‘ Change batching (not per-keystroke)
β–‘ Message compression for large payloads
β–‘ Authentication and authorization
β–‘ Rate limiting (prevent spam)
β–‘ Heartbeat/ping-pong to detect dead connections
β–‘ Graceful degradation (falls back to polling if WebSocket fails)

When to Use vs Avoid

Scenario Strategy
Text editing (Google Docs) βœ… Operational Transform
JSON objects (Figma) βœ… CRDTs (Yjs, Automerge)
Simple cursor sharing βœ… Basic WebSocket + presence
Chat messages βœ… Simple append-only (no OT/CRDT)
Video timeline editing βœ… CRDTs for timeline, OT for text
Read-only dashboards ❌ Use Server-Sent Events instead

References

  • /references/ot-vs-crdt.md - Deep comparison of conflict resolution strategies
  • /references/websocket-scaling.md - Scaling to millions of concurrent connections
  • /references/presence-patterns.md - Cursor tracking, user awareness, activity indicators

Scripts

  • scripts/collaboration_tester.ts - Simulate concurrent edits, test conflict resolution
  • scripts/latency_simulator.ts - Test behavior under high latency/packet loss

This skill guides: Real-time collaboration | WebSocket architecture | Operational Transform | CRDTs | Presence awareness | Conflict resolution

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.