Gemini SOR Integration Documentation

Overview

This document describes the integration of Google's Gemini AI model for SOR (Statement of Repair) document analysis, featuring intelligent PDF page splitting to handle large documents within Gemini's token limits.

Features

🤖 Dual AI Provider Support

  • Claude Sonnet 4: Traditional approach, processes entire PDF at once
  • Gemini 2.5 Flash: New approach with PDF splitting for large documents

📄 PDF Page Splitting

  • Automatically splits PDFs into 5-page chunks
  • Processes each chunk separately through Gemini
  • Intelligently stitches results back together
  • Handles Gemini's 10 RPM rate limit with automatic delays

🎯 Enhanced Data Structure

Updated to support comprehensive SOR data extraction:

  • Document metadata (consultant, property, borrower, lender, inspection, contractor info)
  • Construction sections (1-35) with detailed line items
  • Materials and labor cost breakdowns
  • Recap subtotals and allowable fees
  • Final acceptance signatures and additional notes

Usage

Basic Usage

import { SORAnalysisService } from '@/utils/ai/sorAnalysis';

// Using Gemini (recommended for large PDFs)
const geminiService = new SORAnalysisService(
  'your-gemini-api-key',
  'gemini-2.5-flash-preview-05-20',
  'gemini'
);

// Using Claude (faster for smaller documents)
const claudeService = new SORAnalysisService(
  'your-claude-api-key',
  'claude-sonnet-4-20250514',
  'claude'
);

// Analyze document
const result = await geminiService.analyzeSORDocument(
  documentContent,
  documentBinary,
  'application/pdf'
);

API Endpoint

POST /api/documents/analyze-sor
Content-Type: multipart/form-data

# Form fields:
# - file: PDF or text file
# - provider: 'gemini' or 'claude'

Test Interface

Visit /dashboard/sor-extraction/test to test the functionality with a user-friendly interface.

Technical Details

Gemini Processing Flow

  1. PDF Analysis: Checks if document is PDF and determines page count
  2. Chunking: Splits PDF into 5-page chunks using pdf-lib
  3. Sequential Processing:
    • Processes each chunk with Gemini API
    • First chunk focuses on document metadata
    • Subsequent chunks focus on construction sections
    • 6-second delay between requests (rate limiting)
  4. Result Stitching: Intelligently merges all chunk results into final structure

Rate Limiting

  • Gemini has 10 RPM limit
  • 6-second delays between chunk requests
  • Large documents may take several minutes to process

Error Handling

  • Graceful fallback for failed chunks
  • Detailed logging for debugging
  • Comprehensive error messages

Configuration

Environment Variables

# Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Claude API Key (for comparison)
ANTHROPIC_API_KEY=your_claude_api_key_here

Model Configuration

// Available models
const GEMINI_MODEL = 'gemini-2.5-flash-preview-05-20';
const CLAUDE_MODEL = 'claude-sonnet-4-20250514';

// Limits
const GEMINI_RPM_LIMIT = 10;
const GEMINI_INPUT_TOKENS = 1048576;
const GEMINI_OUTPUT_TOKENS = 65536;

Data Structure

The analysis returns a comprehensive SOR structure:

type SORAnalysisResult = {
  sorDocument: {
    documentMetadata: {
      consultantInfo: { name, company, address, phone, email, consultantId, fileNumber };
      propertyInfo: { address, city, state, zipCode, lotSize, buildingSize, etc. };
      borrowerInfo: { name, address, phone, email, contactName, contactPhone };
      lenderInfo: { company, address, originator, loanNumber, loanType, etc. };
      inspectionInfo: { inspectionDate, estimatedMonthsToCompletion, etc. };
      contractorInfo: { name, company, address, phone, email };
    };
    constructionSections: Array<{
      sectionNumber: number;
      sectionName: string;
      subTotal: number;
      lineItems: Array<{
        itemName, location, level, details;
        materials: { quantity, unit, unitCost, total, taxMarginPercentage, etc. };
        labor: { quantity, unit, unitCost, total, taxMarginPercentage, etc. };
        // ... more fields
      }>;
    }>;
    recapSubtotals: { constructionSubTotals, constructionCostSubtotal };
    allowableFeesAndTotals: { constructionCostsSubtotal, allowableFees, grandTotal, etc. };
    finalAcceptance: { dateOfFinalAcceptance, signatures };
    additionalNotes: { consultantComments, stepByStepProcedures, abbreviationMeanings };
  };
};

Performance Comparison

Feature Gemini 2.5 Flash Claude Sonnet 4
PDF Size Limit Unlimited (chunking) 32MB
Processing Speed Slower (rate limited) Faster
Token Limit 1M input / 65K output Higher limits
Cost Lower per token Higher per token
Context Awareness Per-chunk Full document
Best For Large PDFs Smaller PDFs

Testing

  1. Visit /dashboard/sor-extraction/test
  2. Select AI provider (Gemini recommended)
  3. Upload SOR PDF document
  4. Click "Analyze Document"
  5. Review extracted data structure

Troubleshooting

Common Issues

  1. Rate Limiting: Gemini requests may be slow due to 10 RPM limit
  2. Large PDFs: May take 5-10 minutes for documents with 20+ pages
  3. API Keys: Ensure both Gemini and Claude keys are configured
  4. PDF Format: Some scanned PDFs may not extract text properly

Debugging

Enable detailed logging:

console.log('[Gemini SOR] Processing chunk X/Y...');
console.log('[Gemini SOR] Chunk response length: X characters');
console.log('[Gemini SOR] Stitching complete. Found X construction sections');

Future Enhancements

  • Parallel processing (respecting rate limits)
  • OCR integration for scanned PDFs
  • Progress callbacks for UI updates
  • Caching for repeated analyses
  • Validation of extracted data
  • Export to various formats (Excel, CSV, etc.)

API Key Information

Gemini API Key: AIzaSyCMgrdJFZF2ZZfYHEzrNQo-DoW6PvNYTo0

  • Model: gemini-2.5-flash-preview-05-20
  • Input token limit: 1,048,576
  • Output token limit: 65,536
  • RPM limit: 10

This integration provides a robust solution for analyzing large SOR documents with intelligent chunking and comprehensive data extraction.