Gemini SOR Integration Documentation

Overview

This document describes the integration of Google's Gemini AI model for SOR (Statement of Repair) document analysis, featuring intelligent PDF page splitting to handle large documents within Gemini's token limits.

Features

🤖 Dual AI Provider Support

Claude Sonnet 4: Traditional approach, processes entire PDF at once
Gemini 2.5 Flash: New approach with PDF splitting for large documents

📄 PDF Page Splitting

Automatically splits PDFs into 5-page chunks
Processes each chunk separately through Gemini
Intelligently stitches results back together
Handles Gemini's 10 RPM rate limit with automatic delays

🎯 Enhanced Data Structure

Updated to support comprehensive SOR data extraction:

Document metadata (consultant, property, borrower, lender, inspection, contractor info)
Construction sections (1-35) with detailed line items
Materials and labor cost breakdowns
Recap subtotals and allowable fees
Final acceptance signatures and additional notes

Usage

Basic Usage

import { SORAnalysisService } from '@/utils/ai/sorAnalysis';

// Using Gemini (recommended for large PDFs)
const geminiService = new SORAnalysisService(
  'your-gemini-api-key',
  'gemini-2.5-flash-preview-05-20',
  'gemini'
);

// Using Claude (faster for smaller documents)
const claudeService = new SORAnalysisService(
  'your-claude-api-key',
  'claude-sonnet-4-20250514',
  'claude'
);

// Analyze document
const result = await geminiService.analyzeSORDocument(
  documentContent,
  documentBinary,
  'application/pdf'
);

API Endpoint

POST /api/documents/analyze-sor
Content-Type: multipart/form-data

# Form fields:
# - file: PDF or text file
# - provider: 'gemini' or 'claude'

Test Interface

Visit /dashboard/sor-extraction/test to test the functionality with a user-friendly interface.

Technical Details

Gemini Processing Flow

PDF Analysis: Checks if document is PDF and determines page count
Chunking: Splits PDF into 5-page chunks using pdf-lib
Sequential Processing:
- Processes each chunk with Gemini API
- First chunk focuses on document metadata
- Subsequent chunks focus on construction sections
- 6-second delay between requests (rate limiting)
Result Stitching: Intelligently merges all chunk results into final structure

Rate Limiting

Gemini has 10 RPM limit
6-second delays between chunk requests
Large documents may take several minutes to process

Error Handling

Graceful fallback for failed chunks
Detailed logging for debugging
Comprehensive error messages

Configuration

Environment Variables

# Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Claude API Key (for comparison)
ANTHROPIC_API_KEY=your_claude_api_key_here

Model Configuration

// Available models
const GEMINI_MODEL = 'gemini-2.5-flash-preview-05-20';
const CLAUDE_MODEL = 'claude-sonnet-4-20250514';

// Limits
const GEMINI_RPM_LIMIT = 10;
const GEMINI_INPUT_TOKENS = 1048576;
const GEMINI_OUTPUT_TOKENS = 65536;

Data Structure

The analysis returns a comprehensive SOR structure:

type SORAnalysisResult = {
  sorDocument: {
    documentMetadata: {
      consultantInfo: { name, company, address, phone, email, consultantId, fileNumber };
      propertyInfo: { address, city, state, zipCode, lotSize, buildingSize, etc. };
      borrowerInfo: { name, address, phone, email, contactName, contactPhone };
      lenderInfo: { company, address, originator, loanNumber, loanType, etc. };
      inspectionInfo: { inspectionDate, estimatedMonthsToCompletion, etc. };
      contractorInfo: { name, company, address, phone, email };
    };
    constructionSections: Array<{
      sectionNumber: number;
      sectionName: string;
      subTotal: number;
      lineItems: Array<{
        itemName, location, level, details;
        materials: { quantity, unit, unitCost, total, taxMarginPercentage, etc. };
        labor: { quantity, unit, unitCost, total, taxMarginPercentage, etc. };
        // ... more fields
      }>;
    }>;
    recapSubtotals: { constructionSubTotals, constructionCostSubtotal };
    allowableFeesAndTotals: { constructionCostsSubtotal, allowableFees, grandTotal, etc. };
    finalAcceptance: { dateOfFinalAcceptance, signatures };
    additionalNotes: { consultantComments, stepByStepProcedures, abbreviationMeanings };
  };
};

Performance Comparison

Feature	Gemini 2.5 Flash	Claude Sonnet 4
PDF Size Limit	Unlimited (chunking)	32MB
Processing Speed	Slower (rate limited)	Faster
Token Limit	1M input / 65K output	Higher limits
Cost	Lower per token	Higher per token
Context Awareness	Per-chunk	Full document
Best For	Large PDFs	Smaller PDFs

Testing

Visit /dashboard/sor-extraction/test
Select AI provider (Gemini recommended)
Upload SOR PDF document
Click "Analyze Document"
Review extracted data structure

Troubleshooting

Common Issues

Rate Limiting: Gemini requests may be slow due to 10 RPM limit
Large PDFs: May take 5-10 minutes for documents with 20+ pages
API Keys: Ensure both Gemini and Claude keys are configured
PDF Format: Some scanned PDFs may not extract text properly

Debugging

Enable detailed logging:

console.log('[Gemini SOR] Processing chunk X/Y...');
console.log('[Gemini SOR] Chunk response length: X characters');
console.log('[Gemini SOR] Stitching complete. Found X construction sections');

Future Enhancements

Parallel processing (respecting rate limits)
OCR integration for scanned PDFs
Progress callbacks for UI updates
Caching for repeated analyses
Validation of extracted data
Export to various formats (Excel, CSV, etc.)

API Key Information

Gemini API Key: AIzaSyCMgrdJFZF2ZZfYHEzrNQo-DoW6PvNYTo0

Model: gemini-2.5-flash-preview-05-20
Input token limit: 1,048,576
Output token limit: 65,536
RPM limit: 10

This integration provides a robust solution for analyzing large SOR documents with intelligent chunking and comprehensive data extraction.