BeigeOCR Documentation

A .NET 8 Razor Pages application for OCR using Claude API

Overview

BeigeOCR is a .NET 8 Razor Pages web application that utilizes Anthropic's Claude API to extract text from document images and PDFs. Named after Claude's beige color theme, this application provides a simple interface for OCR (Optical Character Recognition) with advanced preprocessing capabilities.

Features

Simple Web Interface

Upload images and PDFs through an easy-to-use interface

Multiple Format Support

Process JPEG, PNG, HEIC (with automatic conversion), and PDF documents

Image Preprocessing

Automatic enhancement of images for better text extraction:

  • Trim excess whitespace
  • Normalize and improve contrast
  • Resize large images
  • Convert to optimal format for processing
PDF Support

Direct processing of PDF documents with Claude's document understanding capabilities

Customizable Developer Prompt

Hidden prompts guide Claude's analysis (configured in settings)

JSON Response Formatting

Clean, formatted JSON responses

Access Control

Key-based authentication with time-limited sessions

Cost Estimation

Calculate API usage costs based on token consumption

Prerequisites

Installation

1. Clone the repository:

git clone https://github.com/yourusername/BeigeOCR.git

2. Open the solution in Visual Studio 2022 or your preferred IDE.

3. Configure your appsettings.json file:

{
  "AnthropicApi": {
    "ApiKey": "your-api-key",
    "DeveloperPrompt": "Your custom prompt for Claude"
  },
  "AccessKeys": {
    "ValidKeys": [
      "test-key-1",
      "test-key-2",
      "demo-key-3"
    ]
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "AllowedHosts": "*"
}

4. Install required NuGet packages:

dotnet restore

5. Build and run the application:

dotnet build
dotnet run

Usage

  1. Access the application through your web browser at https://localhost:7191 or http://localhost:5281 (default ports).
  2. Enter a valid access key (configured in your appsettings.json).
  3. Upload a document image or PDF.
  4. Review the extracted text in the API Response section.

Dependencies

Limitations

  • Image processing requires ImageMagick installation
  • PDF preprocessing requires Ghostscript installation
  • Maximum file size determined by Claude API limits
  • API costs are incurred for each document processed

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License.