Overview Features Prerequisites Installation Usage Dependencies Limitations Contributing License

BeigeOCR Documentation

A .NET 8 Razor Pages application for OCR using Claude API

Overview

BeigeOCR is a .NET 8 Razor Pages web application that utilizes Anthropic's Claude API to extract text from document images and PDFs. Named after Claude's beige color theme, this application provides a simple interface for OCR (Optical Character Recognition) with advanced preprocessing capabilities.

Features

Simple Web Interface

Upload images and PDFs through an easy-to-use interface

Multiple Format Support

Process JPEG, PNG, HEIC (with automatic conversion), and PDF documents

Image Preprocessing

Automatic enhancement of images for better text extraction:

Trim excess whitespace
Normalize and improve contrast
Resize large images
Convert to optimal format for processing

PDF Support

Direct processing of PDF documents with Claude's document understanding capabilities

Customizable Developer Prompt

Hidden prompts guide Claude's analysis (configured in settings)

JSON Response Formatting

Clean, formatted JSON responses

Access Control

Key-based authentication with time-limited sessions

Cost Estimation

Calculate API usage costs based on token consumption

Prerequisites

.NET 8 SDK
Visual Studio 2022 or other compatible IDE
Anthropic Claude API key
ImageMagick installed on the server (for advanced image preprocessing)
Ghostscript installed on the server (for PDF processing)

Installation

1. Clone the repository:

git clone https://github.com/yourusername/BeigeOCR.git

2. Open the solution in Visual Studio 2022 or your preferred IDE.

3. Configure your appsettings.json file:

{
  "AnthropicApi": {
    "ApiKey": "your-api-key",
    "DeveloperPrompt": "Your custom prompt for Claude"
  },
  "AccessKeys": {
    "ValidKeys": [
      "test-key-1",
      "test-key-2",
      "demo-key-3"
    ]
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "AllowedHosts": "*"
}

4. Install required NuGet packages:

dotnet restore

5. Build and run the application:

dotnet build
dotnet run

Usage

Access the application through your web browser at https://localhost:7191 or http://localhost:5281 (default ports).
Enter a valid access key (configured in your appsettings.json).
Upload a document image or PDF.
Review the extracted text in the API Response section.

Dependencies

Magick.NET: .NET wrapper for ImageMagick
Newtonsoft.Json: JSON processing library
Anthropic Claude API: AI service for text extraction

Limitations

Image processing requires ImageMagick installation
PDF preprocessing requires Ghostscript installation
Maximum file size determined by Claude API limits
API costs are incurred for each document processed

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License.