←Back

Document GPT

Document-GPT is an integration of cutting-edge technologies, including OpenAI’s ChatGPT API, Tesseract.js, and the Next.js framework. This powerful combination can be used to revolutionize the landscape of document analysis capabilities while keeping the integration simple, easy to follow and even easier to build on-top of.

Imagine a scenario where you have a stack of documents that require swift analysis and information extraction. Thanks to Document-GPT, the process becomes effortlessly efficient. By simply uploading pictures of your documents, the AI-driven engine takes care of the heavy lifting. Don’t forget the ease of snapping a photo of a document instead of scanning individual pages into a PDF

What sets Document-GPT apart is its versatility. Users can effortlessly pose questions about the document, extracting answers that were previously concealed within the text. Additionally, for those who require specific information in a structured format, Document-GPT seamlessly generates a customizable JSON object with user-defined keys, simplifying data organization.

Now, let’s delve into the underlying mechanics of Document-GPT. At its core, the system leverages OpenAI’s ChatGPT API, which harnesses advanced natural language processing capabilities. Tesseract.js acts as the optical character recognition (OCR) engine, accurately transforming images into text locally without any 3rd party service. The Next.js framework serves as the foundation, providing a user-friendly web interface and a developer friendly environment to work in.

Importance Of Prompts

Prompts are vital for this project to succeed. If a prompt is not engineers correctly incorrect information could be communicated to the user or improperly formatted JSON objects could be returned.

Chat Prompt:

Given the following raw OCR output of a document please answer the following question in a short direct form: ${question}

Example question: “what bank is this mortgage through?”

JSON Prompt:

given the following OCR text, create a JSON object that Is constructed as: ${objectPrompt}. Do not include any additional information that is not asked for. If you don’t know the answer, mark it as null

Example objectPrompt: “{mortgage: string, mortgage_amount: number, mortgage_due_date: date}”


Running Locally

Running document-GPT is super easy!

  1. (Optional) Create a .env file
    • OPEN_AI_API_TOKEN=...
    • Source your .env file
  2. Install dependencies by running yarn
  3. Run locally by running yarn dev

Thanks for Sticking around and learning about Document-GPT

Leave a Reply

Your email address will not be published. Required fields are marked *