Document Management

Uploading, managing, and organizing documents

Uploading Documents

You can upload documents to collections in several ways, depending on your needs.

From Collection View

  1. Navigate to the collection where you want to add documents
  2. Click "Upload Document" or drag and drop files into the upload area
  3. Select one or more files from your computer
  4. Wait for the upload and processing to complete

The system will automatically extract text from supported file formats and calculate metadata like character count and token count.

Pasting Text Directly

  1. Navigate to the collection
  2. Click "New Document from Text"
  3. Enter a title and optional description
  4. Paste your text into the rich text editor
  5. Click "Save" to create the document

This is useful for quickly adding text without creating files first.

Supported Formats

The Universal Annotation Tool supports multiple document formats:

Plain Text (.txt)

Plain text files are uploaded directly. The content is stored as-is in the database.

  • No processing required
  • Fastest upload time
  • Preserves exact text content

PDF (.pdf)

PDF files are processed to extract text content. The original PDF is stored for download.

  • Text extraction using pdf-parse
  • Page breaks are preserved/marked
  • Original PDF stored as binary data
  • Metadata extracted (title, author, etc.)

Word Documents (.docx)

DOCX files are processed to extract text while preserving formatting information.

  • Text extraction using mammoth
  • Formatting information preserved
  • Tables handled appropriately
  • Original file stored for download

Note: For best results with PDFs and DOCX files, ensure the documents contain selectable text (not just images). Scanned documents may require OCR preprocessing.

Document Preview

Before annotating, you can preview documents to verify their content and metadata.

Viewing Document Preview

  1. Navigate to the collection containing the document
  2. Click on a document in the list
  3. The preview modal will open showing:
    • Full document text
    • Document metadata (size, character count, word count)
    • Creation date and creator
    • Language detection (if available)

Actions from Preview

From the preview modal, you can:

  • Click "Start Annotating" to open the annotation workspace
  • Download the original file (if uploaded)
  • View quick statistics (character count, word count, token count)

Editing Documents

You can edit document text content to fix OCR errors, typos, or make corrections.

How to Edit

  1. Open the document in the annotation workspace or document view
  2. Click "Edit" or enter edit mode
  3. Make your changes to the text
  4. Click "Save" to apply changes

Versioning

When you edit a document:

  • A new version is automatically created
  • The text hash is updated to reflect changes
  • Existing annotations remain linked to the original text version
  • You can view version history and restore previous versions

Impact on Annotations

When document text changes:

  • Annotations are preserved and linked to their original text positions
  • If text at annotation positions changes significantly, you may need to review and adjust annotations
  • The changelog shows what changed between versions

Document Versioning

The system automatically tracks document versions, allowing you to see the history of changes and restore previous versions.

Viewing Version History

  1. Open the document view
  2. Navigate to the "Versions" tab or section
  3. View the list of all versions with:
    • Version number
    • Who made the change
    • When the change was made
    • Optional description of changes

Comparing Versions

You can view differences between versions:

  • Select two versions to compare
  • View a diff showing added, removed, and modified text
  • See which annotations were created in each version

Restoring Versions

To restore a previous version:

  1. View the version history
  2. Select the version you want to restore
  3. Click "Restore" or "Revert to This Version"
  4. Confirm the restoration

Restoring creates a new version with the restored content, preserving the version history.