Document Management
Uploading, managing, and organizing documents
Uploading Documents
You can upload documents to collections in several ways, depending on your needs.
From Collection View
- Navigate to the collection where you want to add documents
- Click "Upload Document" or drag and drop files into the upload area
- Select one or more files from your computer
- Wait for the upload and processing to complete
The system will automatically extract text from supported file formats and calculate metadata like character count and token count.
Pasting Text Directly
- Navigate to the collection
- Click "New Document from Text"
- Enter a title and optional description
- Paste your text into the rich text editor
- Click "Save" to create the document
This is useful for quickly adding text without creating files first.
Supported Formats
The Universal Annotation Tool supports multiple document formats:
Plain Text (.txt)
Plain text files are uploaded directly. The content is stored as-is in the database.
- No processing required
- Fastest upload time
- Preserves exact text content
PDF (.pdf)
PDF files are processed to extract text content. The original PDF is stored for download.
- Text extraction using pdf-parse
- Page breaks are preserved/marked
- Original PDF stored as binary data
- Metadata extracted (title, author, etc.)
Word Documents (.docx)
DOCX files are processed to extract text while preserving formatting information.
- Text extraction using mammoth
- Formatting information preserved
- Tables handled appropriately
- Original file stored for download
Note: For best results with PDFs and DOCX files, ensure the documents contain selectable text (not just images). Scanned documents may require OCR preprocessing.
Document Preview
Before annotating, you can preview documents to verify their content and metadata.
Viewing Document Preview
- Navigate to the collection containing the document
- Click on a document in the list
- The preview modal will open showing:
- Full document text
- Document metadata (size, character count, word count)
- Creation date and creator
- Language detection (if available)
Actions from Preview
From the preview modal, you can:
- Click "Start Annotating" to open the annotation workspace
- Download the original file (if uploaded)
- View quick statistics (character count, word count, token count)
Editing Documents
You can edit document text content to fix OCR errors, typos, or make corrections.
How to Edit
- Open the document in the annotation workspace or document view
- Click "Edit" or enter edit mode
- Make your changes to the text
- Click "Save" to apply changes
Versioning
When you edit a document:
- A new version is automatically created
- The text hash is updated to reflect changes
- Existing annotations remain linked to the original text version
- You can view version history and restore previous versions
Impact on Annotations
When document text changes:
- Annotations are preserved and linked to their original text positions
- If text at annotation positions changes significantly, you may need to review and adjust annotations
- The changelog shows what changed between versions
Document Versioning
The system automatically tracks document versions, allowing you to see the history of changes and restore previous versions.
Viewing Version History
- Open the document view
- Navigate to the "Versions" tab or section
- View the list of all versions with:
- Version number
- Who made the change
- When the change was made
- Optional description of changes
Comparing Versions
You can view differences between versions:
- Select two versions to compare
- View a diff showing added, removed, and modified text
- See which annotations were created in each version
Restoring Versions
To restore a previous version:
- View the version history
- Select the version you want to restore
- Click "Restore" or "Revert to This Version"
- Confirm the restoration
Restoring creates a new version with the restored content, preserving the version history.
Full-Text Search
The system provides powerful full-text search capabilities to help you find documents quickly.
Searching Documents
- Use the search bar on the dashboard or collection view
- Enter your search query
- Results will show documents matching your search in:
- Document name
- Document content
- Creator name
Search Features
The search supports:
- Case-insensitive matching
- Partial word matching
- Search across all collections or filter by collection
- Results sorted by relevance
- Highlighted search terms in result snippets
Filtering Results
You can filter search results by:
- Collection
- Document status (annotated, pending)
- Date range
- Creator