SureLink discovers and indexes text from your local files and feeds the most relevant excerpts into a large language model (LLM) during a query. This is a retrieval‑augmented generation (RAG) workflow designed for answers with citations, not a full‑text replacement for your DMS or BI tools.
1) Content & file‑handling limits
- Text extractability matters. SureLink performs best with machine‑readable text (PDF, DOCX, Markdown, HTML, TXT). Scanned PDFs, images, handwriting, and diagrams need OCR or Image description turned; without it, the system will only document the meta-data of the file.
- Formatting fidelity is lossy. Complex tables, formulas, pivot sheets, tracked changes, comments, and footnotes may lose structure when converted to plain text, which can reduce retrieval precision.
- Unsupported or restricted files. Password‑protected/encrypted files, very large compressed archives, or files with embedded binaries may be skipped by the indexer.
- Links & attachments. Embedded links or referenced files outside the indexed scope won’t be followed unless those targets are also linked and indexed.
- Versioning & staleness. Updated files are not retroactively reflected until re‑indexed; deleted files may persist in caches until the next scheduled sync.
What this means for you: Prefer original, text‑native sources; run OCR on scans; export crucial tables to CSV; keep a light, flat folder structure for content you expect to query often.
2) Retrieval & answer‑quality limits (RAG realities)
- Token budgets force summarization. SureLink can’t send “all your docs” to the model; it selects and compresses top passages. Important context can be missed if it’s not retrieved.
- Conflicting sources. When two documents disagree, the LLM may blend or hedge. Ask for direct quotes with file and page to anchor answers.
- Multilingual & jargon. Cross‑language retrieval and domain‑specific acronyms depend on your embedding model; expect lower recall unless your corpus and embeddings are aligned.
- Determinism. LLM outputs vary run‑to‑run. For regulated uses, require citations and enable human review before decisions.
3) Hard technical limits (model & API)
- Model in use: SureLink currently supports retrieval augmentation via gpt-oss-120B. Supported input context window: Up to 65,536 tokens in total, which includes both your retrieved document snippets and your user prompt combined. Model output limit: Up to 16,384 tokens per response.
- Rate limiting: Groq enforces organization‑level limits by requests/tokens per minute and per day. When exceeded, you’ll receive HTTP 429 with headers (e.g., x-ratelimit-*, retry-after)—your client must back off and retry.
4) Scale & operations limits
- Index size & throughput: Practical limits depend on your storage/index choice and hardware. Very large corpora increase index time and can raise tail latencies for retrieval.
- Change detection: Frequent file edits can trigger re‑chunking and re‑embedding; plan for scheduled refresh windows.
- Access control at answer‑time: If user permissions change after indexing, filtered retrieval may still surface titles or metadata unless you enforce ACL checks before query‑time retrieval.
- Observability: Without token, rate‑limit, and hit‑rate telemetry, it’s hard to diagnose misses. Instrument these metrics early.
5) Customer‑side best practices (to stay within limits)
- Turn on OCR when linking PDF’s where you want to query based on the file content
- Turn on Image Scanning for images where you want the system to generate a text summary of the file.
- Ask the model to cite file name + section/page; for long answers, request quoted excerpts.
6) 60‑minute self‑test plan (quick validation before go‑live)
- Coverage: Index 10 files spanning DOCX, text‑PDF, scanned‑PDF (with/without OCR), CSV, and MD. Confirm each is discoverable and that citations map back to the right file/page.
- Stress token budget: Ask a question that touches all 10 files; verify whether the answer includes correct citations and whether the app gracefully truncates or compresses context.
- Rate‑limit behavior: Fire 100 rapid queries; confirm 429 handling and that retries succeed without user‑visible errors.
- Change drift: Edit a source file; verify re‑index latency and that the old passage no longer appears in answers.
- Permissions: Remove access to one file for a test user; ensure it never appears in retrieval results or citations.