Unlock the Advanced Scores in Content Intelligence!
Shelf Content Intelligence assesses your content across several categories to pinpoint sections or entire documents that will pose challenges for AI Applications (including Shelf's Search Copilot). Shelf identifies content which will hinder an AI Application's information retrieval, accuracy in responses, and susceptibility to hallucinations.

Please Note:
Standard Scores are included with the Content Intelligence platform and accessible to all users.
Advanced Scores unlock deeper insights with Shelf AI and are available as a premium feature. Advanced Scores must be individually activated for each account. If you are interested in a more in-depth discussion/demo, please reach out to your CSM.
Gaps are displayed on the Quality Assessment dashboard as well as on other reports and pages:
Gap Focus: Accuracy
Gaps | Description | Score's Value |
---|---|---|
Similar Documents Advanced Score | Replication of information refers to instances where two pieces of content are identical. If two documents are semantically similar with 99%, it's a gap. | This score enables the removal or consolidation of redundant information to enhance efficiency of your system and maintain data accuracy. Documents with 99% or higher content similarity (near-duplicates) reduce system efficiency, while being distinct from exact duplicates. |
Partial Duplicates Standard Score | It searches for sections that are 100% identical in two or more documents. If two documents contain identical sections. | Identical content fragments shared between different documents create redundancy in your knowledge base. This partial duplication leads to maintenance challenges and increases the risk of inconsistent updates. |
Exact document duplicates Standard Score | If the content of two or more documents is 100% identical. | Documents containing identical content create information redundancy across your knowledge base. This leads to maintenance challenges and increases the risk of inconsistent updates. |
Contradictions Advanced Score | Information within the knowledge base that is inconsistent or conflicting, presenting different facts or interpretations on the same subject matter. Contradictions happen when two statements are extremely unlikely to answer the same question. So, contradiction is when two semantically similar Q&A pairs provide confusing or contradicting answers. | Such conflicting information can lead to potentially misleading and wrong answers by LLMs. The Contradictions page and report allow users to view what contradictions the content stored on the account has and fix them. |
Past-dated Advanced Score | Stores data indicating whether an item is current as of the present day. Note: Past-dated gap is not based on the article creation / last update date. | Shelf AI analyzes document titles and summaries to identify potentially outdated content in your knowledge base. This automated detection helps maintain content freshness by highlighting documents that may need review or updates. Out-of-date information will cause Gen AI applications to provide incorrect answers and lead to poor user satisfaction. |
Empty documents Standard Score | Documents without content (empty). | Empty documents reduce LLM and search efficiency and should be removed from the knowledge base. |
Gap Focus: Incomplete Knowledge
Gap | Description | Score's Value |
---|---|---|
Undefined Acronyms Advanced Score | An acronym is a shortened form of a phrase or a set of words formed by taking the initial letters or syllables. Companies often use acronyms specific to their domain or organization. | Properly defined and described acronyms may result in AI applications providing more accurate responses during interactions with users. |
Crossed-out text Standard Score | Text that has been crossed out indicates it should no longer be treated as relevant, accurate, or up-to-date. | Crossed-out text should be filtered out, as it could be used for generating inaccurate responses. |
Work in progress Standard Score | Documents marked as drafts or work-in-progress may contain incomplete information and lead to inaccurate AI responses. For example: "WIP", "TBD", "Pending", "draft", "ïn progress", "for review", "incomplete". | Such documents usually do not add valuable information to your knowledge base and can be filtered out to improve efficiency and to avoid LLMs generating misleading answers based on incomplete thoughts. |
Broken links Standard Score | Link without a valid target address. If a document contains at least one broken link, it’s a gap. | Broken links reduce content reliability and can lead to incomplete AI-generated responses. |
Missing link captions Standard Score | Links without descriptive captions reduce content context and limit AI systems' understanding of referenced materials. If a document contains at least one uncaptioned link, it’s a gap. | Links without descriptive captions reduce content context and limit AI systems' understanding of referenced materials. |
Gap Focus: Compliance
Gap | Description | Score's Value |
---|---|---|
Internal information Advanced Score | Identification of sections that contain disclaimers that content within a document/section is strictly/partially confidential. | Recognizing and protecting internal sensitive data ensures company privacy and reduces the risk of data breaches or policy violations. Such content can be modified within a RAG pipeline or filtered. |
Unsecured credentials Standard Score | Identification of documents that contain passwords, security tokens, private keys, passwords hashes. | Content containing sensitive credentials (passwords, tokens, keys) requires protection from exposure through AI systems and public resources. |
Toxic or Biased Advanced Score | Identification of sections that contain inappropriate content (hate, self-harm, violence, or sexual). Identifies and addresses content that may be toxic or biased, promoting inclusivity and ethical content creation. | Ensures your content and LLM generations are free of harmful language and biased perspectives. Content flagged for policy violations requires review to maintain appropriate professional standards and ensure AI system compliance. |
Gap Focus: Missing Metadata
Gap | Description | Score's Value |
---|---|---|
Missing description Standard Score | Document description includes details about a document's background, purpose, target audience, and content. | Documents missing descriptions reduce searchability and limit AI-powered features that rely on textual context. |
Missing table of contents Standard Score | A Table of Contents is an organized list of sections or topics in a document, outlining its structure. Documents exceeding 2000 words require a table of contents to enhance navigation and help AI systems better understand document structure. | A Table of Contents can be used for improved content splitting, where documents are split into semantically meaningful sections rather than fixed-length chunks. Additionally, it can be utilized to contextualize sections by incorporating information about their position in the document's structure. This can significantly improve information retrieval and LLM generation processes. |
Skipped heading levels Standard Score | Document headings are structured titles or labels within text that serve to organize and delineate its content. If the subheading is not consecutive to its parent, it’s a gap. For example, Heading 4 is not consecutive to Heading 2. | Headings can be used for improved content splitting, where we segment the document into semantically meaningful sections. Properly ordered heading levels improve document structure for better content navigation and enhance AI-powered content analysis. |
Missing image text Standard Score | An image caption is a brief, descriptive text accompanying an image, providing context or additional information to improve comprehension. If a document contains images captioned with less than 3 words (or no captions), it’s a gap. | Image captions can be embedded (vectorized) and used as image representations in information retrieval systems. Well-written image captions improve content discovery and enhance AI-powered features by providing clear, searchable descriptions of visual content. |
Missing tags Standard Score | Document tags are metadata labels or keywords assigned to a piece of content. Tags represent the content's key themes, topics, or attributes and improve content organization. | Tags can be used to provide additional context and in hybrid retrieval systems where search is done using both semantics and keywords. Missing document tags reduce content discoverability and limit the effectiveness of semantic search systems. |
Missing categories Standard Score | Document categories refer to broader classifications that group related content together based on overarching themes, subjects, or purposes. Categories provide a high-level organizational structure. Missing document categories reduce content organization and limit the effectiveness of content classification and LLM. | Categories can be used to provide additional context and in hybrid retrieval systems where search is done using both semantics and keywords. Missing document categories reduce content organization and limit the effectiveness of content classification and LLM.. |
Gap Focus: User Navigation
Gap | Description | Score's Value |
---|---|---|
Overpopulated folders Standard Score | Folders containing more than 49 documents reduce navigation efficiency and complicate information finding. | Folders containing more than 49 documents reduce navigation efficiency and complicate information finding. |
Underpopulated folders Standard Score | Empty or near-empty folders without subfolders create unnecessary navigation layers and reduce organizational efficiency. | Empty or near-empty folders without subfolders create unnecessary navigation layers and reduce organizational efficiency. |
Folders with excessive nesting Standard Score | Folders nested beyond 4 levels create excessive hierarchy and complicate content navigation. | Such folder structure leads to too many hierarchy levels and decreases the efficiency of navigation. |