File-based knowledge refers to information content stored and handled in document form, such as PDF, DOCX, TXT files and web pages. With the aid of Tencent Cloud Agent Development Platform, users can import these document files into the system, where the platform parses and manages them to further build a business knowledge base. After the intelligent agent application accesses this knowledge base, it can directly provide users with QA and information retrieval services based on its content, thereby achieving structured knowledge utilization and efficient support for business scenarios.
Note:
During the Application Evaluation process, you cannot change the knowledge base content, including adding imports, deleting, or modifying knowledge settings. The knowledge base has a finite capacity. You need to purchase a knowledge base expansion pack to use it. Upon expiry of the expansion pack, files/Q&A exceeding the character limit will change to [overlimit invalid] status. Manual restoration is required after expansion. Overlimit invalid knowledge can no longer be retrieved in the dialogue interface and will be auto-deleted after 1 month of invalidity. It is advisable to renew on time or purchase additional before expiration. Click on the left Knowledge, then click View in the operation column to enter the knowledge management interface and manage file-based knowledge. It supports document import, download document, delete document, and document classification.
Importing Documentation
You can import documents through three methods: local document, webpage file, and Tencent Cloud Object Storage (COS).
Local Document Import Steps
1. After entering the knowledge base document page, click Import and select Local Document.
2. Upload file, complete advanced setting, and click Next.
Note:
Conditions for importing local files:
Supported formats: pdf, doc, docx, ppt, mhtml, pptx, wps, ppsx. Size limit: 200 MB.
Supported formats: keynote, pages. Size limit: 50 MB.
Supported formats: xlsx, xls, csv, md, txt, html, xmind, json, xml, log, numbers (row count limited to 30,000, column count limited to 180). Size limit: 20 MB.
Supported formats for importing images with text: jpg, png, jpeg, tiff, bmp, gif format. Size limit: 50 MB, aspect ratio not more than 1:7.
Spreadsheet files (xlsx, xls, csv format) support a maximum of 10,000 rows and 100 columns. It is advisable to store only one table per sheet. Empty rows in the table will affect the Q&A effect.
Support batch import of files.
Web File Import Steps
Note:
Website Link Limit:
Ensure the crawled web page requires no authorization verification, meaning the current user identity needs no verification and the user's system requires no access privileges to access the web page.
Asynchronous loading of website content is not currently supported for crawling.
Please make sure to use this web page parsing tool within the allowed range of laws and regulations, comply with the target platform management specifications, and guarantee the legitimate rights and interests of the right holder. You shall bear independent liability for this. Tencent Cloud ADP, as a tool provider, does not assume any responsibility for your parsing or downloading actions.
1. After entering the knowledge base document page, click Import and select Webpage File.
2. Upload URL, set update frequency and web page retrieval level.
Note:
Upload method: supports one-by-one upload and batch upload.
One-by-one upload: Directly upload one web link. Web page retrieval level supports selecting level 1, 2, or 3.
Batch upload: Only supports uploading a single .xlsx file. You may click the download template link on the page to download and fill in the template before uploading. Web page retrieval level only supports selecting level 1.
3. Fill in advanced setting, click Next.
5. Tencent Cloud ADP takes some time to automatically crawl images, text, etc. from the webpage. Once completed, it supports preview and edit. Click Save as Document after completion.
Tencent Cloud Object Storage (COS) Import Steps
1. After entering the knowledge base document page, click Import and select Tencent Cloud Object Storage (COS).
2. Follow the on-page instructions to complete COS authorization and select the document to be imported.
3. Configure document advanced settings as needed, set effective scope, document tag, expiration time, reference, and document classification.
4. Set document segmentation rules. Please refer to Document Segmentation Settings to understand how to split documents. After completing the configuration, click Import Document. SharePoint File Import Steps
1. Go to the knowledge base document page, click Import, and select SharePoint.
2. Follow the page prompts to complete the SharePoint authorization.
3. Select the SharePoint site containing the documents to be imported.
4. Select the documents to be uploaded from the SharePoint site.
5. Configure document update frequency and advanced settings as needed, including effective scope, document tags, expiration time, reference sources, and document classification.
6. Set document segmentation rules. Please refer to Document Segmentation Settings to understand how to split documents. After completing the configuration, click Import Document. OneDrive File Import Steps
1. Go to the knowledge base document page, click Import, and select OneDrive.
2. Follow the page prompts to complete OneDrive authorization and select the documents to be imported.
3. Configure document update frequency and advanced settings as needed, including effective scope, document tags, expiration time, reference sources, and document classification.
4. Set document segmentation rules. Please refer to Document Segmentation Settings to understand how to split documents. After completing the configuration, click Import Document. Google Drive File Import Steps
1. Go to the knowledge base document page, click Import, and select Google Drive.
2. Follow the page prompts to complete Google Drive authorization and select the documents to be imported.
3. Configure document update frequency and advanced settings as needed, including effective scope, document tags, expiration time, reference sources, and document classification.
4. Set document segmentation rules. Please refer to Document Segmentation Settings to understand how to split documents. After completing the configuration, click Import Document. Notion File Import Steps
1. Go to the knowledge base document page, click Import, and select Notion.
2. Follow the page prompts to complete Notion authorization and select the documents to be imported.
3. Configure document update frequency and advanced settings as needed, including effective scope, document tags, expiration time, reference sources, and document classification.
4. Set document segmentation rules. Please refer to Document Segmentation Settings to understand how to split documents. After completing the configuration, click Import Document. Note:
Data Source Management:
You can manage data sources uniformly in the knowledge base settings, including authorization, deauthorization, and batch deauthorization.
After the data source is deauthorized, files in the knowledge base imported from this data source will no longer receive scheduled updates. To use this feature, reauthorization is required.
File Import Settings
Effective scope: Support selection of "Not Effective", "Only Effective in Debug", "Only Effective in Publish", "Effective in Debug/Publish". Select it, then the knowledge will come into effect according to settings without needing to publish application.
Not Effective: unable to retrieve knowledge in both development and release domains.
Only Effective in Debug: can only retrieve knowledge in the development domain.
Only Effective in Publish: can only retrieve knowledge in the release domain.
Effective in Debug/Publish: can retrieve knowledge in both development and release domains.
File Label: Used to tag files. You can configure the matching relationship between API parameters and tags in Knowledge Base Retrieval Related Settings. Pass API parameters through the custom_variables field of the application dialogue endpoint API. When users ask questions with different API parameter values, retrieve file content that matches the tag values. For details, see Knowledge Base Retrieval Related Settings. Expiration Time
: Set the effective time for file knowledge. You can set it to permanent validity or customize the expiration time. If a custom expiration time is set, the file knowledge will automatically expire after the expiration time. Show Reference Source: Once enabled, the source will be displayed at the end of the response, and online viewing is supported. You can choose to show the document link referenced by the large model or customize the reference link (such as the home page).
File Category: Establishing file classifications in the knowledge base helps conveniently manage knowledge Q&A in different categories. It supports up to 10 levels of classification. Supports renaming, deleting, and searching for classifications. When you hover over any classification, ... will be displayed on the right. Click ... to display a dropdown. Click Add subclass, enter the name and press Enter to create a new subclass under the current classification.
The product provides the ability to set document segmentation. When importing a document, the user can set different segmentation rules based on the document. The system will split the document into slices according to the segmentation rules set by the user.
File Operations
View Document: After entering the Knowledge Base QA application details page, select the Knowledge Management > File Tab and click the file name to view the file content.
File Status: Refers to the processing status of the file by the system and the custom active status after the file is uploaded.
Status Description:
Parsing in progress: Performing file parsing work. No setting adjustment is supported for files in this status.
Parsing failure: File parsing fails. A pop-up will prompt and details are viewable.
Under review: Performing file review work. No setting adjustment is supported for files in this status.
Review failure: The file review fails. The reason may be that the content of the file does not conform to the specified standard or requirement.
Learning: Performing file learning work. No setting adjustment is supported for file in this status.
Learn failure: The knowledge base QA application fails to learn the file content and is unable to conduct Q&A in dialogue tests and the official environment based on the file.
Expired: The file has expired and is invalid. QA cannot be conducted based on the file in the dialogue test and the official environment.
Manual Appeal: When the file review fails and is submitted for manual review, the status during the manual review process is Manual Appeal.
Manual Appeal Failed: The manual review is not approved, the file status is Manual Appeal Failed, and it is required to modify the file offline and then import it again.
Excess Capacity Invalidation: When the knowledge base capacity expires and the used knowledge base capacity exceeds the available knowledge base capacity, the files exceeding the capacity constraint are processed as in an excess capacity invalidation state.
Excess Capacity Recovery: The process of restoring knowledge in an excess capacity invalidation state to its state before invalidation. Files in excess capacity invalidation require manual recovery.
File Search: Support searching for files by file name/tag name.
File Download: Download imported files from Tencent Cloud ADP to your local system.
File Deletion: Delete files in the knowledge base.
Rename: Support updating the file name.
Note:
1. Deleted files will not result in the deletion of the corresponding Q&A in the Q&A database.
2. After modifying the document, when the status flow changes to "import complete", it can take effect within the set effective scope.