针对AI知识库,采用的是通过接口的方式将文档导入知识库,类似于这样:
app.rags.integratedManagementOfTheKnowled.addDocumentByBusinessId(‘aa’, [{ “fileName”: ‘aa’, “size”: size, “url”: ‘https://wd.hbxtxzl.cn/group1/M00/02/3E/CsiNamj4RdmAdH19ABkvYlzOmJY995.pdf’, “type”: ‘’}], {“chunkSeparator”: [“\n”], “chunkSize”: 1024, “chunkOverlap”: 100}, {“chunkCleaning”: True})
最近发现,有些文档通过这种方式导入后,知识库内并没有显示该文档
所以我采取了将文档下载到本地导进去的方式:
会出现向量化失败的情况:
错误信息如下:
文档 processing failed: {‘code’: 60101, ‘message’: ‘Document loading failed [{file_path}https://jit.hbxtxzl.cn/api/xtjit/hbxt/storages/services/StorageSvc/preview?file=4ff40de7673e07a70777b9b868fa51b0.pdf\]: {\‘code\’: 60101, \‘message\’: \‘Document loading failed [https://jit.hbxtxzl.cn/api/xtjit/hbxt/storages/services/StorageSvc/preview?file=4ff40de7673e07a70777b9b868fa51b0.pdf{file_url}\]: {“code”: 20005, “reason”: "Element rags.NormalType.services.RemoteLoader [HTTP Document Download Service] failed. Error: {\\\‘code\\\’: 60101, \\\‘message\\\’: \\\‘Document loading failed [/tmp/tmp_4wba758.pdf{file_url}]: {“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’, \\\‘description\\\’: \\\‘Unable to load specified document file\\\’, \\\‘file_path\\\’: \\\’/tmp/tmp_4wba758.pdf\\\’, \\\‘error\\\’: \\\‘{“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’}“}\‘, \‘description\’: \‘Unable to load specified document file\’, \‘file_path\’: \‘https://jit.hbxtxzl.cn/api/xtjit/hbxt/storages/services/StorageSvc/preview?file=4ff40de7673e07a70777b9b868fa51b0.pdf\’, \‘error\’: \’{“code”: 20005, “reason”: “Element rags.NormalType.services.RemoteLoader [HTTP Document Download Service] failed. Error: {\\\‘code\\\’: 60101, \\\‘message\\\’: \\\‘Document loading failed [/tmp/tmp_4wba758.pdf{file_url}]: {“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’, \\\‘description\\\’: \\\‘Unable to load specified document file\\\’, \\\‘file_path\\\’: \\\‘/tmp/tmp_4wba758.pdf\\\’, \\\‘error\\\’: \\\‘{“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’}”}\‘}’, ‘description’: ‘Unable to load specified document file’, ‘file_url’: ‘https://jit.hbxtxzl.cn/api/xtjit/hbxt/storages/services/StorageSvc/preview?file=4ff40de7673e07a70777b9b868fa51b0.pdf’, ‘error’: ‘{\‘code\’: 60101, \‘message\’: \‘Document loading failed [https://jit.hbxtxzl.cn/api/xtjit/hbxt/storages/services/StorageSvc/preview?file=4ff40de7673e07a70777b9b868fa51b0.pdf{file_url}\]: {“code”: 20005, “reason”: "Element rags.NormalType.services.RemoteLoader [HTTP Document Download Service] failed. Error: {\\\‘code\\\’: 60101, \\\‘message\\\’: \\\‘Document loading failed [/tmp/tmp_4wba758.pdf{file_url}]: {“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’, \\\‘description\\\’: \\\‘Unable to load specified document file\\\’, \\\‘file_path\\\’: \\\’/tmp/tmp_4wba758.pdf\\\’, \\\‘error\\\’: \\\‘{“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’}”}\‘, \‘description\’: \‘Unable to load specified document file\’, \‘file_path\’: \‘https://jit.hbxtxzl.cn/api/xtjit/hbxt/storages/services/StorageSvc/preview?file=4ff40de7673e07a70777b9b868fa51b0.pdf\’, \‘error\’: \’{“code”: 20005, “reason”: “Element rags.NormalType.services.RemoteLoader [HTTP Document Download Service] failed. Error: {\\\‘code\\\’: 60101, \\\‘message\\\’: \\\‘Document loading failed [/tmp/tmp_4wba758.pdf{file_url}]: {“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’, \\\‘description\\\’: \\\‘Unable to load specified document file\\\’, \\\‘file_path\\\’: \\\‘/tmp/tmp_4wba758.pdf\\\’, \\\‘error\\\’: \\\‘{“code”: 20005, “reason”: “Element rags.NormalType.services.PdfLoader [PDF Document Loader Service] failed. Error: Could not read Boolean object”}\\\’}”}\‘}’}
因此想请问一下,这是什么原因造成的,如何解决并修复

