dify/api/core/rag/extractor
Charlie.Wei 97b65f9b4b
Optimize webscraper (#4392)
Co-authored-by: luowei <glpat-EjySCyNjWiLqAED-YmwM>
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2024-05-15 15:23:16 +08:00
..
blod chore: remove Langchain tools import (#3407) 2024-04-12 16:26:09 +08:00
entity Fix some RAG bugs (#2570) 2024-02-27 11:39:05 +08:00
unstructured feat:api Add support for extracting EPUB files in ExtractProcessor (#3254) 2024-04-12 11:25:02 +08:00
csv_extractor.py fix dataset retrival in dataset mode (#3334) 2024-04-11 02:11:21 +08:00
excel_extractor.py add xls file suport (#3321) 2024-04-12 14:53:44 +08:00
extract_processor.py Optimize webscraper (#4392) 2024-05-15 15:23:16 +08:00
extractor_base.py Feat/dify rag (#2528) 2024-02-22 23:31:57 +08:00
helpers.py Feat/dify rag (#2528) 2024-02-22 23:31:57 +08:00
html_extractor.py Fix some RAG bugs (#2570) 2024-02-27 11:39:05 +08:00
markdown_extractor.py Feat/dify rag (#2528) 2024-02-22 23:31:57 +08:00
notion_extractor.py fix: notion extractor raise 'NoneType' object has no attribute 'curre… (#2608) 2024-02-28 17:08:27 +08:00
pdf_extractor.py Feat/dify rag (#2528) 2024-02-22 23:31:57 +08:00
text_extractor.py Feat/dify rag (#2528) 2024-02-22 23:31:57 +08:00
word_extractor.py fix: UP031 style rule violation (#3866) 2024-04-26 11:24:08 +08:00