Commit Graph

69 Commits

Author SHA1 Message Date
Jyong
ba5f8afaa8
Feat/firecrawl data source (#5232)
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
Co-authored-by: chenhe <guchenhe@gmail.com>
Co-authored-by: takatost <takatost@gmail.com>
2024-06-15 02:46:02 +08:00
Bowen Liang
f976740b57
improve: mordernizing validation by migrating pydantic from 1.x to 2.x (#4592) 2024-06-14 01:05:37 +08:00
takatost
d1dbbc1e33
feat: backend model load balancing support (#4927) 2024-06-05 00:13:04 +08:00
Jyong
b6631cd878
modify rerank and splitter code directory (#4924) 2024-06-04 20:07:25 +08:00
Jyong
233c4150d1
support images and tables extract from docx (#4619) 2024-05-23 18:05:23 +08:00
Rain Chen
c255a20d7c
allow to config max segmentation tokens length for RAG document using environment variable (#4375) 2024-05-20 13:20:27 +08:00
Bowen Liang
04ad46dd31
chore: skip unnecessary key checks prior to accessing a dictionary (#4497) 2024-05-19 18:30:45 +08:00
LIU HONGWEI
c227f3d985
feat: Deprecate datetime.utcnow() in favor of datetime.now(timezone.utc).replace(tzinfo=None) for better timezone handling (#3408) (#3416) 2024-04-12 16:22:24 +08:00
Jyong
33ea689861
fix detached instance error in keyword index create thread and fix question classifier node out of index error (#3219) 2024-04-09 14:34:51 +08:00
Jyong
283979fc46
fix keyword index error when storage source is S3 (#3182) 2024-04-09 01:42:58 +08:00
takatost
7753ba2d37
FEAT: NEW WORKFLOW ENGINE (#3160)
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: Yeuoly <admin@srmxy.cn>
Co-authored-by: JzoNg <jzongcode@gmail.com>
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
Co-authored-by: jyong <jyong@dify.ai>
Co-authored-by: nite-knite <nkCoding@gmail.com>
Co-authored-by: jyong <718720800@qq.com>
2024-04-08 18:51:46 +08:00
Jyong
b0b0cc045f
add mutil-thread document embedding (#3016)
Co-authored-by: jyong <jyong@dify.ai>
2024-03-28 17:02:35 +08:00
Jyong
6454e1d644
chunk-overlap None check (#2781)
Co-authored-by: jyong <jyong@dify.ai>
2024-03-11 15:36:56 +08:00
Jyong
31070ffbca
fix qa index processor tenant id is None error (#2713)
Co-authored-by: jyong <jyong@dify.ai>
2024-03-06 16:46:08 +08:00
Charlie.Wei
fa7ba30ba3
Fix rebuild index&csv parsing (#2705)
Co-authored-by: luowei <glpat-EjySCyNjWiLqAED-YmwM>
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2024-03-06 11:33:32 +08:00
Jyong
5b953c1ef2
Fix some RAG bugs (#2570)
Co-authored-by: jyong <jyong@dify.ai>
2024-02-27 11:39:05 +08:00
Jyong
0620fa3094
Feat/vdb migrate command (#2562)
Co-authored-by: jyong <jyong@dify.ai>
2024-02-26 19:47:29 +08:00
Jyong
4be3087642
Fix/new RAG bugs (#2547)
Co-authored-by: jyong <jyong@dify.ai>
2024-02-23 16:54:15 +08:00
Jyong
91ea6fe4ee
Fix/langchain document schema (#2539)
Co-authored-by: jyong <jyong@dify.ai>
2024-02-23 14:16:44 +08:00
Jyong
6c4e6bf1d6
Feat/dify rag (#2528)
Co-authored-by: jyong <jyong@dify.ai>
2024-02-22 23:31:57 +08:00
Jyong
97fe817186
Fix/upload limit (#2521)
Co-authored-by: jyong <jyong@dify.ai>
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
2024-02-22 17:16:22 +08:00
Bowen Liang
063191889d
chore: apply ruff's pyupgrade linter rules to modernize Python code with targeted version (#2419) 2024-02-09 15:21:33 +08:00
crazywoola
243ca5b1e2
fix: typo in package path of core.splitter (#2411) 2024-02-07 15:34:02 +08:00
Bowen Liang
843280f82b
enhancement: introduce Ruff for Python linter for reordering and removing unused imports with automated pre-commit and sytle check (#2366) 2024-02-06 13:21:13 +08:00
takatost
9f637ead38
bump version to 0.5.3 (#2306) 2024-02-01 18:11:57 +08:00
KVOJJJin
89fcf4ea7c
Feat: chunk overlap supported (#2209)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-26 13:24:40 +08:00
takatost
6cf93379b3
fix: split chunks return empty strings (#2197) 2024-01-25 13:59:18 +08:00
Jyong
869690c485
fix notion estimate (#2090)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-19 13:27:12 +08:00
Jyong
cb7a608d75
ascii filter Unicode U+FFFE (#2038)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-15 16:52:18 +08:00
Jyong
a63a9c7d45
text spliter length method use default embedding model tokenizer (#2011)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-12 18:45:34 +08:00
Bowen Liang
cc9e74123c
improve: introduce isort for linting Python imports (#1983) 2024-01-12 12:34:01 +08:00
Jyong
24bdedf802
fix get embedding model provider in empty dataset (#1986)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-10 20:48:16 +08:00
Jyong
4a3d15b6de
fix customer spliter character (#1915)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-04 16:21:48 +08:00
takatost
a938e1f184
fix: notion_indexing_estimate embedding_model_instance NPE (#1907) 2024-01-04 13:28:52 +08:00
Yeuoly
9134849744
fix: remove tiktoken from text splitter (#1876) 2024-01-03 13:02:56 +08:00
takatost
d069c668f8
Model Runtime (#1858)
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
Co-authored-by: Garfield Dai <dai.hai@foxmail.com>
Co-authored-by: chenhe <guchenhe@gmail.com>
Co-authored-by: jyong <jyong@dify.ai>
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: Yeuoly <admin@srmxy.cn>
2024-01-02 23:42:00 +08:00
Jyong
df1509983c
ppt & pptx improve (#1790)
Co-authored-by: jyong <jyong@dify.ai>
2023-12-19 18:11:27 +08:00
Jyong
5e34f938c1
Feat/add unstructured support (#1780)
Co-authored-by: jyong <jyong@dify.ai>
2023-12-18 23:24:06 +08:00
crazywoola
994fceece3
fix: qa regex (#1738) 2023-12-11 15:53:37 +08:00
Pascal M
bc54cdc537
refactor: typo in dataset docstore (#1711) 2023-12-07 09:24:52 +08:00
Pascal M
5d10cf0fe6
fix: error Class 'builtins.list' is not mapped (#1710) 2023-12-07 09:24:39 +08:00
Jyong
4588831bff
Feat/add retriever rerank (#1560)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-17 22:13:37 +08:00
crazywoola
d0e1ea8f06
1506 remove duplicated code (#1511) 2023-11-13 19:05:32 +08:00
Garfield Dai
42a5b3ec17
feat: advanced prompt backend (#1301)
Co-authored-by: takatost <takatost@gmail.com>
2023-10-12 10:13:10 -05:00
Jyong
289c93d081
Feat/improve document delete logic (#1325)
Co-authored-by: jyong <jyong@dify.ai>
2023-10-12 13:30:44 +08:00
yezhwi
8b8e510bfe
fix: handle AttributeError for datasets and index (#1052) 2023-08-30 11:14:16 +08:00
Jyong
a55ba6e614
Fix/ignore economy dataset (#1043)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-29 03:37:45 +08:00
Jyong
2d604d9330
Fix/filter empty segment (#1004)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-25 15:50:29 +08:00
Jyong
5623839c71
update document segment (#950)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-22 17:59:24 +08:00
takatost
3a0a9e2d8f
fix: embedding get price definition missing (#922) 2023-08-19 21:31:40 +08:00