Commit Graph

321 Commits

Author SHA1 Message Date
rafaelsideguide
b5045d1661 [feat] improved the scrape for gdrive pdfs 2024-06-04 17:47:28 -03:00
Nicolas
96257b7b17 Update handleCustomScraping.ts 2024-06-04 12:22:46 -07:00
Nicolas
674500affa Nick: 2024-06-04 12:15:39 -07:00
rafaelsideguide
5ae4d1caf5 Update single_url.ts 2024-06-04 15:28:09 -03:00
rafaelsideguide
64a4338ff0 Update single_url.ts 2024-06-04 14:40:05 -03:00
Rafael Miller
02fe470e20
Merge pull request #148 from mendableai/nsc/improvemnts-fixes-misc
Better fallbacks for initial crawl start
2024-06-04 14:31:10 -03:00
Rafael Miller
b80fb374e5
Merge branch 'main' into playwright-service-bug-222 2024-06-04 11:57:17 -03:00
rafaelsideguide
6920ec8a61 bugfixing. already on main 2024-06-04 11:05:50 -03:00
Nicolas
d91b725c6f Update fly.toml 2024-06-04 00:41:15 -07:00
Nicolas
cbf8d79cce Update pdfProcessor.ts 2024-06-04 00:13:37 -07:00
Nicolas
3fc9004ba8 Update fly.toml 2024-06-03 23:49:46 -07:00
Nicolas
2ea01f1456 Update single_url.ts 2024-06-03 23:42:39 -07:00
Nicolas
854d5b3cb3 Update single_url.ts 2024-06-03 23:32:55 -07:00
Nicolas
99059814a8 Nick: 2024-06-03 21:32:48 -07:00
Nicolas
918059ee9e Merge branch 'main' into nsc/improvemnts-fixes-misc 2024-06-03 16:46:02 -07:00
Nicolas
38e583f66c Update socialBlockList.test.ts 2024-06-03 16:44:23 -07:00
Nicolas
c69c89f838 Nick: 2024-06-03 16:42:42 -07:00
Nicolas
48d1ec05b2 Merge branch 'main' into nsc/improved-blocklist 2024-06-03 16:38:03 -07:00
Nicolas
d30ced4394
Merge pull request #221 from mendableai/nsc/fwd-header-auth
feat: Ability to forward headers to reliable providers for auth etc...
2024-06-03 16:33:40 -07:00
Romain Bruyère
4987f901d1
Merge branch 'mendableai:main' into main 2024-06-03 21:29:33 +02:00
rombru
3ff91ddd1f fix: use @ instead of # for default BULL_AUTH_KEY. hash mark is reserved for URI fragments. 2024-06-03 21:28:25 +02:00
rafaelsideguide
c1aed1360e Update index.test.ts 2024-06-03 15:51:07 -03:00
rafaelsideguide
1fc3a15149 Update single_url.ts 2024-06-03 15:24:40 -03:00
Nicolas
fde522c3e1 Update single_url.ts 2024-06-02 20:23:45 -07:00
Matt Joyce
deefe65cbe Change the way the playwright response is parsed
Was failing with a Type Error, but actually looked ok.
This fixes the type error, and stop scraper fallback.
2024-06-01 19:16:56 +10:00
Matt Joyce
14896a9fdd Fix PLAYWRIGHT_MICROSERVICE_URL
It needs to end in html, otherwise scrape will 404
2024-06-01 19:03:16 +10:00
Nicolas
8cb62dde92 Update website_params.ts 2024-05-31 16:09:39 -07:00
Nicolas
3b8059edb6 Update single_url.ts 2024-05-31 15:43:06 -07:00
Nicolas
6bea803120 Nick: 2024-05-31 15:39:54 -07:00
Nicolas
2139129296 Nick: v12 2024-05-31 11:39:55 -07:00
Nicolas
260e31c68b Merge branch 'nsc/new-pricing' 2024-05-30 16:08:31 -07:00
Nicolas
aa8133ca7f Update load-testing-example.ts 2024-05-30 16:07:14 -07:00
Nicolas
0c115c6181
Merge pull request #216 from mendableai/nsc/new-pricing
feat: New pricing/limits changes
2024-05-30 15:36:59 -07:00
Nicolas
6860ace4af Nick: 2024-05-30 15:07:49 -07:00
Nicolas
6ceb7ff50a Nick: 2024-05-30 14:46:55 -07:00
Nicolas
33f10a7f91 Nick: fixes 2024-05-30 14:42:32 -07:00
Nicolas
ace46f340b Nick: new limits, new pricing 2024-05-30 14:31:36 -07:00
Nicolas
6c939d534d Nick: small refactor 2024-05-29 19:43:51 -07:00
Eric Ciarla
37915e11e8 Final push 2024-05-29 21:18:24 -04:00
Eric Ciarla
a0e404f94e init commit 2024-05-29 18:56:57 -04:00
rafaelsideguide
ee9a2184e2 Added custom scraping conditions for readme docs 2024-05-29 13:39:43 -03:00
Nicolas
c20c38721d Update index.test.ts 2024-05-28 17:17:20 -07:00
Nicolas
0f43a12906 Update index.test.ts 2024-05-28 17:17:12 -07:00
Nicolas
1b3547dcf2 Nick: 2024-05-28 12:56:24 -07:00
Nicolas
1ef307cb6f Nick: checks 2024-05-27 10:01:12 -07:00
Nicolas
1bbfb98d7e
Merge pull request #186 from Keredu/main
Limit on /search is not deterministic
2024-05-26 18:08:16 -07:00
Nicolas
7e2df7bd5e Update auth.ts 2024-05-26 18:07:21 -07:00
Simon H
115204e6b6 Feat: Provide more details for 429 error msg
- Added better error code for when rate limit exceeded including
consumed/remaining points, reset date and retry-after seconds
2024-05-25 12:03:20 -04:00
Keredu
2192978f91 Limit on /search is not deterministic 2024-05-25 00:12:26 +02:00
Nicolas
e98434606d Update blocklist.ts 2024-05-24 15:04:15 -07:00