Commit Graph

1869 Commits

Author SHA1 Message Date
Nicolas
445fc432e9 Reapply "fix(v1/crawl): always use sitemap"
This reverts commit 339b19ce9d.
2024-10-01 14:03:07 -03:00
Nicolas
339b19ce9d Revert "fix(v1/crawl): always use sitemap"
This reverts commit 5dc0fcf644.
2024-10-01 13:59:49 -03:00
Gergő Móricz
5dc0fcf644 fix(v1/crawl): always use sitemap 2024-10-01 18:49:44 +02:00
Gergő Móricz
3c045c43a4 feat(queue-worker): always crawl links from content even if sitemapped 2024-10-01 18:32:53 +02:00
Nicolas
1af26fe1b4 Nick: sitemap fix 2024-10-01 12:38:48 -03:00
Nicolas
ff4b7a835b
Merge pull request #685 from devflowinc/main
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
bugfix: using onlyIncludeTags and removeTags together
2024-09-30 17:18:30 -03:00
Nicolas
986262e1d4 Update search.ts 2024-09-30 15:23:43 -03:00
Gergő Móricz
0dd06d33ef fix(v0/search): pass job priority 2024-09-30 19:20:24 +02:00
Gergő Móricz
20ffdbd15c hotfix 2024-09-30 19:17:52 +02:00
Gergő Móricz
a8df85fd9b fix(acuc): remove sentry capture 2024-09-30 19:10:24 +02:00
Gergő Móricz
c6a83ab92c fix(api): entrypoint
Some checks failed
Fly Deploy / Pre-deploy checks (push) Has been cancelled
Fly Deploy / Deploy app (push) Has been cancelled
2024-09-27 22:16:27 +02:00
Gergő Móricz
e44bdf7a54 bad dockerfile 2024-09-27 21:07:11 +02:00
Gergő Móricz
f0a1a2e45b fix: increase ulimit -n in docker 2024-09-27 20:44:52 +02:00
Gergő Móricz
d5e2a80e4a fix(crawl-status): keep 10 megabyte pages if they're the only thing in the output
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
2024-09-27 20:41:41 +02:00
Nicolas
975f0575b4 Nick: max retries with axios-retry 2024-09-27 12:58:57 -04:00
Nicolas
92961cf74f Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-09-27 12:23:45 -04:00
Nicolas
1fdff87b3e Update single_url.ts 2024-09-27 12:23:44 -04:00
Gergő Móricz
6283e8fc47 fix(logger): set default level to trace 2024-09-27 17:46:43 +02:00
Gergő Móricz
5e8ef4954e feat(auth): log cache key in acuc update error 2024-09-27 17:13:10 +02:00
Gergő Móricz
e98f858eb6 fix(api): playground scrape errors
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
2024-09-26 22:28:14 +02:00
Nicolas
8d44cb33bb Nick: fixed error message 2024-09-26 22:15:15 +02:00
Gergő Móricz
2cb493321a fix(ACUC): do not refresh cache every set 2024-09-26 22:15:15 +02:00
Gergő Móricz
9bdd344b36 fix(redlock): use redlock.using for stability 2024-09-26 22:15:15 +02:00
Gergő Móricz
250c3bb5c6 fix(auth): move redlock settings 2024-09-26 22:15:15 +02:00
Gergő Móricz
81245e68fa fix(auth/redlock): retry cached ACUC lock for 20 seconds 2024-09-26 22:15:15 +02:00
Gergő Móricz
0f89f5e7cb fix(billTeam): cache update race condition 2024-09-26 22:15:15 +02:00
Gergő Móricz
d13a97f979 fix(credit_billing): allow spending of exact credits 2024-09-26 22:15:15 +02:00
Gergő Móricz
84bff8add8 fix(billTeam): update cached ACUC after billing 2024-09-26 22:15:15 +02:00
Gergő Móricz
f22ab5ffaf feat(db): implement bill_team RPC 2024-09-26 22:15:15 +02:00
Gergő Móricz
c1f68c3e0a fix(credit_billing): return chunk.remaining_credits 2024-09-26 22:15:15 +02:00
Gergő Móricz
2073063fb7 fix(db): fix caching and rpc error 2024-09-26 22:15:15 +02:00
Gergő Móricz
f8c70fe5dd feat(db): implement auth_credit_usage_chunk RPC 2024-09-26 22:15:15 +02:00
Eric Ciarla
75e32b0e5c
Merge pull request #707 from mendableai/new_examples
o1 job recommender
2024-09-26 14:51:45 -04:00
Eric Ciarla
20b998e66a Delete o1_job_recommender.ipynb 2024-09-26 14:51:07 -04:00
Eric Ciarla
5c4d436f1e Create o1_job_recommender.py 2024-09-26 14:46:48 -04:00
Eric Ciarla
51bc2f25fe remove actions crawler 2024-09-26 11:44:55 -04:00
Eric Ciarla
289af6f89e example 2024-09-25 21:10:09 -04:00
Nicolas
a9773a24a3 Nick: increased timeout for chrome-cdp due to smart wait
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
2024-09-25 19:27:02 -04:00
Eric Ciarla
abdc08edea
Merge pull request #679 from h4r5h4/fix/folder-name
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
remove space in the examples/o1_web_crawler folder name
2024-09-25 10:40:09 -04:00
Nicolas
1da026b26e Update single_url.ts
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
2024-09-24 23:29:48 -04:00
Nicolas
b8266cc329 Update website_params.ts 2024-09-24 23:28:58 -04:00
Gergő Móricz
f00c0b82f9 fix(v1/scrape): add total wait specified in request to timeout
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
2024-09-24 21:56:22 +02:00
Nicolas
3f138e559e Update website_params.ts 2024-09-24 15:14:26 -04:00
Gergő Móricz
43730b5db6 feat(WebScraper): always report error of last scraper in order 2024-09-24 20:03:49 +02:00
Gergő Móricz
3e661a2087 fix(v1/crawl-cancel): avoid double authing 2024-09-24 20:01:34 +02:00
Nicolas
86744f6deb
Update README.md 2024-09-24 13:22:09 -04:00
Gergő Móricz
4194525640 fix(blocklist): unblock TikTok Business page
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
This is just a regular business site, not social media.
2024-09-24 16:55:19 +02:00
Gergő Móricz
4a623c084a fix(fly): don't use Depot builders (doesn't work) 2024-09-24 10:50:30 +02:00
Gergő Móricz
a59b5836d5 Revert error tallying 2024-09-24 10:27:49 +02:00
Gergő Móricz
a4b128e8b7 fix(rust): blocklisted error test
Some checks are pending
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Deploy app (push) Blocked by required conditions
2024-09-23 23:03:00 +02:00