Gergo Moricz
d591e0f51c
block corterix.com for performance issues
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-25 20:06:12 +02:00
rafaelsideguide
6f9a2687ae
fixed turndown bug
2024-08-25 15:04:32 -03:00
Gergo Moricz
96e91ab9ec
convert webhook call to v1
2024-08-25 14:05:46 +02:00
Nicolas
1f99bfd3c8
Update queue.ts
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
Simple Autoscaler / scale (push) Has been cancelled
2024-08-23 22:47:12 -03:00
Nicolas
b80277d4de
Update queue.ts
2024-08-23 22:46:44 -03:00
Nicolas
d87b62fed9
Nick:
2024-08-23 22:33:17 -03:00
Nicolas
b9e06e27f4
Update queue.ts
2024-08-23 22:17:27 -03:00
Nicolas
8e78511ed4
Update queue.ts
2024-08-23 22:15:47 -03:00
Nicolas
28d7a637c2
Update queue.ts
2024-08-23 22:07:49 -03:00
Nicolas
173f4ee1bf
Nick: chrome cdp main | simple autoscaler
2024-08-23 20:09:59 -03:00
Gergő Móricz
064ebfc54d
fix websocket
2024-08-23 19:55:41 +02:00
Gergő Móricz
05c250d3b8
Merge branch 'main' into v1-webscraper
2024-08-23 19:38:57 +02:00
Gergő Móricz
2ab0dd2e15
fix(scrape): add further llm extraction catch
2024-08-23 19:20:17 +02:00
Gergő Móricz
1054a1397b
Merge branch 'main' into v1-webscraper
2024-08-23 19:14:49 +02:00
Nicolas
3d53f4e213
Nick: unblocking pin
2024-08-23 13:56:05 -03:00
Gergő Móricz
5ef3926d2a
fix(scrape,search): handle failed jobs
2024-08-23 18:47:56 +02:00
Gergő Móricz
866e71910c
further fixes
2024-08-23 18:27:00 +02:00
Gergő Móricz
eea530e0ad
feat(v1): update for sentry
2024-08-23 17:29:42 +02:00
Gergő Móricz
e7f267b6fe
Merge branch 'main' into v1-webscraper
2024-08-23 17:21:54 +02:00
Gergő Móricz
52a05b8c6e
rename "dragonfly" to "redis"
2024-08-23 17:05:59 +02:00
Gergő Móricz
64e9be0cd4
feat(redis): use bitnami image
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-22 23:38:04 +02:00
Gergő Móricz
8d9ff90bcb
feat(fire-engine): propagate sentry trace
2024-08-22 23:38:04 +02:00
rafaelsideguide
74ea820bc6
fix: url and check for metadata
2024-08-22 18:32:19 -03:00
Nicolas
1f0abacadf
Merge branch 'main' of https://github.com/mendableai/firecrawl
2024-08-22 18:30:54 -03:00
Nicolas
1f779e261a
Update rate-limiter.ts
2024-08-22 18:30:45 -03:00
Gergő Móricz
8e3c2b2855
fix(crawler): verify URL
2024-08-22 23:30:19 +02:00
Gergő Móricz
e690a6fda7
fix: remove QueueEvents
2024-08-22 22:38:39 +02:00
Gergő Móricz
76c8e9f996
fix
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-22 22:24:24 +02:00
Gergő Móricz
ad82175fb8
fix(scrape): poll
2024-08-22 22:12:02 +02:00
rafaelsideguide
5f60a55967
workflow and npm now running v1 tests
2024-08-22 15:28:49 -03:00
rafaelsideguide
30e809966f
Merge remote-tracking branch 'origin/v1/python-sdk' into v1-webscraper
2024-08-22 15:18:05 -03:00
rafaelsideguide
a37681bdff
fix: replace jest, removed map for v0
2024-08-22 15:16:46 -03:00
rafaelsideguide
7473b74021
fix: html and rawlhtmls for pdfs
2024-08-22 15:15:45 -03:00
Gergő Móricz
dd737f1235
feat(sentry): add queue instrumentation to
2024-08-22 19:17:51 +02:00
Nicolas
d2521612b4
Update .gitignore
2024-08-22 14:15:19 -03:00
Gergő Móricz
7265ab7c67
fix(search): filter docs properly
2024-08-22 18:46:56 +02:00
rafaelsideguide
b1d61d8557
Merge remote-tracking branch 'origin/v1-webscraper' into v1/python-sdk
2024-08-22 13:39:09 -03:00
rafaelsideguide
ab88a75c70
fixes sdks
2024-08-22 13:38:34 -03:00
Gergő Móricz
d036738da0
fix(bullmq): duplicate redis connection for QueueEvents
2024-08-22 18:04:09 +02:00
Gergő Móricz
6d48dbcd38
feat(sentry): add trace continuity for queue
2024-08-22 16:47:38 +02:00
Gergő Móricz
6d92b8524d
feat(scrape): record job result in span
2024-08-22 16:00:13 +02:00
Gergő Móricz
5ca36fe9fc
feat(api): add more captureExceptions
2024-08-22 15:49:16 +02:00
Gergő Móricz
0e8fd6ce70
fix(scrape): ensure extractionSchema is an object if llm-extraction is specified
2024-08-22 14:50:51 +02:00
Gergő Móricz
4bd2ff26d3
fix(llm-extract): pass stacktrace properly
2024-08-22 14:37:09 +02:00
Gergő Móricz
e4adbaa88e
fix(llm-extract): handle llm-extract if scrape failed
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-22 14:12:52 +02:00
Gergő Móricz
670d253a8c
fix(auth): fix error reporting
2024-08-22 14:08:09 +02:00
Gergő Móricz
7d9f5bf8b1
fix(crawl): don't use sitemap if it's empty
...
Fixes FIRECRAWL-SCRAPER-JS-11
2024-08-22 13:41:33 +02:00
Gergő Móricz
1f580deefc
fix(crawl): validate includes.excludes regexes
2024-08-22 13:29:11 +02:00
Gergő Móricz
fbbc3878f1
fix(crawler): make sure includes/excludes is an array
2024-08-22 13:18:26 +02:00
Gergő Móricz
508568f943
fix(search): handle scrape timeouts on search
...
Fixes FIRECRAWL-SCRAPER-JS-15
2024-08-22 13:10:58 +02:00
Gergő Móricz
14fa75cae6
fix(crawl): send error if url is not a string
...
Fixes FIRECRAWL-SCRAPER-JS-1E and FIRECRAWL-SCRAPER-JS-Z
2024-08-22 13:09:08 +02:00
Nicolas
8a778278a9
Merge branch 'main' into nsc/job-priority
2024-08-21 22:57:55 -03:00
Gergo Moricz
0cdf41587e
feat(sentry): add error handles to try-catch blocks
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-22 03:55:40 +02:00
Nicolas
53ca704620
Update index.ts
2024-08-21 22:55:39 -03:00
Nicolas
477c3257dc
Nick:
2024-08-21 22:53:33 -03:00
Nicolas
c7bfe4ffe8
Nick:
2024-08-21 22:20:40 -03:00
Nicolas
6bdb1d045d
Merge branch 'main' into nsc/job-priority
2024-08-21 21:52:05 -03:00
Nicolas
e78d2af1f0
Nick:
2024-08-21 21:51:54 -03:00
Nicolas
e64d3815ea
Merge branch 'main' into nsc/job-priority
2024-08-21 20:54:57 -03:00
Nicolas
0ea0a5db46
Nick: wip
2024-08-21 20:54:39 -03:00
rafaelsideguide
a4686e3c8c
fixing tests
2024-08-21 15:56:48 -03:00
rafaelsideguide
fe2e8c0b7a
includehtml fix
2024-08-21 15:54:00 -03:00
Gergő Móricz
629da74a5c
fix(sentry): decrease tracesSampleRate
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-21 20:51:35 +02:00
Gergő Móricz
55009e51f5
fix: filter out invalid URLs from crawl links
2024-08-21 20:49:25 +02:00
Gergő Móricz
dae1408e66
fix(Dockerfile): retain sentry auth token properly
2024-08-21 20:40:42 +02:00
Gergő Móricz
ac9783ed2f
fix(sentry): adjust profiles sample rate to be even lower
2024-08-21 20:21:16 +02:00
Gergő Móricz
9579f03c4b
fix: import resolution
2024-08-21 20:16:06 +02:00
Gergő Móricz
6104d74213
fix(sentry): drop profiling sample rate
2024-08-21 20:12:47 +02:00
Gergő Móricz
3d5dc9d90a
feat(sentry): add log + server name
2024-08-21 19:39:10 +02:00
Gergő Móricz
85ff0c311e
Add worker ID to job attribute
2024-08-21 19:21:29 +02:00
Gergő Móricz
920702cdde
Update builder to handle uploading sourcemaps
2024-08-21 19:08:03 +02:00
Gergő Móricz
86942728e3
Add metadata for queue-worker and Express
2024-08-21 17:58:27 +02:00
Nicolas
35decb1af2
Nick:
2024-08-21 12:35:03 -03:00
rafaelsideguide
af0e47a30e
Merge remote-tracking branch 'origin/v1/node-sdk' into v1/python-sdk
2024-08-21 12:09:53 -03:00
rafaelsideguide
52abec41c2
fixing delete
2024-08-21 10:35:50 -03:00
Nicolas
db8c84ff0f
Update requests.http
2024-08-21 10:19:37 -03:00
rafaelsideguide
b66553867e
reverting delete, fixed express bug on checkCredits
2024-08-21 09:28:20 -03:00
rafaelsideguide
138437d616
commenting out delete, crashing on fire-engine
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-21 08:11:24 -03:00
rafaelsideguide
5e48bec1fd
commenting out delete, crashing on fire-engine
2024-08-21 08:10:46 -03:00
Nicolas
90b32f16c8
Nick: fixes
2024-08-20 21:38:11 -03:00
Nicolas
819ad50af3
Update fireEngine.ts
2024-08-20 21:16:33 -03:00
rafaelsideguide
e9d6ca197e
tests passing now
2024-08-20 20:00:41 -03:00
Nicolas
1b3ad60a2c
Reapply "Merge pull request #561 from mendableai/bug/dealing-with-dns-error"
...
This reverts commit ffe11a5bf7
.
2024-08-20 19:22:09 -03:00
Nicolas
441628998f
Reapply "Merge pull request #561 from mendableai/bug/dealing-with-dns-error"
...
This reverts commit ffe11a5bf7
.
2024-08-20 19:16:48 -03:00
Nicolas
ffe11a5bf7
Revert "Merge pull request #561 from mendableai/bug/dealing-with-dns-error"
...
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
This reverts commit 2030ec6031
, reversing
changes made to f494d2b707
.
2024-08-20 18:16:11 -03:00
Gergő Móricz
5818236659
fix: remove rawHtml properly
2024-08-20 22:51:12 +02:00
rafaelsideguide
0f48823c9c
Merge remote-tracking branch 'origin/v1-webscraper' into v1/node-sdk
2024-08-20 17:24:59 -03:00
rafaelsideguide
9d64c8eedc
screenshot should not be undefined also
2024-08-20 17:24:45 -03:00
Gergő Móricz
1368f9a87f
fix: treat existing screenshot as a scraper success condition
2024-08-20 22:24:18 +02:00
rafaelsideguide
70d81ca69e
Merge remote-tracking branch 'origin/v1-webscraper' into v1/node-sdk
2024-08-20 17:08:07 -03:00
rafaelsideguide
537fc689b6
fixing request
2024-08-20 17:08:02 -03:00
rafaelsideguide
f98be7d94e
Update fireEngine.ts
2024-08-20 16:53:01 -03:00
Nicolas
0c48c8a436
Nick: billing for map
2024-08-20 16:43:46 -03:00
Gergő Móricz
f494d2b707
prioritize search lower
2024-08-20 21:18:31 +02:00
Nicolas
59eb552df5
Merge pull request #560 from mendableai/feat/cancel-fire-engine
...
[Feat] Added check job and cancel to fire-engine requests
2024-08-20 15:43:31 -03:00
rafaelsideguide
1f27182a13
added try catch
2024-08-20 15:42:39 -03:00
Gergő Móricz
efb91f9ca5
fix(search): delete jobs after done
2024-08-20 20:29:08 +02:00
Nicolas
39388cdc35
Update crawl.ts
2024-08-20 14:41:43 -03:00
Nicolas
674adee144
Merge branch 'v1-webscraper' of https://github.com/mendableai/firecrawl into v1-webscraper
2024-08-20 14:41:05 -03:00
Nicolas
b36faeaf54
Nick:
2024-08-20 14:39:52 -03:00
Gergő Móricz
cf32893c2e
add strict enforcement + move crawlerOptions to top-level in /crawl
2024-08-20 19:31:26 +02:00
rafaelsideguide
e326249a57
added check job and cancel to fire-engine requests
2024-08-20 14:26:42 -03:00
Gergő Móricz
70d50b3640
fix(queue-worker): move dotenv config up
2024-08-20 19:25:19 +02:00
Nicolas
c5ad4dedeb
Update crawl.ts
2024-08-20 14:19:20 -03:00
Nicolas
de0dc20a02
Update credit_billing.ts
2024-08-20 14:18:14 -03:00
Nicolas
5abd26a267
Nick: set the crawl limit to the remaining credits
2024-08-20 14:16:54 -03:00
Nicolas
e200ec9e12
Nick:
2024-08-20 12:24:14 -03:00
Nicolas
55dad82df1
Nick: fixed map search
2024-08-20 12:17:53 -03:00
Nicolas
27903247b6
Nick: map tests and fixes
2024-08-20 12:04:08 -03:00
rafaelsideguide
fa89d2e535
v1 support for crawl/monitor status
2024-08-20 10:37:24 -03:00
rafaelsideguide
7727302ef1
Merge remote-tracking branch 'origin/v1-webscraper' into v1/node-sdk
2024-08-20 09:22:55 -03:00
rafaelsideguide
e160d5529a
fixed test
2024-08-20 09:22:38 -03:00
rafaelsideguide
e1c9cbf709
bug fixed. crawl should not stop if sitemap url is invalid
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-20 09:11:58 -03:00
rafaelsideguide
ecd472356b
added variables to beta customers
2024-08-19 16:41:54 -03:00
Nicolas
3dc298be54
Nick: 2x rate limits for standard and growth for /scrape
2024-08-19 13:52:54 -03:00
rafaelsideguide
32aba44167
fixing merge issues
2024-08-19 13:37:20 -03:00
rafaelsideguide
72461ce9a6
Update index.test.ts
2024-08-19 13:29:52 -03:00
rafaelsideguide
fd7fdc1d52
added blocklist middleware
2024-08-19 13:28:54 -03:00
rafaelsideguide
e516e4993b
Merge remote-tracking branch 'origin/v1-webscraper' into v1/node-sdk
2024-08-19 11:24:57 -03:00
Nicolas
8e4ca86463
Update crawl.ts
2024-08-19 11:02:24 -03:00
Nicolas
36b35dbc67
Update crawl.ts
2024-08-19 11:01:26 -03:00
rafaelsideguide
4ffc60596a
Update queue-worker.ts
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-19 09:29:23 -03:00
rafaelsideguide
b8170aaa47
Update blocklist.ts
2024-08-19 08:51:48 -03:00
Nicolas
3fe82b4f12
Update queue-worker.ts
Fly Deploy / Pre-deploy checks (push) Has been cancelled
Fly Deploy / Test Suite (push) Has been cancelled
Fly Deploy / Python SDK Tests (push) Has been cancelled
Fly Deploy / JavaScript SDK Tests (push) Has been cancelled
Fly Deploy / Go SDK Tests (push) Has been cancelled
Fly Deploy / Deploy app (push) Has been cancelled
Fly Deploy / Build and publish Python SDK (push) Has been cancelled
Fly Deploy / Build and publish JavaScript SDK (push) Has been cancelled
2024-08-17 03:09:31 -04:00
Nicolas
f797380112
Nick:
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-16 22:17:38 -04:00
Nicolas
47123be783
Nick: weird activity block
2024-08-16 22:01:56 -04:00
Nicolas
ff84f1fe5e
Update map.ts
2024-08-16 20:42:36 -04:00
Nicolas
4314313477
Update map.ts
2024-08-16 19:56:18 -04:00
Nicolas
af9a0a6f0b
Update map.ts
2024-08-16 19:56:03 -04:00
Nicolas
ba5279eafc
Nick: all tests passing
2024-08-16 19:55:44 -04:00
Nicolas
5205c5f005
Update map.ts
2024-08-16 19:37:00 -04:00
Nicolas
0c05d096a9
Merge branch 'v1-webscraper' of https://github.com/mendableai/firecrawl into v1-webscraper
2024-08-16 19:33:58 -04:00
Nicolas
ab48353226
Nick: /map almost good
2024-08-16 19:33:57 -04:00
Gergő Móricz
eb84673b06
feat: crawl status websocket WIP
2024-08-17 01:04:14 +02:00
Gergő Móricz
e2a6ef26d3
mount v1Router under v1 path
2024-08-16 23:48:50 +02:00
Gergő Móricz
4c1b74dab3
fix(map): remove robots.txt
2024-08-16 23:46:10 +02:00
Gergő Móricz
c281fe62c0
fix(crawl): propagate db fix to preview endpoint
2024-08-16 23:43:54 +02:00
Gergő Móricz
803577eeba
feat(crawl): webhook
2024-08-16 23:42:48 +02:00
Gergő Móricz
e6738abf96
fix(crawl-status): retrieve from DB in bulk
2024-08-16 23:39:39 +02:00
rafaelsideguide
086ba6280b
fixed markdown format
2024-08-16 18:39:13 -03:00
Gergő Móricz
aabfaf0ac5
clean up crawl-status, fix db ddos
2024-08-16 23:29:39 +02:00
rafaelsideguide
e5b807ccc4
Merge branch 'v1-webscraper' of https://github.com/mendableai/firecrawl into v1-webscraper
2024-08-16 17:57:31 -03:00
rafaelsideguide
7a61325500
map + search + scrape markdown bug
2024-08-16 17:57:11 -03:00
Gergő Móricz
5896153d19
fix: crawl status and redis fixes
2024-08-16 22:52:48 +02:00
Gergő Móricz
3fcb21930e
remove log
2024-08-16 22:48:23 +02:00
Gergő Móricz
f20328bdbb
crawl status and document stuff
2024-08-16 22:48:05 +02:00
Nicolas
0c057bb649
Update index.test.ts
2024-08-16 16:45:10 -04:00
Nicolas
b32464558a
Update index.test.ts
2024-08-16 16:41:09 -04:00
Nicolas
5bac7988a6
Update index.test.ts
2024-08-16 16:08:38 -04:00
Nicolas
290c7ee936
Update index.test.ts
2024-08-16 16:06:46 -04:00
Nicolas
23a033fe61
Nick: fixes and more e2e tests
2024-08-16 16:03:35 -04:00
Nicolas
78ca94251c
Merge pull request #480 from mendableai/nsc/hyper-v81
...
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
Reduce metrics ingestion w/ HyperDX v0.8.1
2024-08-16 14:34:14 -04:00
Nicolas
37ae9a9043
Update index.test.ts
2024-08-16 14:17:43 -04:00
Nicolas
200ce8e2ce
Merge branch 'v1-webscraper' of https://github.com/mendableai/firecrawl into v1-webscraper
2024-08-16 14:16:35 -04:00
Nicolas
21d3798e49
Nick: initial e2e v1 tests for /scrape
2024-08-16 14:16:30 -04:00
rafaelsideguide
3f998b688d
scrape ready
2024-08-16 15:14:37 -03:00
Nicolas
b0d211ecc1
Merge branch 'main' into v1-webscraper
2024-08-16 13:43:28 -04:00
Gergő Móricz
fd6432e7fd
fix(queue-worker): correct job success
2024-08-16 19:16:08 +02:00
Gergő Móricz
6e54942265
fix(queue-worker): add cancelled to crawl log
2024-08-16 19:11:53 +02:00
rafaelsideguide
9b1cb266a0
added origin to request types
2024-08-16 13:49:50 -03:00
Gergő Móricz
d0a8382a5b
fix(queue-worker): crawl finishing race condition
2024-08-16 18:48:52 +02:00
Gergő Móricz
6bd52e63bf
fix(queue-worker): fix linksOnPage undefined error
2024-08-16 18:42:24 +02:00
Gergő Móricz
5a6570cba2
fix(webhooks): call back with parent crawl ID
2024-08-16 17:42:42 +02:00
rafaelsideguide
7d324bd2c8
Create checkCredits.ts
2024-08-16 11:21:52 -03:00
Nicolas
ec361609d2
Nick: added growth-2x plan
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-15 18:37:19 -04:00
Gergő Móricz
8b7569f8f3
add zod, create middleware, update openapi declaration, add crawl logic
2024-08-15 23:30:33 +02:00
Gergő Móricz
4165de1773
v1 restructure
2024-08-15 21:51:59 +02:00
Gergő Móricz
af08ab0b1a
fix bad module resolution
2024-08-15 21:51:52 +02:00
Nicolas
c917c8fbcd
Merge branch 'main' into v1-webscraper
2024-08-15 15:14:29 -04:00
Nicolas
32c6b1f136
Nick: remove active job alerts
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-15 14:50:30 -04:00
Gergő Móricz
0c14366720
fix: add checkandupdateurl to crawlPreview
2024-08-15 20:30:25 +02:00
Nicolas
81b2479db3
Merge pull request #459 from mendableai/feat/queue-scrapes
...
feat: Move scraper to queue
2024-08-15 14:19:55 -04:00
Gergő Móricz
fc08ff450d
search port
2024-08-15 20:10:59 +02:00
Nicolas
86326f34e9
Update single_url.test.ts
2024-08-15 13:48:42 -04:00
Gergő Móricz
129a882bcc
fix(scrape): give scrapes their real job id
2024-08-15 19:29:47 +02:00
Gergő Móricz
965a5817d1
fix(queue-worker): log jobs correctly
2024-08-15 19:27:15 +02:00
Gergő Móricz
dad9d353d9
use thomas's url validation
2024-08-15 19:19:02 +02:00
Gergő Móricz
e3279274f1
fix: make playground crawl work
2024-08-15 19:14:32 +02:00
Gergő Móricz
c5597bc722
fix: robots.txt laoding
2024-08-15 19:11:07 +02:00
Gergő Móricz
29f0d9ec94
propagate priority to fire-engine
2024-08-15 19:04:46 +02:00
Gergő Móricz
b79d3d1754
fix
2024-08-15 19:02:05 +02:00
Gergő Móricz
57730f6a35
priority changes
2024-08-15 18:58:07 +02:00
Gergő Móricz
846610681b
fix: fix posthog, add dummy crawl DB items
2024-08-15 18:55:18 +02:00
rafaelsideguide
81066cf90a
updating cargo pckg name n version
2024-08-15 10:11:27 -03:00
Nicolas
6e1074cdd1
Update website_params.ts
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-14 17:39:54 -04:00
Thomas Kosmas
6410e1a81d
Update params
2024-08-15 00:10:14 +03:00
rafaelsideguide
697501cc8a
Merge remote-tracking branch 'origin/main' into f/rust-sdk
2024-08-14 17:30:43 -03:00
Gergő Móricz
8a5cad72f6
fix(queue-worker): variable name collision
2024-08-14 22:02:05 +02:00
Gergő Móricz
b8ec40dd72
fix(crawl): submit sitemapped jobs in bulk
2024-08-14 20:34:19 +02:00
Gergő Móricz
2ca1017fc3
fix(crawl): make request 0 of crawl jobs higher priority
2024-08-14 19:34:18 +02:00
Gergő Móricz
f4466f6bb0
fix(test-suite): add artillery
2024-08-14 19:33:09 +02:00
Gergő Móricz
cfad067a63
fix(fly): change proxy limits
2024-08-14 18:52:40 +02:00
Gergő Móricz
a6c81f9d62
fix: return all data when calling webhook
2024-08-14 17:53:47 +02:00
rafaelsideguide
f86d2bb291
added go-sdk as submodule
2024-08-13 18:17:35 -03:00
Nicolas
e2472b9b0d
Merge remote-tracking branch 'origin/v1/mockup-controllers' into v1-webscraper
2024-08-13 16:34:57 -04:00
Gergo Moricz
2e5e480cc2
fix(crawl): call webhooks
2024-08-13 22:10:17 +02:00
Gergo Moricz
a33596de3c
fix(log_job): add crawl_id
2024-08-13 22:03:46 +02:00
Gergo Moricz
9252940b52
fix(crawl-status): sort data
2024-08-13 21:55:13 +02:00
Gergo Moricz
8dbac0268c
feat: offload crawl results to the DB
2024-08-13 21:40:59 +02:00
Gergo Moricz
4bbc9db1df
fix: prioritize scrape jobs over crawl jobs
2024-08-13 21:31:34 +02:00
Gergo Moricz
5f2af37880
fix(scrape): remove scrape job from queue after the job is done
2024-08-13 21:26:41 +02:00
Gergo Moricz
2413e33359
fix(queue-worker): remove console.log
2024-08-13 21:07:36 +02:00
Gergo Moricz
d7549d4dc5
feat: remove webScraperQueue
2024-08-13 21:03:24 +02:00
Gergő Móricz
4a2c37dcf5
Merge branch 'main' into feat/queue-scrapes
2024-08-13 20:53:49 +02:00
Gergo Moricz
86e136beca
feat: crawl to scrape conversion
2024-08-13 20:51:43 +02:00
rafaelsideguide
a4be95ac27
fixed tests
2024-08-13 13:42:26 -03:00
KentHsu
fd060c7ef1
fix: go-sdk module name
2024-08-13 10:10:45 +08:00
Rafael Miller
76160a38db
Update single_url.ts
2024-08-12 17:57:00 -03:00
Rafael Miller
7c339ea125
Update single_url.ts
2024-08-12 17:55:10 -03:00
Nicolas
09ca165d2e
Merge pull request #531 from kevinswiber/fix/respect-docker-env-file-comments
...
Self-host fix: Moving comments of .env.example values from end-of-line to above-line.
2024-08-12 16:54:56 -04:00
Nicolas
d06f40810c
Merge pull request #515 from wahpiangle/main
...
Update redis urls in example .env
2024-08-12 16:51:59 -04:00
Nicolas
6810338271
Update search.ts
2024-08-12 16:51:43 -04:00
Thomas Kosmas
98be29c963
Update parameters for platform.openai.com
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Go SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-12 22:49:28 +03:00
rafaelsideguide
c3aeed510b
Update single_url.ts
2024-08-12 16:40:31 -03:00
Kevin Swiber
33aa5cf0de
Moving comments of .env.example values from end-of-line to above-line. Self-host docs suggest using .env.example as a base. However, Docker doesn't respect end-of-line comments. It sets the comment as the actual value of the variable. This fix prevents that.
2024-08-12 12:24:46 -07:00
Nicolas
74a5125185
Nick: removed redlock
2024-08-12 15:07:30 -04:00
Nicolas
0bd1a820ee
Update auth.ts
2024-08-12 13:42:09 -04:00
Nicolas
25a899eae3
Nick: redlock cache in auth
2024-08-12 13:37:47 -04:00
rafaelsideguide
bbed6ef23d
added validation on every USE_DB_AUTHENTICATION call
2024-08-12 14:20:41 -03:00
Rafael Miller
36e4b2cf49
Update .env.example
2024-08-12 10:37:00 -03:00
Quan Ming
a96ad4b0e2
Update redis url to use comment
2024-08-10 12:33:26 +08:00
Kevin Swiber
ba2af74adf
Ensuring USE_DB_AUTHENTICATION is true in single URL scraper.
2024-08-09 15:29:18 -07:00
Nicolas
e28c415cf4
Nick:
2024-08-09 14:07:46 -04:00
Gergo Moricz
5a778f2c22
fix(js-sdk): add type metadata to exports
Fly Deploy / Pre-deploy checks (push) Has been cancelled
Fly Deploy / Test Suite (push) Has been cancelled
Fly Deploy / Python SDK Tests (push) Has been cancelled
Fly Deploy / JavaScript SDK Tests (push) Has been cancelled
Fly Deploy / Go SDK Tests (push) Has been cancelled
Fly Deploy / Deploy app (push) Has been cancelled
Fly Deploy / Build and publish Python SDK (push) Has been cancelled
Fly Deploy / Build and publish JavaScript SDK (push) Has been cancelled
2024-08-09 20:05:36 +02:00
Rafael Miller
6a78f6fe78
Merge pull request #497 from KentHsu/feat/add-go-sdk
...
[Feat] Add Go SDK implementation
2024-08-09 14:58:20 -03:00
rafaelsideguide
0591000b64
bugfix includes excludes
2024-08-09 14:30:41 -03:00
Kent (Chia-Hao), Hsu
1fda882983
Merge branch 'mendableai:main' into feat/add-go-sdk
2024-08-10 00:46:15 +08:00
Quan Ming
0221872a70
Update redis urls in example .env
2024-08-10 00:40:11 +08:00
rafaelsideguide
b802ea02a1
small improvements
...
- wait for getting results on crawl: sometimes crawl takes some a second to save the data on the db and this causes response.data to be empty
- added timeout value to test script
- increased http client timeout (llm extract was failing on e2e tests)
- fixed env path on test script
2024-08-09 11:13:14 -03:00
rafaelsideguide
0b8df5e264
python sdk and tests
2024-08-08 14:25:09 -03:00
Nicolas
f1f5605010
Update website_params.ts
2024-08-08 12:31:58 -04:00
rafaelsideguide
cf9d77d889
typescript fixes
2024-08-08 11:41:13 -03:00
Nicolas
b0abad07da
Merge pull request #496 from tak-s/improve-logging-level
...
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
Improve logs
2024-08-07 22:01:12 -04:00
rafaelsideguide
c16437e933
fixed bunch of types
2024-08-07 17:05:18 -03:00
Gergo Moricz
920b7f2f44
fix(runWebScraper): don't filter empty docs
2024-08-07 21:00:22 +02:00
Gergo Moricz
55ec96c23f
fix(queue-worker): bad job lock extension time
2024-08-07 20:24:16 +02:00
Gergo Moricz
ab7a35c581
fix(queue-worker): log lock extensions
2024-08-07 19:49:48 +02:00
Gergo Moricz
a1c2ee5aa9
fix: always complete job, no try
2024-08-07 19:39:09 +02:00
Gergo Moricz
191dfbd9ca
fix: move to completed in one place
2024-08-07 18:49:58 +02:00
Nicolas
457c082ba1
Nick: fixed tests
2024-08-07 11:08:53 -04:00
Nicolas
8a992b1596
Merge branch 'main' of https://github.com/mendableai/firecrawl
Fly Deploy / Pre-deploy checks (push) Waiting to run
Fly Deploy / Test Suite (push) Blocked by required conditions
Fly Deploy / Python SDK Tests (push) Blocked by required conditions
Fly Deploy / JavaScript SDK Tests (push) Blocked by required conditions
Fly Deploy / Deploy app (push) Blocked by required conditions
Fly Deploy / Build and publish Python SDK (push) Blocked by required conditions
Fly Deploy / Build and publish JavaScript SDK (push) Blocked by required conditions
2024-08-07 10:40:06 -04:00
Nicolas
b12e1157cc
Nick: v35 bump
2024-08-07 10:40:00 -04:00
Gergő Móricz
5fc7fcb77c
Merge branch 'main' into feat/queue-scrapes
2024-08-07 16:35:44 +02:00
Gergo Moricz
fe9fdb578b
revert bad hotfixes
2024-08-07 16:34:25 +02:00
Gergo Moricz
b7c01dcb9b
fix(webScraperQueue): reduce retries to 2
2024-08-07 16:31:50 +02:00
Gergo Moricz
cdf7bad5b4
fix(runWebScraper): don't move to completed
2024-08-07 15:20:56 +02:00
Gergo Moricz
9df8719efa
fix(queue-worker): raise queue log level to info
2024-08-07 14:56:04 +02:00
Gergo Moricz
7bb922071c
fix(queue-worker): manually renew lock (testing)
2024-08-07 14:35:20 +02:00
Gergo Moricz
8216266d16
fix(scrape_log): display error properly
2024-08-07 14:19:20 +02:00
Gergo Moricz
2e2e80d679
fix(scrape-events): updateScrapeResult fix
2024-08-07 14:17:50 +02:00