Nicolas
|
2c1221750b
|
Merge pull request #449 from mendableai/bugfix/malformed-url-sitemap
Added regex for links in sitemap
|
2024-07-24 20:37:35 -04:00 |
|
Nicolas
|
92843a356d
|
Merge branch 'main' of https://github.com/mendableai/firecrawl
|
2024-07-24 18:13:36 -04:00 |
|
Nicolas
|
1e13ddbe8e
|
Nick: changes to the ui component
|
2024-07-24 18:13:34 -04:00 |
|
Gergő Móricz
|
623b547292
|
fix(fly.toml): scale up memory limit
|
2024-07-24 23:39:00 +02:00 |
|
Nicolas
|
15890772be
|
Scale bump
|
2024-07-24 16:56:19 -04:00 |
|
rafaelsideguide
|
4eca6bd301
|
fix/check-for-auth-on-scrape-log
|
2024-07-24 12:54:14 -03:00 |
|
Nicolas
|
3a1b8a9797
|
Update website_params.ts
|
2024-07-24 11:04:47 -04:00 |
|
Nicolas
|
8b48ec8d30
|
Update website_params.ts
|
2024-07-24 11:02:20 -04:00 |
|
Rafael Miller
|
5e728c1a4d
|
Update apps/api/src/scraper/WebScraper/crawler.ts
no need for regex
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
|
2024-07-24 08:33:00 -03:00 |
|
Nicolas
|
f0b07b509b
|
Update index.ts
|
2024-07-23 15:15:56 -04:00 |
|
rafaelsideguide
|
a684bd3c5d
|
added regex for links in sitemap
|
2024-07-23 09:07:23 -03:00 |
|
Nicolas
|
30e706b43f
|
Update scrape.ts
|
2024-07-22 19:15:24 -04:00 |
|
Nicolas
|
8916fec66c
|
Update index.ts
|
2024-07-22 19:14:53 -04:00 |
|
Nicolas
|
575ddc9e6e
|
Update scrape.ts
|
2024-07-22 19:12:51 -04:00 |
|
Nicolas
|
e31a5007d5
|
Nick: speed improvements
|
2024-07-22 18:30:58 -04:00 |
|
Nicolas
|
b229fbebd8
|
Update scrape_log.ts
|
2024-07-19 12:53:26 -04:00 |
|
rafaelsideguide
|
5c02dbe20c
|
fix(isFile): added .tiff extension
|
2024-07-18 17:07:21 -03:00 |
|
Gergo Moricz
|
f0e95ce399
|
fix(WebCrawler): filter out file URLs when taking URLs from sitemap
|
2024-07-18 21:49:37 +02:00 |
|
Gergo Moricz
|
95c6c63b85
|
fix(fly): raise heap limit to 4G per process
|
2024-07-18 20:56:54 +02:00 |
|
Nicolas
|
5f14f4f788
|
Update blocklist.ts
|
2024-07-18 14:20:19 -04:00 |
|
Nicolas
|
6161b83890
|
Update scrape_log.ts
|
2024-07-18 14:17:08 -04:00 |
|
Nicolas
|
2dd7398aad
|
Update scrape_log.ts
|
2024-07-18 14:16:46 -04:00 |
|
Nicolas
|
f10f3f886b
|
Merge pull request #410 from mendableai/feat/fire-engine-chrome-cdp
Support chrome-cdp and restructure sitemap fire-engine support.
|
2024-07-18 13:52:08 -04:00 |
|
Nicolas
|
9a1a227797
|
Update crawl-cancel.ts
|
2024-07-18 13:49:51 -04:00 |
|
Nicolas
|
11768571ed
|
Update crawl-cancel.ts
|
2024-07-18 13:43:03 -04:00 |
|
Nicolas
|
ce804d3c20
|
Update crawl-cancel.ts
|
2024-07-18 13:40:24 -04:00 |
|
Nicolas
|
d2de01d342
|
Nick: fixes
|
2024-07-18 13:19:44 -04:00 |
|
Gergo Moricz
|
0b8047c7a0
|
fix(WebScraper): infinite regex leading to fly.io instance hangs
|
2024-07-18 19:13:43 +02:00 |
|
Nicolas
|
f11137352c
|
Merge branch 'main' into feat/fire-engine-chrome-cdp
|
2024-07-18 12:48:42 -04:00 |
|
Nicolas
|
01b5e8fc73
|
Merge pull request #429 from mendableai/mog/fix-job-stuck-2
Fix queue stuck bug via lock settings changes
|
2024-07-18 12:39:21 -04:00 |
|
Nicolas
|
b134ba92bc
|
Merge pull request #427 from mendableai/docs/update-docs
[Docs] Updating docs
|
2024-07-18 11:49:08 -04:00 |
|
rafaelsideguide
|
f13ef02a08
|
Update openapi.json
|
2024-07-18 10:34:03 -03:00 |
|
Nicolas
|
2fab2d8d29
|
Update scrape.ts
|
2024-07-17 20:44:34 -04:00 |
|
Nicolas
|
6609c1b6e5
|
Update .env.local
|
2024-07-17 16:22:27 -04:00 |
|
Nicolas
|
17a1f9b55f
|
Update .env.example
|
2024-07-17 16:22:04 -04:00 |
|
rafaelsideguide
|
eda616d728
|
Merge remote-tracking branch 'origin/main' into docs/update-docs
|
2024-07-17 16:44:51 -03:00 |
|
rafaelsideguide
|
2b4ce12097
|
Update openapi.json
|
2024-07-17 16:43:22 -03:00 |
|
Gergo Moricz
|
8160c311c0
|
fix queue stuck bug via lock setting changes
|
2024-07-17 21:31:25 +02:00 |
|
Caleb Peffer
|
8d5ebc9b9f
|
Merge pull request #423 from mendableai/cjp/linksOnPage
Caleb: Return a list of links on a page by default
|
2024-07-17 12:36:07 -06:00 |
|
Caleb Peffer
|
5b24d26c84
|
Caleb; fixed test
|
2024-07-17 11:33:12 -07:00 |
|
Caleb Peffer
|
c5d1e7260d
|
Caleb: made changes per Rafaels requests
|
2024-07-17 11:29:05 -07:00 |
|
rafaelsideguide
|
205cd63c2f
|
Update openapi.json
|
2024-07-17 15:07:06 -03:00 |
|
Rafael Miller
|
f020048a46
|
Merge pull request #420 from mendableai/bugfix/empty-tags
Small fix for empty pageOptions
|
2024-07-17 10:10:24 -03:00 |
|
Caleb Peffer
|
da3c6bca37
|
Caleb: added a simple test
|
2024-07-16 21:23:22 -07:00 |
|
Caleb Peffer
|
0b3c0ede49
|
Added tests per @nicks request
|
2024-07-16 21:15:59 -07:00 |
|
Caleb Peffer
|
98c788ca7a
|
Caleb: added a test to ensure links on page exists and isn't zero on mendable
|
2024-07-16 21:13:52 -07:00 |
|
Nicolas
|
3c3412e893
|
Update rate-limiter.test.ts
|
2024-07-16 22:45:12 -04:00 |
|
Nicolas
|
ffc3b7c5fb
|
Update index.ts
|
2024-07-16 22:42:40 -04:00 |
|
Nicolas
|
c9073a747c
|
Nick:
|
2024-07-16 22:41:13 -04:00 |
|
Caleb Peffer
|
d39d3be649
|
Caleb: now extracting and returning a list of all links on the page for a customer
|
2024-07-16 18:38:03 -07:00 |
|