Commit Graph

1137 Commits

Author SHA1 Message Date
rafaelsideguide
81aa919262 fix 2024-07-25 17:47:43 -03:00
Nicolas
10e80f00cf Merge branch 'main' into nsc/admin-router 2024-07-25 16:46:38 -04:00
Nicolas
11e6b2680e
Merge pull request #455 from mendableai/feat/scrape-monitoring
Add scrape monitoring
2024-07-25 16:27:07 -04:00
Nicolas
e5b797549e Merge branch 'main' into feat/scrape-monitoring 2024-07-25 16:21:02 -04:00
Nicolas
50d2426fc4 Update scrape-events.ts 2024-07-25 16:20:29 -04:00
Nicolas
28a8a98491 Update admin.ts 2024-07-25 14:58:14 -04:00
Nicolas
2014d9dd2e Nick: admin router 2024-07-25 14:54:20 -04:00
Nicolas
a75d6889c7
Merge pull request #450 from mendableai/feat/logger
[wip] Added logger
2024-07-25 14:40:19 -04:00
rafaelsideguide
1f1c068eea changing from error to debug 2024-07-25 10:00:50 -03:00
rafaelsideguide
e720e1bacf Merge remote-tracking branch 'origin/main' into feat/logger 2024-07-25 09:49:27 -03:00
rafaelsideguide
309728a482 updated logs 2024-07-25 09:48:06 -03:00
Nicolas
2c1221750b
Merge pull request #449 from mendableai/bugfix/malformed-url-sitemap
Added regex for links in sitemap
2024-07-24 20:37:35 -04:00
Gergő Móricz
d1a3df6d08 fix: aaaaahhh 2024-07-25 00:50:03 +02:00
Nicolas
6ad7e24403 Update ingestion.tsx 2024-07-24 18:15:51 -04:00
Gergő Móricz
6798695ee4 feat: move scraper to queue 2024-07-25 00:14:25 +02:00
Nicolas
92843a356d Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-07-24 18:13:36 -04:00
Nicolas
1e13ddbe8e Nick: changes to the ui component 2024-07-24 18:13:34 -04:00
Gergő Móricz
623b547292 fix(fly.toml): scale up memory limit 2024-07-24 23:39:00 +02:00
Nicolas
15890772be Scale bump 2024-07-24 16:56:19 -04:00
Eric Ciarla
a4bccbe3bb
Firecrawl UI Template
Firecrawl UI template
2024-07-24 15:05:55 -04:00
Eric Ciarla
a62c0730c1
Delete package-lock.json 2024-07-24 15:00:19 -04:00
Eric Ciarla
4cb091ad05
Update .gitignore 2024-07-24 14:59:34 -04:00
Eric Ciarla
4596d0b2e6 Add ReadMe and LICENSE 2024-07-24 14:56:53 -04:00
Eric Ciarla
9654721bf2 Vite commit 2024-07-24 14:27:50 -04:00
rafaelsideguide
cc98f83fda added failed and completed log events 2024-07-24 15:25:36 -03:00
Jakob Stadlhuber
2dc7be3869 Remove liveness and readiness probes from worker.yaml
This commit removes the liveness and readiness probes configuration from the Kubernetes worker manifest. Additionally, a Service definition for the worker application has been removed. These changes might be necessary to update the deployment strategy or simplify the configuration.
2024-07-24 19:38:54 +02:00
Jakob Stadlhuber
d68f349109 Update Kubernetes YAMLs and add worker service
Refactored container configurations in worker, api, and playwright-service YAMLs to streamline syntax and add missing fields. Added a service definition for the worker component and included a new environment variable in the configmap for rate-limiting. These changes enhance configuration clarity and ensure proper resource definitions.
2024-07-24 19:31:37 +02:00
Jakob Stadlhuber
f26bda2477 Update Docker build paths in Kubernetes setup README
Corrected relative paths for Docker build commands to ensure the appropriate directories are targeted. This fix is crucial for successful image builds and deployment consistency in the Kubernetes cluster setup.
2024-07-24 19:06:19 +02:00
Jakob Stadlhuber
895e80caa4 Add liveness and readiness probes to Kubernetes configs
Introduced liveness and readiness probes for the Playwright service, API, and worker components. This ensures that Kubernetes can better manage the health and availability of these services by periodically checking their endpoints. This enhancement will improve the robustness and reliability of the deployed applications.
2024-07-24 19:00:23 +02:00
Jakob Stadlhuber
be9e7f9edf Update Kubernetes configs for playwright-service, api, and worker
Added new ConfigMap for playwright-service and adjusted existing references.
Applied imagePullPolicy: Always to ensure all images are updated promptly.
Updated README to include --no-cache for Docker build instructions.
2024-07-24 18:54:16 +02:00
Gergo Moricz
60c74357df feat(ScrapeEvents): log queue events 2024-07-24 18:44:14 +02:00
Jakob Stadlhuber
497aa5d25e Update Kubernetes configs for playwright-service, api, and worker
Added new ConfigMap for playwright-service and adjusted existing references.
Applied imagePullPolicy: Always to ensure all images are updated promptly.
Updated README to include --no-cache for Docker build instructions.
2024-07-24 17:55:45 +02:00
rafaelsideguide
4eca6bd301 fix/check-for-auth-on-scrape-log 2024-07-24 12:54:14 -03:00
Nicolas
4ead89f983
Merge pull request #453 from mendableai/nsc/notion-fix
Notion Website Fixes
2024-07-24 11:40:19 -04:00
Nicolas
3a1b8a9797 Update website_params.ts 2024-07-24 11:04:47 -04:00
Nicolas
8b48ec8d30 Update website_params.ts 2024-07-24 11:02:20 -04:00
Gergo Moricz
4d35ad073c feat(monitoring/scrape): include url, worker, response_size 2024-07-24 16:43:39 +02:00
Gergo Moricz
64bcedeefc fix(monitoring): bad success check on scrape 2024-07-24 16:21:59 +02:00
Gergo Moricz
d57dbbd0c6 fix: add jobId for scrape 2024-07-24 15:18:12 +02:00
Gergo Moricz
71072fef3b fix(scrape-events): bad logic 2024-07-24 14:46:41 +02:00
Gergo Moricz
7cd9bf92e3 feat: scrape event logging to DB 2024-07-24 14:31:25 +02:00
Rafael Miller
5e728c1a4d
Update apps/api/src/scraper/WebScraper/crawler.ts
no need for regex

Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
2024-07-24 08:33:00 -03:00
Eric Ciarla
1b7a00624d Delete old comp 2024-07-23 21:51:08 -04:00
Eric Ciarla
565bc09439 Basic react app 2024-07-23 21:48:11 -04:00
rafaelsideguide
6208ecdbc0 added logger 2024-07-23 17:30:46 -03:00
Eric Ciarla
a0d89169ed init 2024-07-23 15:48:12 -04:00
Nicolas
f0b07b509b Update index.ts 2024-07-23 15:15:56 -04:00
rafaelsideguide
a684bd3c5d added regex for links in sitemap 2024-07-23 09:07:23 -03:00
Nicolas
252bc09ee2
Merge pull request #447 from mendableai/nsc/speed-improvements
/scrape should now be 600ms-900ms faster
2024-07-22 19:18:24 -04:00
Nicolas
ac692ef09c
Update CONTRIBUTING.md 2024-07-22 19:17:53 -04:00