Matt Joyce
cd7f260288
Added PyPi classifiers
...
These classifiers will help potential users find and understand the purpose and status of the project. use python 3.8 as the base, because that's what module 'requests' needs.
2024-05-26 11:33:28 +10:00
Matt Joyce
e5c6ac23fe
Added long description to PyPi
...
https://packaging.python.org/en/latest/guides/making-a-pypi-friendly-readme/
2024-05-26 10:01:35 +10:00
Nicolas
e5c8719554
Update blocklist.ts
2024-05-24 14:53:04 -07:00
Rafael Miller
4ce28593b2
Merge pull request #132 from mendableai/feat/idempotency-key
...
[Feat] Added idempotency key to crawl route
2024-05-24 14:22:50 -03:00
rafaelsideguide
d39860c08b
Merge branch 'main' into feat/idempotency-key
2024-05-24 14:15:37 -03:00
Nicolas
605ba4c031
Merge pull request #178 from mattjoyce/main
...
Fix FIRECRAWL_API_URL bug, also various PyLint fixes
2024-05-24 09:49:42 -07:00
Nicolas
8c380d70a5
Update firecrawl.py
2024-05-24 09:48:48 -07:00
Nicolas
65fe9c4f80
Merge branch 'main' into main
2024-05-24 09:47:12 -07:00
Rafael Miller
53a7ec0f6e
Removed hard coded timeout
2024-05-24 13:46:16 -03:00
Nicolas
e0d979edad
Merge pull request #176 from mendableai/bug/data-check-in-python-sdk
...
[Bug] Added data check for python SDK
2024-05-24 09:45:39 -07:00
Nicolas
53a214cefb
Merge pull request #168 from mendableai/nsc/allowed-keywords-in-blocklist
...
feat: Allow privacy/legal/ other pages in social media websites
2024-05-24 09:43:15 -07:00
Nicolas
e166c07690
Merge pull request #170 from qyou/fix-hardcode-timeout
...
update: wait until body attached in playwright-service
2024-05-24 09:41:27 -07:00
Rafael Miller
c00580eabd
Merge pull request #181 from JakobStadlhuber/feat/proxy-support
...
[Feat] Added proxy and media blocking support for Playwright
2024-05-24 13:34:34 -03:00
Jakob Stadlhuber
9fc5a0ff98
Update comment in .env.example for proxy settings
...
This commit modifies the comment in .env.example to specify that proxy settings are for Playwright. This clarification aims to provide users a more clear context about when and why these proxy settings are used.
2024-05-24 17:45:59 +02:00
Jakob Stadlhuber
b001aded46
Add proxy and media blocking configurations
...
Updated environment variables and application settings to include proxy configurations and media blocking option. The proxy settings allow users to use a proxy service, while the media blocking is an optional feature that can help save bandwidth. Changes have been made in the .env.example, docker-compose.yaml, and main.py files.
2024-05-24 17:41:34 +02:00
Rafael Miller
6a5b9ca314
Merge pull request #180 from mendableai/added-issue-templates
...
Update issue templates
2024-05-24 10:17:40 -03:00
Rafael Miller
9562c837eb
Update issue templates
2024-05-24 09:34:43 -03:00
rafaelsideguide
c201ea1986
added idempotency key to python sdk
2024-05-23 12:52:59 -03:00
rafaelsideguide
35927a65a5
Merge branch 'main' into feat/idempotency-key
2024-05-23 12:20:06 -03:00
rafaelsideguide
184e4678f1
bugfix on idempotency key check
2024-05-23 11:47:04 -03:00
Matt Joyce
96630154d3
Merge pull request #1 from mendableai/main
...
Fix FIRECRAWL_API_URL bug, also various PyLint fixes
2024-05-23 09:16:03 +10:00
Matt Joyce
106c18d11f
Use truthiness check for 'success' key in API response
...
PyLint C0121
2024-05-23 08:57:53 +10:00
Matt Joyce
5c21aed9c7
adding pylintrc to allow longer lines
2024-05-23 08:45:56 +10:00
Matt Joyce
48e91c89e7
Removed unnecessary If block
...
PyLint R1731
2024-05-23 08:42:07 +10:00
Matt Joyce
7d2efe5acb
Added request timeouts
...
connection timeout to 5 seconds and the response timeout to 10
PyLint W3101
2024-05-23 08:39:19 +10:00
Matt Joyce
96b19172a1
Removed trailing whitespace
...
PyLint C0303: Trailing whitespace (trailing-whitespace)
2024-05-23 08:30:23 +10:00
Matt Joyce
6216c85322
Time module already imported
...
Pylint
W0404: Reimport 'time' (imported line 16) (reimported)
C0415: Import outside toplevel (time) (import-outside-toplevel)
2024-05-23 08:21:32 +10:00
Matt Joyce
8adf2b7132
Added Docstrings for functions
...
PyLint C0116: Missing function or method docstring (missing-function-docstring)
2024-05-23 08:20:32 +10:00
Matt Joyce
971e1f85c4
Added module docstring
...
PyLint C0114 - missing-module-docstring
2024-05-23 08:03:58 +10:00
Matt Joyce
8d041c05b4
rearranged logic for FIRECRAWL_API_URL
...
It would not use the ENV unless the param was set to None which was counter-intuitive.
2024-05-23 08:00:56 +10:00
Nicolas
4e39701644
Update main.py
2024-05-22 12:59:56 -07:00
Rafael Miller
df0550d2f6
Merge pull request #143 from mendableai/bug/crawl-limit
...
[Bug] Fixing /crawl limit
2024-05-22 14:51:54 -03:00
Nicolas
3aa5f26627
Update main.py
2024-05-22 10:45:43 -07:00
Nicolas
3e63985e53
Update main.py
2024-05-22 10:40:47 -07:00
rafaelsideguide
4dfc371241
Update index.test.ts
2024-05-22 14:38:41 -03:00
rafaelsideguide
f4a3469b9e
Merge branch 'main' into bug/crawl-limit
2024-05-22 14:27:28 -03:00
Nicolas
0d187f0425
Merge pull request #77 from tractorjuice/patch-1
...
Add additional file extensions to crawler.ts
2024-05-22 10:16:49 -07:00
rafaelsideguide
f9ae1729b6
Update firecrawl.py
2024-05-22 09:40:38 -03:00
Nicolas
f915b08016
Merge pull request #174 from mendableai/nsc/fire-engine-beta
...
beta: Fire-Engine fallback
2024-05-21 19:25:51 -07:00
Nicolas
cb2bd0e71f
Update index.test.ts
2024-05-21 19:03:32 -07:00
Nicolas
253abb849f
Update rate-limiter.ts
2024-05-21 18:53:58 -07:00
Nicolas
229b9908d2
Nick: only enable hyper dx in prod
2024-05-21 18:52:46 -07:00
Nicolas
a8ff295977
Update single_url.ts
2024-05-21 18:50:42 -07:00
Nicolas
a5e718b084
Nick: improvements
2024-05-21 18:34:23 -07:00
Nicolas
2e264a4c75
Update ci.yml
2024-05-21 13:24:09 -07:00
Nicolas
6285f12cd1
Merge pull request #167 from mendableai/nsc/hyper-dx-integration
...
feat: HyperDX Integration
2024-05-21 13:19:38 -07:00
youqiang
c47dae13a9
update: wait until body attached in playwright-service
2024-05-21 14:53:57 +08:00
Nicolas
7f64fe884a
Update blocklist.ts
2024-05-20 17:26:01 -07:00
Nicolas
756f54466d
Nick: allowed keywords for now
2024-05-20 17:24:21 -07:00
Nicolas
01783dc336
Update openapi.json
2024-05-20 17:10:55 -07:00