Commit Graph

  • d5eee95175
    Merge pull request #9 from iOSDevSK/main main intergalacticalvariable 2024-10-04 17:45:35 +0200
  • e7dc0366bf The issue with the server crash has been solved, if the user made a mistake in the URL. Filip Dvoran 2024-10-01 14:41:19 +0200
  • 008e00940b
    Merge pull request #7 from Ashes47/main intergalacticalvariable 2024-09-30 17:42:21 +0200
  • 5ba38588ee update Ashes47 2024-09-30 19:49:26 +0530
  • d75fc56a51
    Merge pull request #1 from Ashes47/patch-1 Anuj Verma 2024-09-30 19:44:58 +0530
  • 4c24f3b810 add screenshots Ashes47 2024-09-30 19:44:26 +0530
  • 228a2f3aca
    Create docker-compose.yaml Anuj Verma 2024-09-30 19:40:13 +0530
  • a91599a936
    Update README.md intergalacticalvariable 2024-09-29 12:45:11 +0200
  • 22b58be001
    Update README.md intergalacticalvariable 2024-09-29 12:40:29 +0200
  • 0310b00df9
    Update README.md intergalacticalvariable 2024-09-29 12:35:12 +0200
  • 5e947f4cfd
    Update README.md intergalacticalvariable 2024-09-29 12:20:25 +0200
  • 3ecb275e35
    Update README.md intergalacticalvariable 2024-09-28 12:51:20 +0200
  • 20945d733d
    Update README.md intergalacticalvariable 2024-09-28 10:54:28 +0200
  • e1d45875fe
    Update README.md intergalacticalvariable 2024-09-28 10:49:56 +0200
  • a846c24307
    Update README.md intergalacticalvariable 2024-09-28 10:47:36 +0200
  • 1e2a35e04e Updated README intergalacticalvariable 2024-09-28 08:45:57 +0000
  • 7debf71cb8 Updated Readme intergalacticalvariable 2024-09-28 08:39:52 +0000
  • 4f5d1b519a updated Readme Generic Developer 2024-09-28 08:15:56 +0000
  • 9b0a0b91b8 works ;) Generic Developer 2024-09-28 02:31:42 +0000
  • 025c8b67b0 works with urls Generic Developer 2024-09-28 02:01:01 +0000
  • 1bcfead104 screenshots allmost work Generic Developer 2024-09-28 01:10:17 +0000
  • 3d3863f369 docker markdown working Generic Developer 2024-09-27 23:56:51 +0000
  • 66e5d42d2e cleanup Harsh Gupta 2024-09-18 09:48:30 +0530
  • cc4d316764 fix: Improve logging and validation for cookie setting in web scraping Harsh Gupta (aider) 2024-08-26 17:40:50 +0530
  • 6309cbd7a0 update the start script Harsh Gupta 2024-08-26 17:36:54 +0530
  • 66183d8216 set cookies properly Harsh Gupta 2024-08-15 22:48:52 +0530
  • 49a90ee7d4 feat: use urlToCrawl in cookie url Harsh Gupta (aider) 2024-08-15 22:43:50 +0530
  • dcc3e5294d fix: Update crawler.ts to log cookies Harsh Gupta 2024-08-15 22:43:49 +0530
  • 1a5f2eb408 feat: Read cookies from x-set-cookie header and set those cookies in crawlOpts, the url needs to be read from the request parameters Harsh Gupta (aider) 2024-08-15 22:40:49 +0530
  • fc0023f381 manually set cookie Harsh Gupta 2024-08-15 22:33:58 +0530
  • 953429218a Revert "WIP: Cookie fixes" Harsh Gupta 2024-08-15 21:31:17 +0530
  • 12850d79c7 WIP: Cookie fixes Harsh Gupta 2024-08-15 21:01:49 +0530
  • bb3f6b3199
    Update README.md Harsh Gupta 2024-08-15 16:48:09 +0530
  • abef29075c remove empty file Harsh Gupta 2024-08-15 16:38:15 +0530
  • f60a2a19fb update installation/usage instruction Harsh Gupta 2024-08-15 16:30:32 +0530
  • aa0dcea9b0 docs: Add setup and usage instructions to README Harsh Gupta (aider) 2024-08-15 16:05:40 +0530
  • a7fbe3cb38 finish responding properly Harsh Gupta 2024-08-15 15:42:39 +0530
  • 7677ec77ce Parse request headers properly Harsh Gupta (aider) 2024-08-15 15:03:41 +0530
  • 19dc9df9cb more console logs Harsh Gupta 2024-08-15 15:01:48 +0530
  • f6ee7ca6e5 fix: Improve header handling in CrawlerOptions.from() Harsh Gupta (aider) 2024-08-15 14:59:51 +0530
  • c77135490b feat: Add logging for scrapping options context Harsh Gupta 2024-08-15 14:59:49 +0530
  • 77be0d08ff more console logs Harsh Gupta 2024-08-14 20:44:25 +0530
  • 56c1d461ec feat: Add console.log statements to crawler.ts Harsh Gupta (aider) 2024-08-14 20:42:51 +0530
  • 3e2bf6d39d add an express endpoint to run the crawl endpoint Harsh Gupta 2024-08-14 19:34:53 +0530
  • 57b07507d1 feat: Add Express server with crawl endpoint Harsh Gupta (aider) 2024-08-14 19:17:25 +0530
  • 32263c7e9e feat: Add express server for crawling functionality Harsh Gupta 2024-08-14 19:17:23 +0530
  • 1d8b3eae0d increase memory limit Harsh Gupta 2024-08-14 16:59:51 +0530
  • b682ee5bb5 feat: add hello world endpoint in firebase cloud functions Harsh Gupta (aider) 2024-08-14 16:42:03 +0530
  • f927aab144 update deps Harsh Gupta 2024-08-14 16:40:13 +0530
  • 54aae972ae fix the puppeteer thingy Harsh Gupta 2024-08-14 16:39:59 +0530
  • a72373f815 fix: Add try-catch block to handle errors in salvage method Harsh Gupta (aider) 2024-08-14 16:04:57 +0530
  • 888546e064 fix: Make the salvage method private in the puppeteer service Harsh Gupta 2024-08-14 16:04:56 +0530
  • ef138360c2 fix: Remove private modifier from salvage method Harsh Gupta (aider) 2024-08-14 16:01:00 +0530
  • f6f3fc5bea fix: Improve error handling and add retry mechanism in PuppeteerControl Harsh Gupta (aider) 2024-08-14 15:49:13 +0530
  • a3a299fb38 fix: Implement retry mechanism and improve error handling for scraping function Harsh Gupta (aider) 2024-08-14 15:46:41 +0530
  • ddbf0030b4 fix the logger thingy Harsh Gupta 2024-08-14 15:35:20 +0530
  • a3f222638e feat: Add shared module dependencies and exports Harsh Gupta (aider) 2024-08-14 15:15:07 +0530
  • 02abc2aaaa fix: Register Logger class with dependency injection container Harsh Gupta (aider) 2024-08-14 15:11:14 +0530
  • 2d6447e8fc add mock shared libraries Harsh Gupta 2024-08-14 14:53:52 +0530
  • 88a6bd7131 remove submodule shared Harsh Gupta 2024-08-14 14:53:22 +0530
  • cbe4fa94c1 remove alt-text service Harsh Gupta 2024-08-14 14:46:37 +0530
  • 4c957adbce remove PDF extraction functionality Harsh Gupta 2024-08-14 14:44:01 +0530
  • db6cd7d76c fixes Harsh Gupta 2024-08-14 14:41:02 +0530
  • e9ac98a628 fix: Remove new keyword when using RPCReflection Harsh Gupta (aider) 2024-08-14 14:39:32 +0530
  • 4e9b6b7ca5 fix: Update type annotations for mixins and suffixMixins arrays in crawler.ts Harsh Gupta 2024-08-14 14:39:31 +0530
  • 87d9f772c1 more fixes Harsh Gupta 2024-08-14 14:35:07 +0530
  • 2343c1d28b feat: Modify crawler.ts and index.ts to make crawl function usable as a Firebase function Harsh Gupta (aider) 2024-08-14 14:30:08 +0530
  • 80547abf38 fix: Remove unused code and dependencies Harsh Gupta 2024-08-14 14:30:07 +0530
  • c33929afb2 refactor: remove usage of cache Harsh Gupta (aider) 2024-08-14 13:51:35 +0530
  • 127c32abc9 fix: Remove unnecessary code for calculating charge amount Harsh Gupta 2024-08-14 13:51:33 +0530
  • 6804b99533 fix: Remove billing and related flow Harsh Gupta (aider) 2024-08-14 13:49:46 +0530
  • 6c17175c43 fix: Remove unused getChargeAmount function Harsh Gupta 2024-08-14 13:49:44 +0530
  • fbdc266660 fix: Remove auth and user info Harsh Gupta (aider) 2024-08-14 13:48:29 +0530
  • d380599986 strip more stuff Harsh Gupta 2024-08-14 13:47:25 +0530
  • bf27d39f1b fix: Replace estimateToken with a mock implementation Harsh Gupta (aider) 2024-08-14 13:45:02 +0530
  • aeb6ebed67 fix: Remove unnecessary SecurityCompromiseError import and usage Harsh Gupta 2024-08-14 13:45:01 +0530
  • d15b721bfa refactor: Remove rate limiting from crawler.ts Harsh Gupta (aider) 2024-08-14 13:43:04 +0530
  • aa862d4247 fix: Refactor crawler.ts by removing unused imports and code Harsh Gupta 2024-08-14 13:43:03 +0530
  • df58fcb3fa
    fix: alleviate search performance issue Yanlong Wang 2024-08-09 15:03:24 +0800
  • eb74e9c6f8
    fix: remove select element from markdown to walk around turndown performance issue Yanlong Wang 2024-08-09 10:55:36 +0800
  • e4ef6cb0f9
    chore: reduce fetch count in search Yanlong Wang 2024-08-09 10:29:50 +0800
  • e529369ba6
    fix: search with failed pages Yanlong Wang 2024-08-08 15:49:23 +0800
  • 0dd05b5dab
    chore: tweak concurrency Yanlong Wang 2024-08-06 17:58:27 +0800
  • 7af2bde01f
    fix: html rebasing with <base> tag Yanlong Wang 2024-08-06 13:15:10 +0800
  • 40e91853e2 fix Yanlong Wang 2024-08-02 20:10:17 +0800
  • cda0f371e1
    feat: updated rate policy Yanlong Wang 2024-08-02 19:39:51 +0800
  • 0a2c0932fd
    fix Yanlong Wang 2024-08-02 17:13:50 +0800
  • ee632199df
    fix Yanlong Wang 2024-08-02 17:12:10 +0800
  • 0a33207f8f
    fix: another approach to suspected DoS abuse Yanlong Wang 2024-08-02 17:04:13 +0800
  • e658e8102c
    fix yanlong.wang 2024-08-01 20:07:39 +0800
  • f4f189c8e6
    fix yanlong.wang 2024-08-01 19:51:53 +0800
  • 54fa5feb7f
    fix yanlong.wang 2024-08-01 19:49:40 +0800
  • d0a922144d
    fix yanlong.wang 2024-08-01 19:48:00 +0800
  • 6fb5df97cc
    fix: abuse of flooding elements yanlong.wang 2024-08-01 19:34:39 +0800
  • 8b7af6d076
    fix: ignore match all target selectors for performance yanlong.wang 2024-07-31 14:06:22 +0800
  • a08218506e
    fix: truncate svg in jsdom yanlong.wang 2024-07-31 13:12:57 +0800
  • 4e5aff3332
    debug: log jsdom and turndown operations Yanlong Wang 2024-07-31 11:12:12 +0800
  • 0f239793d2
    fix: also recover screenshot pricing Yanlong Wang 2024-07-30 20:11:55 +0800
  • d3f3a8502a
    fix: revert screenshot behavior and introduce pageshot Yanlong Wang 2024-07-30 20:09:06 +0800
  • 57cbae864e
    fix: jsdom, cache tolerance, screenshot pricing yanlong.wang 2024-07-30 17:47:26 +0800