8000 GitHub - jadbox/solomonagent: This project provides a command-line interface (CLI) tool to interact with web pages. It fetches page content, summarizes it using an AI model, and allows users to select and interact with identified actions (like forms or links) on the page.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

This project provides a command-line interface (CLI) tool to interact with web pages. It fetches page content, summarizes it using an AI model, and allows users to select and interact with identified actions (like forms or links) on the page.

License

Notifications You must be signed in to change notification settings

jadbox/solomonagent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Solomon's Agent: CLI interaction to a more concise web

Screenshot 2025-06-06 124815

This project provides a command-line interface (CLI) tool to quickly interact with web from the CLI. It fetches page content, summarizes it using an AI model, and allows users to select and interact with identified actions (like forms or links) on the page. This tool acts as a smart filter for the web, right in your CLI. It parses a page to surface only the essential content and the most common next steps, streamlining tasks like reading the news, researching on Wikipedia, or getting a quick weather update.

CLI prompt a webpage:

Screenshot 2025-06-06 120950

Interact with a simple page form:

Screenshot 2025-06-06 121008 Screenshot 2025-06-06 121026

Project Structure

The project is organized into a modular structure for better maintainability and scalability:

.
├── src/
│   ├── types.ts            # Defines shared TypeScript interfaces (e.g., PageAction).
│   ├── browserUtils.ts     # Handles Playwright browser automation and Chrome profile detection.
│   ├── aiUtils.ts          # Contains logic for AI-powered page summarization using OpenAI/Gemini.
│   └── cliHandler.ts       # Manages all command-line user interactions using @clack/prompts.
├── index.ts                # The main entry point, orchestrating the application flow.
├── package.json            # Project metadata and dependencies.
├── tsconfig.json           # TypeScript configuration.
├── .gitignore              # Specifies intentionally untracked files to ignore.
└── README.md               # Project documentation.

How to Run

  1. Set up Environment Variable: Ensure you have your Gemini API key set as an environment variable: export GEMINI_API_KEY="YOUR_GEMINI_API_KEY"

  2. Install Dependencies:

    bun install
  3. Run the Application:

    npm install

    Node Example with type stripping attribute:

    npm run start https://example.com

Features

  • Web Page Fetching: Retrieves HTML content and title from a given URL.
  • AI Summarization: Uses a configured AI model (Gemini) to summarize page content concisely.
  • Action Identification: Identifies potential user actions (forms, links) on the page based on AI analysis.
  • Interactive CLI: Provides a user-friendly command-line interface to select and interact with identified page actions.
  • Chrome Profile Detection: [WIP] Automatically detects common Chrome user profile paths for persistent browser sessions (though currently configured for non-persistent headless mode).

Plug

5471

This is a silly project built while I'm looking for new opportunities around building AI-powered platforms and tools. If you're hiring, message me on https://www.linkedin.com/in/jonathandunlap/

About

This project provides a command-line interface (CLI) tool to interact with web pages. It fetches page content, summarizes it using an AI model, and allows users to select and interact with identified actions (like forms or links) on the page.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0