GitHub - LocoreMind/locoagent: AI-powered social media agent with real browser automation

What is LocoAgent?

LocoAgent is an AI-powered social media agent that autonomously operates social media accounts through real browser automation. It combines an LLM-driven agentic loop with agent-browser CLI to perceive, decide, and act on live web pages — performing tasks like liking posts, writing replies, following users, and publishing content.

Key differentiators:

Real browser, real sessions — Operates through Chrome CDP with your actual login cookies, not API hacks
Platform skill system — Injects full platform operation playbooks (32+ operations for X.com) so the agent completes composite tasks in one pass
Workflow engine — Pure browser-automation pipelines that run without LLM involvement, controlled by the agent as a supervisor
Operation log — Persistent deduplication across sessions prevents repeated actions
Multi-provider LLM — Works with any OpenAI-compatible API (OpenRouter, DeepSeek, Ollama, etc.)

Installation

Prerequisites

Requirement	Version	Notes
Bun	Latest	Runtime and package manager
Node.js	>= 18	Required by some dependencies
agent-browser	Latest	Browser automation CLI
Git	Any	For context features

Setup

git clone https://github.com/LocoreMind/locoagent.git
cd locoagent
bun install

Configuration

Create a .env file in the project root (auto-loaded at startup):

# LLM Provider (pick one)

# Option A: OpenRouter (access 200+ models)
CLAUDE_CODE_USE_OPENAI=1        # Enable OpenAI-compatible provider
OPENAI_API_KEY=sk-or-v1-...
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_MODEL=anthropic/claude-sonnet-4.5

# Option B: DeepSeek (with thinking mode support)
CLAUDE_CODE_USE_OPENAI=1
OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.deepseek.com
OPENAI_MODEL=deepseek-v4-flash

# Option C: Anthropic direct (omit CLAUDE_CODE_USE_OPENAI)
ANTHROPIC_API_KEY=sk-ant-...

# Agent behavior
SKIP_PERMISSIONS=1               # Required for non-interactive/automated mode

Run

# Interactive mode
bun start

# Single query (headless)
bun start -p "open X.com and like the first post about AI agents"

# With specific model
bun start --model anthropic/claude-sonnet-4.5

Model Providers

LocoAgent supports any OpenAI-compatible API through a built-in translation shim. The rest of the system is provider-agnostic.

Provider	Base URL	Notes
OpenRouter	`https://openrouter.ai/api/v1`	Access 200+ models
DeepSeek	`https://api.deepseek.com`	Thinking mode (`reasoning_content`) fully supported
OpenAI	`https://api.openai.com/v1`	GPT-4o, o1, etc.
Ollama	`http://localhost:11434/v1`	Local models
LM Studio	`http://localhost:1234/v1`	Local models
Anthropic	(native SDK)	Set `ANTHROPIC_API_KEY` only
AWS Bedrock	(native SDK)	AWS credentials
Google Vertex AI	(native SDK)	GCP credentials

Browser Automation

LocoAgent uses agent-browser CLI to control a real Chrome browser via CDP (Chrome DevTools Protocol).

Why Chrome CDP?

Social media platforms detect and block headless browsers and API-based automation. LocoAgent operates through a copy of your real Chrome profile — same cookies, same login sessions, same fingerprint.

Setup

# One-time: copy Chrome profile + launch with CDP
bun run setup-chrome

# agent-browser connects to the running Chrome
agent-browser connect 9222

How the Agent Uses It

agent-browser open https://x.com/home        # Navigate
agent-browser snapshot -i                      # Perceive: get interactive elements with @ref IDs
agent-browser click @e5                        # Act: click a like button
agent-browser fill @e3 "Great research!"       # Act: type in a reply box
agent-browser screenshot result.png            # Verify: capture result

The full agent-browser CLI reference is embedded in the agent's system prompt, so it knows every command natively.

Platform Skills

Skills are operation playbooks loaded on demand via slash commands. Each skill injects a complete manual into the agent's context, enabling composite task execution in one pass.

Available Skills

Platform	Command	Operations	Description
X.com	`/x-com`	32+	Browse, engage, post, social graph, profile, navigation, lists

Usage

# Interactive: load skill then give task
> /x-com open home timeline, like first 3 posts about AI, reply to the best one

# Headless
bun start -p "/x-com like 5 posts about 'large language models', then follow the authors"

Adding a New Platform

mkdir -p skills/linkedin

Create skills/linkedin/SKILL.md:

---
description: "LinkedIn platform operations playbook"
allowed-tools:
  - Bash
user-invocable: true
---

# LinkedIn Operations

## 1. Navigation
...

The skill auto-discovers at startup and becomes available as /linkedin.

Workflow Engine

Workflows are deterministic browser-automation pipelines that run without any LLM involvement. The agent acts as a supervisor — it can inspect status, start/stop workflows, but the execution is pure scripted automation.

Built-in Workflows

Workflow	ID	Schedule	Description
HuggingFace Papers Fetcher	`hf-daily-papers`	Daily	Fetch paper list, abstracts, and thumbnails from HuggingFace
HuggingFace → X.com	`hf-papers-to-x`	Daily	Full pipeline: fetch HF papers → download thumbnails → post as tweets
X.com Search & Reply	`x-search-reply`	Daemon	Search X.com → read posts → generate AI reply → post reply
LinkedIn Search & Comment	`linkedin-search-reply`	Daemon	Search LinkedIn → read posts → generate AI comment → post comment

CLI

bun run workflow list                          # List all workflows + status
bun run workflow run --id hf-papers-to-x       # Run once (blocking)
bun run workflow start --id hf-papers-to-x     # Run once (background)
bun run workflow daemon --id x-search-reply --interval 3   # Run every 3 min
bun run workflow stop --id x-search-reply      # Stop at next checkpoint
bun run workflow status                        # Show status of all workflows
bun run workflow history --id hf-papers-to-x   # Show execution history

Creating a Custom Workflow

Workflows are code-driven pipelines that can include browser automation, LLM API calls, or any scripted logic. You can create your own in two files:

Step 1. Create the definition — workflows/<id>.json:

{
  "id": "my-workflow",
  "name": "My Custom Workflow",
  "description": "What this workflow does",
  "schedule": "daily",
  "executor": "executors/my-workflow.ts",
  "config": {
    "searchQuery": "ai agent",
    "maxPosts": 5,
    "cdpPort": 9222
  }
}

Step 2. Create the executor — workflows/executors/my-workflow.ts:

#!/usr/bin/env bun
import { execSync } from 'node:child_process'

// Parse config from workflow engine
const configArg = process.argv.find((_, i, a) => a[i - 1] === '--config')
const config = JSON.parse(configArg!)

// agent-browser helper
function ab(cmd: string): string {
  return execSync(`agent-browser --cdp ${config.cdpPort} ${cmd}`, {
    encoding: 'utf-8', timeout: 30000,
  }).trim()
}

// Logs go to stderr (visible during execution)
console.error('[my-workflow] Step 1: ...')
// ... your automation logic using ab() ...

// Final JSON summary goes to stdout (last line, required)
console.log(JSON.stringify({ stepsCompleted: 1, stepsTotal: 1 }))

Step 3. Test:

bun run workflow run --id my-workflow

The executor contract: accept --config <json>, log to stderr, output a JSON summary with stepsCompleted and stepsTotal as the last line on stdout.

For the full development guide covering deduplication, checkpoint protocol, LLM integration, and daemon mode, see docs/workflow-development-guide.md.

Operation Log

Persistent memory across sessions. The agent checks the log before acting and records every action after — preventing duplicate likes, follows, and replies.

# Check before acting (exit 0 = already done, exit 1 = not done)
bun run scripts/log-operation.ts check \
  --platform x --action like --url "https://x.com/.../status/123"

# Record after acting
bun run scripts/log-operation.ts add \
  --platform x --action like --url "https://x.com/.../status/123" \
  --status success --note "AI agents research post"

# View recent operations
bun run scripts/log-operation.ts recent --limit 20

# 30-day summary (auto-injected into system prompt at startup)
bun run scripts/log-operation.ts summary --days 30

State stored in persona/operation-log.json (human-readable JSON).

Task Scheduling

Structured daily/weekly task execution replaces ad-hoc prompts.

Define Tasks

Edit persona/tasks.md:

## Daily Tasks
1. Engage with relevant content (like posts matching topic queries)
2. Monitor own project mentions
3. Leave 1 technical comment on the most relevant post

## Weekly Tasks (Monday)
4. Follow 3-5 relevant researchers
5. Post 1 original tweet about recent research findings

## Session Constraints
| Action   | Max per session |
|----------|----------------|
| Likes    | 10             |
| Comments | 2              |
| Follows  | 5              |
| Posts    | 1              |

Run

bun run run-tasks              # Execute today's tasks
bun run run-tasks:dry          # Preview the prompt without running
bun run run-tasks -- --platform x   # Restrict to one platform

Realtime Trajectory Monitor

--print mode is a black box. The trajectory monitor watches the session log and prints live execution status.

# Terminal 1: start the monitor
bun run tail

# Terminal 2: run the agent
bun start -p "/x-com open timeline, like first post"

Output:

═══ New Task ═══
/x-com open timeline, like first post

[6:30:47 PM] ⚡ Bash: agent-browser connect 9222
[6:30:47 PM] ✓ Result: Done
[6:31:10 PM] ⚡ Bash: agent-browser open https://x.com/home
[6:31:27 PM] ⚡ Bash: agent-browser snapshot -i -c -s 'article'
[6:31:44 PM] ● Agent: Found first post, like button ref=e136
[6:31:44 PM] ⚡ Bash: agent-browser click e136
[6:31:45 PM] ✓ Result: Done

bun run tail:history           # Replay latest session from beginning
bun run tail:list              # List recent sessions
bun run tail <id>              # Watch specific session

Project Structure

locoagent/
├── src/
│   ├── entrypoints/         # CLI entry point
│   ├── services/api/        # LLM provider layer (multi-provider shim)
│   ├── tools/               # 43 tool implementations
│   ├── skills/              # Bundled skills (20)
│   ├── commands/            # Slash commands (90+)
│   ├── components/          # Terminal UI components (130+)
│   ├── screens/             # REPL screen
│   ├── hooks/               # React hooks (80+)
│   ├── services/mcp/        # MCP server management
│   ├── query.ts             # Agentic loop engine
│   └── constants/prompts.ts # System prompt assembly
├── scripts/
│   ├── setup-chrome.sh      # Chrome CDP setup
│   ├── log-operation.ts     # Operation log CLI
│   ├── run-tasks.ts         # Task scheduler
│   ├── tail-agent.ts        # Trajectory monitor
│   └── workflow-engine.ts   # Workflow lifecycle manager
├── workflows/
│   ├── *.json               # Workflow definitions
│   ├── executors/           # Workflow executor scripts
│   └── state.json           # Workflow state persistence
├── persona/
│   ├── tasks.md             # Task schedule definitions
│   └── operation-log.json   # Action history for dedup
├── docs/                       # Public documentation (tracked in git)
├── internal-docs/              # Internal documentation (gitignored)
├── .env                     # Local config (auto-loaded)
└── package.json

Tech Stack

Component	Technology
Runtime	Bun
Language	TypeScript (TSX)
UI	React + Ink (terminal rendering, custom fork)
CLI	Commander.js
Browser automation	agent-browser + Chrome CDP
LLM integration	Multi-provider (Anthropic SDK + OpenAI-compatible shim)
Extension protocol	MCP (Model Context Protocol)

Contributing

Contributions welcome. Key areas:

New platform skills — Add playbooks for LinkedIn, Reddit, etc.
New workflows — Automated pipelines for content creation/distribution (development guide)
New tools — Extend agent capabilities
Bug fixes — Especially in browser automation edge cases

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
assets		assets
docs		docs
scripts		scripts
skills/x-com		skills/x-com
src		src
stubs		stubs
tests		tests
workflows		workflows
.gitignore		.gitignore
.mcp.json		.mcp.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
bunfig.toml		bunfig.toml
package.json		package.json
run-claude.sh		run-claude.sh
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is LocoAgent?

Installation

Prerequisites

Setup

Configuration

Run

Model Providers

Browser Automation

Why Chrome CDP?

Setup

How the Agent Uses It

Platform Skills

Available Skills

Usage

Adding a New Platform

Workflow Engine

Built-in Workflows

CLI

Creating a Custom Workflow

Operation Log

Task Scheduling

Define Tasks

Run

Realtime Trajectory Monitor

Project Structure

Tech Stack

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is LocoAgent?

Installation

Prerequisites

Setup

Configuration

Run

Model Providers

Browser Automation

Why Chrome CDP?

Setup

How the Agent Uses It

Platform Skills

Available Skills

Usage

Adding a New Platform

Workflow Engine

Built-in Workflows

CLI

Creating a Custom Workflow

Operation Log

Task Scheduling

Define Tasks

Run

Realtime Trajectory Monitor

Project Structure

Tech Stack

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages