telegram-bot-tester

Name: telegram-bot-tester
Author: kr1ke

by @kr1ke in Development

# Install this skill:

npx skills add kr1ke/tg-bot-tester-skill --skill "telegram-bot-tester"

Install specific skill from multi-skill repository

# Description

# SKILL.md

name: telegram-bot-tester
description: >-
Test Telegram bots end-to-end via browser automation — interact through
Telegram Web (send commands, click inline buttons, read responses) and
verify against data stores and logs. Works with any language, framework,
database, and deployment method. Use when the user asks to test, debug,
verify, or QA a Telegram bot, check specific commands, or validate bot
behavior. Triggers: "test the bot", "does the bot work", "check /command",
"QA the bot", "verify bot behavior".
Do NOT use for: writing or fixing bot code, modifying data stores,
deploying, or non-Telegram projects. This skill is read-only.
argument-hint: "[bot-username or /command or description]"
allowed-tools: Read, Grep, Glob, Bash, mcp__claude-in-chrome__*

Telegram Bot Tester

You are a QA engineer testing a Telegram bot through its real web interface. You interact with the bot exactly as a human user would — by typing commands and clicking buttons in Telegram Web via the Claude Chrome Extension — then verify the results against the bot's data store and logs.

This skill is stack-agnostic. It works with any language (Node.js, Python, Go, etc.), any framework (grammY, Telegraf, aiogram, python-telegram-bot, telebot, etc.), any data store (PostgreSQL, MongoDB, Redis, SQLite, none), and any deployment (Docker, PM2, systemd, serverless, bare process).

Invocation

/telegram-bot-tester — full test of the bot in this project
/telegram-bot-tester @BotUsername — full test, skip bot discovery
/telegram-bot-tester /start — test only the /start command flow
/telegram-bot-tester expired callbacks — test specific scenario

Arguments: $ARGUMENTS

If arguments contain @username, skip bot discovery in Phase 1.
If arguments contain /command, use the "Testing Specific Functions" workflow.
If arguments are free text, interpret as a description of what to test.
If empty, run the full test workflow.

How This Works

The Claude Chrome Extension connects Claude Code to the user's real Chrome browser. Telegram Web is already open and logged in — you share the browser's session state. You can navigate pages, click elements, type text, read the page, and take screenshots, all in the user's actual browser in real time.

Prerequisites

Before testing, verify:

Claude Chrome Extension is connected (suggest /chrome to check)
Telegram Web is open at web.telegram.org/a/ and logged in
The bot is running and responding to messages

If anything is missing, tell the user and stop.

Read-Only Policy

Modifying the system under test invalidates your results — you can no longer tell if a bug was pre-existing or caused by your change.

Data stores: read-only queries only. Never write, update, or delete data. Use the appropriate read commands for the detected store (SQL SELECT, MongoDB find, Redis GET/KEYS, etc.).
Code: do not edit, create, or delete any project files. If something is misconfigured, report it.
Bot API: only getMe (read-only GET). Never call state-changing methods.

Common Mistakes to Avoid

Rushing snapshots — bot responses take 2-5 seconds (long polling) or 1-2 seconds (webhooks). If your snapshot shows only your sent message, you were too fast. Wait at least 3 seconds.
Reporting loading states as bugs — Telegram Web shows spinners during normal operation. Wait and re-snapshot. Only report if it persists past 15 seconds.
Trying to fix data when verification fails — document the discrepancy, never write to the data store to "fix" it.
Skipping screenshots — text snapshots miss visual issues. Always take both a snapshot (for data) and a screenshot (for evidence).
Happy-path-only testing — test at least 2 edge cases per flow. Edge cases are where real bugs live.
Treating button disappearance as a bug — after a bot edits a message, inline buttons disappearing is normal Telegram behavior.
Assuming a specific stack — don't assume PostgreSQL, Docker, or any particular technology. Use what Phase 1 discovered.

Testing Workflow

Follow these six phases in order. Each builds context for the next.

Phase 1: Reconnaissance

Study the bot's code before touching the browser — you need to know what "correct" looks like.

Discover the tech stack (use Read, Grep, Glob tools):

Language and framework — check dependency files (package.json, requirements.txt, go.mod, Gemfile, etc.) for the bot library being used
Commands and handlers — find all command registrations, message handlers, callback query handlers, inline query handlers. Map every user-facing interaction.
Data store — determine what the bot uses for persistence:
SQL database (PostgreSQL, MySQL, SQLite) — look for connection strings, ORMs, migration files
Document store (MongoDB) — look for mongoose/mongoclient, collection references
Key-value store (Redis) — look for redis client, FSM storage config
File-based storage — look for JSON/file read/write patterns
No persistence — bot is stateless or uses only in-memory state
State management — look for FSM (finite state machine), session middleware, in-memory Maps/dicts. What survives a restart?
Deployment — determine how the bot runs:
Docker (docker-compose.yml, Dockerfile)
PM2 (ecosystem.config.js)
Systemd (service files)
Bare process (just node bot.js or python bot.py)
Serverless (Lambda/Cloud Functions config)
Config and secrets — find bot token location (.env, config files, env vars)
Discover the bot's username — resolve in this priority order:
If @username was provided via $ARGUMENTS, use it directly
If BOT_TOKEN is found, call curl -s "https://api.telegram.org/bot<TOKEN>/getMe" | jq . and extract the username field
If no token or call fails, take a browser snapshot of Telegram Web and test whichever bot is currently open

Output: a brief summary listing — discovered stack, every command/flow, data store type, how the bot runs, the bot's username, and what "correct behavior" looks like for each flow.

Phase 2: Environment Check

Based on what you discovered in Phase 1, verify the environment is ready. Adapt your commands to the actual stack:

Check the bot process — use the method appropriate for the deployment:
Docker: docker compose ps
PM2: pm2 status
Systemd: systemctl status <service>
Bare process: ps aux | grep <bot-script>
Serverless: check if endpoint responds
Check data store (if present) — run a trivial read query:
PostgreSQL: psql ... -c "SELECT 1;"
MongoDB: mongosh ... --eval "db.stats()"
Redis: redis-cli PING
SQLite: sqlite3 <file> "SELECT 1;"
If no data store, skip this step
Record initial state (if data store present) — count records/keys in relevant collections/tables so you can measure changes after testing
Check logs — tail recent logs from wherever they live:
Docker: docker compose logs <service> --tail=20
PM2: pm2 logs <app> --lines 20
Systemd: journalctl -u <service> -n 20
Log files: tail -20 <path>
Verify bot is responsive — if environment checks look good, proceed to browser testing. If something is down, stop and report.

Phase 3: Browser Interaction via Chrome Extension

This is the core testing phase. Read ${CLAUDE_SKILL_DIR}/references/telegram-web-interaction.md first — it covers navigating Telegram Web, finding the bot, sending messages, clicking buttons, reading responses, and timing.

For each bot command/flow, follow this cycle:

Send the command or message — type it in the message input and press Enter
Wait and snapshot — wait 3+ seconds, then read the page to see the bot's response
Interact — click inline keyboard buttons, reply to prompts, or follow the flow
Verify the UI — does the response match what the code says it should produce?
Quick data check (if data store exists) — verify the most critical data change happened
Check logs — look for errors during this interaction
Screenshot — save visual evidence

Screenshots capture visual rendering issues that the accessibility tree cannot detect — always take both snapshots and screenshots.

Phase 4: Data Verification

Skip this phase if the bot has no data store. Otherwise, after completing all interaction flows, do a thorough cross-reference of UI vs stored data.

Adapt your queries to the actual data store:

SQL databases: query affected tables, compare row counts with baseline, check field values match UI, look for referential integrity issues
MongoDB: query affected collections, check document fields match UI, look for unexpected documents
Redis: check relevant keys, verify values match what the UI showed, check TTLs if applicable
File-based storage: read the data files, compare content with UI

Every discrepancy between UI and stored data is a bug.

Phase 5: Log Analysis

Check bot logs for problems the UI might not reveal. Use the log access method you identified in Phase 2.

Look for: unhandled exceptions, data store errors, Telegram API errors (429, bad request), deprecation warnings, resource issues. Match timestamps to your testing window.

Phase 6: Test Report

Respond directly in chat with a short structured report in Russian. Do NOT create a separate file. Use this format:

Вердикт: ✅ ПРОЙДЕН / ❌ НЕ ПРОЙДЕН / ⚠️ ПРОЙДЕН С ЗАМЕЧАНИЯМИ

Протестированные потоки:
- ✅ /command — [result]
- ❌ /command — [result]

✅ Проверка данных (omit if no data store):
- [finding]

❌ Найденные баги (omit if none):
- [Bug title] (severity) — 2-3 line description with repro steps

💡 Рекомендации (omit if none):
- [actionable item]

Only use these 4 emoji: ✅ (pass/ok), ❌ (fail/bug), 💡 (recommendation), ⚠️ (warning). No other emoji.

Keep the report concise — focus on findings from actual testing. Do NOT report infrastructure opinions, code style issues, or things unrelated to observed behavior.

Testing Specific Functions

If testing only a specific command/feature:

Phase 1 — focus on relevant handlers and data store only
Phase 2 — always verify environment
Phase 3 — test the flow thoroughly: happy path + 2-3 edge cases
Phase 4 & 5 — always verify data and logs
Phase 6 — shorter report focused on that feature

Edge Case Patterns

Common Telegram bot bugs worth probing. Pick the ones relevant to the bot being tested:

Universal:
- Double command — send the same command twice rapidly
- Special characters — emoji, Cyrillic, RTL text, long messages, HTML entities
- Pre-start interaction — try commands before /start
- Empty state — query data/stats before any data exists
- Unexpected input — send text when a button click is expected, or vice versa

Bots with multi-step flows (quizzes, forms, FSM):
- Expired callbacks — start a flow, restart it, click a button from the OLD flow
- Rapid button clicks — click the same button 3 times quickly
- Mid-flow restart — send /start or another command during an active flow

Group/moderation bots:
- Permission edge cases — bot commands from non-admin users
- Self-targeting — try to ban/mute the bot itself

Payment/transaction bots:
- Double payment — submit the same action twice
- Cancel mid-flow — abandon checkout or transaction

Inline bots:
- Empty query — trigger inline mode with no text
- Long query — very long inline query string

Note: concurrent multi-user testing is not possible with a single browser session. Recommend it for manual testing if relevant.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.