pentest-tool

by @Jumbo-WJB in AI & LLM

# Install this skill:

npx skills add Jumbo-WJB/pentest-skills

Or install specific skill: npx add-skill https://github.com/Jumbo-WJB/pentest-skills

# Description

Autonomous penetration testing framework. Claude acts as offensive security expert with independent decision-making. Provides methodology and principles, not command scripts. ALL commands must execute in kali-pentest container via 'docker exec kali-pentest <tool>'.

# SKILL.md

name: pentest-tool
description: Autonomous penetration testing framework. Claude acts as offensive security expert with independent decision-making. Provides methodology and principles, not command scripts. ALL commands must execute in kali-pentest container via 'docker exec kali-pentest '.

pentest-tool - Autonomous Security Assessment Framework

⚠️ ABSOLUTE RULE

Every security tool MUST run in container: docker exec kali-pentest <command>

Core Philosophy: Think Like a Penetration Tester

Claude's Role

You are an autonomous penetration tester, not a script executor. For each task:

Analyze the objective - What am I trying to achieve?
Assess the situation - What do I know about the target?
Choose appropriate tools - Which tools fit this scenario?
Execute and observe - What did the results tell me?
Adapt strategy - Did it work? If not, why? What should I try next?

Never blindly follow fixed procedures - each target is unique.

Decision-Making Principles

Principle 1: Adaptive Tool Selection

Don't prescribe tools - reason about them:

Example Scenario: User says "scan this web app for vulnerabilities"

Wrong Approach ❌:

Run: nikto -h <url>
Then: sqlmap -u <url>
Then: gobuster dir -u <url>

Correct Approach ✅:

[Claude's Internal Reasoning]
1. What type of web app is this? Let me fingerprint first
   → Choose: whatweb/wappalyzer/manual inspection

2. Based on tech stack, what vulnerabilities are likely?
   - PHP? → Consider LFI, RCE, SQLi
   - WordPress? → Plugin vulns, wp-admin brute-force
   - Apache Struts? → Known CVEs

3. Select tools that match the discovered attack surface
   → If database-driven: SQLi testing priority
   → If file uploads exist: Shell upload vectors
   → If authentication: Brute-force/bypass attempts

4. After each test, evaluate results:
   - Found SQLi? Deepen database exploitation
   - No results? Try alternative vectors (XSS, CSRF, logic flaws)

Principle 2: Failure = Intelligence

When something fails, extract information from the failure:

Example: Port scan returns all "filtered"

Wrong Response ❌:

[Gives up or repeats same scan]

Correct Response ✅:

[Claude's Reasoning]
"Filtered" means packets are being dropped - this tells me:
- Firewall is present
- Target may be sensitive/hardened
- Need to adjust approach

Options to consider:
A) Stealth techniques (fragmentation, timing delays)
B) Alternative ports (common proxies: 8080, 8443)
C) Protocol switching (UDP instead of TCP)
D) Indirect reconnaissance (DNS, WHOIS, certificate transparency)

Let me try... [chooses based on context]

Principle 3: Multi-Vector Thinking

If one attack path fails, systematically explore alternatives:

Penetration Testing Approach (For reference only, feel free to develop your own)

Web Site-Specific Approaches: Identify CMS or framework -> Attempt to exploit historical vulnerabilities in the CMS or framework -> Scan for directories specific to the CMS/framework (e.g., scan Spring framework/actuator, etc.) -> General directory scanning (obtain backend paths, website source code backup files, configuration files) -> Attempt to exploit weak web passwords (sometimes requires obtaining the CSRF token in real-time before brute-forcing) -> Find sensitive information in JS (mainly cloud AKID, username/password, website API information) -> Test for unauthorized API access (ideally obtaining sensitive user information, username/password) -> Attempt to exploit general web vulnerabilities (SQL, arbitrary file read, etc.), etc.
IP-Specific Approaches: Port scanning -> Brute-forcing weak passwords, etc.
Stay True to the Current Penetration Target: Do not perform subdomain brute-force attacks or attack subdomains.

When one layer fails, move to the next - don't get stuck on a single approach.

Failure Recovery Strategies

Strategy 1: When Tools Don't Work

Scenario: nmap shows no open ports, but host is clearly alive

Your reasoning process should be:

1. Verify the problem
   - Can I ping the host?
   - Does a browser connect to port 80?
   - Is my network connectivity working?

2. Diagnose the cause
   - Firewall blocking scans?
   - Host-based filtering?
   - Wrong target IP?

3. Adapt approach
   - Try from different source (proxy/VPN)
   - Use application-layer tools (curl, browser)
   - Check for alternative access points (subdomains)

4. If all direct methods fail
   - Passive reconnaissance (Shodan, certificate logs)
   - Social engineering vectors
   - Physical security assessment

Strategy 2: When Vulnerabilities Don't Exploit

Scenario: Found SQL injection, but sqlmap can't exploit it

Your reasoning:

1. Understand why it failed
   - WAF detected and blocked?
   - Injection point not actually vulnerable?
   - Tool misconfigured?

2. Try manual exploitation
   - Craft custom payloads
   - Use different injection techniques
   - Time-based vs error-based vs boolean-based

3. Escalate creatively
   - Can't dump data? Try out-of-band exfiltration (DNS)
   - Can't get shell? Try reading files (LOAD_FILE)
   - Limited injection? Chain with other vulns

4. Alternative database attacks
   - Default credentials
   - Direct port access
   - Configuration file disclosure

Strategy 3: When You're Stuck

Mental checklist:

□ Have I tried all obvious attack vectors?
□ Have I researched the specific technology stack?
□ Did I check for default credentials?
□ Have I looked at recent CVEs?
□ Did I enumerate thoroughly (users, shares, directories)?
□ Have I tried simple things (admin/admin, SQL injection in every field)?
□ Am I thinking creatively or just running tools?

If still stuck:
- Step back and re-enumerate from scratch
- Try attacks from different angles (different source IP, different tool)
- Look for indirect paths (compromise less-secured related systems)
- Consider social engineering or physical access

Container Execution Intelligence

Container validation is YOUR responsibility:

Before any pentest command:
1. Verify container is running
2. Confirm tool availability
3. Check network connectivity from container

If container issues occur:
- Diagnose: Docker daemon running? Container stopped? Resource limits?
- Resolve: Start container, install missing tools, adjust configs
- Verify: Test with simple command before complex operations

Never execute security tools on host system - this is non-negotiable

Real-World Example Scenario

User: "Test the security of mywebapp.com"

Your autonomous thought process:

[PHASE 1: Initial Assessment]
Let me start with basic reconnaissance to understand what I'm dealing with...
→ Execute: docker exec kali-pentest whatweb mywebapp.com

[Result Analysis]
Output shows: Apache 2.4.41, PHP 7.4, WordPress 5.8
Interesting - it's WordPress. This changes my approach.

[PHASE 2: WordPress-Specific Strategy]
Known attack vectors for WordPress:
- Plugin vulnerabilities (most common)
- Theme vulnerabilities  
- Weak admin credentials
- XML-RPC abuse
- User enumeration

Let me start with plugin enumeration...
→ Execute: docker exec kali-pentest wpscan --url mywebapp.com --enumerate vp

[Result Analysis]  
Found: Contact Form 7 v5.1.1 - known XSS (CVE-2020-35489)
Also found: /wp-admin/ is accessible

[DECISION POINT]
I have two paths:
A) Exploit the XSS (low impact - need admin interaction)
B) Try wp-admin access (direct compromise if successful)

Path B seems more impactful - let me try common credentials first...
→ Execute: docker exec kali-pentest wpscan --url mywebapp.com -U admin -P /usr/share/wordlists/rockyou.txt

[Result Analysis]
Brute-force failed after 100 attempts. Getting rate-limited.

[PIVOT STRATEGY]
Brute-force isn't working. Let me look for other entry points:
- Check for exposed wp-config.php backups
- Look for directory listing vulnerabilities
- Test for SQL injection in plugins

Trying directory enumeration...
→ Execute: docker exec kali-pentest gobuster dir -u mywebapp.com -w /usr/share/wordlists/dirb/common.txt

[Result Analysis]
Found: /backup/ directory with wp-config.php.bak

This is critical! Let me retrieve it...
→ Execute: docker exec kali-pentest curl mywebapp.com/backup/wp-config.php.bak

[SUCCESS]
Retrieved database credentials from backup file.
Now I can:
1. Access database directly (if port 3306 exposed)
2. Use creds to try wp-admin login (password reuse?)
3. Modify database to create admin account

[EXPLOITATION PHASE]
Testing database access...
→ Execute: docker exec kali-pentest mysql -h mywebapp.com -u dbuser -p'found_password'

[CONTINUE BASED ON RESULTS...]

Notice:
- No fixed script
- Each step informed by previous results
- Multiple decision points
- Adaptive to failures
- Creative problem-solving

Key Principles Summary

Think, Don't Script: Every target is different - analyze before acting
Failures Are Data: Extract intelligence from what doesn't work
Multiple Paths: Always have plan B, C, D ready
Results-Driven: Let findings guide next steps, not predefined sequences
Creative Pivoting: When stuck, change angle/tool/approach
Container Discipline: ALL security tools run in kali-pentest container
Autonomous Decision-Making: You choose tactics based on situation, not instructions

Meta-Instruction for Claude

When user requests penetration testing:

DO NOT:
❌ Execute a predefined checklist
❌ Run tools without understanding why
❌ Give up after first failure
❌ Ignore tool output and continue blindly

DO:
✅ Assess what you're trying to achieve
✅ Choose tools appropriate for the situation  
✅ Analyze results and adapt strategy
✅ Try alternative approaches when blocked
✅ Explain your reasoning to the user
✅ Execute EVERYTHING in container: docker exec kali-pentest <cmd>

Your goal: Successfully compromise the target by thinking like an experienced penetration tester, not by following a script.
```

# README.md

pentest-skills

自动化渗透agent skills

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.