Documentation Discovery and Analysis (Docs Seeker)

Overview

Intelligently discover and analyze technical documentation through multiple strategies: llms.txt lookup (prioritizing context7.com), GitHub repository analysis via Repomix, parallel exploration with multiple agents, and fallback research when other methods are unavailable.

Core Workflow

Phase 1: Initial Discovery

Identify Target
- Extract library/framework name from user request
- Note version requirements (default: latest)
- Clarify scope if ambiguous
- Identify whether target is a GitHub repository or website

Search for llms.txt (Prefer context7.com)

First: Try context7.com pattern

For GitHub repositories:

Pattern: https://context7.com/{org}/{repo}/llms.txt
Examples:
- https://github.com/imagick/imagick -> https://context7.com/imagick/imagick/llms.txt
- https://github.com/vercel/next.js -> https://context7.com/vercel/next.js/llms.txt
- https://github.com/better-auth/better-auth -> https://context7.com/better-auth/better-auth/llms.txt

For websites:

Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt
Examples:
- https://docs.imgix.com/ -> https://context7.com/websites/imgix/llms.txt
- https://docs.byteplus.com/en/docs/ModelArk/ -> https://context7.com/websites/byteplus_en_modelark/llms.txt
- https://docs.haystack.deepset.ai/docs -> https://context7.com/websites/haystack_deepset_ai/llms.txt
- https://ffmpeg.org/doxygen/8.0/ -> https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt

Topic-specific search (when user asks about a specific feature):

Pattern: https://context7.com/{path}/llms.txt?topic={query}
Examples:
- https://context7.com/shadcn-ui/ui/llms.txt?topic=date
- https://context7.com/shadcn-ui/ui/llms.txt?topic=button
- https://context7.com/vercel/next.js/llms.txt?topic=cache
- https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt?topic=compress

Fallback: Traditional llms.txt search

WebSearch: "[library-name] llms.txt site:[docs-domain]"

Common patterns:

https://docs.[library].com/llms.txt
https://[library].dev/llms.txt
https://[library].io/llms.txt

If found, proceed to Phase 2; if not found, proceed to Phase 3.

Phase 2: llms.txt Processing

Single URL:

Use WebFetch to retrieve content
Extract and present information

Multiple URLs (3+):

Critical: Launch multiple Explorer agents in parallel
Each agent handles a major documentation section (up to 5 in the first batch)
Each agent reads its assigned URLs
Consolidate findings into a comprehensive report

Example:

Launch 3 Explorer agents simultaneously:
- Agent 1: getting-started.md, installation.md
- Agent 2: api-reference.md, core-concepts.md
- Agent 3: examples.md, best-practices.md

Phase 3: Repository Analysis

When llms.txt is not found:

Find the GitHub repository via WebSearch

Use Repomix to package the repository:

npm install -g repomix  # Install if needed
git clone [repo-url] /tmp/docs-analysis
cd /tmp/docs-analysis
repomix --output repomix-output.xml

Read repomix-output.xml and extract documentation

Repomix advantages:

Entire repository packaged into a single AI-friendly file
Preserves directory structure
Optimized for AI consumption

Phase 4: Fallback Research

When no GitHub repository exists:

Launch multiple Researcher agents in parallel
Focus areas: official documentation, tutorials, API references, community guides
Consolidate findings into a comprehensive report

Agent Assignment Guidelines

1-3 URLs: Single Explorer agent
4-10 URLs: 3-5 Explorer agents (2-3 URLs each)
11+ URLs: 5-7 Explorer agents (prioritize most relevant)

Version Handling

Latest (default):

Search without version identifiers
Use current documentation paths

Specific version:

Include version in search: [library] v[version] llms.txt
Check versioned paths: /v[version]/llms.txt
For repositories: check out specific tags/branches

Output Format

# [Library] [Version] Documentation

## Source
- Method: [llms.txt / Repository / Research]
- URLs: [list of sources]
- Date Accessed: [current date]

## Key Information
[Extracted relevant information organized by topic]

## Additional Resources
[Related links, examples, references]

## Notes
[Any limitations, missing information, or caveats]

Quick Reference

Tool selection:

WebSearch -> Find llms.txt URLs, GitHub repositories
WebFetch -> Read single documentation pages
Task (Explore) -> Multiple URLs, parallel exploration
Task (Researcher) -> Scattered documentation, diverse sources
Repomix -> Full codebase analysis

Popular llms.txt addresses (try context7.com first):

Astro: https://context7.com/withastro/astro/llms.txt
Next.js: https://context7.com/vercel/next.js/llms.txt
Remix: https://context7.com/remix-run/remix/llms.txt
shadcn/ui: https://context7.com/shadcn-ui/ui/llms.txt
Better Auth: https://context7.com/better-auth/better-auth/llms.txt

Fallback to official sites if context7.com is unavailable:

Astro: https://docs.astro.build/llms.txt
Next.js: https://nextjs.org/llms.txt
Remix: https://remix.run/llms.txt
SvelteKit: https://kit.svelte.dev/llms.txt

Error Handling

llms.txt inaccessible -> Try alternative domains -> Repository analysis
Repository not found -> Search official website -> Use Researcher agents
Repomix fails -> Try /docs directory only -> Manual exploration
Multiple conflicting sources -> Prioritize official sources -> Note versions

Key Principles

Prefer context7.com for llms.txt -- most comprehensive and up-to-date aggregator
Use topic parameter when applicable -- targeted search via ?topic=...
Aggressively use parallel agents -- faster results, better coverage
Fall back to official sources for validation -- when context7.com is unavailable
Report methodology -- inform user which method was used
Handle versions explicitly -- do not assume latest

Detailed Documentation

Complete guides, examples, and best practices:

Workflows:

WORKFLOWS.md -- Detailed workflow examples and strategies

Reference Guides:

Tool Selection -- Complete guide to choosing and using tools
Documentation Sources -- Common sources and patterns across ecosystems
Error Handling -- Troubleshooting and resolution strategies
Best Practices -- 8 essential principles for effective discovery
Performance -- Optimization tips and benchmarks
Limitations -- Boundaries and success criteria

Documentation Discovery and Analysis (Docs Seeker)

Documentation Discovery and Analysis (Docs Seeker)

Overview

Core Workflow

Phase 1: Initial Discovery

Phase 2: llms.txt Processing

Phase 3: Repository Analysis

Phase 4: Fallback Research

Agent Assignment Guidelines

Version Handling

Output Format

Quick Reference

Error Handling

Key Principles

Detailed Documentation

相关技能 Related Skills

Repomix

DuckDuckGo Web Search

Mermaid.js v11