GitHub PR Comments Retriever
Technology Radar
- TypeScript
- VS Code Extension API
- GitHub Octokit
- Node.js FS
- PNPM
The Challenge
GitHub’s web interface is excellent for active reviews, but extracting that data for offline documentation, compliance audits, or feeding into LLMs for context is difficult. Navigating deeply nested comments and multiple review threads via the UI makes it nearly impossible to get a "bird's-eye view" of a PR's history in a portable format.
Fig 1. User Interaction Flow
The Solution
I built a streamlined retriever that uses the GitHub Octokit API to traverse PR data. It intelligently handles pagination for high-volume threads and categorizes different comment types (Review, Issue, and Description) into a structured folder system. It even captures comments from AI agents (like CodeRabbit or Claude), making it a valuable tool for tracking AI-assisted development cycles.
Comprehensive Data Extraction: Captures metadata, line-level comments, and general discussions.
AI-Agent Support: Specifically designed to preserve context from AI code reviewers.
Structured Local Storage: Automatically organizes data into ~/github-prs/ by project title.
Smart Authentication: Supports Personal Access Tokens (PAT) with secure VS Code settings storage.
Pagination Logic: Robustly fetches data for massive PRs without missing threads.
Portable Formats: Exports to standard Markdown (.md) and JSON for easy sharing or processing.