Ingest GitHub pull requests, issues, and commits into CortexDB.
GitHub Connector
Captures pull requests, issues, comments, and commits from your GitHub org / repos as CortexDB experiences. Webhooks for low-latency ingest; cursor-based polling as a fallback.
Two ways to run this connector:
- Run it yourself (Free + paid) —
pip install 'cortexdb-connectors[github]'thencortexdb-sync sync github. The connector reads your token from~/.cortexdb/state.json(written bycortexdb init) or from env vars. - Managed sync (Starter and up) — CortexDB runs the connector as a worker on its infrastructure. Configure it once from your dashboard; CortexDB handles webhook registration, retries, backfill, and idempotency.
1. Prepare credentials in GitHub
- Settings → Developer settings → GitHub Apps → New GitHub App.
- Permissions:
- Repository: Contents (read), Issues (read), Pull requests (read), Metadata (read)
- Account: (none)
- Subscribe to events:
issues,issue_comment,pull_request,pull_request_review,pull_request_review_comment,push. - Install the App on the target org / repos.
- Note the App ID and generate a private key (
.pem).
2. Configure in your CortexDB dashboard
- Sign in at cortexdb.ai/login.
- Settings → Connectors → Add Connector → GitHub.
- Paste the App ID, Installation ID, and private key (PEM contents).
- Set the scope template. Default:
org:<your-org>/source:github/repo:{repo}. Placeholders:{owner},{repo},{author}. - Pick sync mode: Real-time (webhooks — the dashboard provisions the delivery URL and validates with the webhook secret automatically) or Polling.
- Click Start sync.
What gets written
Each GitHub event becomes one experience:
| Event | Modality | Content kind | Notes |
|---|---|---|---|
| Pull request opened/edited | document | text | Title + body; preceded_by points to the PR's prior version |
| Pull request review | feedback | message | Reviewer's body |
| Issue opened/edited | document | text | Title + body |
| Comment (issue / PR) | conversation | message | role=user; context.preceded_by links to the parent |
| Push commit | observation | text | Commit message + author |
idempotency_key shape: gh:<event_type>:<owner>/<repo>/<id> — retries and backfills write once.
Permissions
CortexDB mints a service actor with scope.write on your target scope. Managed for you.
Run it yourself
If you'd rather host the connector yourself instead of using CortexDB's managed worker, every connector ships in the cortexdb-connectors PyPI package:
pip install 'cortexdb-connectors[github]'
# Step 1: get a CortexDB token + actor (one-time, free tier)
pip install cortexdb-cli
cortexdb init
# Step 2: supply the connector's third-party credentials
export GITHUB_TOKEN=... # required
# Step 3: one-shot sync, or `watch` for a poll loop
cortexdb-sync sync github
cortexdb-sync watch github --interval 60
cortexdb-sync reads your CortexDB token + actor from ~/.cortexdb/state.json automatically. Cursor state is persisted in ~/.cortexdb/connectors-state.json, so re-running picks up where the last cycle left off.
See also
- GitLab Connector — same model
- POST /v1/experience