# Knowledge graph

Bito's **Knowledge graph** is the context engine behind [Bito's AI Architect](https://docs.bito.ai/ai-architect/overview) that captures your entire engineering system. It ingests your codebase, business context, and tribal knowledge into a unified graph, giving AI and dev tools a shared understanding of the system.

This always-updated graph powers smarter planning, grounded code generation, and faster problem-solving across a variety of tools, including:

* Issue trackers (Jira, Linear)
* Coding agents (Claude Code, Cursor, etc.)
* Code reviews (GitHub, GitLab, Bitbucket)
* Team communication (Slack)

The knowledge graph is great at answering questions like "What will be affected if we change Service A?" or "Have we seen this issue before?" without manual research.

<figure><img src="https://2860197046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYgNBTrPKG0DuVdAyDvSa%2Fuploads%2FyOMf7vhqESsd4yIkKWXF%2FIllustration-Desktop-2.webp?alt=media&#x26;token=6fd72965-70ad-48ea-8459-5758ccb38982" alt=""><figcaption><p><em>Bito's AI Architect knowledge graph indexes your codebase,</em> business context, and tribal knowledge<em>. AI and dev tools can query this graph to give context-aware answers.</em></p></figcaption></figure>

## Table of contents

1. [**How the knowledge graph is built**](#how-the-knowledge-graph-is-built)
2. [**Capabilities enabled by the knowledge graph**](#capabilities-enabled-by-the-knowledge-graph)
3. [**How the knowledge graph differs from other AI tools**](#how-the-knowledge-graph-differs-from-other-ai-tools)
4. [**Why the knowledge graph matters**](#why-the-knowledge-graph-matters)

## How the knowledge graph is built

The knowledge graph ingests data from multiple sources to form a complete context layer.

The following sources are fed into Bito's indexing pipeline, which scans and parses each type of data to populate the knowledge graph.

* **Code and commits history:** All source code (microservices, libraries, modules) and Git history (commits, branches). The graph records entities like classes, functions, and API endpoints, and notes code changes (e.g. refactor patterns) from commit metadata.
* **Issue trackers:** Jira or Linear tickets, epics, and bugs. Each ticket is connected to the code or service it involves, and recurring incident patterns (like frequent hotfixes) become part of the context.
* **Documentation:** Design documents, architecture decisions, wikis (e.g. Confluence). The graph links these to the relevant code components, capturing business intent and past decisions for reference.
* **Observability data:** Logs, errors, and performance metrics. For example, if a service has repeated errors or missing logs, the graph notes that instability. This means operational risk indicators (like outage-prone services) are surfaced in the graph.
  * **Note:** Observability data integration is available as a custom-built solution for your organization. We can integrate with your existing observability platform (such as DataDog, New Relic, or others) to enrich the knowledge graph with operational insights. Contact the Bito team at <support@bito.ai> to discuss your observability integration needs.
* **Team communication:** Slack messages that contain tribal knowledge like incident discussions, architecture debates, undocumented context.
* **Custom instructions:** Enrich the knowledge graph with additional context that's relevant to your engineering workflow. By providing supplementary documentation, you help AI Architect develop a more complete understanding of your project.
  * [Learn more](https://docs.bito.ai/ai-architect/custom-instructions)

## Capabilities enabled by the knowledge graph

#### Feasibility analysis

The knowledge graph evaluates proposed changes against your actual system state. It understands service boundaries, existing patterns, historical constraints, and known limitations. This allows it to determine what is realistically buildable, highlight risks early, and surface constraints that may not be obvious from specifications alone.

#### Impact assessment

Before any change is made, the knowledge graph maps its full system impact. It traces dependencies across services, APIs, and shared components, and combines this with historical usage and change patterns. This ensures that decisions are based on a complete view of downstream effects, not partial assumptions.

#### Technical design

The knowledge graph grounds design decisions in how your system actually works. It reflects existing architectural patterns, prior design decisions, and service-level responsibilities. This leads to designs that are consistent with your environment and easier to implement, operate, and maintain.

#### Epic breakdown

The knowledge graph connects high-level work to real system components. It uses past implementation patterns, system dependencies, and team workflows to break down epics into actionable units. This results in tasks that are aligned with how work is actually executed in your organization.

#### Context for AI coding agents

When generating code, the knowledge graph provides system-specific context such as API contracts, service interactions, error handling patterns, and operational constraints. This ensures that generated code aligns with your architecture and integrates correctly with existing systems.

#### Code review enhancement

The knowledge graph adds system-wide context to code reviews. It highlights relevant patterns, prior issues, and cross-service implications that may not be visible within a single pull request. This helps reviewers make more informed decisions with less manual investigation.

## How the knowledge graph differs from other AI tools

#### Beyond RAG and vector search

Most AI coding tools use retrieval-augmented generation: embedding code as vectors and retrieving relevant snippets. This treats your codebase as searchable text rather than an interconnected system.

When you ask about authentication, vector search finds files containing "auth" keywords. It doesn't understand that your authentication service depends on specific Redis configuration, that an Architecture Decision Record (ADR) documents why Redis was chosen, or that a recent incident involved rate limiting.

The difference is between finding information and understanding relationships. A vector database can retrieve your authentication service code and a Jira ticket separately, but can't connect that the ticket describes an incident that led to the code change, which implemented a pattern from an ADR, and monitoring shows the pattern's effectiveness.

#### More context isn't better context

Some tools add more context to prompts — feeding more code and documentation to the AI. But retrieving fifty potentially relevant files when only three matter creates noise, not clarity.

The knowledge graph solves a different problem: identifying which information matters for specific decisions and why it matters.

#### Understanding system evolution

Traditional AI tools treat codebases as relatively static, re-indexing periodically without modeling change over time. The knowledge graph understands that your current architecture results from decisions made in specific contexts.

When your team has been migrating from monolith to microservices for a year, that's not historical trivia — it's a pattern that should inform how new features are designed.

#### Operational context

Code analysis tools build dependency graphs and trace function calls, but they don't capture operational reality. Knowing Service A calls Service B is useful. Knowing that Service B has rate limits that caused a production incident eight months ago, that the response involved implementing circuit breakers, and that a shared library now exists for this pattern — that's the context difference between adequate code and production-ready code.

## Why the knowledge graph matters

#### Scaling engineering organizations

The constraint in scaling engineering isn't hiring developers — it's enabling them to make good decisions without complete context. When critical knowledge lives in senior engineers' heads, scaling means either accepting decisions with less context (leading to rework) or creating bottlenecks.

The knowledge graph distributes accumulated knowledge. Mid-level engineers access the same contextual understanding staff engineers bring, not through years of experience but through the graph making that context explicit and queryable.

#### Technical debt visibility

Technical debt is the gap between how your system is designed and how it needs to work. The graph makes this visible by connecting what was decided (ADRs, design docs), what was built (code), what happened (incidents, monitoring), and what's planned (Jira epics).

This visibility enables informed decisions about what technical debt costs and what remediation provides value.

#### Compliance and auditability

For organizations with regulatory requirements, the graph provides an auditable record of technical decisions. When you need to demonstrate why architectural choices were made, what alternatives were considered, and how security requirements were addressed, the graph traces these relationships explicitly.

#### AI for high-level decisions

Most AI tools excel at isolated tasks: writing functions, explaining code, suggesting refactors. The knowledge graph makes AI useful for decisions requiring system-wide understanding: feasibility analysis, architectural design, impact assessment, etc.

These are tasks senior engineers spend most time on, and where AI historically provided least value due to lacking context.

#### Organizational learning

Every decision captured in the graph becomes training data for better future decisions. Organizations accumulate knowledge over time, but it usually remains implicit until people leave and understanding walks out the door.

The knowledge graph makes accumulation explicit, preserving it in a form that informs decision-making as teams evolve.
