Part 2: EDT – Deploying on Antigravity

Chapter 2 Overview

Chapter 2 moves from theory to practice. It introduces Antigravity – Google's AI IDE that serves as the operational engine for an Enterprise Digital Twin – then walks through system architecture, agent configuration, the KWSR capability evolution model, and file processing capabilities.

2.1 Antigravity – The Processing Plant for Your Digital Twin

2.1.1 From the ChatGPT Wave to the Data Management Problem

Late 2022: ChatGPT launched and created an unprecedented wave of adoption inside organizations. For the first time, AI became accessible enough that anyone could use it – no programming knowledge, no expensive license. Just open a browser, sign up, and start asking questions in plain English.

Accounting staff used it to write more professional emails. Marketing teams used it to brainstorm campaign ideas. Project managers used it to condense long meeting transcripts into key takeaways. Everyone talked about doubling their productivity. "Prompt engineering" courses multiplied overnight. The question shifted from "Should we use AI?" to "How do we maximize its effectiveness?"

But after the initial wave of excitement, a different reality emerged – not from newcomers trying it out, but from the people who had been using AI every single day.

Chatbots only live inside the chat window. Imagine you're a senior accountant who needs to consolidate expense reports from 12 regional offices every week. Each office sends a separate Excel file. To use ChatGPT to analyze them, you have to open the file, select the table, copy it, open ChatGPT, paste it, ask your question, read the result – then repeat for the remaining 11 files. For a short file, this is tolerable. For dozens or hundreds of files? You spend most of your time copy-pasting instead of doing actual work.

Results only exist as text inside the chat window. AI analyzes your data, surfaces insights, writes a complete report. But the result sits in the chat window. To save it as a Word document you have to copy it, open Word, paste it, reformat. To get an Excel file you have to create a new workbook, paste the data, adjust columns. AI does 80% of the hard work, but humans still have to do 20% of the easy work – every single time. This paradox drove many people back to their old methods because "using AI adds extra steps."

Every conversation starts from zero. Today you spend 15 minutes explaining your company's reporting process. AI understands and replies perfectly. Open a new chat tomorrow? You explain it all over again. A colleague needs AI support for the same task? They have to train AI from scratch too. Nothing accumulates. Nothing is shared. Everyone has to repeat everyone else's effort.

By 2025, a meaningful step forward arrived: AI Agents – allowing AI to not just answer questions but take specific actions. ChatGPT launched Canvas. Gemini integrated deeply into Google Workspace. Claude introduced Artifacts. Users could now ask AI to directly create Word documents, Excel spreadsheets, and PowerPoint presentations.

Yet humans still had to click "Download" themselves, choose the folder, name the file, organize the data. AI had stepped out of the chat window – but only halfway. It could create files, but couldn't organize them into the right place in the system. It still stood outside the actual workflow.

The core problem isn't AI's thinking capacity. It's data management: collecting inputs, organizing output files, maintaining folder structure, ensuring consistency, sharing with the team, reusing processes. That's the real bottleneck in enterprise AI adoption.

2.1.2 The Rise of AI IDEs

While businesses were struggling with chatbot limitations, another group of users had already found a solution for their own work: software developers.

AI IDEs (Integrated Development Environments) emerged from a simple but powerful idea: integrate AI directly into the working environment. When AI lives inside the code editor, it can read files, understand project context, and operate directly on the file system – no copy-paste middleman.

Year	Tool	Capability	Milestone
2021	GitHub Copilot	Intelligent line-completion	Smart autocomplete
2023	Cursor	Understands entire codebase, edits multiple files	From autocomplete to context-aware
2024	Windsurf	Analyzes build errors, suggests fixes	Deep repository understanding
2025	Claude Code	CLI, multi-agent, parallel work	Multi-agent collaboration
2025	Antigravity	Manages the full data lifecycle	From developer tool to everyone's tool

In early 2026, Cursor launched "Agent Mode," letting AI decide which files to edit, which new files to create, and even run tests automatically to verify results. In 2025, OpenAI negotiated to acquire Windsurf for $3 billion but the deal collapsed. Google then recruited Windsurf's founding team into DeepMind for a $2.4 billion deal, signaling the strategic importance of the AI IDE space.

However, all of these tools share one important limitation: they were designed for developers, to build software. Their interfaces are optimized for writing code. Their features revolve around reading/writing code files and running build commands. Their terminology is developer terminology.

For non-technical business users – accountants working in Excel, marketers working in PowerPoint, sales teams working in CRM – these tools remain out of reach. The question was: when would these capabilities arrive for users who don't know how to code?

2.1.3 Antigravity – Managing the Full Data Lifecycle

Antigravity, announced alongside Gemini 3 in November 2025, looks at first glance like Cursor or Windsurf – an AI IDE with file read/write and terminal capabilities. Similar interface: editor in the center, file tree on the left, terminal at the bottom.

But Antigravity represents a qualitatively different step forward – not just further along the same road, but in a different direction.

Cursor, Windsurf, and Claude Code solve one problem well: helping developers write and fix code faster. They are processing tools – code in, fixed code out. Antigravity solves a broader problem: the full data lifecycle. Not just processing existing data, but collecting data from multiple sources, packaging results into complete deliverables, storing them in the right location in the system, distributing them to the team, and defining processes to be reused.

Six core capabilities that distinguish Antigravity:

Integrated browser for data collection. Antigravity includes a built-in Chrome browser that Cursor, Windsurf, and Claude Code lack. The Agent can open websites, navigate to needed pages, collect information, capture screenshots as evidence, and even record video of the process. This means the Agent doesn't just work with files already on the machine – it can go out and collect data from any source accessible through a browser.

Large-scale processing and packaging. Unlimited file processing capacity. 50 expense reports, 100 contracts, 200 email threads – the Agent reads them all within minutes. Results aren't text in a chat window; they're complete, ready-to-use output files.

Automatic storage to the right location. The Agent doesn't just create files – it saves them to the correct location in the system. You specify the folder and naming conventions; the Agent executes precisely without you manually navigating folders, naming files, or moving things around.

Distribution via cloud sync. If the folder is synced with Google Drive or OneDrive, files automatically appear in the cloud. Teammates access them immediately without email attachments. Other AI systems in the Google ecosystem can also use this data.

Reuse via the KWSR system. Instead of re-explaining everything each time, define it once and reuse indefinitely through four mechanisms: Knowledge (stores accumulated expertise), Workflow (defines action sequences), Rule (ensures consistency), Skill (packages specialized capabilities). KWSR is covered in depth in Chapter 3.

Oversight via Artifacts. The Agent creates evidence throughout its work: a plan for you to review before action, screenshots of key steps, video of the process, detailed logs of every operation. Artifacts let humans understand exactly what AI is doing with their data – a critical requirement in enterprise environments where accountability is mandatory.

2.1.4 Summary – Antigravity in the Digital Twin Landscape

Looking back at three years of AI development, from chatbots in 2022 to AI Agents in 2025, the evolution in solving the data management problem is clear:

Capability	Chatbot 2022	AI Agent 2025	Antigravity
Data collection	Manual copy-paste	Local files only	Local files + web browser
Data processing	Inside the chat	On files	On files, unlimited volume
Packaging results	Text in the chat	Creates files	Files + evidence Artifacts
Automatic storage	No	No	Saves to specified location
Distribution	Manual	Limited	Local + cloud sync
Reusability	No	Limited	Knowledge, Workflow, Rule, Skill
Oversight	No	Limited	Artifacts, detailed logs

In the context of the Digital Twin, Antigravity plays the role of the processing plant. The organized knowledge repository is the Knowledge Base. Antigravity is where the Agent reads data from the repository, processes it according to the System Prompt defined through KWSR, and returns complete deliverables saved back into the repository.

If a Digital Twin is like a factory: the knowledge repository is the raw materials warehouse, and Antigravity is the production line. Unlike a conventional production line, it can go out and source raw materials from the web via browser, and deliver finished products to exactly where they're needed.

The old question was: "How can AI help me work?" This led to using AI as a supplemental tool – open a chatbot when needed, ask and answer, then close it.

The new question to ask is: "How do I organize my data so AI can take over entire workflows?" This leads to building a Digital Twin – designing the system from the ground up with the assumption that AI will be the "employee" doing the daily work.

2.2 Antigravity System Architecture

2.2.1 Overview: Two Workspaces

When first opening Antigravity, many users feel disoriented. The interface looks like a code editor – file tree on the left, editor in the middle, terminal at the bottom. But where does data live? Where are settings saved? How does the Agent "remember" what you've taught it? How do you share configurations with a teammate?

To answer these questions, understand the architecture of Antigravity – a design built on the recognition that users have two fundamentally different types of needs.

Need #1: Personal consistency. Some things you want the Agent to follow everywhere, regardless of which project you're working on. Your writing style preference. Always respond in English. Always ask before deleting a file. These are personal preferences you don't want to reconfigure every time you start a new project.

Need #2: Project flexibility. But other things differ between projects. A report for Client A uses one template; a report for Client B uses a different format. An internal project uses a simple format; a client-facing project requires a more professional presentation. Each project has its own "rules of the game."

Solution: Two configuration levels.

Global level stores everything belonging to you personally – applied to all projects, all workspaces. When you open a brand new project, your Global settings still take effect. This is where the Agent "remembers" who you are, how you prefer to work, and what lessons it has learned from previous conversations. Think of this as the Agent's "default personality" when working with you.

Workspace level stores everything belonging to one specific project – active only when you're working inside that project. Open the "Q1 2026 Budget" project, and that workspace's settings apply. Switch to the "Marketing Strategy 2026" project? Completely different workspace settings activate. Think of this as the "role" the Agent takes on in each project.

This two-level architecture solves a real problem that many other AI tools can't: combining personal consistency with contextual flexibility.

2.2.2 Folder Structure: The System Map

Global Level: The .gemini folder

Everything at the Global level lives in a hidden .gemini folder in the user home directory. On Windows: C:\Users\[YourUsername]\.gemini\

Priority order (high → low): WORKSPACE .agent/rules/ → WORKSPACE .agent/workflows/ and skills/ → GLOBAL GEMINI.md → GLOBAL workflows/ and skills/ → DEFAULT

Key components:

GEMINI.md – the most important file at the Global level. Think of this as the "constitution" for your Agent – fundamental rules it must follow in every situation. Write "Always respond in English" here, and the Agent responds in English no matter which project you're in. Write "Never delete a file without asking first" and the Agent will always request permission before deleting anything.

brain/ – where Antigravity "remembers" conversations. Each new chat creates a subfolder with a unique ID. Inside are logs of Agent actions, created Artifacts, plans, screenshots, and recorded videos. This data is auto-generated; don't edit it directly.

knowledge/ – the Agent's "long-term memory." After each conversation, a sub-agent analyzes the session and extracts useful knowledge, saved here as Knowledge Items. When you explain a company-specific process, specialized terminology, or report formatting preferences, that information doesn't disappear when you close the chat. Next time, the Agent retrieves it from knowledge/ instead of asking you again.

skills/ – global Skills: specialized capabilities you define once and use everywhere. Example: a Skill for writing reports in your company's standard format, a Skill for analyzing financial data.

global_workflows/ – global Workflows: processes you use frequently regardless of which project you're in. Example: a workflow for converting DOCX to Markdown, a workflow for packaging a project to send to a client.

scratch/ – a temporary workspace where the Agent can create drafts and run experiments without affecting actual project folders.

Workspace Level: The Project Folder

The Workspace level isn't in a fixed location – it lives inside whatever project folder you're working in. When you open Antigravity and "Add Folder" pointing to a project directory, the Agent searches for configuration inside that folder.

.agent/ – the hidden folder containing all Agent configuration for this project:

rules/ – project rules. In contrast to Global, you can split these into multiple topic-based files: general.md for general project rules, report-format.md for client report formatting, naming-convention.md for file naming conventions. The Agent reads all files in this folder and applies them all.
workflows/ – project-specific processes: action sequences the Agent executes on request.
skills/ – project-specific Skills: specialized capabilities with meaning only in this project's context.

Recommended content structure following the IPO model from Chapter 1:

project/
├── .agent/
│   ├── rules/
│   ├── workflows/
│   └── skills/
├── 01_Inputs/       ← Raw data, materials to be processed
├── 02_Process/      ← Working area, intermediate files in progress
└── 03_Outputs/      ← Final deliverables, ready to use

This structure isn't mandatory – Antigravity works with any folder organization. But the Input-Process-Output pattern has proven its effectiveness: when anyone (or any Agent) looks at the folder, they immediately understand how data flows through the project.

2.2.3 Automatic vs. Manual: Who Does What?

Automatic components (Agent-managed):

Brain (Conversation history) is created every time you talk with the Agent. The conversation is automatically saved to ~/.gemini/antigravity/brain/ with a unique ID. Inside are detailed logs of each step the Agent took, created Artifacts, plans, screenshots, and recorded videos. You can review these if you need to revisit past work, but don't edit them directly – they follow Antigravity's internal format.

Knowledge (Distilled expertise) is extracted from conversations. After each session, a sub-agent analyzes the conversation and saves useful knowledge to ~/.gemini/antigravity/knowledge/. This happens automatically.

However, Knowledge has a special feature: you can also actively request the Agent to update Knowledge. When you've just explained an important process and want to make sure it remembers: "Please save this process to Knowledge for future use." The Agent will synthesize and save on request.

Manual components (You create and manage):

Rules, Workflows, and Skills are all created and managed by you.

Component	Global Level	Workspace Level
Rules	Written in `~/.gemini/GEMINI.md` (one file)	Multiple `.md` files in `.agent/rules/` (topic-separated)
Workflows	Created in `~/.gemini/antigravity/global_workflows/`	Created in `.agent/workflows/`
Skills	Created in `~/.gemini/antigravity/skills/`	Created in `.agent/skills/`

Why can Global Rules only be one file but Workspace Rules can be many?

Global Rules are typically short – a few basic preferences and style guidelines. One file is enough. But Workspace Rules can be very detailed – each client may have different requirements, each report type has its own format, each data source has its own processing rules. Splitting them into topic-specific files keeps things organized:

general.md – general project rules
report-format.md – client-specific report formatting
data-processing.md – specific data handling rules
naming-convention.md – file and folder naming conventions

2.2.4 Priority Order: When There's a Conflict

What happens when the same rule is defined at both Global and Workspace levels?

Example: Global says "Respond in English," but Workspace says "Respond in Spanish" because the client is based in Mexico. Which does the Agent follow?

Priority order (high → low): WORKSPACE .agent/rules/ → WORKSPACE .agent/workflows/ and skills/ → GLOBAL GEMINI.md → GLOBAL workflows/ and skills/ → DEFAULT

Simple rule: Workspace always wins over Global.

When there's a conflict, the Agent prioritizes what's defined at the Workspace level. This makes complete sense: Workspace represents the specific context of the project being worked on, while Global is just a general default.

Real tested example:

Step 1: In Global GEMINI.md, write: "When asked 'Who are you?', respond 'I am GLOBAL Agent.'"
Step 2: In Workspace .agent/rules/test-rule.md, write: "When asked 'Who are you?', respond 'I am WORKSPACE Agent.'"
Result: Agent responds "I am WORKSPACE Agent." – confirming Workspace rules take priority over Global.

Three practical benefits of this design:

Baseline consistency – rules in Global GEMINI.md apply everywhere as a foundation. Define personal preferences, writing style, habits, language preference once – they automatically take effect in every project.

Contextual flexibility – when a project needs different rules, just create a file in .agent/rules/ for that workspace. No need to touch Global; no effect on other projects.

No ambiguity – the priority order is clear and consistent. No situation where you wonder "what will the Agent do?" WORKSPACE → GLOBAL → DEFAULT, simple and predictable.

2.2.5 Practical Application: Where to Start?

For beginners: The first step is simpler than you'd expect.

Open Antigravity and "Add Folder" pointing to your project directory. No need to create any configuration files. The Agent can read files in that folder immediately.
Start working. Ask the Agent to read files, analyze data, generate reports. The Agent automatically saves history to brain/ and learns knowledge into knowledge/. No configuration needed.
After a few days or weeks of use, you'll notice rules you want the Agent to always follow. At that point, create GEMINI.md in ~/.gemini/ and write those rules in. The Agent will read this file and apply it in every workspace.

That's all you need to start. The system works even without any configuration files – the defaults are sensible for most situations.

For power users: When comfortable with Antigravity and ready to optimize:

Set up Global GEMINI.md – define personal rules: writing and communication style, preferred language, work habits (always ask before deleting, always create backups), universal file naming conventions.
Create Global Workflows – for tasks you do frequently regardless of project: convert DOCX to Markdown, package a project for a client, backup important data.
Build .agent/ structure for important or long-running projects: topic-separated rule files, project-specific workflows, specialized skills.

Sharing with your team:

Sharing Workspace: Put .agent/ in version control (Git). When a colleague clones the project, they automatically have the same Rules, Workflows, and Skills. No instruction emails; no manual setup on each machine.
Sharing Skills: Copy the Skill folder from ~/.gemini/antigravity/skills/ to a colleague's machine. They put it in the same location and can use it immediately.

2.2.6 Key Takeaways

Antigravity's two-level architecture – Global and Workspace – is built on four principles:

Simple to start. No configuration needed; the system works with sensible defaults. Use Antigravity immediately without reading about rules or workflows.

Powerful when needed. When you want detailed control, every mechanism is available: Rules for behavior, Workflows for processes, Skills for specialized capabilities, Knowledge for accumulated expertise.

Clear ownership. Data from conversations (Brain, Knowledge) is automatically managed – no need to worry about it. Configuration for working methods (Rules, Workflows, Skills) is defined by you – fully under your control.

No conflicts. Clear priority order: Workspace beats Global; Global beats Default. No ambiguity.

2.3 Agent Configuration

2.3.1 Overview: Why Configure?

Have you ever experienced this: you ask the Agent to write a report and the result is too long or too short? Or the Agent does a thorough analysis but takes minutes, when you just needed a quick answer? Or the Agent makes a series of changes you didn't have time to review?

These issues aren't because the Agent is "bad." They happen because the default configuration doesn't fit your specific needs in that specific situation.

Agent configuration in Antigravity revolves around two fundamental questions:

Question 1: Which "brain" do I use? Antigravity supports multiple AI models from multiple providers – Google, Anthropic, and open-source. Each model has its own strengths: some think fast, some think deep, some write beautifully. Choosing the right model for the right task saves time and improves quality.

Question 2: What working style? The Agent can operate in two modes: Planning Mode (plan before acting) and Fast Mode (act immediately without planning). Each mode fits different situations.

2.3.2 AI Models: Choosing the "Brain"

Why multiple models? No single model is perfect for every situation. Just like people – some calculate quickly but write average prose; some analyze deeply but need time; some tell stories beautifully but lack logical rigor. AI models are the same.

Antigravity supports seven models from three providers:

Google Gemini family: Versatile and reliable

Google built Gemini with a "one model, all tasks" philosophy. Gemini can handle text, images, code, and logical reasoning equally well. It's the safe default when you're unsure which model to choose.

Model	Characteristics	Best for
Gemini 3 Pro (High)	Deepest reasoning, highest quality	Complex analysis, long-term planning, multi-step workflows
Gemini 3 Pro (Low)	Balanced quality and cost	Batch processing, repetitive tasks
Gemini 3 Flash	Fastest speed, lowest cost	Quick Q&A, everyday coding

Gemini 3 Pro (High) is the most powerful version. As of Q1 2026, it leads the LMArena Leaderboard with Elo 1501 and achieves 91.8% on GPQA Diamond (doctoral-level reasoning). It leads multi-step task benchmarks: 54.2% on Terminal-Bench 2.0 and tops WebDev Arena with Elo 1487.

Gemini 3 Pro's unique strength: a 2 million token context window – the largest among commercial models. This lets the Agent "hold in mind" thousands of pages of documents simultaneously, essential for complex multi-step tasks.

Gemini 3 Flash has a surprise: in coding benchmarks (SWE-bench Verified), Flash scores 78% – higher than Pro (76.2%). For everyday coding tasks, Flash is simultaneously faster, cheaper, and delivers better results. A valuable finding for anyone working with code daily.

Anthropic Claude family: Natural writing style

Anthropic, the company behind Claude, is famous for its philosophy of "helpful, harmless, and honest" AI. Claude is specially trained to write naturally, coherently, and readably. Many readers can't tell Claude-written content is from an AI.

Model	Characteristics	Best for
Claude Sonnet 4.5	Most natural prose, 0% error rate in writing	Content creation, email drafting, documentation
Claude Sonnet 4.5 (Thinking)	Step-by-step reasoning with visible process	Debugging code, complex reasoning
Claude Opus 4.5 (Thinking)	80.9% SWE-bench, highest in the world	Complex programming, deep research

Claude Opus 4.5 (Thinking) set a new record: 80.9% on SWE-bench Verified – the first model to break 80%, measuring ability to solve real-world software problems from GitHub. If you encounter a complex bug that other models can't resolve, Opus is the final option.

A notable feature: at medium thinking effort, Opus achieves results equivalent to Sonnet while saving 76% of tokens – meaning you can use the most powerful model without excessive cost.

Claude Sonnet 4.5 achieves 0% error rate in text editing benchmarks – the most natural writing style among AI models. When your output will be read by customers, a boss, or partners, Sonnet is the essential choice.

Extended Thinking: When "Thinking" mode is enabled, the Agent not only gives you an answer but displays its reasoning process – each step of logic, which hypotheses were considered, why a decision was made. You not only know what the Agent decided but understand why. If the reasoning is flawed somewhere, you can point it out and ask the Agent to reconsider.

Open-source family: Alternative option

GPT-OSS 120B is based on an open-source architecture with 120 billion parameters. Despite the "OSS" label, this model runs on Google's infrastructure through Antigravity – just choose from the dropdown and use; no self-deployment required.

Estimated performance: ~65-70% on SWE-bench and 85-88% on GPQA. Suitable for standard tasks with lower requirements on writing style or absolute accuracy.

2.3.3 Benchmark Comparison (Q1 2026)

Model	Coding (SWE-bench)	Reasoning (GPQA)	Multi-step (Terminal)	Multi-modal	Context window
Claude Opus 4.5	80.9%	~92%	–	–	~200K
Claude Sonnet 4.5	70.6%	~90%	–	–	~200K
Gemini 3 Pro	76.2%	91.8%	54.2%	81.2%	2M+
Gemini 3 Flash	★78%	90.4%	–	–	2M+
GPT-OSS 120B	~65-70%	~85-88%	–	–	~128K

★ Flash scores higher than Pro in coding – a surprising result

Benchmark-based recommendations:

Multi-step workflows → Gemini 3 Pro. Only model with Terminal-Bench data (54.2%), 2M+ context to "hold in mind" the entire workflow, 81.2% multi-modal handling.
Complex coding → Claude Opus 4.5. 80.9% SWE-bench (highest in the world), Thinking mode to understand root cause.
Everyday coding → Gemini 3 Flash. Surprisingly reaches 78% SWE-bench, fast and affordable.
Content writing → Claude Sonnet 4.5. 0% error rate in writing benchmarks, most natural prose.
Strategic analysis → Claude Opus or Gemini Pro. Opus: 92% GPQA with transparent Thinking. Pro: 91.8% GPQA, high accuracy, minimal hallucination.

2.3.4 Industry-by-Industry Recommendations

Industry	Model #1	Model #2	Primary reason
HR	Claude Sonnet	–	Natural prose
Marketing	Claude Sonnet	Gemini Flash	Quality vs. Volume
Legal	Claude Opus (T)	Gemini Pro	Reasoning + Context
Accounting	Gemini Pro	Claude Opus	Multi-modal + Reasoning
Education	Claude Sonnet	Gemini Flash	Feedback quality + Speed
Engineering/IT	Claude Opus (T)	Gemini Flash	80.9% coding + Speed
Healthcare	Gemini Pro	Claude Opus	Multi-modal + Accuracy
Sales	Claude Sonnet	–	Personalization
Research	Gemini Pro	Claude Opus (T)	Context + PhD-level reasoning
Customer Support	Gemini Flash	Claude Sonnet	Volume vs. Sensitive cases

(T) = Thinking mode

HR example (US): Screening 100 resumes for a Senior Product Manager role at a SaaS company. Claude Sonnet reads and summarizes each resume, rates fit against the job description, and outputs a ranked shortlist of 10 candidates with justification – all matching your company's hiring criteria without needing repeated explanations.

Legal example (US): Reviewing a 150-page commercial lease for a retail chain expansion. Gemini Pro reads the full document at once (2M+ context window), while Claude Opus in Thinking mode flags unfavorable clauses and explains why each clause poses a risk under California commercial real estate law.

2.3.5 Working Modes: Planning vs. Fast

Planning Mode: Think first, then act

In Planning Mode, the Agent doesn't rush into action. Instead, it goes through a controlled process:

Two important files are created:

implementation_plan.md – detailed plan before execution. Lists each step the Agent will take, which files will be created or modified, and potential risks. You can review, adjust, or reject this plan.
walkthrough.md – summary after completion. What the Agent did, what the results were, any issues to note.

When to use Planning Mode:

Complex tasks with many steps and significant impact
You need tight control over each change
First time working with the Agent for this type of task
High-stakes work where mistakes are unacceptable

Fast Mode: Quick and efficient

The Agent executes immediately – no planning phase, no waiting for approval.

When to use Fast Mode:

Simple, familiar tasks
Quick Q&A that doesn't change any files
Small edits you can verify immediately
You know the Agent's behavior well

Criterion	Planning Mode	Fast Mode
Speed	Slower	Fast
Control	High – you approve first	Lower – Agent decides
Best for	Complex, high-stakes work	Simple, familiar tasks
Risk	Low	Higher
Artifacts	Creates `implementation_plan.md` and `walkthrough.md`	None created

2.3.6 Combining Models and Modes

Gemini Flash + Fast Mode – "Maximum speed" combo for simple daily tasks.

Claude Sonnet + Planning Mode – For writing important content needing both prose quality and control.

Gemini Pro + Planning Mode – For strategic analysis and complex multi-step workflows.

Claude Opus (Thinking) + Fast Mode – For debugging code when you need to see the reasoning process immediately.

For beginners: Start with Gemini 3 Pro (High) + Planning Mode. Safest configuration – smart model, controlled mode.

After a few days, experiment: switch to Fast Mode for simple questions; try Gemini Flash for quick tasks and coding; try Claude Sonnet when writing content.

Key takeaway: No configuration is "right" for every situation. Develop the habit of checking before each important task: "Am I using the right model and mode for this?"

2.4 The Agent Capability Evolution Model

2.4.1 The Paradox of AI Maturity

Today we see large language models with remarkable reasoning capabilities. In theory, they're ready to take over complex work.

Yet when deployed in real operational workflows, many organizations encounter a paradox: the Agent is very smart at answering questions, but "immature" at running systems.

A typical scenario: you're a manager who just deployed an Agent to help process periodic reports. Week one, you guide it step by step – where to get data, which columns to extract, how to calculate. The Agent does excellent work and produces accurate results.

Week two, you say "Do the report like last time," and the Agent asks basic questions as if it's new: "Where do I get the data?" "What's the format?" You have to explain everything from scratch.

The problem isn't the AI's intelligence – it's the mechanism for crystallizing knowledge. The Agent can process information but needs a structured path to transform those isolated actions into reusable experience.

The parallel to human staff development:

A new hire goes through a natural development journey:

Observation phase – watches colleagues, asks questions, takes notes. Works based on direct guidance with no fixed process.
Process recognition phase – starts identifying patterns: "This task always starts with Step A, then B, ends at C." Forms a mental workflow; no longer confused as before.
Skill mastery phase – focuses on quality: how to make the report most readable, most accurate, precisely meeting company standards without being reminded.
Compliance phase – internalizes the boundaries: safety rules, security requirements, the "lines you don't cross" that protect organizational interests.

AI Agents need a similar journey to move from passive support tool to capable, trusted assistant.

2.4.2 Layer 1 – Knowledge (Discovery Phase)

Knowledge in Antigravity is equivalent to hands-on work experience – accumulated gradually through interaction, like a new employee "picking up" the culture and habits of their manager.

During day-to-day work, the Agent quietly observes patterns. It gradually understands how you want data presented, what writing style you prefer (personal preference). It remembers the project folder structure, the specialized terminology you use (work context). It notices which tasks you regularly do at specific times each week (habits).

This information doesn't need to be formally taught. The Agent derives it from conversation history and daily interactions.

Strengths: Flexibility. No need to prepare complex documentation upfront – just start working, and the system learns on its own. Perfect for the early phase of a project when everything is still taking shape.

Limitations: Consistency. Because it's based on personal experience and specific context, this knowledge is hard to apply broadly to other people or projects. An Agent that "gets you" may not work well with your colleague.

2.4.3 Layer 2 – Workflow (Standardization Phase)

When you notice yourself having to guide the Agent through the same action sequence more than three times, that's the signal to move from Knowledge to Workflow.

Instead of relying on the Agent's memory ("Do it like last time"), you need a more specific, explicit guide.

What Workflow is: Like a Standard Operating Procedure (SOP). A sequential list of steps to complete a task.

Key difference: Workflow is a manual process. It doesn't run itself. The user must actively trigger it with a specific command – for example, /weekly_report in Antigravity.

This is by design. Workflows focus on the processing steps – which often vary slightly depending on the situation. Requiring manual activation gives users control and the ability to adjust before the Agent begins.

Flexibility: Workflows are a framework, not a mold. You can ask the Agent to run a Workflow but skip step two, change the data source, or combine multiple Workflows together.

Workflow solves the efficiency problem – no need to re-explain the process each time. But it doesn't fully solve the quality problem – ensuring the output is consistently excellent.

2.4.4 Layer 3 – Skill (Specialization Phase)

If Workflow answers "What steps come first and next?", Skill answers "How do I do this excellently?"

Skill corresponds to deep expert capabilities developed through repeated practice. A veteran employee doesn't just remember the process – they've formed their own quality standards: the font must be precise, the layout must breathe, numbers must be accurate to the right unit.

In Antigravity, Skill is pre-packaged capability bundles. Unlike Workflow, Skills can automatically activate. When the Agent recognizes that your request matches a Skill's capabilities, it automatically applies those quality standards without you explicitly calling the Skill by name.

Workflow vs. Skill summarized:

Workflow – focuses on execution logic, the steps to take. Flexible, easy to adjust, activated manually, guides the work.
Skill – focuses on product quality, what the output should look like. Stable, high standards, auto-activated when Agent recognizes a pattern, ensures quality.

Simply put: Workflow tells the Agent what to do. Skill teaches the Agent how to do it well.

When to upgrade from Workflow to Skill:

The process is stable with rarely-changing steps
You have strict requirements for output format and quality
You want to reuse this capability across multiple projects or share it with teammates

Skills turn personal experience into organizational assets.

2.4.5 Layer 4 – Rule (Control Phase)

There are situations where "doing the right process" and "doing it with skill" can still lead to disaster – when the Agent's actions, however well-intentioned, violate core safety or ethical principles.

Example: A request to extract customer data and email it to a personal address. The Agent knows how to extract data (Workflow). The Agent can format the data beautifully (Skill). But from a safety perspective, this is a serious data breach.

Without a control mechanism, the Agent's "enthusiasm" becomes a liability.

What Rule is: A set of mandatory limits and regulations the Agent must follow in every situation. It answers: "What is NOT ALLOWED?" and "Which principles must be followed absolutely?"

Rule categories in enterprise environments:

Data security: No sending sensitive data outside the company domain. No storing passwords in plaintext. No exporting customer PII to personal email – ever.

System integrity: Never delete or overwrite original source files. Always create a backup before editing any file.

Compliance: Mask personally identifiable information (PII) in any external-facing reports. Log all data access operations.

Rule lets managers delegate tasks to the Agent with confidence, knowing that even if something goes wrong, core safety boundaries remain protected.

2.4.6 The Adaptation and Evolution Mechanism

The KWSR system may appear to be a linear progression: learn → process → skill → rules. In reality, business environments are constantly changing.

A "perfect" process today may become obsolete tomorrow due to staffing changes, technology shifts, or strategy pivots. The system needs to be able to "learn again."

Two-directional evolution model:

Evolution (left to right): transforms scattered experience (Knowledge) into standardized processes (Workflow/Skill/Rule).

Adaptation (right to left): when a Rule or Skill is no longer relevant, break it – return the Agent to a learning state (Knowledge) to find a new approach. Once the Agent has mastered the new way, establish a new Workflow, then package a new Skill.

Example: A company shifts from "Email Reports" to "Real-Time Dashboard Reports." The action: cancel old email Workflow and Rule, use Knowledge to let the Agent learn the new Dashboard tool. Once proficient, set up a new Workflow for data updates, then package a new Skill for data visualization on the Dashboard.

This flexibility prevents the system from "hardening" – keeps it alive and relevant.

2.4.7 Measuring Agent Maturity

To manage effectively, you need to know which phase your Agent is in.

Maturity score = 100% − (Correction interventions ÷ Total tasks assigned)

Scale:

Below 70% – Learning phase: needs close supervision and step-by-step guidance. Focus on building Knowledge.
70%-90% – Proficient phase: can be assigned work by process, verify final results only. Focus on optimizing Workflow and Skill.
Above 90% – Expert phase: can grant high autonomy. Focus on governance through Rule and overseeing exceptions.

Diagnosing intervention causes:

Intervention about context – Agent doesn't understand terminology, can't find files, misunderstands intent → Add Knowledge
Intervention about process – Agent did steps in wrong order or skipped a step → Refine Workflow
Intervention about quality – Agent did it correctly but result looked bad or had wrong format → Standardize Skill
Intervention about safety – Agent did something dangerous or risky → Establish Rule

2.4.8 The Operator's Creative Role

Antigravity provides the foundation and tools, but you are the architect.

The Agent's maturity reflects the clarity of your management thinking:

Work haphazardly → Agent remains at the level of scattered Knowledge
Think in processes → Agent becomes an efficient operational engine (Workflow)
Prioritize quality → Agent becomes a specialist (Skill)
Think about risk management → Agent becomes a trustworthy assistant (Rule)

Don't expect magic to happen overnight. Patiently guide the Agent through each phase of the evolution model. That's the most sustainable way to build a Digital Twin that truly serves your business.

2.5 File Processing Capabilities

2.5.1 Opening Story

Imagine you're the accounting manager at a mid-sized retail company. A Monday morning, your controller asks: "Consolidate expenses from these 50 vendor invoices for our quarterly report. We need it by Wednesday."

You look at the stack of invoices and sigh. Each one needs to be opened, read for amount, date, and vendor name, then manually entered into Excel. At an average of three minutes per invoice, that's 50 invoices – nearly three hours of tedious, error-prone work.

You remember the Agent your IT team deployed last week. The IT manager mentioned the Agent could "read files and process data automatically." You decide to try it.

You photograph all 50 invoices with your phone, save them to a folder on your computer, and open Antigravity: "Read all image files in the folder Invoices_October, extract amounts and dates, consolidate into Excel."

The Agent starts working. A few minutes later, results appear. You open the Excel file and… disappointment.

Several invoices have wrong amounts. The number 1,500 was read as 1,800. Several files were skipped entirely because they were "unreadable." The total is off by nearly $20,000 from reality.

You conclude: "The Agent is terrible – doing it by hand is better." And you go back to the manual approach.

But the truth is: the Agent wasn't bad. You just didn't yet understand the Agent's file processing capabilities.

2.5.2 Why the Agent Misreads

To understand why the Agent got things wrong in that scenario, you need to understand how the Agent "sees" different file types.

Imagine two kinds of gift boxes.

Box type one is transparent. You can see inside immediately – a book, a watch, a toy – and describe each detail precisely without opening the box.

Box type two is locked shut. Nothing on the outside hints at what's inside. You need a key to open it, or you have to guess based on weight and whether it rattles when shaken.

Computer files divide the same way.

"Transparent" files are plain-text files like TXT, CSV, and Markdown. Open them in Notepad and you can see every character clearly. The Agent reads these like looking into a transparent box – perfectly accurate, never wrong.

"Locked" files come in two subtypes:

Packaged files like Word (.docx) and Excel (.xlsx) are actually "boxes" containing many small files inside, compressed and packaged in a proprietary structure. The Agent doesn't have a built-in "key" to open these directly – it needs a special tool (like a Python library) to decode the internal structure. Opening them incorrectly breaks the structure and corrupts data.

Image files like JPG, PNG, or scanned PDFs aren't "locked boxes" but "frosted boxes." The Agent must look through frosted glass and guess what's inside. If the glass is very frosted (poor image quality), the Agent will guess wrong.

This is exactly what happened with the invoices. The Agent didn't misread because it was incompetent – it misread because those photos were like frosted boxes: some slightly frosted (read correctly), some heavily frosted (read wrong or unreadable).

2.5.3 The Golden Rule of File Processing

From that story, one simple rule:

If the Agent can create that file type itself, it can definitely read and edit that file type perfectly.

The Agent can create text files, CSV files, Markdown files, and code files. These are "transparent boxes" it can see through completely. Therefore, the Agent reads and edits these files with absolute accuracy.

The Agent cannot create a photo taken by a camera, nor can it create a Word or Excel file from scratch (it needs a template or tool). Therefore, when working with these files, the Agent must use a "key" or "look through frosted glass." Results may be correct or may be wrong, depending on conditions.

Simple solution: Convert data to "transparent" format before handing it to the Agent. For text: convert to TXT or Markdown. For spreadsheets: export to CSV. This turns "locked boxes" into "transparent boxes" so the Agent can read accurately.

2.5.4 Three Capability Groups by Reliability

Group 1: High Reliability

Files the Agent reads perfectly. Verify lightly or not at all.

Plain text files – any file you can open in Notepad and read: TXT, Markdown, CSV, JSON, YAML. The Agent reads every character accurately, understands structure (headers, content, data), and can read, edit, and create these files without any supporting tools.

US example: If you have a vendor invoice list in CSV format with columns Date, Vendor, Amount – the Agent will handle it flawlessly. It can calculate totals, filter by vendor, sort by date, and output a report with high accuracy.

Code files – all programming languages: Python, JavaScript, HTML, CSS, SQL. The Agent not only reads code but understands the logic inside. It can find bugs, suggest fixes, and write new features.

This means if you need the Agent to analyze complex data, it can write a script to do so, run it, and return the results.

Group 2: Medium Reliability

Files the Agent can work with but needs additional processing through supporting tools. Verify results for important files.

Modern Excel and Word files (.xlsx and .docx) – created with Microsoft Office 2007 or later. The Agent uses Python libraries to open and read them. This takes a few extra seconds and may encounter errors with special formatting.

With Excel files: the Agent can read data from sheets, calculate, and create new tables. With Word files: it can read text and make edits. However, complex formatting like nested tables, images within documents, or macros may cause difficulty.

Tip: If you have Excel or Word files the Agent will process regularly, export to CSV or Markdown first. This upgrades reliability from medium to high.

Digital-text PDFs – PDFs created from computer-generated text (not scanned from paper). When you export a Word document to PDF, that's a digital-text PDF. The Agent can read the content but cannot edit it directly.

Group 3: Low Reliability

Files the Agent must "see" rather than "read." Results depend heavily on file quality. Always verify with important data.

Image files (PNG, JPG, GIF) – the Agent can view images and describe their contents. It can read text in images via OCR technology, recognize tables and charts. Accuracy depends entirely on image quality.

Good quality image: photographed straight-on, sufficient lighting, clear text, no blurring. With good-quality images, the Agent can read correctly 90%+ of the time.

Poor quality image: photographed at an angle, dim lighting, small or blurred text, lots of noise. With poor-quality images, the Agent may misread or skip entirely.

Practical tip: Before handing images to the Agent, flip through them yourself. Ask: "If I had to read this image, could I easily?" If you find it hard to read, so will the Agent.

Scanned PDFs – PDFs created from a scanner or smartphone camera photos. These are image files packaged together – treated exactly like photos, with the same limitations.

Video and audio files – need additional processing steps. Video is separated into individual frames and audio. Audio is converted to text via speech recognition. Processing time: approximately 1-3 minutes per minute of audio, and 2-5 minutes per minute of video.

2.5.5 Applying This to the Invoice Problem

Now let's revisit the opening scenario with a better approach.

The scenario involved 50 photo images of invoices – Group 3, low reliability. The Agent can handle them, but accuracy depends on image quality.

Better approach (short-term): Before handing images to the Agent, review them quickly. Re-shoot blurry or angled photos. For invoices with very small text, zoom in and shoot closer.

After the Agent processes the batch, spot-check 5-10 invoices: compare amounts in the original photos against Agent-extracted amounts. If there's significant discrepancy, ask the Agent to reprocess with more specific guidance.

Better approach (long-term): If this is a monthly recurring task, propose that your company require vendors to submit as e-invoices in digital PDF or Excel format rather than paper invoices. Many US vendors already support electronic invoicing (and many are now required to for government contracts).

With digital Excel or PDF invoices, the Agent will process with high accuracy and speed – no more worrying about misreads.

This is the Digital Twin mindset: not just delegating tasks to the Agent, but redesigning workflows so the Agent operates most effectively.

2.5.6 Unsupported File Formats

Some formats the Agent currently doesn't support – convert before handing over:

Compressed files (ZIP, RAR) – contain multiple files inside. The Agent can't read them directly; extract first.

Specialty files (Photoshop .psd, AutoCAD .dwg, professional design formats) – not supported. Export to common formats like PNG, JPG, or PDF first.

Password-protected or encrypted files – cannot be read. Unlock first.

2.5.7 Practical Recommendations for US Businesses

When building a Digital Twin, standardizing file formats is one of the most important steps for efficient Agent operation.

Prioritize plain text files. CSV and Markdown are the two formats the Agent processes fastest and most accurately. If data is currently in Excel, export to CSV before handing to the Agent. If documentation is in Word, consider converting to Markdown.

Always preserve original files. Keep originals in a separate folder; let the Agent work on copies. This prevents data loss if something goes wrong. Make this a Rule for your Agent.

Use meaningful file names. File names should describe the content – for example, 2026-10-Acme-Invoice-001.csv rather than data.csv or file1.csv. The Agent better understands context when file names are descriptive.

Organize folders logically. Put related files in folders with clear names. For example, all October 2026 invoices in Invoices_2026_10, all reports in Reports_2026. This helps the Agent find and process files efficiently.

Build a conversion process. If your business regularly receives files from outside in non-optimal formats, build a conversion process. Example: when receiving an Excel file from a vendor, Step 1 is always to export to CSV before further processing.

Next: Part 3 – Designing AI Agents: Knowledge, Workflow, Skill, and Rule in Depth