A Practical Guide

Codex in VS Code
for Academic Economists

Agentic AI assistance for R, Stata, Python, LaTeX, and research workflows

by Claes Bäckman

VS Code is one of several places Codex can run. If you have not picked a surface yet, see where to run agentic AI for a short tour of the alternatives (terminal, desktop app, web, mobile) before working through this guide. For the Anthropic counterpart, see the companion Claude Code guide and the side-by-side comparison.

Installation in 5 Steps

Installing Codex in VS Code is straightforward. The important thing is to install the official OpenAI extension and open a full project folder before asking it to do research work.

Mac users: replace Ctrl with Cmd throughout, for example Cmd+Shift+X for Extensions and Cmd+Shift+P for the Command Palette.
  1. Open VS Code and open the Extensions panel with Ctrl+Shift+X.
  2. Search "Codex" and install Codex - OpenAI's coding agent from OpenAI. The extension ID is openai.chatgpt. You can also press Ctrl+P and run ext install openai.chatgpt.
  3. Restart VS Code if the Codex icon does not appear right away. In VS Code, Codex opens in the right sidebar by default.
  4. Sign in with your ChatGPT account, or use an API key if that is how your account is set up. ChatGPT Plus, Pro, Business, Edu, and Enterprise plans include Codex usage.
  5. Open your full project folder in VS Code (File > Open Folder), not just a single file. Codex is most useful when it can see the directory tree, scripts, output folders, and documentation together.
Explorer
paper/main.tex
Codex
project/ data/raw/ code/01_clean.R code/02_analysis.R paper/main.tex AGENTS.md
\section{Empirical Strategy} We estimate the baseline model...   table: output/main_results.tex figure: output/event_study.pdf
Agent medium
@code/02_analysis.R explain why the clustered SEs changed.
A useful setup: full folder in the Explorer, source files in the editor, Codex in the side panel.
Windows users: Codex can run natively on Windows with the Windows sandbox. If your research code depends on Linux tooling, WSL2 is often the smoother route.

After Installing - What to Do Next

Codex works out of the box, but a few one-time setup steps make it much more useful for empirical research. Everything below is covered in detail later; this is the short roadmap.

  1. Install extensions for your research stack. Stata, Python, LaTeX, CSV viewers, and PDF viewers make it easier for Codex to run code, inspect errors, preview output, and iterate inside VS Code.
  2. Create an AGENTS.md file for each project. Tell Codex once about your data, sample definitions, file layout, and coding conventions. Codex reads these instructions before it starts work.
  3. Learn the mode switcher. Use Chat when you only want advice, Agent when Codex may edit files and run commands, and Agent (Full Access) only when you deliberately want broader autonomy.
  4. Add skills for workflows you repeat. A standard robustness check, referee-style read, or table-formatting routine can become a reusable skill you invoke with $skill-name.

A Starter Pack for Economists

Install these alongside Codex to cover the full research stack. Search for each by extension ID in the Extensions panel (Ctrl+Shift+X).

Stata

tmonk.stata-workbench
A VS Code-compatible extension that allows Stata code to be run directly from the editor. Useful when Codex writes or edits .do files and you want to execute them without leaving VS Code. It also exposes an MCP server, so if you configure that server for Codex, Codex can interact with Stata more directly. Built by Thomas Monk, London School of Economics.

Python

ms-python.python
The core Python extension. Syntax highlighting, IntelliSense, and a run button for .py files. Required for most Python work in VS Code.
ms-python.vscode-pylance
Fast, type-aware language server for Python. Gives Codex better static context about variable types, function signatures, and import errors.
ms-python.debugpy
Python debugger. Set breakpoints and inspect variables mid-run - essential when tracking down why a panel merge or regression loop is producing unexpected results.
ms-python.vscode-python-envs
Manages virtual environments such as conda, venv, and pipenv from inside VS Code. Keeps project dependencies isolated when different papers use different package versions.
Why install Python even if you do not write Python. A surprising share of everyday empirical-work friction - converting .docx or PDFs to markdown, reshaping a messy CSV, scraping a table, pulling a series from an API, cleaning a bibliography file - is solved most quickly by a short Python script. Ask Codex in plain English, and it can write, run, and debug that script for you.

LaTeX

james-yu.latex-workshop
The standard LaTeX extension. Compiles on save, renders a side-by-side PDF preview, shows compile errors inline, and provides autocomplete for commands and citations. Pairs well with Codex for table, notation, and bibliography fixes.
yzane.markdown-pdf
Converts Markdown files to PDF with one command. Useful for quick memos, referee responses, or notes that do not need full LaTeX treatment.

Data Files

mechatroner.rainbow-csv
Colour-codes CSV columns so you can visually verify structure at a glance. Also adds a lightweight SQL-like query tool (RBQL) for filtering rows without loading them into R or Python.
tomoki1207.pdf
Renders PDFs inline in VS Code. Useful when you want papers, codebooks, or appendices visible next to the code Codex is editing.
LaTeX build-file cleanup. By default, latex-workshop leaves auxiliary files (.aux, .bbl, .log, .fls, .fdb_latexmk, and so on) next to your .tex source. To have VS Code delete them automatically after each successful build, add three settings to .vscode/settings.json:
"latex-workshop.latex.autoClean.run": "onBuilt",
"latex-workshop.latex.clean.method": "glob",
"latex-workshop.latex.clean.fileTypes": [
  "*.aux", "*.bbl", "*.blg", "*.log",
  "*.out", "*.toc", "*.fls", "*.fdb_latexmk",
  "*.nav", "*.snm", "*.vrb"
]
The first line triggers cleanup after every build; the second tells it to match by file extension; the third lists which extensions to delete.

What Codex Can Read - and What to Convert First

Codex works best with plain text. The more a file looks like characters on a page, as opposed to a rendered binary, the better Codex can inspect, edit, and reason about it. For economists this matters: a referee report in .docx, a codebook as a scanned PDF, or a Stata dataset in .dta can all be useful inputs, but usually only after conversion to a text-friendly form.

A markdown file (.md) is the simplest useful form: a plain-text document with lightweight formatting conventions - # for headings, *italic*, **bold**, bullet lists, and links written as [text](url). It renders nicely but remains fully readable as raw text.

A rough guide to the formats you are most likely to encounter:

Plain text
.md, .txt, .tex, .bib, .csv, .json, .yaml
Native - no friction. Codex reads these directly and can edit them line by line. Markdown and LaTeX are the ideal substrate for writing tasks. CSV and TSV are fine up to moderate sizes; larger files should be sampled, summarized, or queried with code.
Code
.py, .R, .do, .jl, .sh
Native. Codex reads, edits, and can run these when the relevant interpreter is installed. Stata do-files are text, so Codex can edit them even if Stata itself is not available.
Jupyter
.ipynb
Usable. Notebooks are JSON under the hood. Codex can inspect them directly, though for large notebooks it is often cleaner to export to a script or markdown first.
Word
.docx
Convert first. For referee reports or co-author comments, convert to markdown with pandoc file.docx -o file.md. Codex can do this for you if pandoc is installed.
Excel
.xlsx, .xls
Convert or inspect with code. Ask Codex to write a Python or R script that lists sheet names, previews columns, and exports one CSV per sheet. Do not paste entire workbooks into a prompt.
Stata / R binaries
.dta, .rds, .sav
Convert first. Ask Codex to write Stata, R, or Python code that dumps a sample, a schema, a codebook, or a CSV version. For large datasets, summary tables are more useful than raw rows.
PDF
.pdf
Extract text first for serious work. Inline PDF preview is convenient for you, but Codex gets better results from markdown or plain text extracted from the PDF. See the note below.
Images
.png, .jpg, figures
Useful as visual context. The Codex IDE extension accepts images in prompts. This is good for plots, screenshots, and UI references. For scanned documents, run OCR first if the text matters.
Qualtrics / survey
.qsf
Convert first. A .qsf file is technically JSON, but deeply nested. Ask Codex to flatten it into a markdown outline of blocks, questions, and answer choices before reviewing it.

Handling PDFs

PDFs are where researchers most often get stuck. They look like documents, but they are layout files. Text may be stored as page-positioned glyphs with no clean paragraph, column, or table structure.

A practical rule of thumb:

Born-digital PDFs
(from LaTeX, Word, etc.)
Usually fine for short documents after text extraction. Expect some table, footnote, and multi-column mangling.
Long PDFs
(50+ pages)
Convert to markdown first. A clean markdown version is far easier for Codex to navigate and much cheaper in context.
Scanned PDFs
(image-based)
Run OCR first, for example with ocrmypdf, then convert the OCR'd PDF to markdown or text.
PDFs with equations and tables
These are the hardest case. If you have the source .tex, use that instead. It will almost always be cleaner than re-parsing the compiled PDF.

For converting PDFs to markdown, a few options work well:

pdftotext / pandoc
Good first pass. Free command-line tools for extracting text or converting formats. Useful on clean, born-digital PDFs.
marker / MinerU / docling
ML-based PDF-to-markdown tools. Often better on papers with tables, figures, captions, and equations. Codex can help install and run them.
$pdf-to-markdown skill
Roll your own. A Codex skill can wrap your preferred conversion tools with sensible fallbacks, then report word counts and suspicious extraction failures.
Rule of thumb: if a file format requires conversion, do it once up front and keep the markdown or CSV version alongside the original. Every later Codex session then starts from clean text.

AGENTS.md and Skills

AGENTS.md - Persistent Project Context

An AGENTS.md file gives Codex persistent instructions and context. It is the place to state things that are always true about your project: folder layout, data rules, code conventions, model notation, and what not to touch.

There are two especially useful levels:

~/.codex/AGENTS.md
Computer-level. Applies to every project on your machine. Use it for personal defaults: preferred languages, regression conventions, LaTeX notation, writing style, or recurring cautions. On Mac/Linux, ~ is your home folder; on Windows it is typically C:\Users\yourname\.
your-project/AGENTS.md
Folder-level. Applies when Codex is opened in that project or repository. Use it for one paper: data structure, sample restrictions, script order, output folders, and co-author conventions.

Codex can also read nested AGENTS.md and AGENTS.override.md files as it walks from the project root to your current working directory. More specific instructions appear later and take precedence.

Unlike Claude Code, Codex does not use CLAUDE.md. For a new empirical project, create AGENTS.md manually or ask Codex to draft one after it has inspected the folder. A useful starting point:

# Project: [Paper title]

## Data
- Unit of analysis: firm-year panel, 2010-2022
- Raw data lives in data/raw/ - never edit these files
- Main dataset after cleaning: data/clean/panel.dta
- When data are large, create summaries or samples instead of loading full files into context

## Code conventions
- R scripts are numbered: 01_clean.R, 02_analysis.R, 03_tables.R
- All regressions cluster SEs at the firm level unless instructed otherwise
- Use fixest for panel regressions
- Save generated tables to paper/tables/

## LaTeX
- Main file: paper/main.tex
- Use \widehat{} rather than \hat{} for estimators
- Keep notation consistent with paper/notation.md

## Safety
- Do not overwrite raw data
- Ask before installing new packages
- Run the relevant script after editing analysis code
Writing as well as code: AGENTS.md is also the right place for instructions about how you want Codex to write: tone, voice, and phrases to avoid. Paul Goldsmith-Pinkham has a useful post on using these instructions to get AI assistance that improves rather than flattens your prose: Writing and thinking with AI assistance.

Skills - Reusable Workflows

Codex skills are folders containing a SKILL.md file, plus optional scripts, references, and assets. They package task-specific instructions so Codex can follow a workflow reliably. Skills are available in the Codex CLI, IDE extension, and Codex app.

Codex skills are not slash commands in the Claude sense. In Codex, slash commands such as /status, /review, /cloud, and /local control the session. Skills are invoked by mentioning them directly, for example $robustness, or by typing $ and selecting from the list. Codex may also invoke a skill automatically when your request matches its description.

Skills can live in several places:

your-project/.agents/skills/
Repository skills checked into a project. Good for paper-specific workflows that co-authors should share.
~/.agents/skills/
User skills available across projects. Good for general research workflows you personally reuse.
/etc/codex/skills/
Machine or container-level skills, more common in shared lab or teaching environments.

To create a $robustness skill manually:

mkdir -p .agents/skills/robustness
# then create .agents/skills/robustness/SKILL.md

A minimal SKILL.md for an economist:

---
name: robustness
description: Use when asked to run standard robustness checks on the main regression.
---

For the main specification in this project:
1. Identify the baseline regression script and output table
2. Re-run with alternative clustering: industry-year instead of firm
3. Re-run dropping the top and bottom 1% of the outcome variable
4. Re-run on the pre-2020 subsample only
5. Produce a summary table comparing coefficients across all specifications
6. Report which files changed and which commands were run

Downloading Skills Others Have Written

Codex includes a $skill-installer skill for installing curated or external skills. For local experimentation you can also copy a skill folder into ~/.agents/skills/ or into a project's .agents/skills/ directory. If a new skill does not appear immediately, restart Codex.

For economists, the existing Claude-skill ecosystem is still useful as a source of ideas, but paths and invocation syntax need to be adapted for Codex. A few starting points:

Research feedback workflows. Commands for referee-style reviews of papers, grant proposals, and code. Port the instruction files into Codex skill folders when you want them available as $skills.
Scott Cunningham's empirical-workflow tools. A useful model for structuring repeatable research tasks, even when adapting from Claude-style folders to Codex-style skills.
Chris Blattman's skill reference. A clear example of how a working economist structures daily AI-assisted research workflows. Translate slash-command conventions into Codex skills where useful.
Worth knowing: Codex may invoke skills implicitly when their descriptions match your request. If you want a skill to run only when explicitly mentioned, add optional skill metadata that disables implicit invocation.

Approval Modes, Reasoning, and Cloud Work

The most important Codex habit is choosing the right level of autonomy for the task. The mode switcher sits under the chat input in the Codex panel.

Chat
Advice only. Use this for explaining code, brainstorming identification checks, reviewing a table, or planning changes before any files are touched.
Agent
Default working mode. Codex can read files, edit files, and run commands in the working directory. It asks for approval before working outside the directory or using the network.
Agent (Full Access)
Use sparingly. Codex can run with fewer approval prompts, including broader network access depending on your setup. This is convenient for trusted maintenance tasks but deserves care.
Reasoning effort
Start at medium. Use higher reasoning effort for complex debugging, refactors, or ambiguous empirical logic. Higher effort is slower and consumes usage faster.
Cloud
Delegate longer jobs. Use /cloud or the cloud controls when you want Codex to run a larger task remotely, then review the changes locally before merging them.
Local first for private data: if a task touches confidential datasets, restricted administrative data, or files you are not allowed to upload, keep the work local and avoid cloud delegation.

Git Integration

Codex is git-aware. When you open a project that is a git repository, Codex can inspect diffs, staged changes, branches, and commit history. This makes it useful for version control tasks alongside coding ones.

1Write commit messages
Write a commit message for my staged changes.
2Explain what changed
Summarise what changed between the last two commits
in code/02_analysis.R.
3Review uncommitted work
/review

Review my uncommitted changes and flag anything that
could change the sample or regression specification.
4Resolve a merge conflict
I have a merge conflict in code/03_tables.R. Read both
versions and resolve it, keeping the newer variable names.
No git? That is fine for local work. A Dropbox folder, shared drive, or plain folder on your desktop can still be a useful project root. Git becomes more important when you want history, branches, cloud delegation, or pull requests.

Managing Context

Codex can only hold a limited amount of project information in its working memory at once. Large empirical projects with many scripts, logs, data extracts, and output tables can fill that space quickly. A few habits help:

1Use @-mentions deliberately

Point Codex at the specific files you want it to use instead of making it infer the whole project.

@code/02_analysis.R fix the clustering in the main spec.
2Add selected text as context

Select the exact equation, table, or error message in VS Code and use the Codex command to add the selection to the current thread.

Explain this selected LaTeX error and patch the table.
3Start new threads for new tasks

A fresh thread avoids dragging old assumptions into a new problem. In VS Code, Cmd+N or Ctrl+N creates a new Codex thread by default.

4Check status

Use /status to see thread information, context usage, and rate-limit information.

5Use Auto Context with care

/auto-context can include recent files and IDE context automatically. It is convenient for small projects, but explicit file mentions are often cleaner for large research folders.

AGENTS.md helps here too. Because Codex reads persistent project guidance before working, you do not need to re-explain the same data structure and file conventions in every prompt.