March 13, 2026

Track ML Research Trends from Codex CLI with arXiv Search

Machine learning moves fast. New architectures, training techniques, and benchmark results appear on arXiv daily. If you’re building ML systems, keeping up with what’s new helps you avoid reimplementing solved problems and lets you adopt techniques that actually work. AgentPatch’s arXiv Search tool lets Codex query arXiv directly, so you can check what’s new without leaving your terminal.

Why This Matters

There’s a gap between what gets published on arXiv and what shows up in blog posts and Twitter threads weeks later. If you’re waiting for the filtered version, you’re always behind. But checking arXiv manually every day isn’t realistic either — the volume is too high, especially in categories like cs.LG and cs.CL.

Having arXiv Search available in Codex means you can ask targeted questions about what’s been published recently. Instead of browsing the firehose, you query for exactly what’s relevant to your current work.

Setup

The AgentPatch CLI is designed for AI agents to use via shell access. Install it, and your agent can discover and invoke any tool on the marketplace.

Install (zero dependencies, Python 3.10+):

pip install agentpatch

Set your API key:

export AGENTPATCH_API_KEY=your_api_key

Example commands your agent will use:

ap search "web search"
ap run google-search --input '{"query": "test"}'

Get your API key from the AgentPatch dashboard.

Add AgentPatch to ~/.codex/config.toml:

[mcp_servers.agentpatch]
url = "https://agentpatch.ai/mcp"
bearer_token_env_var = "AGENTPATCH_API_KEY"

Then set your API key:

export AGENTPATCH_API_KEY=your_api_key

Replace your_api_key with your actual key from the AgentPatch dashboard. Codex discovers all AgentPatch tools automatically on next start.

Example

You’re working on a text generation service and want to know if there are new decoding strategies worth considering:

“Search arXiv for papers on decoding strategies for language models published in the last month. Category cs.CL, sorted by most recent, give me 15.”

Codex queries arXiv:

Searching arXiv for "decoding strategies language models" in cs.CL,
sorted by submittedDate descending, limit 15...

Found 15 results. Grouping by approach:

## Speculative Decoding Improvements (5 papers)
This is the most active area right now:
- "Multi-Draft Speculative Decoding with Optimal Acceptance" (2026-03-08)
  Uses multiple draft models in parallel instead of one. Gets 3.1x speedup
  over autoregressive decoding on Llama-3 70B.
- "Self-Speculative Decoding via Layer Skipping" (2026-03-02)
  No draft model needed — uses early exit from intermediate layers as the
  draft. 2.4x speedup with zero additional memory.

## Structured/Constrained Decoding (4 papers)
- "Grammar-Constrained Decoding with Adaptive Lookahead" (2026-02-28)
  Makes grammar-constrained generation faster by pruning the lookahead
  tree based on grammar state.

## Sampling Methods (3 papers)
- "Entropy-Adaptive Sampling for Open-Ended Generation" (2026-03-05)
  Dynamically adjusts temperature based on the model's token-level
  entropy. Reduces repetition without hurting creativity scores.

## Evaluation and Benchmarks (3 papers)
Mostly new benchmarks for measuring decoding quality.

The self-speculative decoding paper (#2) is probably the most directly useful
for your service — it gives a significant speedup without needing to train or
host a separate draft model. Want me to look at its implementation details?

“Yes. Also check if there’s a reference implementation.”

From the abstract and paper metadata:
- The approach works by routing tokens through only the first N layers for
  the draft phase, then verifying with the full model.
- The authors report results on Llama-3 and Mistral architectures.
- Comment field says "Code available at github.com/..."

I can search for the repo on GitHub if you want to check the implementation
directly.

Two prompts and you’ve gone from “what’s new in decoding?” to a specific technique with a reference implementation you can evaluate. That’s a research workflow compressed into a few minutes.

You can make this a recurring habit. Start each week with a quick check:

“What were the most cited new papers in cs.LG this week?”

Codex runs the search and gives you a five-minute briefing before you start coding.

Wrapping Up

AgentPatch’s arXiv Search turns Codex into a research-aware coding tool. You can track what’s new in any ML subfield, check for techniques relevant to your current project, and stay ahead of trends — all from your terminal. The same connection gives Codex access to every other tool on the marketplace. See what’s available at agentpatch.ai.

Track ML Research Trends from Codex CLI with arXiv Search

Why This Matters

Setup

Example

Wrapping Up

Give your agent real-world capabilities with AgentPatch