Resources

This is a deliberately opinionated set of references I come back to when building systems and when designing AI-heavy platforms. If a link feels "boring," it is usually because it survived contact with production.

Architecture and systems references

Designing Data-Intensive Applications - The baseline vocabulary for data systems: storage engines, replication, partitioning, stream processing, and the tradeoffs you will otherwise rediscover painfully.
The Datacenter as a Computer - Warehouse-scale computing fundamentals; clarifies why latency, tail behavior, and utilization are always in tension.
Google SRE Book - Reliability principles and mechanisms (SLOs, error budgets, on-call hygiene) described by people who operate what they build.
Google SRE Workbook - The more actionable companion; especially useful when you need to translate reliability goals into engineering tasks.
AWS Builders' Library - High-signal essays on operational architecture, resilience, and failure economics; vendor-hosted, but broadly applicable.
Release It! (Michael Nygard) - Stability patterns that show up everywhere: timeouts, bulkheads, back-pressure, circuit breakers, and capacity as a first-class design constraint.
The Architecture of Open Source Applications - Deep dives into real systems by the people who built them; a great antidote to architecture-by-diagram.
ACM Queue - Practitioners writing for practitioners; excellent for calibrating your mental model of modern systems.
USENIX (OSDI/NSDI/FAST) - Primary sources for distributed systems design; when you want actual data instead of "best practice" folklore.
Papers We Love - A well-curated on-ramp into classic systems papers; useful for building a personal reference spine.
Architecture Decision Records (ADR) - A lightweight, durable way to make architectural intent legible over time (and to capture why you said "no" to the other options).
Martin Fowler - Delivery and architecture essays; treat as lenses, not scripture.

AI engineering toolchain

OpenAI Platform Docs - Core API patterns (tool use, structured outputs, streaming), safety guidance, and the constraints that matter when you move beyond demos.
OpenAI Cookbook - Practical integration patterns; best used as a menu of implementation sketches to adapt, not copy.
Anthropic Docs - A good comparative reference for tool use and message design; cross-checking vendors is a fast way to reveal hidden assumptions.
LangGraph - Agent orchestration as explicit state machines; helpful when "a chain" becomes a workflow with retries, branches, and audits.
LlamaIndex - Retrieval plumbing and connectors; useful when your system boundary is "knowledge," not just prompts.
LiteLLM - Provider abstraction and gateway patterns; reduces lock-in and makes multi-model routing and cost control less bespoke.
vLLM - High-throughput inference for open models; a practical default when you care about batching, latency, and GPU utilization.
Ollama - Local model runner for fast iteration; great for prototyping agent behavior without burning cloud cycles.
Transformers (Hugging Face) - The reference implementation ecosystem for open models; also a reality check on what "just run it" actually entails.
pgvector - Vector search inside Postgres; a strong option when operational simplicity beats specialized infra.
Qdrant - Vector database with clear operational docs; good when you need filtering, payloads, and predictable retrieval behavior.
OpenTelemetry - If it matters, instrument it: unify traces/metrics/logs across orchestration, retrieval, and model calls.
LangSmith - Tracing, dataset-driven evaluation, and regression testing for agent workflows.
Phoenix (Arize) - Open-source observability/evals for LLM apps; useful when you want transparency and local control.
Weights & Biases - Experiments, artifacts, and operational visibility; handy when you treat prompts/config as versioned assets.
promptfoo - A pragmatic evaluation harness; makes prompt/model changes behave more like real software changes (diffs, baselines, regressions).
Ragas - RAG evaluation metrics and pipelines; a starting point for measuring retrieval quality beyond vibes.
OpenAI Evals - An eval harness you can read end-to-end; useful for designing your own regression suite and scorer patterns.
FAISS - Core vector search library; worth knowing even if you never run it directly, because many stacks embed its assumptions.
Introduction to Information Retrieval - Retrieval fundamentals (scoring, ranking, evaluation) that make RAG systems more measurable and less mystical.
Retrieval-Augmented Generation (Lewis et al., 2020) - The canonical framing for RAG; useful for vocabulary, baselines, and conceptual boundaries.
ReAct (Yao et al., 2022) - A clean description of reasoning + tool-use loops; the ancestor of many agent orchestration patterns.
OWASP Top 10 for LLM Applications - Threat modeling and common failure modes; use it to design guardrails that survive adversarial input.
NIST AI Risk Management Framework - A structured way to talk about AI risk with grown-ups (security, safety, governance) without collapsing into theater.

Writing and research workflow

Obsidian - Local-first knowledge base for long-form drafting; excellent when you treat notes as a system (links, maps, refactoring).
Readwise - Capture highlights and run a spaced review loop; good for turning "I read it" into "I can reuse it."
Zotero - PDF/library management for papers and specs; keeps citations, tags, and search sane once your library stops fitting in your head.
Better BibTeX for Zotero - Stable citation keys and export workflows; useful if you ever want your references to be reproducible.
Pandoc - The universal format converter; helpful for publishing pipelines and for keeping content portable over time.
Mermaid - Diagrams as text; works well when architectural diagrams should be versioned alongside code.
Raycast - Automation surface for repeatable workflows; small time savings that compound if you actually use them.

If you have a resource recommendation that consistently improves technical outcomes, send it through the Contact page.