<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>#reliability | CurrentStack</title><description>Articles tagged with #reliability on CurrentStack.</description><link>https://currentstack.io/</link><language>en-us</language><item><title>Inference Reliability in 2026: Vendor Verification, Multi-Provider Routing, and SLO-Aware Fallbacks</title><link>https://currentstack.io/stories/inference-vendor-verification-multi-provider-slo-2026-04-20/</link><guid isPermaLink="true">https://currentstack.io/stories/inference-vendor-verification-multi-provider-slo-2026-04-20/</guid><description>How teams should verify model provider claims and design resilient routing across heterogeneous inference backends.</description><pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate><category>ai</category><category>llm</category><category>cloud</category><category>reliability</category><category>observability</category></item><item><title>Windows 11 May 2026 Reliability Update: Enterprise Rollout Blueprint with AI Surface Controls</title><link>https://currentstack.io/stories/windows-11-may-2026-reliability-rollout-playbook/</link><guid isPermaLink="true">https://currentstack.io/stories/windows-11-may-2026-reliability-rollout-playbook/</guid><description>A practical deployment strategy for Windows core reliability updates while controlling AI-feature drift and endpoint risk.</description><pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate><category>reliability</category><category>security</category><category>enterprise</category><category>observability</category><category>automation</category></item><item><title>Dynamic Workers + Durable Objects: Stateful Agent Sandbox Patterns That Actually Hold in Production</title><link>https://currentstack.io/stories/dynamic-workers-durable-objects-agent-state-patterns-2026-04-14/</link><guid isPermaLink="true">https://currentstack.io/stories/dynamic-workers-durable-objects-agent-state-patterns-2026-04-14/</guid><description>An implementation playbook for combining fast sandbox startup with deterministic state control in agent workloads.</description><pubDate>Tue, 14 Apr 2026 00:00:00 GMT</pubDate><category>cloud</category><category>agents</category><category>serverless</category><category>architecture</category><category>reliability</category></item><item><title>Intel + Terafab and the New AI Chip Race: A Supply-Chain Risk Playbook for Platform Teams</title><link>https://currentstack.io/stories/intel-terafab-ai-chip-supply-chain-risk-playbook-2026-04-08-noon/</link><guid isPermaLink="true">https://currentstack.io/stories/intel-terafab-ai-chip-supply-chain-risk-playbook-2026-04-08-noon/</guid><description>How to prepare engineering and procurement strategy for a volatile AI compute supply chain as new mega-fabrication initiatives emerge.</description><pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate><category>cloud</category><category>finops</category><category>enterprise</category><category>architecture</category><category>reliability</category></item><item><title>GitHub Actions OIDC Custom Properties and Azure VNET Failover: Identity and Resilience by Design</title><link>https://currentstack.io/stories/github-actions-oidc-custom-properties-vnet-resilience-2026-04-07/</link><guid isPermaLink="true">https://currentstack.io/stories/github-actions-oidc-custom-properties-vnet-resilience-2026-04-07/</guid><description>A practical operating model for using repository custom property claims in OIDC tokens and Azure private networking failover in GitHub Actions.</description><pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate><category>ci/cd</category><category>cloud</category><category>identity</category><category>networking</category><category>reliability</category></item><item><title>GitHub Actions Service Container Entrypoints: A Cleaner Path to Deterministic CI Environments</title><link>https://currentstack.io/stories/github-actions-service-container-entrypoint-architecture-2026-04-07/</link><guid isPermaLink="true">https://currentstack.io/stories/github-actions-service-container-entrypoint-architecture-2026-04-07/</guid><description>How the new service container entrypoint/command overrides reduce CI glue code and improve reproducibility, security, and troubleshooting.</description><pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate><category>devops</category><category>platform</category><category>ci/cd</category><category>automation</category><category>reliability</category></item><item><title>Programmable DDoS Mitigation: Operating Custom UDP Protection Without Breaking Production</title><link>https://currentstack.io/stories/programmable-ddos-mitigation-magic-transit-playbook-2026-04-07/</link><guid isPermaLink="true">https://currentstack.io/stories/programmable-ddos-mitigation-magic-transit-playbook-2026-04-07/</guid><description>A practical rollout guide for programmable flow protection on global networks, including safety controls, test harnesses, and incident runbooks.</description><pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate><category>security</category><category>networking</category><category>site-reliability</category><category>reliability</category><category>architecture</category></item><item><title>When AI Vendors Issue Service Credits: Turning Incident Apologies into Procurement Signals</title><link>https://currentstack.io/stories/ai-vendor-credit-incident-slo-procurement-2026-04-06-c/</link><guid isPermaLink="true">https://currentstack.io/stories/ai-vendor-credit-incident-slo-procurement-2026-04-06-c/</guid><description>How to use credit events and compensation programs as structured input for SLO governance, vendor scoring, and renewal decisions.</description><pubDate>Mon, 06 Apr 2026 00:00:00 GMT</pubDate><category>ai</category><category>enterprise</category><category>finops</category><category>reliability</category><category>compliance</category><category>product</category></item><item><title>Local-First Is Back: Production Architecture Patterns with SQLite WASM and OPFS</title><link>https://currentstack.io/stories/local-first-sqlite-wasm-opfs-production-architecture-2026-04-03/</link><guid isPermaLink="true">https://currentstack.io/stories/local-first-sqlite-wasm-opfs-production-architecture-2026-04-03/</guid><description>How to adopt browser-side SQLite safely for offline-capable products without losing sync correctness or observability.</description><pubDate>Fri, 03 Apr 2026 00:00:00 GMT</pubDate><category>database</category><category>architecture</category><category>performance</category><category>reliability</category></item><item><title>GitHub Actions Timezone and Environment Controls: An Operations Playbook for Global Teams</title><link>https://currentstack.io/stories/github-actions-timezone-environment-governance-playbook-2026-04-02/</link><guid isPermaLink="true">https://currentstack.io/stories/github-actions-timezone-environment-governance-playbook-2026-04-02/</guid><description>A practical guide to redesigning CI/CD schedules and environment approvals after GitHub Actions timezone and environment behavior updates.</description><pubDate>Thu, 02 Apr 2026 00:00:00 GMT</pubDate><category>devops</category><category>ci/cd</category><category>platform-engineering</category><category>automation</category><category>enterprise</category><category>reliability</category></item><item><title>From Security Tab to Security &amp; Quality: A Better DevSecOps Operating Model</title><link>https://currentstack.io/stories/github-security-quality-tab-devsecops-metrics-2026-04-02/</link><guid isPermaLink="true">https://currentstack.io/stories/github-security-quality-tab-devsecops-metrics-2026-04-02/</guid><description>How to use GitHub’s Security &amp; quality surface to unify vulnerability response, code health, and engineering accountability.</description><pubDate>Thu, 02 Apr 2026 00:00:00 GMT</pubDate><category>security</category><category>devops</category><category>reliability</category><category>platform-engineering</category><category>compliance</category></item><item><title>Tailscale’s New macOS Architecture: Migration Lessons for Endpoint Networking Teams</title><link>https://currentstack.io/stories/tailscale-macos-network-extension-migration-operations-2026-04-02/</link><guid isPermaLink="true">https://currentstack.io/stories/tailscale-macos-network-extension-migration-operations-2026-04-02/</guid><description>Operational guidance for teams adapting to Tailscale’s updated macOS model, with rollout controls, support playbooks, and security validation.</description><pubDate>Thu, 02 Apr 2026 00:00:00 GMT</pubDate><category>networking</category><category>security</category><category>zero-trust</category><category>platform</category><category>reliability</category></item><item><title>Axios NPM Compromise Lessons: Transitive Dependency Risk Governance for 2026</title><link>https://currentstack.io/stories/axios-npm-compromise-transitive-risk-governance-2026-04-01/</link><guid isPermaLink="true">https://currentstack.io/stories/axios-npm-compromise-transitive-risk-governance-2026-04-01/</guid><description>A response framework for handling package compromise events with rapid containment, provenance checks, and policy hardening.</description><pubDate>Wed, 01 Apr 2026 00:00:00 GMT</pubDate><category>supply-chain</category><category>security</category><category>open-source</category><category>compliance</category><category>reliability</category></item><item><title>When the LLM Gateway Is Compromised: Enterprise Incident Response After LiteLLM-Type Events</title><link>https://currentstack.io/stories/litellm-compromise-enterprise-llm-gateway-response-2026-04-01/</link><guid isPermaLink="true">https://currentstack.io/stories/litellm-compromise-enterprise-llm-gateway-response-2026-04-01/</guid><description>A containment and recovery architecture for organizations relying on shared model gateways in production.</description><pubDate>Wed, 01 Apr 2026 00:00:00 GMT</pubDate><category>security</category><category>ai</category><category>supply-chain</category><category>platform-engineering</category><category>reliability</category></item><item><title>Code Verification Agents and the New Economics of AI-Generated Software</title><link>https://currentstack.io/stories/code-verification-agents-quality-economics-2026-03-31/</link><guid isPermaLink="true">https://currentstack.io/stories/code-verification-agents-quality-economics-2026-03-31/</guid><description>Why test/review verification agents are becoming core infrastructure as coding output scales, and how to adopt them without slowing delivery.</description><pubDate>Tue, 31 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>agents</category><category>testing</category><category>reliability</category><category>devops</category><category>engineering</category></item><item><title>MCP over gRPC in the Enterprise: Integration Contracts, SLOs, and Failure Design</title><link>https://currentstack.io/stories/mcp-grpc-enterprise-integration-contracts-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/mcp-grpc-enterprise-integration-contracts-2026/</guid><description>How to adopt MCP ecosystems without losing control of transport contracts, latency budgets, and incident handling.</description><pubDate>Tue, 31 Mar 2026 00:00:00 GMT</pubDate><category>agents</category><category>api</category><category>grpc</category><category>platform-engineering</category><category>reliability</category><category>observability</category></item><item><title>After Sora’s Reported Shutdown Signals: A Product-Risk Playbook for AI Video Teams</title><link>https://currentstack.io/stories/ai-video-sora-shutdown-product-risk-playbook-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/ai-video-sora-shutdown-product-risk-playbook-2026/</guid><description>What AI video teams should change in roadmap planning, vendor strategy, and reliability governance when flagship services face disruption.</description><pubDate>Sun, 29 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>product</category><category>startup</category><category>platform</category><category>reliability</category></item><item><title>Post-Quantum TLS Hybrid Migration: Operational Checklist for 2026</title><link>https://currentstack.io/stories/post-quantum-tls-hybrid-migration-ops-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/post-quantum-tls-hybrid-migration-ops-2026/</guid><description>A step-by-step migration model for hybrid post-quantum TLS with latency budgets, compatibility tests, and incident playbooks.</description><pubDate>Sun, 29 Mar 2026 00:00:00 GMT</pubDate><category>security</category><category>networking</category><category>performance</category><category>cloud</category><category>reliability</category></item><item><title>Kubernetes fsGroupChangePolicy and Restart SLOs: A 2026 Reliability Playbook</title><link>https://currentstack.io/stories/kubernetes-fsgroupchangepolicy-restart-slo-playbook-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/kubernetes-fsgroupchangepolicy-restart-slo-playbook-2026/</guid><description>How to reduce pod restart latency and protect rollout SLOs by applying fsGroupChangePolicy intentionally in Kubernetes production clusters.</description><pubDate>Sat, 28 Mar 2026 00:00:00 GMT</pubDate><category>kubernetes</category><category>site-reliability</category><category>platform-engineering</category><category>reliability</category><category>security</category><category>devops</category></item><item><title>Small Model Edge Voice Inference: Production Guide for 2026</title><link>https://currentstack.io/stories/small-model-edge-voice-inference-production-guide-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/small-model-edge-voice-inference-production-guide-2026/</guid><description>A practical architecture for deploying low-latency small voice models at the edge with observability, fallback strategy, and cost discipline.</description><pubDate>Sat, 28 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>edge</category><category>mlops</category><category>performance</category><category>platform-engineering</category><category>reliability</category></item><item><title>GitHub Actions Timezone Support: A Multi-Region Release Management Playbook</title><link>https://currentstack.io/stories/github-actions-timezone-change-management-multi-region-release-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/github-actions-timezone-change-management-multi-region-release-2026/</guid><description>How to redesign release, approvals, and incident ownership now that scheduled workflows can run in local business timezones.</description><pubDate>Tue, 24 Mar 2026 00:00:00 GMT</pubDate><category>devops</category><category>ci/cd</category><category>automation</category><category>enterprise</category><category>reliability</category></item><item><title>Workers Agents SDK v0.8: Idempotent Scheduling and Stateful Agent Operations Playbook</title><link>https://currentstack.io/stories/workers-agents-sdk-idempotent-scheduling-playbook-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/workers-agents-sdk-idempotent-scheduling-playbook-2026/</guid><description>A practical implementation guide for using readable state and idempotent scheduling in Cloudflare Agents SDK to run reliable production agents.</description><pubDate>Tue, 24 Mar 2026 00:00:00 GMT</pubDate><category>agents</category><category>cloud</category><category>edge</category><category>serverless</category><category>reliability</category><category>observability</category></item><item><title>Agentic Tooling in 2026: Channels, Session Events, and the New Reliability Baseline</title><link>https://currentstack.io/stories/agentic-tools-channels-and-session-event-architecture-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/agentic-tools-channels-and-session-event-architecture-2026/</guid><description>A systems design guide for teams adopting channel-based event injection and long-running agent sessions in production developer workflows.</description><pubDate>Fri, 20 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>agents</category><category>tooling</category><category>architecture</category><category>reliability</category></item><item><title>Hardware Price Shocks in 2026: Capacity Planning Patterns for Infra and Data Teams</title><link>https://currentstack.io/stories/hardware-price-shocks-and-infra-capacity-planning-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/hardware-price-shocks-and-infra-capacity-planning-2026/</guid><description>A playbook for handling sudden storage and device price swings without derailing delivery timelines, reliability targets, or budget discipline.</description><pubDate>Thu, 19 Mar 2026 00:00:00 GMT</pubDate><category>cloud</category><category>finops</category><category>platform</category><category>reliability</category><category>data</category></item><item><title>Robotaxi Capital Wave and the New Reliability Bar for Mobility Platforms</title><link>https://currentstack.io/stories/robotaxi-platform-sre-lessons-from-waymo-capex-wave/</link><guid isPermaLink="true">https://currentstack.io/stories/robotaxi-platform-sre-lessons-from-waymo-capex-wave/</guid><description>What engineering leaders can learn from large robotaxi funding rounds: reliability economics, safety SLOs, and city-by-city rollout control.</description><pubDate>Sun, 15 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>platform</category><category>site-reliability</category><category>reliability</category><category>enterprise</category></item><item><title>Stateful API Vulnerability Scanning: How to Connect Detection, Runtime Signals, and Triage</title><link>https://currentstack.io/stories/api-stateful-vulnerability-scanning-runtime-defense-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/api-stateful-vulnerability-scanning-runtime-defense-2026/</guid><description>A rollout model for stateful API scanning programs that avoid alert floods and produce actionable remediation queues.</description><pubDate>Sat, 14 Mar 2026 00:00:00 GMT</pubDate><category>security</category><category>api</category><category>observability</category><category>devops</category><category>reliability</category></item><item><title>Consumer AI and Psychosis Risk: A Safety Operations Framework for Product Teams</title><link>https://currentstack.io/stories/consumer-ai-psychosis-risk-safety-ops-framework-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/consumer-ai-psychosis-risk-safety-ops-framework-2026/</guid><description>Recent legal and media signals around AI-related psychosis demand concrete product safety operations, not just policy statements.</description><pubDate>Sat, 14 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>product</category><category>compliance</category><category>ux</category><category>security</category><category>reliability</category></item><item><title>Cloudflare Account Abuse Protection: A Practical Fraud-Defense Architecture for 2026</title><link>https://currentstack.io/stories/cloudflare-account-abuse-protection-fraud-defense-architecture-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/cloudflare-account-abuse-protection-fraud-defense-architecture-2026/</guid><description>How to combine behavioral signals, identity tiers, and response policies to reduce signup and login abuse without hurting conversion.</description><pubDate>Fri, 13 Mar 2026 00:00:00 GMT</pubDate><category>security</category><category>identity</category><category>reliability</category><category>cloud</category><category>observability</category></item><item><title>GitHub REST API 2026-03-10: A Migration Playbook for Stable Integrations</title><link>https://currentstack.io/stories/github-rest-api-version-2026-03-10-migration-playbook/</link><guid isPermaLink="true">https://currentstack.io/stories/github-rest-api-version-2026-03-10-migration-playbook/</guid><description>How platform teams should adopt the new GitHub REST API version with compatibility testing, endpoint inventorying, and rollout guardrails.</description><pubDate>Fri, 13 Mar 2026 00:00:00 GMT</pubDate><category>api</category><category>devops</category><category>platform-engineering</category><category>automation</category><category>tooling</category><category>reliability</category></item><item><title>Valkey Global Datastore DR Drills: Operating Cross-Region Failover Without Surprises</title><link>https://currentstack.io/stories/valkey-global-datastore-dr-failover-ops-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/valkey-global-datastore-dr-failover-ops-2026/</guid><description>A practical runbook for validating replication lag, failover timing, and application behavior in managed Valkey global setups.</description><pubDate>Fri, 13 Mar 2026 00:00:00 GMT</pubDate><category>cloud</category><category>caching</category><category>site-reliability</category><category>reliability</category><category>observability</category></item><item><title>RFC 9457 Error Contracts as a Cost Control Layer for AI Agents</title><link>https://currentstack.io/stories/rfc9457-error-contracts-agent-cost-optimization-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/rfc9457-error-contracts-agent-cost-optimization-2026/</guid><description>Using structured API errors to cut retry storms, reduce agent token burn, and improve reliability in tool-using AI systems.</description><pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate><category>api</category><category>backend</category><category>agents</category><category>reliability</category><category>performance</category><category>engineering</category></item><item><title>Turn Monthly Secret Scanning Pattern Updates into a Security Operating Model</title><link>https://currentstack.io/stories/secret-scanning-pattern-deltas-operating-model-2026/</link><guid isPermaLink="true">https://currentstack.io/stories/secret-scanning-pattern-deltas-operating-model-2026/</guid><description>How to operationalize monthly pattern updates from GitHub Secret Scanning with triage automation, ownership, and measurable response quality.</description><pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate><category>security</category><category>supply-chain</category><category>compliance</category><category>automation</category><category>devops</category><category>reliability</category></item><item><title>AI-Generated Code Flood: Building a Review Control Plane</title><link>https://currentstack.io/stories/ai-code-review-flood-control/</link><guid isPermaLink="true">https://currentstack.io/stories/ai-code-review-flood-control/</guid><description>How to redesign code review pipelines for the surge of machine-generated pull requests in 2026.</description><pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>engineering</category><category>ci/cd</category><category>reliability</category><category>automation</category></item><item><title>Pingora Ingress Request Smuggling: An Operator Response Playbook</title><link>https://currentstack.io/stories/pingora-ingress-smuggling-response-playbook/</link><guid isPermaLink="true">https://currentstack.io/stories/pingora-ingress-smuggling-response-playbook/</guid><description>A practical response plan for teams running Pingora as ingress after newly disclosed request smuggling CVEs.</description><pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate><category>security</category><category>api</category><category>networking</category><category>reliability</category><category>open-source</category></item><item><title>Dynamic Path MTU + QUIC: A Reliability Playbook for Enterprise SASE Clients</title><link>https://currentstack.io/stories/dynamic-path-mtu-quic-enterprise-sase-reliability/</link><guid isPermaLink="true">https://currentstack.io/stories/dynamic-path-mtu-quic-enterprise-sase-reliability/</guid><description>How network and platform teams can reduce silent packet loss and improve remote user experience with adaptive MTU and QUIC-first transport.</description><pubDate>Mon, 09 Mar 2026 00:00:00 GMT</pubDate><category>networking</category><category>cloud</category><category>performance</category><category>reliability</category><category>site-reliability</category></item><item><title>AI Agents in Scrum: An Operating Model That Improves Throughput Without Gaming Metrics</title><link>https://currentstack.io/stories/ai-agent-scrum-operating-model/</link><guid isPermaLink="true">https://currentstack.io/stories/ai-agent-scrum-operating-model/</guid><description>How to integrate coding and documentation agents into sprint execution while preserving accountability, quality, and team learning.</description><pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>agents</category><category>engineering</category><category>automation</category><category>reliability</category></item><item><title>Hardware-Aware LLM Selection: Turning Model Choice Into an SRE Discipline</title><link>https://currentstack.io/stories/hardware-aware-llm-selection-ops/</link><guid isPermaLink="true">https://currentstack.io/stories/hardware-aware-llm-selection-ops/</guid><description>Why teams need reproducible model-to-hardware routing policies as local inference and heterogeneous fleets expand.</description><pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>mlops</category><category>platform-engineering</category><category>performance</category><category>reliability</category></item><item><title>IP Overlap Is the New Normal: Return Routing Patterns for Modern SASE</title><link>https://currentstack.io/stories/ip-overlap-return-routing-sase-patterns/</link><guid isPermaLink="true">https://currentstack.io/stories/ip-overlap-return-routing-sase-patterns/</guid><description>How to design resilient SASE client routing when enterprises collide on private address space and split-tunnel assumptions break.</description><pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate><category>networking</category><category>zero-trust</category><category>edge</category><category>reliability</category><category>enterprise</category></item></channel></rss>