The 20 worst Azure misdesigns and limitations

The 20 worst Azure misdesigns and limitations

Here's a compilation of twenty significant Azure "gotchas" that can cause unexpected problems in production:

  1. Storage Account Name Constraints - Globally unique, lowercase alphanumeric only, 3-24 characters, making automation challenging when preferred names are taken.

  2. Azure Function Consumption Plan Cold Starts - Similar to AWS Lambda, but can be even more pronounced, particularly with .NET applications.

  3. Resource Group Deletion Cascading - Deleting a resource group deletes ALL contained resources without granular confirmation, making accidental mass-deletions possible.

  4. App Service Deployment Slots Surprises - Settings marked as "slot settings" don't swap during slot swaps, but this isn't always intuitive and can lead to configuration mismatches.

  5. ARM Template Resource Dependencies - Implicit dependencies aren't always detected correctly, requiring explicit dependsOn declarations that aren't obvious until deployment failures occur.

  6. Azure AD Guest User Permission Delays - Permissions granted to B2B guest users can take hours to fully propagate, causing access issues.

  7. App Service CORS Implementation - The CORS settings in App Service only apply to the application itself, not to Azure Functions hosted within the same plan.

  8. IP Restriction Evaluation Order - Azure services evaluate IP restrictions before authentication, so authenticated users might still be blocked by IP restrictions.

  9. Azure SQL Connection Pool Exhaustion - Azure SQL has connection limits based on service tier, but the limits aren't proactively enforced, leading to performance degradation rather than clear errors.

  10. Activity Log Retention Period - Limited to 90 days maximum, requiring custom solutions for longer retention which isn't obvious until compliance audit time.

  11. Azure Front Door Cache Purging Limits - Limited to 100 purges per day per Front Door profile, making granular cache management difficult.

  12. Cosmos DB Request Units (RUs) Calculation - RU consumption varies significantly based on query patterns, with no easy way to predict costs accurately before deployment.

  13. Azure DevOps Build Pipeline Variable Groups - Not automatically linked to release pipelines, requiring manual linkage that's easy to miss.

  14. Logic Apps Connector Throttling - Different connectors have different undocumented throttling limits that can cause intermittent failures.

  15. Virtual Machine Reserved IP vs Static IP Confusion - Two similar concepts with different implementations and limitations, often leading to IP address changes during maintenance events.

  16. Azure Key Vault Secret Size Limitations - Limited to 25 KB per secret, which becomes problematic for certificates with chains or large configuration files.

  17. Event Grid Message Ordering - No guaranteed order of message delivery, which isn't always clear from the documentation and can break sequence-dependent processes.

  18. API Management Policy Evaluation Order - Policies are evaluated in a specific order that isn't immediately intuitive, causing security or transformation issues.

  19. Azure Functions Output Binding Execution - Output bindings execute even if your function throws an exception, potentially causing data inconsistency.

  20. Azure Kubernetes Service (AKS) Node Pool Upgrades - Automatic upgrades can cause unexpected downtime if pod disruption budgets aren't configured correctly.

These issues aren't always well-documented in the official Azure documentation and are typically discovered through production experience or community knowledge sharing.