Datadog

Datadog

FeaturedPaid

Enterprise observability โ€” metrics, logs, traces, APM, and real user monitoring

๐Ÿ“ŠMonitoring & Observability

About Datadog

Datadog is the dominant observability platform for companies that take production reliability seriously. The platform unifies infrastructure metrics, application performance monitoring (APM), log management, distributed tracing, real user monitoring, and synthetics in one place with a single search experience. One dashboard can show a spike in error rates, the specific trace causing it, the logs from that request, and the infrastructure metrics from the affected host โ€” all connected. Datadog Watchdog uses ML to automatically detect anomalies and surface them before they become incidents. The Agent installs on any host, container, or cloud function. Datadog supports 650+ integrations. Pricing is complex and can escalate: infrastructure monitoring starts at $15/host/month, APM adds $31/host/month, logs add costs per GB ingested. Enterprise teams regularly spend $50k-500k/year. Compare to New Relic (more generous free tier, similar capability), Grafana (open source, self-managed), Honeycomb (better for microservices debugging). Best for: mid-market to enterprise engineering teams running critical production workloads.

What's Great

  • โœ“Unified platform: metrics, logs, traces, RUM, and synthetics in one product
  • โœ“Watchdog ML automatically detects anomalies and correlates signals
  • โœ“650+ integrations covering every major cloud service and technology
  • โœ“Dashboards that connect infrastructure and application metrics seamlessly
  • โœ“Excellent documentation and out-of-the-box dashboards for common stacks

Watch Out For

  • !Pricing can escalate dramatically โ€” common to see $50k+/year for mid-size teams
  • !Complex pricing model makes forecasting costs difficult
  • !Can require dedicated SRE to manage Datadog configuration
  • !Log retention and volume costs add up quickly for high-traffic applications

Common Use Cases

1

An SRE team gets alerted to a memory leak on a specific microservice via Watchdog before customers report slowness

2

A DevOps team builds a single dashboard correlating deploys with error spikes and response time degradation

3

An engineering team uses distributed tracing to identify which database query is causing p99 latency spikes

4

A company monitors real user metrics (Core Web Vitals) alongside server-side performance in one view

Pricing Model

Paid

Paid subscription required. Check website for current pricing.

Category

Monitoring & Observability

Infrastructure monitoring, APM, and observability platforms for production systems.

Tags

apminfrastructure monitoringlogsdistributed tracingobservability

More Monitoring & Observability Tools

See all โ†’