Monitoring Nodes (Geographic Locations)

A service might be down in Europe but accessible from North America. Or slow for Australian users but fast for US visitors. Single-location monitoring misses regional problems. Xitoring's distributed monitoring nodes check your services from 15+ locations worldwide, providing complete visibility into global availability and performance.

Why Multiple Nodes Matter

Single-node monitoring: Blind to regional outages (service fails in Asia, you don't know)
Multi-node monitoring: Global visibility (detect which regions affected)
Result: Understand user experience worldwide, not just from one location.

What Are Monitoring Nodes?

Distributed Probes

A monitoring node (also called probe or monitoring location) is a server that runs uptime checks FROM that geographic location.

How it works:

You create an uptime check (e.g., "Monitor https://example.com")
You select which nodes to monitor from (e.g., "US East, Europe, Australia")
Each selected node runs the check independently every interval
Each node reports results separately (response time, status, errors)
You see: Service is UP from US (200ms), UP from Europe (350ms), DOWN from Australia

Geographic diversity = comprehensive monitoring - Catch problems users experience in specific regions.

Available Monitoring Nodes

Xitoring operates monitoring nodes in 15 locations across 6 continents:

Country	City	Code Name	IPv4	IPv6
US	Los Angeles	LAX01	23.19.228.105	2607:f5b4:88:108:1c00:34ff:fe00:847
US	Jacksonville	JAX01	198.203.28.254	2605:8680:ffff:5::2
NL	Amsterdam	AMS01	85.17.162.72	2001:1af8:5301:132:1c00:73ff:fe00:1f4a
US	Virginia	VIR01	20.47.109.69	2603:1030:20e:3::35f
UK	London	LON01	173.234.79.120	2a0d:3003:b700:105:1c00:7dff:fe00:5d1
US	Chicago	CHI01	40.116.73.227	2603:1030:603::43d
CA	Montreal	MTL01	184.107.178.219	2607:f748:c2:106:1c00:e1ff:fe00:3a8
AU	Canberra	CBR01	4.197.41.211	2603:1010:2:2::2e
AU	Sydney	SYD01	173.234.109.13	2401:d040:1002:104:1c00:1ff:fe00:e4
AE	Dubai	DBX01	20.203.52.136	2603:1040:900:2::ab
DE	Frankfurt	FRA01	178.162.219.152	2a00:c98:400f:107:1c00:90ff:fe00:738
JP	Tokyo	JPN01	142.91.108.150	2401:d560:0:107:1c00:d5ff:fe00:46e
BR	São Paulo	BRA01	4.201.91.236	2603:1050:1:3::9a
US	Des Moines	DSM01	20.9.19.110	2603:1030:7:6::174
US	San Antonio	SAT01	4.151.79.14	2a01:111:f100:4002::9d37:c3cc

Why Multi-Node Monitoring Matters

Detect Regional Outages

Scenario: CDN misconfiguration routes European traffic to wrong backend

Without multi-node monitoring:

Only monitoring from US East
Check shows: ✅ Service UP, 150ms
Not aware that European users seeing errors

With multi-node monitoring (US + Europe):

Check from US East: ✅ Service UP, 150ms
Check from Amsterdam: ❌ Service DOWN, Connection refused
Check from London: ❌ Service DOWN, Connection refused
Result: 🚨 Alert "Service down from 2/3 nodes (Europe affected)"
Reality: You immediately know Europe has problems, investigate CDN

Global services need global monitoring - Users worldwide, monitoring should match.

Identify Performance Issues by Region

Scenario: Database hosted in US East, website users globally

Response times by node:

Node	Response Time	User Impact
US East (Virginia)	50ms	✅ Excellent
US West (Los Angeles)	120ms	✅ Good
Europe (Amsterdam)	450ms	⚠️ Slow
Australia (Sydney)	850ms	❌ Very slow
Asia (Tokyo)	920ms	❌ Very slow

Insight: US users happy, but Asian/Australian users experiencing poor performance.

Action: Deploy CDN or additional server region closer to Asia/Australia.

Without multi-node monitoring: You wouldn't know non-US users suffering.

Avoid False Positives from Network Issues

Scenario: Monitoring node's ISP has routing problem

With single location monitoring:

Node in Chicago experiences ISP outage
Can't reach your service
Creates incident: "Service Down"
False positive - Service is fine, monitoring node had problem

With multi-node monitoring (3+ nodes):

Node in Chicago: ❌ Can't reach service (ISP problem)
Node in Virginia: ✅ Service UP
Node in Amsterdam: ✅ Service UP
Smart logic: 2/3 nodes UP → Service is UP
No false positive incident created

Redundant monitoring = reliability - Network issues at monitoring locations don't cause false alerts.

Choosing Monitoring Nodes

Selection Strategies

1. Match Your User Base Monitor from locations where your users are:

Your Users	Recommended Nodes
US-only	US East + US West + USCentral (3 nodes for redundancy)
US + Canada	US East + US West + Canada Montreal
Europe-focused	Amsterdam + London + Frankfurt
Global service	Default (All) - monitor from all 15 locations
APAC-focused	Tokyo + Sydney + US West (closest US node)
Latin America	Brazil São Paulo + US East + US Central

Rule: If you have users in a region, monitor from that region.

2. Redundancy Within Region Don't rely on single node per region:

Example - US monitoring:

❌ Bad: Only US East (Virginia)
- If Virginia node has issues, false positive
✅ Good: US East (Virginia) + US West (Los Angeles) + US Central (Chicago)
- Node outage won't trigger false alert
- Detect coast-specific issues

Best practice: Minimum 3 nodes for production services (enables 2/3 majority voting).

3. Critical Services: Monitor from Everywhere For revenue-critical or SLA-backed services:

Select Default (All) node group
Monitors from all 15 locations
Maximum visibility into global status
Worth the additional check usage

Example - E-commerce checkout:

US + Europe + Australia + Asia + South America
Any regional outage immediately detected
Can issue customer communications specific to affected geography

4. Testing/Internal Services: Single Region For internal tools or dev environments:

Monitor from closest node only
Example: Internal admin panel → US East (where your office is)
Saves check quota, appropriate for non-customer-facing

Node Selection Groups

Pre-Configured Regional Groups

When creating uptime check, you can select:

Default (All) - All 15 nodes worldwide
North America - All US + Canada nodes (7 total)
US West - Los Angeles
US Central - Chicago, Des Moines, San Antonio
US East - Virginia, Jacksonville
Canada - Montreal
Europe - Amsterdam, London, Frankfurt (3 total)
Australia - Sydney, Canberra (2 total)
Middle East - Dubai
Brazil - São Paulo
Japan - Tokyo

Custom Selection - Pick specific individual nodes

Recommendation by Service Type

Service Type	Recommended Group	Reasoning
Global SaaS application	Default (All)	Users worldwide, monitor from everywhere
US-based startup	North America	Most users in US/Canada, comprehensive US coverage
European business	Europe + US East	EU users primary, US fallback
API for mobile app	Default (All)	Don't know where app users are
Internal company VPN	Single node (closest)	Only accessed from office location
Status page	Default (All)	Users check from anywhere when service down
CDN-backed website	Default (All)	Need to verify CDN serving all regions correctly

How Multi-Node Checks Work

Check Execution

When check interval triggers (e.g., every 5 minutes):

All selected nodes run check simultaneously
Each node records result independently:
- Response time
- Status code (HTTP checks)
- Resolution result (DNS checks)
- Connection success/failure
Results reported to Xitoring dashboard
Dashboard aggregates: "5/5 nodes UP" or "3/5 nodes UP, 2 DOWN"

Independent execution - Nodes don't coordinate; each acts as independent validator.

Incident Creation Logic

Majority voting for reliability:

Scenario	Incident Created?	Why
5/5 nodes DOWN	✅ YES	Service clearly down
4/5 nodes DOWN	✅ YES	Majority indicates outage
3/5 nodes DOWN	✅ YES	Majority down (possible regional outage)
2/5 nodes DOWN	❌ NO	Majority UP (likely node network issue)
1/5 nodes DOWN	❌ NO	Service UP, single node problem
0/5 nodes DOWN	❌ NO	Service fully operational

Configurable threshold: You can adjust "how many nodes DOWN before alerting"

Example configurations:

Conservative: Alert if 3+ nodes down (fewer false positives)
Aggressive: Alert if ANY node down (catch regional issues faster)
Custom: Alert if 2 specific nodes down (e.g., US East + US West = US completely down)

Response Time Aggregation

Dashboard shows per-node response times:

Example HTTP check:

Recent Check Results:
✅ US Virginia (VIR01): 45ms
✅ US Los Angeles (LAX01): 120ms  
✅ Amsterdam (AMS01): 380ms
✅ Sydney (SYD01): 720ms
✅ Tokyo (JPN01): 650ms

Average: 383ms
Fastest: 45ms (US Virginia)
Slowest: 720ms (Sydney)

Insight: Service fast for US users, slow for APAC users. Consider APAC server deployment.

Viewing Node-Specific Results

Per-Node Dashboard

To see results from individual nodes:

Dashboard → Uptime → [Your Check] → Results
View tabs:
- All Nodes - Aggregated view (5/5 UP)
- By Node - Individual node results
Select specific node: "Show only Amsterdam (AMS01)"
See historical data from that node specifically

Use for:

Investigating regional outages ("Which nodes saw the problem?")
Performance analysis by geography
Troubleshooting false positives ("Which node reported down?")

Historical Analysis

Check past performance by node:

Uptime → [Check] → Graphs
Filter: "Response Time by Node"
Multi-line graph shows each node's response time over time
Identify patterns:
- Did Sydney response time suddenly increase? (ISP peering issue)
- Did Amsterdam report intermittent failures? (CDN problem in Europe)

IP Whitelisting for Monitoring Nodes

Why You Might Need IPs

If your service uses IP-based access control:

Firewalls restricting access by IP
Cloud security groups (AWS Security Groups, Azure NSGs)
CDN WAF rules
API rate limiting by IP

You must whitelist monitoring node IPs so they can reach your service.

How to Whitelist

Option 1 - Whitelist All Node IPs: Add all 15 node IPs to your firewall/security group:

IPv4:

23.19.228.105
198.203.28.254
85.17.162.72
20.47.109.69
173.234.79.120
40.116.73.227
184.107.178.219
4.197.41.211
173.234.109.13
20.203.52.136
178.162.219.152
142.91.108.150
4.201.91.236
20.9.19.110
4.151.79.14

IPv6 (if applicable):

2607:f5b4:88:108:1c00:34ff:fe00:847
2605:8680:ffff:5::2
2001:1af8:5301:132:1c00:73ff:fe00:1f4a
2603:1030:20e:3::35f
2a0d:3003:b700:105:1c00:7dff:fe00:5d1
2603:1030:603::43d
2607:f748:c2:106:1c00:e1ff:fe00:3a8
2603:1010:2:2::2e
2401:d040:1002:104:1c00:1ff:fe00:e4
2603:1040:900:2::ab
2a00:c98:400f:107:1c00:90ff:fe00:738
2401:d560:0:107:1c00:d5ff:fe00:46e
2603:1050:1:3::9a
2603:1030:7:6::174
2a01:111:f100:4002::9d37:c3cc

Option 2 - Whitelist Only Selected Nodes: If you only monitor from specific nodes (e.g., US East + Amsterdam), whitelist only those IPs.

Option 3 - Use Monitoring-Specific Endpoint: Create dedicated health check endpoint accessible without IP restrictions:

Main site: IP-restricted
/health endpoint: Publicly accessible for monitoring

Security note: All monitoring node IPs are static and won't change without advance notice.

Testing Connectivity

Verify monitoring nodes can reach your service:

Create test uptime check
Select all nodes you plan to use
Run check immediately (don't wait for interval)
View results:
- ✅ Nodes showing UP: Connectivity works
- ❌ Nodes showing connection errors: IP not whitelisted

Best Practices

For Global Services

Start with "Default (All)" - Monitor from all 15 nodes
Review response times by region after 24 hours
Identify slowest regions (candidates for CDN or additional servers)
Set region-specific triggers:
- Alert if ANY node > 3000ms (very slow anywhere)
- Alert if Average across all nodes > 1000ms (global slowdown)

For Regional Services

Select nodes matching user geography (US service → US nodes only)
Use 3+ nodes within region for reliability (US East + West + Central)
Add one external node as "canary" (Europe node from US service - should always be slowish but reachable)
Alert only if majority of regional nodes down

For SLA Monitoring

Use all nodes in contract regions (SLA covers "US and Europe" → use all US + Europe nodes)
Set incidents trigger based on SLA terms:
- SLA: "99.9% uptime in each region" → Alert if ANY node in a region down
- SLA: "99.9% global uptime" → Alert only if majority of all nodes down
Generate monthly reports filtered by node for SLA compliance proof

For Cost Optimization

Each selected node counts toward your check usage:

1 check from 5 nodes = 5 check executions per interval
1 check from 1 node = 1 check execution per interval

Optimize:

Production services: Monitor from all relevant user regions (don't skimp)
Dev/test services: Single closest node (save quota)
Internal tools: Single node (no need for geographic diversity)

Troubleshooting

Check Failing from One Node Only

Symptoms: 4/5 nodes UP, 1 node consistently DOWN

Possible causes:

IP not whitelisted - Service blocking that specific node's IP
Node network issue - Rare, but monitoring node's ISP may have routing problem
Geo-blocking - Service intentionally blocks that country/region
CDN misconfiguration - CDN serving errors to specific geographic locations

Solutions:

Verify IP whitelist - Ensure node's IP in firewall rules
Check service logs - Are requests from that node's IP arriving? What response sent?
Test manually - Use VPN/proxy in that node's region to replicate issue
Disable problematic node - Remove from check if consistently false positive
Contact support - Report suspected monitoring node network issue

Response Times Inconsistent Across Nodes

Symptoms: US nodes show 50ms, European nodes show 4000ms

This is often normal:

Geographic distance - Physics: more network hops = higher latency
Server location - Service hosted in US, European requests cross Atlantic

When it's a problem:

European nodes suddenly spiked from 300ms → 4000ms
One European node fast, another slow (should be similar)

Solutions:

Deploy CDN or servers in slow regions
Investigate peering routes (is traffic taking inefficient path?)
Check logs for errors that delay responses in specific regions

False Positives from Multiple Nodes

Symptoms: Service is UP (you can access it), but monitoring shows 3/5 nodes DOWN

Possible causes:

IP blocking - Firewall blocking monitoring IPs
Rate limiting - Service rate-limiting monitoring node IPs
DDoS protection - WAF flagging monitoring traffic as attack
Maintenance - Service actually was down, monitoring correct

Solutions:

Whitelist all node IPs in firewall/WAF/CDN
Exempt monitoring from rate limits (dedicated /health endpoint)
Review incident timeline - Does it match known issues?
Check service logs - Confirm requests arriving and responses correct

Common Scenarios

Scenario 1: CDN Regional Outage

Service: Website behind Cloudflare CDN

Incident:

10:00 AM - Alert: "Check DOWN from 2 nodes (Sydney, Tokyo)"
Dashboard shows:
- ✅ All US nodes: UP, 50-150ms
- ✅ All Europe nodes: UP, 200-350ms
- ❌ Sydney: DOWN, Connection timeout
- ❌ Tokyo: DOWN, Connection timeout
- ✅ Dubai: UP, 450ms

Diagnosis: APAC-specific outage, likely CDN issue in Asia-Pacific region

Action:

Check Cloudflare status page (confirms Asia-Pacific degradation)
Post status update: "Users in Asia/Australia may experience issues due to CDN provider incident"
Monitor until Cloudflare resolves
Without multi-node monitoring: You wouldn't know APAC users affected

Scenario 2: Server Migration

Service: API migrating from US East to multi-region (US + Europe)

Before migration:

Monitored from: US East only
Response time: 45ms

After deploying Europe server:

Add Amsterdam + London nodes to monitoring
Results:
- US nodes: 50ms (similar to before)
- Europe nodes: 40ms (huge improvement from previous ~300ms)
Confirm migration successful, European users now served locally

Without multi-node monitoring: Couldn't verify European performance improvement.

Scenario 3: Firewall Misconfiguration

Service: Internal API with IP allowlist

Incident:

Add new uptime check, select 5 nodes
Results: 0/5 nodes can reach service (all show connection refused)
But service works when you test manually

Diagnosis: Monitoring node IPs not in firewall allowlist

Resolution:

Add all 15 node IPs to firewall rules
Re-run check
All nodes now reporting UP

Lesson: Always whitelist monitoring IPs when using IP-based access control.

Node Infrastructure & Reliability

Node Uptime

Monitoring nodes maintained at 99.95%+ uptime:

Automated health checks of nodes themselves
Failover to backup probes if node degraded
Regular maintenance (scheduled, announced in advance)

If node goes down:

Other nodes continue monitoring (redundancy)
Node marked "Maintenance" in dashboard
Checks temporarily redistributed to healthy nodes
No customer action needed

Network Quality

All nodes connected to tier-1 ISPs:

Low-latency routing
High bandwidth (no bottlenecks from monitoring traffic)
IPv4 + IPv6 support
Sub-50ms latency to major internet exchanges

Monitored from monitoring nodes:

Xitoring monitors its own monitoring infrastructure
Real-time detection of node issues
Proactive resolution before customer impact