On-Call Playbook for PCI timeserver hosts

Time servers are "connected systems" and are therefore managed in the PCI infrastructure. Use your .re (regulatory) accounts for all access

Ecosystem

Time synchronization is critical for the campus. If time drifts too much between systems, basic authentication can fail.

Our three ntp time servers serve the credit card “PCI” systems, and so access is strictly controlled. You will need a regulatory (dot re) account and Duo (only) two factor authentication in order to ssh to these hosts, and even then only from pre-approved subnets.

Service Windows

Time doesn’t actually drift that rapidly, so routine maintenance can be safely performed on any individual time server, and short outages are easily tolerated even with mis-configured clients.

You should post an outage notification

  • If all three servers were to go down simultaneously.
  • If the total downtime of a host was more than a day.

Firewall restrictions

You can access these hosts with ssh from VLAN 1108, and the OIT VPN. Elsewhere?

Tests

The time servers enjoy the same basic OS checks as almost all the other roles. On sysnews

ntp | Check Interval | Warning | Critical
| ————– | —————— | —————
| 5 minutes | ?? | ??

This is a useless check that just fingers the ntpd port. JAK swears to destroy it someday.

First Actions

Basic RedHat Troubleshooting.

If the problem is unresolved.

TBD

Posting boilerplate

TBD

Tags: oncall
Edit me