Welcome to The Long View—where we peruse the news of the week and strip it to the essentials. Let’s work out what really matters.
This week: Why Apple services were down, Linux gets a huge RNG overhaul, and we wonder if Okta was hacked again.
1. Rotten Apple Ops?
First up this week: Most of Apple’s services were down yesterday (at least in some regions). For at least two hours, very little was working, including the dev site and some internal apps.
Analysis: “It’s always DNS”
Despite rumors, it wasn’t a Russian hack. Nor was it a BGP attack. Although Cupertino hasn’t ‘fessed up, it looks very much like it was DNS (quell surprise).
Joe Rossignol broke the story: iCloud and Many Other Apple Services Are Down
Affected services and apps include the App Store, iCloud, Siri, iMessage, iTunes Store, Apple Maps, Apple Music, Apple Podcasts, Apple Arcade, Apple Fitness+, Apple TV+, Find My, FaceTime, Notes, Stocks, and many others.
Apple’s developer website is also inaccessible. … Some of Apple’s internal systems are also down.
What went wrong? Animations adds:
Apple’s own DNS servers are redirecting developer.apple.com to something on “akadns.net”, which is operated by Akamai. But Apple’s own DNS servers refuse to resolve that, probably because it’s not in the apple.com zone.
It’s clearly a botched DNS configuration. Not clear what the intent was. … Anyway, this looks like an attempt to outsource something to Akamai that went badly wrong.
ofc, “It’s always DNS,” amirite? Sadiq Saif says:
As far as I can tell the issue was caused by a DNSSEC validation failure on aaplimg.com. I noticed a lot of DNSSEC BOGUS messages in my local pihole’s log for names like proxy.safebrowsing.apple which CNAME to aaplimg.com.
And here’s more dig’ing from lima:
Looks like their DNS servers are responsive, but refuse to serve records. … Most likely a configuration mistake that’ll be undone as soon as they figure out how to re-deploy their DNS servers while DNS is down.
2. Linux Random Number Generator Refactored
Proper, unguessable random numbers are key to DevOps intrinsics such as strong encryption and secure TCP. Linux has been falling behind other operating systems, in areas such as entropy collection, VM security, and obsolete hashing.
Analysis: Technical debt—begone!
Not only has Jason Donenfeld improved these areas, but he’s also addressed a couple of decades worth of code cruft—improving the readability, maintainability and docs. I agree with him that such “unsexy improvements” are critically important investments.
Michael Larabel has been watching developments: Linux 5.18 To Bring Many Random Number Generator Improvements
WireGuard lead developer Jason Donenfeld has recently been spearheading many improvements to the Linux kernel’s random number generator. [These] RNG improvements [give] better VM security, massive performance improvements, and more.
The horse’s mouth would be Jason A. Donenfeld:
[It’s] an attempt to modernize both the code and the cryptography used. … The goal has been to shore up the RNG’s existing design with as much incremental rigor as possible, without, for now, changing anything fundamental. … The focus has been on evolutionarily improving the existing RNG design.
The algorithms underlyings that turn … entropy sources into cryptographically secure random numbers have been overhauled. … The most significant outward-facing change is that /dev/random and /dev/urandom are now exactly the same thing. … I began by swapping out SHA-1 for BLAKE2s … with SHA-1 having been broken quite mercilessly, this was an easy change to make. … That change allowed us to improve the forward security of the entropy input pool from 80 bits to 128 bits [and] set the stage for us to be able to do more interesting things with hashing and keyed hashing … to further improve security [and] performance.
random.c was introduced back in back at 1.3.30 … and was a pretty impressive driver for its time, but after some decades of tweaks, the general organization of the file, as well as some coding style aspects were showing some age. … So a significant amount of work has gone into general code readability and maintainability, as well as updating the documentation. I consider these types of very unsexy improvements to be as important if not more than the various fancy modern cryptographic improvements.
But sinij worries about the new and shiny:
What about SP 800-90B compliance? These changes—especially the switch to BLAKE2s—all but guarantee that Linux would not be able to get NIST certified (and consequently adopted in government, healthcare, financial applications that require certification). So good job giving even more reasons to Red Hat to completely fork the kernel in RHEL 9.
3. “Insiders-as-a-Service” Scrotes Claim Another Victim
The LAPSUS$ group says it’s hacked Okta, a huge identity and authentication provider. If true, it’s a troubling development for any DevOps team that uses Okta.
Analysis: Hack redux or APT?
So far, Okta’s public statements aiming to minimize the problem, saying the screenshots shared are just a retread of data stolen in January. But the group claims it’s persistent inside Okta. Whatever the truth, the fact that groups such as LAPSUS$ can so easily bribe employees and contractors should focus the minds of every DevOps professional.
Raphael Satter: Okta probes report of digital breach
Hackers posted screenshots showing what they claimed was [Okta’s] internal company environment. A hack … could have major consequences because thousands of other companies, such as FedEx, Moody’s and T-Mobile, rely on the San Francisco-based firm to manage access to their own networks and applications. … Okta describes itself as the “identity provider for the internet” and says it has more than 15,000 customers on its platform [for] identity services such as Single Sign-On and Multi-factor Authentication.
The screenshots were posted by a group of ransom-seeking hackers known as LAPSUS$. [They included] pictures of what appeared to be Okta’s internal tickets and its … Slack.
In a statement, Okta official Chris Hollis said the breach could be related to an earlier incident in January, which he said was contained. Okta had detected an attempt to compromise the account of a third party customer support engineer at the time, said Hollis: “We believe the screenshots shared online are connected to this January event. … There is no evidence of ongoing malicious activity beyond the activity detected in January.”
How come this group keeps popping up recently? By bribbing insiders, as nstart explains:
They use the weakest human link in the chain. … They specifically recruit people with access to VPNs/internal support systems. This Okta breach seems to have happened through similar means. The group has made specific calls for access to gaming companies, hosting providers, telcos, call centers, and BPM providers.
[Not] difficult to see how some overworked, underpaid support agent (or even a well paid disgruntled one) might decide to go with this. Just takes one well placed agent to give creds and this group has access to a huge attack surface instantly. This might sound like an overreaction, but corporations in the future might need to make least privilege access and access logging everything a priority.
So an inside job, with the perps in control for two or three months? Scary, thinks upuv:
This is going to be a savage mess. … My colleagues in many organizations and enterprises are now scrambling to identify and contain any breach. … My guess is all the other MFA providers are in full panic mode.
The Moral of the Story:
This above all—to thine own self be true.
You have been reading The Long View by Richi Jennings. You can contact him at @RiChi or [email protected].
Image: Roman Bolozan (via Unsplash; leveled and cropped)