Part 1: Is Insider Risk a Likely Vector in the 16 Billion Credential Leak?

Part 1: Is Insider Risk a Likely Vector in the 16 Billion Credential Leak?

25 June 2025 Cyber and Information Security Comments (0)

Facebook Tweet Pin LinkedIn Like Share Yummly

By Prasanna Abeysekera

It was shocking news for the cybersecurity community in June 2025: "Researchers discover 16 billion new stolen passwords online." There were initial worries regarding the amount and method of credential harvesting due to the large scale of this incident. Was malware responsible for this? Stuffing credentials? Brute-force cracking? Maybe there's a deeper systemic issue.
We take readers beyond the headlines and into the actual repercussions of the breach in this two-part technical series. The first section proposes that insider access or server-side penetration is the most probable reason for such a large leak to happen despite modern security measures like hashing, salting, and two-factor authentication (2FA).

What Information Was Revealed in the Breach Files?

Note on the masked columns:

In the leaked screenshot, two columns have been visually redacted or masked. While we cannot confirm their exact content, based on typical breach patterns and data dump structures, they likely contain identifiers such as email addresses, usernames, IP addresses, or device and session metadata. However, the redaction does not weaken the case or invalidate the analysis. The presence of a plaintext password ("fabricio1234"), along with structured domain and login path data, provides sufficient evidence of insecure handling. The format and detail presented confirm that these records were extracted from a centralised system with backend access rather than being randomly collected through malware.

Screenshots of the exposed data show structured JSON objects, such as the following example:

Notable characteristics:

Cleartext passwords (e.g., fabricio1234)
Associated domain and login paths
High-resolution timestamps
No presence of hashes or salts

This observation is inconsistent with a dataset consisting of hashed credentials; instead, it appears to represent a direct export from internal systems.

Why did this occur despite the existing security measures?

Even with best practices like salted password hashing and two-factor authentication (2FA), credentials can still be compromised due to implementation or architectural failures such as:

Client-side malware Isn't Scalable to 16B Records : While info stealers such as RedLine and Raccoon are capable of harvesting credentials, the claim that these types of malware-infected between 3.5 and 16 billion endpoints lacks credibility. The uniform structure and metadata point to a centralised breach rather than widespread endpoint-based harvesting, making this assertion highly questionable.
Credential reuse amplifies the impact of security breaches: Many users often reuse passwords across various platforms, which poses a significant security risk. When a weak password is compromised, it can lead to breaches of multiple accounts. However, it's essential to recognise that password reuse alone doesn't account for the disclosure of plaintext passwords in breaches; such incidents typically require unauthorised access to the server itself.
Brute-force cracking is too slow, given the volume of data: Brute-forcing strong hashes is impractical, even with a Cray supercomputer, which was designed by Cray Research Inc. and is now part of Hewlett Packard Enterprise (HPE).

Brute force methods are not scalable to 16 billion entries unless the passwords are extremely weak; even in that case, they remain impractical at such a scale.

Two-factor authentication (2FA) safeguards access to accounts but does not protect the data that is stored within those accounts: Two-factor authentication (TOTP or other methods) protects login sessions but not stored credentials. If a database is compromised, hashed passwords, salts, and TOTP secrets could all be exposed.
What This Dataset Implies: The existence of cleartext passwords strongly suggests that data was:
- Not hashed at all, or
- Captured before hashing (in logs, memory, or dev tools)

This brings us to the essential question:

Where did these credentials originate?

All indications suggest that a compromise has occurred on the server side or that someone has gained privileged access. Let's take a closer look at how this may have occurred:

Server-Side Vectors:

Memory-Level Access: Individuals (or attackers) with access to backend processes can retrieve passwords before they are hashed.
Log Leakages: Applications that are not configured properly may log passwords in plain text during the registration or login processes.
Database Misconfiguration: Some legacy or test systems continue to store passwords in an unencrypted format.
Staging Environments: Developers often disable security measures in test environments, leaving them vulnerable to attacks.

Insider Involvement Is Highly Probable

Imagine a scenario where an attacker manages to:

Gain access to neatly structured data that are well-organised,
Export a massive trove of credentials from backend systems,
Classify the leaked information by domain and specific login paths…

In such a case, they likely possessed:

Admin-level access, giving them the keys to the kingdom,
Privileges to perform database dumps, allowing them to extract sensitive information,
Insider knowledge or valid credentials, enabling them to navigate the system with ease.

This structured dataset stands in stark contrast to typical random outputs from infostealers. It's evident that this was a well-planned and deliberate export, not just a haphazard grab of data.

What If These Systems Used SHA-256 + Salt?

If proper hashing and salting were applied, we would see hashed values such as:

However, we did not. So, the original system either:

Did not hash passwords,
Logged them before hashing or
Stored them insecurely in staging or backups.

Final Recommendation

This suggests a failure to adhere to basic principles of cryptographic hygiene.

⚠️ Security extends beyond cryptography; it also encompasses system architecture, proper hygiene practices, and the impact of human factors.

Conclusion: Yes, Insider Risk is a Likely Vector

When cleartext passwords are compromised on a large scale, it often stems from inadequate implementation and poor access control rather than a simple failure to prevent brute-force attacks. Had SHA-256 +Salt and two-factor authentication (2FA) been effectively implemented, this unfortunate dataset would not exist. The most plausible explanation for this breach is that these credentials were accessed from server memory, API logs, misconfigured databases, or through insider threats.

Stay tuned for Part 2, where we'll go over actionable defence measures for:

Hashing implementations
Insider threat monitoring
Secure application design
Detection of log and memory leaks

Legal Disclaimer