Python: Insecure Password Detection

1. Generate a Password Policy

The following are sensible requirements for an organisational Password Policy:

  • 8+ characters, max of 20 characters (prevent buffer overflow/too-hard-to-remember passwords)

  • No character can be repeated more than 3 times in a row I forgot to implement this 😿

  • Use of UPPERCASE, lowercase, numbers and special symbols

  • Doesn’t follow password walking patterns

  • Doesn’t match common passwords used in dictionary attacks

  • Doesn’t contain obvious words associated with the company

2. Create Python script

Let’s start out by blueprinting our Python script to process each of these requirements individually.

# PASSWORD LENGTH CHECKER

# CONTAINS UPPERCASE, LOWERCASE, NUMBER and SPECIAL SYMBOL

# DOESN'T FOLLOW WALKING PATTERNS

# DOESN't MATCH COMMON THREAT ACTOR DICTIONARY(S)

# NO OBVIOUS COMPANY KEYWORDS (PREVENT CUSTOM DICTIONARY ATTACK)

2.0 Flow of Enforcement

Fail by default. I know it sounds weird, but in software engineering, it is a best practice to always fail, unless conditions can be proven to succeed. However, in many implementations of this code, we still want users to be able to create passwords if our script is buggy, so we will make an informed decision to succeed by default, and try to fail the chosen password at each check.

def isCompliantPassword():

    # PASSWORD LENGTH CHECKER

    # CONTAINS UPPERCASE, LOWERCASE, NUMBER AND SPECIAL SYMBOL

    # DOESN'T FOLLOW WALKING PATTERNS

    # DOESN'T MATCH COMMON THREAT ACTOR DICTIONARY(S)

    # NO OBVIOUS COMPANY KEYWORDS (PREVENT CUSTOM DICTIONARY ATTACK)

    # No tests fail, password compliant
    return True

Our code will return:

FALSE => Failed to meet Password Policy

TRUE => Meets Password Policy

2.1 Password Length Checker

I’m going to try to keep our coding consistent and succinct, by using REGEX (Regular Expression) patterns to run our checks. You could custom script functions that achieve these checks, but I don’t believe it is necessary (more than welcome to for learning opportunity). Python includes a built-in REGEX module, review the docs.

From Python Regex module docs

Cool, lets begin creating the actual REGEX pattern for this rule. Use a REGEX tester online to make the pattern generation easier.

  • The . character represents any possible character

  • The ^ character means that the string pattern is matched exactly from the start of the line, $ means it finishes matching exactly at the end of the line

  • We use {min_repetitions, max_repetitions} to specify how many of the preceding character types there must be in a row i.e. we need minimum of 8 of any characters and maximum of 20 of any character

Testing our REGEX pattern online

Looks good! Let’s create an example password string within a Python variable and then run the REGEX test using the Python module syntax.

From the REGEX Python module docs, it seems best for us to code using this second syntax example, since we only check each string against each REGEX check once. Therefore, caching doesn’t seem necessary and this example looks more readable.

Also - keep in mind, that re.match() returns None (NULL) for failed REGEX patterns and an object for successful patterns. We need to wrap in bool() to convert this falsey value to FALSE and truthy value to TRUE.

Code:

# PASSWORD LENGTH CHECKER 
isCorrectLength = bool(re.match(r"^.{8,20}$", test_pw)) 
print(isCorrectLength) # For testing

2.2 Contains Certain Characters

We just follow on and make REGEX tests for each of the character requirements.

# CONTAINS UPPERCASE, LOWERCASE, NUMBER AND SPECIAL SYMBOL
hasUppercase = bool(re.match(r"[A-Z]+", test_pw))
print(hasUppercase) # For testing

hasLowercase = bool(re.match(r"[a-z]+", test_pw))
print(hasLowercase)  # For testing

hasNumber = bool(re.match(r"\d+", test_pw))
print(hasNumber)  # For testing

hasSpecialSymbol = bool(re.match(r"[!@#$%^&*()]+", test_pw))
print(hasSpecialSymbol)  # For testing

But running this script is quickly giving me very confusing output, even if it works right.

Raw "True" and "False" print statements quickly lose meaning.

So let’s just quickly write up a function which returns the variable name, followed by the test result. We can use that in future instead of print statements.

Or not… it seems there isn’t a native way to print a variables name in a modular way. From my research. Maybe it’s time to structure each of these tests into an object which has the following:

  • Test name (same as variable name)

  • Test outcome (True vs False)

Then, we can put all of these objects into a list and traverse this list at the end of the script, outputting all of the test results one-by-one.

from re import match

class PasswordTest:
    def __init__(self, test_name, test_regex):
        self.test_name = test_name
        self.test_regex = test_regex

    def testOutcome(self, password_input):
        """
        Return the test outcome based on REGEX pattern for password input
        """
        return bool(match(self.test_regex, password_input))

2.3 Uppercase / Lowercase / Number / Special Char

At this point I noticed I was constantly performing REGEX evaluations, so made the decision to modularize a REGEX evaluation function and simply call it with the required parameters to perform similar but distinct checks.

    PasswordTest_Regex("hasValidChars", "match", r"^[a-zA-Z0-9!@#$%^&*()_+-=]*$"),
    PasswordTest_Regex("isValidLength", "match", r"^.{8,20}$"),
    PasswordTest_Regex("hasUppercase", "search", r"[A-Z]+"),
    PasswordTest_Regex("hasLowercase", "search", r"[a-z]+"),
    PasswordTest_Regex("hasNumber", "search", "\d+"),
    PasswordTest_Regex("hasSpecialSymbol", "search", r"[!@#$%^&*()]+"),
    PasswordTest_Dictionary("notInThreatActorDicts"),

2.4 Preventing Dictionary Attacks

One of the more interesting complexity checks - ensuring that the password isn’t part of a dictionary list - either of corporate keywords or common hitlists used by threat actors.

I included the option to enumerate common l33t (Leet) transformations of the provided dictionary, particularly useful for detecting common variations of obvious passwords. I thought this would be especially useful for organizational keyword dictionaries, as opposed to huge hitlists (as this may take a very long time to evaluate). My l33t mapping was relatively simple, but could be expanded to account for many other, rarer substitutions.

Eg. hitlist contains words like “<Company Name>”, “<CEO Name>” etc. that employees may use lazily

# Create dictionary to map common LEET mappings as character sets for regex
                            leetmap_common = {"a": "[a4@]", "e": "[e3]", "o": "[o0]", "i": "[i1!]", "l": "[l1!]",
                                              "s": "[s$5]"}

L33t mapping table provided by Wikipedia

Otherwise, by default I would not perform the l33tc0de transformations on the hitlist terms and simply enumerate the provided rockyou.txt, a true classic for cracking obvious passwords via a Dictionary Attack.

2.5 Preventing Keyboard Walks

Keyboard walk in a nutshell

The other creative check I thought I could implement is to detect Keyboard Walks. I constrained this to only horizontal keyboard walks like qwertyu or ghjkl as these are most common from my experience. Vertical would be like qazwsxedc or something and I just don’t feel it is nearly as common.

So I created an adjacency map for all UK keyboard layout keys, such that I could know what keys were directly neighboring them.

Eg. ‘a’ is directly adjacent to only ‘s’. While ‘s’ is directly adjacent to ‘a’ and ‘d’.

keyboard_horizontal_layout = {

"!": ["@"], "@": ["!", "#"], "#": ["@", "$"], "$": ["#", "%"], "%": ["$", "^"], "^": ["%", "&"], "&": ["^", "*"], "*": ["&", "("], "(": ["*", ")"], ")": ["(", "_"], "_": [")", "+"], "+": ["_"],

"1": ["2"], "2": ["1", "3"], "3": ["2", "4"], "4": ["3", "5"], "5": ["4", "6"], "6": ["5", "7"], "7": ["6", "8"], "8": ["7", "9"], "9": ["8", "0"], "0": ["9", "-"], "-": ["0", "="], "=": ["-"],

"q": ["w"], "w": ["q", "e"], "e": ["w", "r"], "r": ["e", "t"], "t": ["r", "y"], "y": ["t", "u"], "u": ["y", "i"], "i": ["u", "o"], "o": ["i", "p"], "p": ["o", "["], "["

: ["p", "]"], "]": ["["],

"a": ["s"], "s": ["a", "d"], "d": ["s", "f"], "f": ["d", "g"], "g": ["f", "h"], "h": ["g", "j"], "j": ["h", "k"], "k": ["j", "l"], "l": ["k", ";"], ";": ["l", "'"], "'": [";", "\\"], "\\": ["'"],

"z": ["x"], "x": ["z", "c"], "c": ["x", "v"], "v": ["c", "b"], "b": ["v", "n"], "n": ["b", "m"], "m": ["n", ","], ",": ["m", "."], ".": [",", "/"], "/": ["."]

}

The logic is: each for each character in the password, from first char to last char, if the next char in sequence is indeed adjacent to the current char, then add one to the counter for detecting horizontal keyboard walks. As soon as finding a char where this doesn’t hold true, return counter to zero. But keep track of the max. value which the counter hits in the evaluation. If this max. is above an acceptable threshold (eg. 5 to prevent False Positives) then consider the password walk check a FAIL.

3. Final Touches - CLI Usage

With my test executions returning as expected, I did experiment with trying to make the Python program nicer to execute via CLI using pyCLI, but honestly, I’m not bothered to push through the bugs I was getting as of now. Maybe later I will revisit and fix for better CLI usage.

Pull my final code from Github, or feel free to check out my other developed programs.

Possible Applications

Obviously, as it stands the script is just a proof of concept.

But some possible applications of this script could be:

  • Compile the Python script as a C-compatible Password Filter DLL to be used with LSA upon ActiveDirectory password creation

  • Provide via the company Intranet portal or as an open-source web application for users to make strong password decisions

  • Implement in the backend of proprietary web application or host-based application in addition to non-enforcing front-end password complexity checks

Next
Next

Pi Not Pie: Building my Bedroom Digital Dashboard