CY 2550 - Foundations of Cybersecurity

Project 2: Passwords

This project is due at 11:59pm on Friday, February 15, 2019.

Description and Deliverables

In this project, you will gain hands on experience cracking passwords, as well as creating secure, memorable passwords that are resistant to cracking. As such, this project has two distinct parts. I highly recommend that you start part one immediately: the necessary computations can take days to complete!

To receive full credit for this project, you will turn in the following two things:

  1. A file named cracked.txt that contains the usernames and cracked passwords for the 50 users contained in this leaked /etc/shadow file.
  2. A program that you will write called xkcdpwgen that can generate secure, memorable passwords using the XKCD method.
Each of these deliverables is described in greater detail below.

Part 1: Password Cracking

Linux systems typically store cryptographically hashed user passwords in crypt format in the /etc/shadow file. If you have sudo access to a Linux system, you can view this file on your own system (don't try to look at this file on systems you don't own, like the Khoury College Linux machines). The file format for the /etc/shadow file is described here.

In this part of the project, you will crack the hashed passwords contained in this leaked /etc/shadow file. There are 50 usernames and passwords in the file, meaning that it will take several days of compute power to crack all 50 passwords, so start this process early!

Cracking Tools

We recommend that students use well-known, heavily optimized cracking tools like John the Ripper or HashCat for this part of the project. Both tools are available for multiple platforms, although they are trivial to install on Debian-based Linux systems:
sudo apt install john
sudo apt install hashcat
Both tools have built-in support for the /etc/shadow file format, have the ability to pause and resume cracking sessions (a useful feature, since cracking can take hours/days), and support multiple different strategies for guessing passwords (e.g. brute force, word lists, etc.). We leave it to you to determine which tool you prefer and learn its command line syntax. Students are welcome to use whatever password guessing approach they want; many wordlists are available for free online, including from the John the Ripper homepage.

John the Ripper and HashCat both have the ability to run in multi-threaded configurations (i.e. they try to crack multiple passwords in parallel). We highly recommend that students utilize these features; for example, on a quad-core laptop, running John the Ripper with the "--fork=3" option to use three CPU cores is a reasonable approach. Alternatively, if your computer has a GPU, we highly recommend using the GPU-optimized, OpenCL modes available in both programs, since GPUs are several orders of magnitude faster at password cracking than CPUs.

Cracking Approach

The leaked shadow file is designed to have a sliding difficulty scale. Without doing anything fancy, roughly half of the passwords should crack in just a few minutes. Why do you think these passwords were so easy to crack?

With a reasonably comprehensive wordlist/dictionary (links to examples are provided in Part 2 of the project) combined with common permutation rules, another ~15 passwords should crack within 24 hours. For example, using John the Ripper the following command will attempt to crack the passwords using a wordlist of your choice and John's built-in permutation rules (e.g. capitalizing the first and last letters of words, adding random numbers to the end of words, etc.).

$ john --wordlist=[path to your wordlist] --rules --fork=3 [path to the shadow file]
The remaining ~10 passwords are more challenging, and require more expansive permutation rules (hint: symbols) or even raw brute force to crack. For example, using John the Ripper, you can attempt a brute force attack against the shadow file using all combinations of ASCII characters with length <14 using the following command:
$ john --incremental=ASCII --fork=3 [path to the shadow file]
Note that this kind of brute force approach will take a long time to complete.

File Format for Part 1

To complete part 1 of this project, you will turn in a file named cracked.txt that contains the usernames and cracked passwords for the 50 users in the leaked shadow file. Each user and corresponding password should appear on one line in cracked.txt separated by a colon. For example, the format of a valid submission might look like this:

Part 2: Generating Secure, Memorable Passwords

One big reason why people choose weak passwords that are easily cracked is because they have been taught that only confusing passwords are secure. People either reject this advice and leave themselves vulnerable, or adopt password creation heuristics that are not resilient to cracking in practice (e.g. English word plus one capital letter, one random number, and one random symbol).

In this part of the project, you will write a program that generates secure, memorable passwords using the XKCD method. Your program may be written in any language that is available on the Khoury College Linux machines (this includes C, C++, Python 2 and 3, Java, Racket, Ruby, Perl, Go, Rust, and possibly others). Regardless of which language you choose, your program must exactly obey the following command line syntax:

$ ./xkcdpwgen -h
usage: xkcdpwgen [-h] [-w WORDS] [-c CAPS] [-n NUMBERS] [-s SYMBOLS]
Generate a secure, memorable password using the XKCD method
optional arguments:
    -h, --help            show this help message and exit
    -w WORDS, --words WORDS
                          include WORDS words in the password (default=4)
    -c CAPS, --caps CAPS  capitalize the first letter of CAPS random words
    -n NUMBERS, --numbers NUMBERS
                          insert NUMBERS random numbers in the password
    -s SYMBOLS, --symbols SYMBOLS
                          insert SYMBOLS random symbols in the password
Note that your program does not need to print this exact help text. However:

Usage of xkcdpwgen

By default, if you run xkcdpwgen with no arguments, it should produce a password composed of four random English words, all characters in lowercase, without numbers or symbols, like this:

$ ./xkcdpwgen
$ ./xkcdpwgen
$ ./xkcdpwgen
You are free to use any English wordlist that you wish as part of this project. Some reasonable wordlists are available here, here, and here. Make sure to turn in a copy of your wordlist with your project! You may assume that your program will be invoked from the same directory that contains your wordlist, and you will need to hard-code the filename of your wordlist in your program.

The "-w" and "--words" arguments allow the user to override the number of words in the generated password. For example:

./xkcdpwgen -w 2
$ ./xkcdpwgen -w 2
The "-c" and "--caps" arguments capitalize the first letters of random words from the password. For example:
./xkcdpwgen -c 2
$ ./xkcdpwgen -c 2
The "-n" and "--numbers" arguments add random numerical characters into the password, either at the beginning, end, or in-between words. The "-s" and "--symbols" arguments do the same thing but for symbol characters (~!@#$%^&*.:;). For example:
$ ./xkcdpwgen -n 2 -s 2
$ ./xkcdpwgen -n 2 -s 4
$ ./xkcdpwgen -n 2 -s 4
$ ./xkcdpwgen -n 2 -s 4

You may add additional functionality to your program if you wish, but these arguments must be available and behave exactly as specified in this project description. You may handle errors however you see fit. For example the following invocation has an error; you may choose to display an error message, or generate a "best-effort" password.

$ ./xkcdpwgen -c 10

Packaging Your Submission for Part 2

Because you are allowed to program in whatever language you wish, we require that all students submit a Makefile. If you choose to use a compiled language, you must turn in your source code, and the Makefile must compile your program. For example, if you write your program in C/C++, the final product of the Makefile should be a program called xkcdpwgen.

If you choose to program in a compiled language that does not produce executable binaries (e.g. the Java compiler produces .class files), then you must include a shell script with your submission named xkcdpwgen that can (1) invoke your program and (2) forward any given command line arguments to your program. You must also include a Makefile that transforms your source code into compiled files (e.g. .java files into .class files).

If you choose to use a language that does not need compilation (e.g. Python, Perl), you may leave your Makefile blank. We encourage students that choose to program in scripting languages to adopt shebang syntax and submit an executable script named xkcdpwgen.

Submitting Your Project

Before turning in the project, you must register yourself for our grading system using the following command:
$ /course/cs2550sp19/bin/register-student [NUID]
NUID is your Northeastern ID number, including any leading zeroes. This command is available on all of the Khoury College lab machines.

The exact files that you submit for this assignment will vary depending on the programming language you choose to use for part 2. At a minimum, you will probably submit:

You submit your project by running the turn-in script as follows:
$ /course/cs2550sp19/bin/turnin project2 <project directory>
where <project directory> is the name of the directory with your submission. The script will print out every file that you are submitting, so make sure that it prints out all of the files you wish to submit! The turn-in script will not accept submissions that are missing a Makefile or cracked.txt. You may submit as many times as you wish; only the last submission will be graded, and the time of the last submission will determine whether your assignment is late.

At any time, you can run the following command to see all of your current grades for projects, essays, quizzes, and tests.

$ /course/cs2550sp19/bin/gradesheet


This project is worth 10% of your final grade, broken down as follows (out of 100): Points can be lost for turning in files in incorrect formats (e.g. not ASCII), failing to follow specified formatting or naming conventions, failing to compile, failing to follow specified command line syntax, insufficient or incorrect randomization, etc.