CY 2550 - Project 2

CY 2550 - Foundations of Cybersecurity

Project 2: Passwords

This project is due at 11:59pm on Friday, February 15, 2019.

Description and Deliverables

In this project, you will gain hands on experience cracking passwords, as well as creating secure, memorable passwords that are resistant to cracking. As such, this project has two distinct parts. I highly recommend that you start part one immediately: the necessary computations can take days to complete!

To receive full credit for this project, you will turn in the following two things:

A file named cracked.txt that contains the usernames and cracked passwords for the 50 users contained in this leaked /etc/shadow file.
A program that you will write called xkcdpwgen that can generate secure, memorable passwords using the XKCD method.

Each of these deliverables is described in greater detail below.

Part 1: Password Cracking

Linux systems typically store cryptographically hashed user passwords in crypt format in the /etc/shadow file. If you have sudo access to a Linux system, you can view this file on your own system (don't try to look at this file on systems you don't own, like the Khoury College Linux machines). The file format for the /etc/shadow file is described here.

In this part of the project, you will crack the hashed passwords contained in this leaked /etc/shadow file. There are 50 usernames and passwords in the file, meaning that it will take several days of compute power to crack all 50 passwords, so start this process early!

Cracking Tools

We recommend that students use well-known, heavily optimized cracking tools like John the Ripper or HashCat for this part of the project. Both tools are available for multiple platforms, although they are trivial to install on Debian-based Linux systems:

sudo apt install john
sudo apt install hashcat

Both tools have built-in support for the /etc/shadow file format, have the ability to pause and resume cracking sessions (a useful feature, since cracking can take hours/days), and support multiple different strategies for guessing passwords (e.g. brute force, word lists, etc.). We leave it to you to determine which tool you prefer and learn its command line syntax. Students are welcome to use whatever password guessing approach they want; many wordlists are available for free online, including from the John the Ripper homepage.

John the Ripper and HashCat both have the ability to run in multi-threaded configurations (i.e. they try to crack multiple passwords in parallel). We highly recommend that students utilize these features; for example, on a quad-core laptop, running John the Ripper with the "--fork=3" option to use three CPU cores is a reasonable approach. Alternatively, if your computer has a GPU, we highly recommend using the GPU-optimized, OpenCL modes available in both programs, since GPUs are several orders of magnitude faster at password cracking than CPUs.

Cracking Approach

The leaked shadow file is designed to have a sliding difficulty scale. Without doing anything fancy, roughly half of the passwords should crack in just a few minutes. Why do you think these passwords were so easy to crack?

With a reasonably comprehensive wordlist/dictionary (links to examples are provided in Part 2 of the project) combined with common permutation rules, another ~15 passwords should crack within 24 hours. For example, using John the Ripper the following command will attempt to crack the passwords using a wordlist of your choice and John's built-in permutation rules (e.g. capitalizing the first and last letters of words, adding random numbers to the end of words, etc.).

$ john --wordlist=[path to your wordlist] --rules --fork=3 [path to the shadow file]

The remaining ~10 passwords are more challenging, and require more expansive permutation rules (hint: symbols) or even raw brute force to crack. For example, using John the Ripper, you can attempt a brute force attack against the shadow file using all combinations of ASCII characters with length <14 using the following command:

$ john --incremental=ASCII --fork=3 [path to the shadow file]

Note that this kind of brute force approach will take a long time to complete.

File Format for Part 1

To complete part 1 of this project, you will turn in a file named cracked.txt that contains the usernames and cracked passwords for the 50 users in the leaked shadow file. Each user and corresponding password should appear on one line in cracked.txt separated by a colon. For example, the format of a valid submission might look like this:

cbw:really_strong_password6@
alice:1337cr4ck1ngsk1llz
bob:weak1234
charlie:lalala

Part 2: Generating Secure, Memorable Passwords

One big reason why people choose weak passwords that are easily cracked is because they have been taught that only confusing passwords are secure. People either reject this advice and leave themselves vulnerable, or adopt password creation heuristics that are not resilient to cracking in practice (e.g. English word plus one capital letter, one random number, and one random symbol).

In this part of the project, you will write a program that generates secure, memorable passwords using the XKCD method. Your program may be written in any language that is available on the Khoury College Linux machines (this includes C, C++, Python 2 and 3, Java, Racket, Ruby, Perl, Go, Rust, and possibly others). Regardless of which language you choose, your program must exactly obey the following command line syntax:

$ ./xkcdpwgen -h
usage: xkcdpwgen [-h] [-w WORDS] [-c CAPS] [-n NUMBERS] [-s SYMBOLS]
                
Generate a secure, memorable password using the XKCD method
                
optional arguments:
    -h, --help            show this help message and exit
    -w WORDS, --words WORDS
                          include WORDS words in the password (default=4)
    -c CAPS, --caps CAPS  capitalize the first letter of CAPS random words
                          (default=0)
    -n NUMBERS, --numbers NUMBERS
                          insert NUMBERS random numbers in the password
                          (default=0)
    -s SYMBOLS, --symbols SYMBOLS
                          insert SYMBOLS random symbols in the password
                          (default=0)

Note that your program does not need to print this exact help text. However:

Your program must support all five of these command line options.
Your program must be named xkcdpwgen.

Usage of xkcdpwgen

By default, if you run xkcdpwgen with no arguments, it should produce a password composed of four random English words, all characters in lowercase, without numbers or symbols, like this:

$ ./xkcdpwgen
guacamoleexamgallopedcrediting
$ ./xkcdpwgen
flockdolliescitizenrysource
$ ./xkcdpwgen
autumnsbooboomultipliesbandwagons

You are free to use any English wordlist that you wish as part of this project. Some reasonable wordlists are available here, here, and here. Make sure to turn in a copy of your wordlist with your project! You may assume that your program will be invoked from the same directory that contains your wordlist, and you will need to hard-code the filename of your wordlist in your program.

The "-w" and "--words" arguments allow the user to override the number of words in the generated password. For example:

./xkcdpwgen -w 2
studiesexaminer
$ ./xkcdpwgen -w 2
luridlypiers

The "-c" and "--caps" arguments capitalize the first letters of random words from the password. For example:

./xkcdpwgen -c 2
GrenadehostelriesBirdcagedirectives
$ ./xkcdpwgen -c 2
warehousedfootbathJiffyGazebo

The "-n" and "--numbers" arguments add random numerical characters into the password, either at the beginning, end, or in-between words. The "-s" and "--symbols" arguments do the same thing but for symbol characters (~!@#$%^&*.:;). For example:

$ ./xkcdpwgen -n 2 -s 2
@$3genteelpredatorcrickets9frustrates
$ ./xkcdpwgen -n 2 -s 4
^saltiness77checkersvulgarly$saturn^;
$ ./xkcdpwgen -n 2 -s 4
~pushes%barre^5pricksgosh$9
$ ./xkcdpwgen -n 2 -s 4
putrefying$~7polycyclic.enneads1unamended!

You may add additional functionality to your program if you wish, but these arguments must be available and behave exactly as specified in this project description. You may handle errors however you see fit. For example the following invocation has an error; you may choose to display an error message, or generate a "best-effort" password.

$ ./xkcdpwgen -c 10

Packaging Your Submission for Part 2

Because you are allowed to program in whatever language you wish, we require that all students submit a Makefile. If you choose to use a compiled language, you must turn in your source code, and the Makefile must compile your program. For example, if you write your program in C/C++, the final product of the Makefile should be a program called xkcdpwgen.

If you choose to program in a compiled language that does not produce executable binaries (e.g. the Java compiler produces .class files), then you must include a shell script with your submission named xkcdpwgen that can (1) invoke your program and (2) forward any given command line arguments to your program. You must also include a Makefile that transforms your source code into compiled files (e.g. .java files into .class files).

If you choose to use a language that does not need compilation (e.g. Python, Perl), you may leave your Makefile blank. We encourage students that choose to program in scripting languages to adopt shebang syntax and submit an executable script named xkcdpwgen.

Submitting Your Project

Before turning in the project, you must register yourself for our grading system using the following command:

$ /course/cs2550sp19/bin/register-student [NUID]

NUID is your Northeastern ID number, including any leading zeroes. This command is available on all of the Khoury College lab machines.

The exact files that you submit for this assignment will vary depending on the programming language you choose to use for part 2. At a minimum, you will probably submit:

A cracked.txt file containing the cracked passwords
A Makefile, which may be empty
The source code for your password generation program
A wordlist file that is used by your password generation program

You submit your project by running the turn-in script as follows:

$ /course/cs2550sp19/bin/turnin project2 <project directory>

where <project directory> is the name of the directory with your submission. The script will print out every file that you are submitting, so make sure that it prints out all of the files you wish to submit! The turn-in script will not accept submissions that are missing a Makefile or cracked.txt. You may submit as many times as you wish; only the last submission will be graded, and the time of the last submission will determine whether your assignment is late.

At any time, you can run the following command to see all of your current grades for projects, essays, quizzes, and tests.

$ /course/cs2550sp19/bin/gradesheet

Grading

This project is worth 10% of your final grade, broken down as follows (out of 100):

40 points - cracking all the passwords in the leaked /etc/shadow file
10 points - turning in a password generation program that successfully compiles (if necessary) and runs on the command line, regardless of correctness
30 points - turning in a password generation program that has the correct default behavior, e.g. generates four word long random passwords
5 points each - correct support for the words, caps, numbers, and symbols arguments

Points can be lost for turning in files in incorrect formats (e.g. not ASCII), failing to follow specified formatting or naming conventions, failing to compile, failing to follow specified command line syntax, insufficient or incorrect randomization, etc.

Tips

Cracking passwords can take days so start part 1 of the project as soon as possible!
How hard is the xkcdpwgen program to write? My reference implementation is 46 lines of Python, so not too bad.
If you've never written a command line driven program before, the first step is figuring out how to read command line arguments in your language of choice. All languages have this capability, although it's not always named the same thing. In C/C++, the command line is available as the argc and argv variables passed to main(). In Python, the sys module holds in the command line arguments in the sys.argv variable. Take the time to look at some examples of command line parsing in your language of choice.
If you're not sure you've implemented the command line of your program correctly, have one of your friends test it out. Alternatively, post some example command lines to Piazza; we'll be happy to tell you if the formatting or behavior is incorrect.
If you're using a compiled language, triple check that your code compiles and that your Makefile is free of errors before you submit. Make sure to test your compile on a Khoury College machine.