Your challenge: Create a directory called shell_fu and inside it create a file called 10_most_freq.txt 10_most_freq.txt should contain the 10 most frequent words that appear in the book "The Dunwich Horror" by H. P. Lovecraft which is available from the following URL: http://www.gutenberg.org/cache/epub/50133/pg50133.txt You can only use shell commands to solve this challenge. In the following you can type the commands inside quotes "" on the shell but do not type in the "s. If you are stuck, try out some of the examples below to see if you can understand each basic concept first. Then try to combine them together to solve the task. Shell concepts: "> filename" which sends the output of a program to a file called filename, e.g. "ls > filelist" (filelist will be a file containing a list of files in your current directory) "|" which takes the output of one program and feeds it to the input of another program (called a pipe) e.g. "ls | wc -l" will send the output of "ls" to another program "wc -l" which counts the number of lines "\n" which stands for the newline character Useful shell and Linux programs: mkdir: creates a directory e.g. "mkdir shell_fu" cd: changes the working directory e.g. "cd shell_fu" pwd: prints your current working directory (Print Working Directory, or pwd for short) curl: downloads a file from a URL, use > to save it, e.g. "curl http://www.gutenberg.org/cache/epub/50133/pg50133.txt > pg50133.txt" cat: prints out a file or any input given to it e.g. "cat instructions.txt" less: stops after the first page of output when there are multiple pages of output e.g. "less instructions.txt" ls: lists all the files in the working directory wc: counts the number of characters, words and lines. Select which you want using an option. e.g. "cat pg50133.txt | wc -l" prints the number of lines in pg50133.txt "cat pg50133.txt | wc -w" prints the number of words in pg50133.txt sort: sorts the input and prints lines in sorted order to output, e.g. "sort -n" sorts numbers. "sort -r" sorts in reverse order. "sort -nr" does both tr: changes one character to another, e.g. "tr ' ' '\n'" changes all spaces to newlines "tr -d '\r'" deletes all fake newlines. "tr -d '[:cntrl:]'" deletes all invisible control characters grep: checks the input or a file for some pattern, e.g. "grep ." prints out only lines that have at least one character of any type "grep -i wilbur filename" prints out all the lines that contain Wilbur, or wilbur, or WiLbUr. uniq: checks if there are duplicates and prints only one of them. e.g. "uniq -c" removes duplicates but also counts them head -n: prints out the top n lines of the input, e.g. "cat most_freq.txt | head -10" prints the top 10 lines of filename. cp file1 file2: copies a file file1 to another file file2 mv file1 file2: copies a file (file1) to another file (file2) and removes the original (file1) man: lists the manual for a program or concept, e.g. run "man ascii" about how text is represented For each program you can find out more by running it with "--help", e.g. "sort --help" The output of this shell challenge should look like this: If you run "cat 10_most_freq.txt" to view the contents of that file it should show: 1387 the 720 of 668 and 442 to 416 a 323 in 222 was 200 that 173 he 164 with If you get the following instead: 1272 the 658 of 596 and 498 413 to 383 a 307 in 253 198 was 189 that Then you are almost there. We want to get rid of invisible characters and empty lines. Just remove '\r' from the file (using "tr -d '\r'") and ignore empty lines by printing only those lines that have at least one character (using "grep .").