Your challenge:

Create a directory called shell_fu and inside it create a file
called 10_most_freq.txt

10_most_freq.txt should contain the 10 most frequent words that
appear in the book "The Dunwich Horror" by H. P. Lovecraft which
is available from the following URL:

http://www.gutenberg.org/cache/epub/50133/pg50133.txt

You can only use shell commands to solve this challenge. 

In the following you can type the commands inside quotes "" on the
shell but do not type in the "s.

If you are stuck, try out some of the examples below to see if you
can understand each basic concept first.  Then try to combine them
together to solve the task.

Shell concepts: 

"> filename" which sends the output of a program to a file called filename, 
      e.g. "ls > filelist" (filelist will be a file containing a list of files in your current directory)

"|" which takes the output of one program and feeds it to the input of another program (called a pipe)
      e.g. "ls | wc -l" will send the output of "ls" to another program "wc -l" which counts the number of lines

"\n" which stands for the newline character

Useful shell and Linux programs:

mkdir: creates a directory
      e.g. "mkdir shell_fu"
cd: changes the working directory
      e.g. "cd shell_fu"
pwd: prints your current working directory (Print Working Directory, or pwd for short)
curl: downloads a file from a URL, use > to save it, 
      e.g. "curl http://www.gutenberg.org/cache/epub/50133/pg50133.txt > pg50133.txt"
cat: prints out a file or any input given to it
      e.g. "cat instructions.txt"
less: stops after the first page of output when there are multiple pages of output
      e.g. "less instructions.txt"
ls: lists all the files in the working directory
wc: counts the number of characters, words and lines. Select which you want using an option.
      e.g. "cat pg50133.txt | wc -l" prints the number of lines in pg50133.txt
           "cat pg50133.txt | wc -w" prints the number of words in pg50133.txt
sort: sorts the input and prints lines in sorted order to output, 
      e.g. "sort -n" sorts numbers. "sort -r" sorts in reverse order. "sort -nr" does both
tr: changes one character to another, 
      e.g. "tr ' ' '\n'" changes all spaces to newlines
		   "tr -d '\r'" deletes all fake newlines.
           "tr -d '[:cntrl:]'" deletes all invisible control characters
grep: checks the input or a file for some pattern, 
      e.g. "grep ." prints out only lines that have at least one character of any type
      "grep -i wilbur filename" prints out all the lines that contain Wilbur, or wilbur, or WiLbUr.
uniq: checks if there are duplicates and prints only one of them. 
      e.g. "uniq -c" removes duplicates but also counts them
head -n: prints out the top n lines of the input, 
      e.g. "cat most_freq.txt | head -10" prints the top 10 lines of filename.
cp file1 file2: copies a file file1 to another file file2
mv file1 file2: copies a file (file1) to another file (file2) and removes the original (file1)
man: lists the manual for a program or concept, 
      e.g. run "man ascii" about how text is represented

For each program you can find out more by running it with "--help", e.g. "sort --help"

The output of this shell challenge should look like this:

If you run "cat 10_most_freq.txt" to view the contents of that file it should show:

1387 the
 720 of
 668 and
 442 to
 416 a
 323 in
 222 was
 200 that
 173 he
 164 with

If you get the following instead:

1272 the
 658 of
 596 and
 498 
 413 to
 383 a
 307 in
 253 
 198 was
 189 that

Then you are almost there. We want to get rid of invisible characters
and empty lines. Just remove '\r' from the file (using "tr -d '\r'")
and ignore empty lines by printing only those lines that have at
least one character (using "grep .").