Due Monday, September 20 at 10pm
This first lab should get you up to speed working with the command line, basic shell commands, an editor, and a small bash program.
Log in to the Thayer plank server
(plank.thayer.dartmouth.edu) with your NetID and set up lab assignments, if you have not already:
[MacBook ~]$ ssh cs50 [plank ~]$ mkdir -p cs50-dev/labs [plank ~]$ chmod go-rwx cs50-dev [plank ~]$ cd cs50-dev/labs
These commands create a directory
~/cs50-dev/labs, removes read, write, execute permissions from the group and other users (i.e. prevent others from peeking at your work), and changes the working directory to
labs so you’re ready to start.
Clone the starter kit: visit GitHub Classroom, accept the assignment, and clone the repository to your labs directory. It will look something like this, assuming your GitHub username is XXXXX:
$ git clone https://github.com/Dartmouth-CS50-Fall2021-Prioleau/lab-1-XXXXX Cloning into 'lab-1-XXXXX'...
The clone step will create a new directory
If you would prefer to work out the initial solutions on your laptop, run the above git clone command on your local laptop (without logging to CS servers via ssh). Later, use
scpto push your solutions back to your Linux account, test them there, and then submit them from there.
First download a spreadsheet from:
and save it as
vaccine.csv. You can use the following command to do both in one step:
wget -O vaccine.csv https://data.cdc.gov/api/views/8xkx-amqh/rows.csv?accessType=DOWNLOAD
In the above line, the
wget command is fetching a file at a given URL. The
-O option (with character
o in uppercase) specifies the file name to save it as.
vaccine.csv is a comma-separated value (CSV) file from the Centers for Disease Control and Prevention (CDC) on COVID-19 vaccine administration presented at the county level. The CDC database is updated daily. Further description of the dataset can be found here.
A. Write a single bash command or pipeline to print only the lines for the state of New Hampshire in the month of August. The output should not contain the current first line, which lists the names of data fields. (5 points)
B. Write a single bash command or pipeline to print only the county
(Recip_State), and percentage of fully vaccinated people
(Series_Complete_Pop_Pct) columns. The output should be comma separated and should not contain the current first line, which lists the names of data fields. (5 points)
C. Write a single bash command or pipeline to print only the lines from August 11 to August 13, including the data on August 11. (5 points)
D. Write a single bash command or pipeline to print the counties with zero percent of fully vaccinated people in the state of California. Note that the latest date will have the cumulative data. (10 points)
E. Write a single bash command or pipeline to print the number of counties with zero percent of fully vaccinated people in each state. Present this in decreasing order based on the number of counties. Each line of the output should contain the number of counties with zero percent of fully vaccinated people and the state name. Note that the latest date will have the cumulative data. (10 points)
F. Write a single bash command or pipeline to print the counties with the top-10 highest percentage of fully vaccinated people based on the latest data. Present this in decreasing order based on the fully vaccinated percent. Each line of the output should contain the county name, the state, and percent of fully vaccinated people, each separated with a comma. Note that the latest date will have the cumulative data. (10 points)
G. Extend the previous command line to edit each output line, adding a pipe (
|) symbol at the beginning and the end, and replacing the comma(s) with a pipe symbol. Copy and paste that output into your
solution.md markdown file. Prepend two lines to it to create a nice table like the one below (created with the data on August 23, 2021). You should not have to edit the output of your commandline - you should just add the header row. (10 points)
You can read about Markdown tables here.
|Bristol Bay Borough||AK||87.3|
|San Juan County||CO||82.7|
|Santa Cruz County||AZ||81.5|
H. Write a bash script called
query.sh that takes the name of a state and outputs the number of fully vaccinated people for this state based on the latest cumulative data. It can also take date as an additional parameter, in which case it will output the number of fully vaccinated people on that date for the specified state. (40 points)
Here are some example outputs by running the script on August 23, 2021:
Similar to question D, E, and F, you should think about how to get the latest date.
$ ./query.sh Incorrect number of arguments. Usage: ./query.sh state [date] $ ./query.sh Hanover Hanover state does not exist $ ./query.sh NH NH: 805909 $ ./query.sh NH 06/01/2021 NH: 763898 $ ./query.sh CA CA: 25731391 $ ./query.sh CA 20-3123 This date (20-3123) does not exist for CA
Things to note:
- Your script should have a brief header comment giving the script name, your name, the date, and a short summary of how someone can/should use the script.
- Your script should print an error and exit non-zero, if the number of arguments is less than 1 or greater than 2.
- Your script should print an error message and exit non-zero, if vaccine.csv is not an existing, readable file.
- Your script should print an error message and exit non-zero, if it does not find the state specified by the first parameter.
- Your script should print an error message and exit non-zero, if it does not find the date specified by the second parameter.
- Your script should exit with a zero status, otherwise.
Other items, such as following delivery related instructions: (5 points)
What to hand in, and how
You should have three files in your
README.mdto remove instructions, add your name, add your username.
solution.mdwith the answers to items A-G. For each, include a subsection header and show the commandline but do not include the command output. This is a “Markdown” file and you should use Markdown formatting. Notably, use code blocks to format the commands, like those you see below. You can preview it with various Markdown-rendering tools (see: Markdown resources) but we will read it on GitHub.com, so make sure it looks good there.
query.shwith the script for item H.
You should add only these three files to your repo:
git add README.md solution.md query.sh
Please do not add your .csv file; it is large and, of course, we can download our own copy.
Commit your changes:
git commit -m "your commit message"
Push your changes to GitHub:
Actually, if it is your first push, it will remind you to
git push --set-upstream origin master
Make sure you left nothing unexpected behind:
If you need to make updates, repeat the
You can verify that it safely uploaded by visiting your private lab repo on GitHub.
If you need to submit after the deadline …
Your commit message should say “PLEASE GRADE THIS COMMIT.” Our graders will grade the last commit made before the deadline, unless they see that message on a late commit; they will grade the latest such commit that is less than 72h after the deadline. Late commits without such a comment will be ignored.
You will find some of the following commands useful; use
man [cmd] to read about any command.
It’s best to run
man inside Linux so you are sure to get the manual for the Linux version of the command (MacOS can differ).
sed depend on regular expressions.
It is helpful to remember that
^ anchors a pattern to the start of a line and
$ anchors to the end of the line.
Most Unix tools work line-by-line. For some problem(s), I found it helpful to translate the csv header line into a sequence of lines, on which I could operate with other tools.