This first lab should get you up to speed working with the command line, basic shell commands, an editor, and a small bash program.


Log in to the Thayer plank server ( with your NetID, set up for your work in this course, if you have not already:

[MacBook ~]$ ssh plank
[plank ~]$ mkdir -p cs50/labs/
[plank ~]$ cd cs50/labs

These commands create a directory ~/cs50/labs, prevent others from peeking at your work, and change the working directory to labs so you’re ready for the work below.

Clone the starter kit: visit GitHub Classroom, accept the assignment, and clone the repository to your labs directory. It will look something like this, assuming your GitHub username is XXXXX:

$ git clone
Cloning into 'lab1-XXXXX'...

The clone step will create a new directory ~/cs50/labs/lab1-XXXXX,

If you would prefer to work out the initial solutions on your laptop, repeat the above git clone command on your local laptop (without logging to Thayer servers via ssh). Later, use scp to transfer your solutions to your Linux account, or push it to your git repo and run git pull on Thayer servers to get the latest version, and test them there.

Get data

First download a spreadsheet from

and save it as vaccine.csv. You can also use the following command to do both in one step:

wget -O vaccine.csv

Here wget command is fetching the file at a given URL. -O option (with character o in uppercase) specifies the file name to save as.

If you would like to access the file on Thayer servers, we have downloaded a copy in our shared workspace. Once you logged into plank server, you can see it listed as one of the files:

$ ls /thayerfs/courses/22fall/cosc050/workspace/
activities/  aliases.txt  passwd  tse/  vaccine.csv  webpage/

Please do not make copies of the vaccine.csv file in your own home directory on Thayer servers. It is a very large (401MB as of August 11, 2022) file and can cause quota issues for your Linux account.

vaccine.csv is a comma-separated value (CSV) file released by Centers for Disease Control and Prevention (CDC). It provides COVID-19 vaccine administration data at county level and is updated daily. More description of the dataset can be found at this link.

You can see the headers and the first few lines of the vaccine.csv file with:

$ head /thayerfs/courses/22fall/cosc050/workspace/vaccine.csv
08/03/2022,54025,31,Greenbrier County,WV,97.2,21686,62.6,21644,65.8,21229,70.2,20194,72.4,7477,91.9,19292,55.7,19281,58.6,1292,25.9,18931,62.6,17989,64.5,6817,83.8,9613,49.8,9611,49.8,9584,50.6,9377,52.1,7256,62,4748,69.6,C,10,10,9,10,10,12,Non-metro,6,6,5,6,6,8,11,12,12,11,7,8,8,7,34662,32879,4979,30238,27900,8136
08/03/2022,17043,31,DuPage County,IL,98.3,776606,84.1,771299,88.7,724164,91.8,662581,92.6,152022,95,718027,77.8,716794,82.5,101098,65.7,673654,85.4,615696,86.1,143405,95,427051,59.5,427043,59.6,421572,62.6,399296,64.9,237414,76.3,122053,85.1,A,3,4,4,4,4,4,Metro,3,4,4,4,4,4,4,4,4,4,4,4,4,4,922921,869134,153791,789146,715343,148998
08/03/2022,01017,31,Chambers County,AL,92.5,12875,38.7,12867,41,12689,44.3,12137,46.1,4038,60.1,10695,32.2,10695,34.1,581,11.5,10562,36.9,10114,38.4,3487,51.9,4159,38.9,4159,38.9,4143,39.2,4092,40.5,3395,50.9,2070,59.4,C,9,9,9,9,9,10,Non-metro,5,5,5,5,5,6,10,10,11,10,6,6,7,6,33254,31372,5031,28624,26341,6715
08/03/2022,36007,31,Broome County,NY,97.6,132779,69.7,132528,73.4,127951,76.9,120272,78.3,35130,95,122717,64.4,122709,67.9,11304,42,118587,71.3,111405,72.5,32839,88.8,70227,57.2,70226,57.2,69990,59,67876,60.9,45160,72.3,25831,78.7,C,10,11,11,11,11,12,Metro,2,3,3,3,3,4,12,12,12,11,4,4,4,3,190488,180602,26919,166375,153683,36980
08/03/2022,16023,31,Butte County,ID,97.7,1424,54.8,1424,58.1,1395,63.3,1326,67.4,466,75.9,1295,49.9,1295,52.9,89,18.4,1269,57.6,1206,61.3,444,72.3,509,39.3,509,39.3,507,40,494,41,393,50.7,259,58.3,A,1,2,1,2,2,3,Metro,1,2,1,2,2,3,2,3,3,2,2,3,3,2,2597,2450,484,2204,1966,614


Edit to add your name, add your GitHub username and provide your code for questions A-G below. For each question, include a subsection header and show the command line solution you developed. (Do not include the command output.) This is a “Markdown” file and you should use Markdown formatting. Notably, use code blocks to format the commands, like those you see below. You can preview it with various Markdown-rendering tools (see: Markdown resources) but we will read it on, so make sure it looks good there.


A. Write a single bash command or pipeline to print only the lines for the New Hampshire state in the month of February, 2022. The output should not contain the current first line, which lists the names of data fields.

B. Write a single bash command or pipeline to print only the county (Recip_County), state (Recip_State), and percentage of fully vaccinated people (Series_Complete_Pop_Pct) columns, separated by commas. The output should not contain the current first line, which lists the names of data fields.

C. Write a single bash command or pipeline to print only the lines from Feb. 15, 2022 to Feb. 17, 2022 (including all of the data on Feb. 15).

D. Write a single bash command or pipeline to print the counties with at least 80% of fully vaccinated population so far in the state of California. List each county only one time.

E. Write a single bash command or pipeline to print the number of counties with at least 80% of fully vaccinated population so far in each state, in decreasing order of the number of counties. Each line of output should contain the number of counties with at least 80% of fully vaccinated population and the state name.

F. Write a single bash command or pipeline to print the counties with the top-20 highest percentage of fully vaccinated population based on the latest data, in decreasing order of fully vaccinated percentage. Each line of output should contain the county name, the state, and fully vaccinated percentage, separated by a comma.

G. Extend that command line to edit each output line, adding a pipe (|) symbol at the beginning and the end, and replacing the comma(s) with a pipe symbol. If you copy-paste that output into a Markdown file and prepend the first two lines (which do not need to be generated from your command), it is turned into a nice table, like this one (based on the data set updated on August 3, 2022):

County State Fully-Vaccinated Percentage
Webb County TX 95
Starr County TX 95
Santa Cruz County AZ 95
San Juan County CO 95
Maverick County TX 95
Lares Municipio PR 95
Irion County TX 95
Imperial County CA 95
Culebra Municipio PR 95
Chattahoochee County GA 95
Bristol Bay Borough AK 95
Apache County AZ 95
Teton County WY 93.9
Presidio County TX 93.8
McKinley County NM 93.2
Bayamon Municipio PR 92.6
Guaynabo Municipio PR 92.2
Aibonito Municipio PR 92.2
Brooks County TX 91.7
Big Horn County MT 90.8

You do not have to edit the output of your command line - you would just add the header row. Read about Markdown, and about Markdown tables.

H. Write a bash script called that takes the name of a state and outputs the number of fully vaccinated people (Column ‘Series_Complete_Yes’) for this state based on the latest cumulative data. It can also take date as an additional parameter, in which case it will output the number of fully vaccinated people on that date for the specified state. Here are some example outputs by running the script on Aug. 3, 2022:

$ ./
Incorrect number of arguments. Usage: ./ state [date]
$ ./ Hanover
Hanover state does not exist
$ ./ CA 2052x-xew
Date 2052x-xew does not exist
$ ./ NH 
NH: 979163
$ ./ CA
CA: 28924481
$ ./  NH 03/25/2022
NH: 925430

Hint: similar to question D, E, and F, we need to think about how to get the latest date.

  • Your script should print an error and exit non-zero if the number of arguments is less than 1 or greater than 2.
  • Your script should print an error and exit non-zero if vaccine.csv is not an existing, readable file (you script should directly access /thayerfs/courses/22fall/cosc050/workspace/vaccine.csv file if it runs on Thayer servers).
  • Your script should print an error and exit non-zero if it does not find the state specified by the first parameter.
  • Your script should print an error and exit non-zero if it does not find the date specified by the second parameter.
  • Your script should exit with zero status, otherwise.
  • Your script should have a brief header comment giving the script name, your name, the date, and a short summary of how someone can/should use the script.

What to hand in, and how

You should have two files in your lab1-XXXX directory:

  • edit as described above for questions A-G.

  • write with the script for question H.

You should add only these two files to your repo:

git add

Please do not add vaccine.csv; it is large and, of course, we can download our own copy.

Commit your changes:

git commit -m "your commit message"

Push your changes to GitHub:

git push

If it is your first push, it will remind you to

git push --set-upstream origin main

Make sure you left nothing unexpected behind:

git status

If you need to make updates, repeat the add, commit, push sequence.

You can verify that it seems safely uploaded by visiting GitHub.

When you are satisfied with you solutions, follow theses instructions to create a separate submit branch in Git for the code you submitted. You can continue to edit your solution, but changes to the main branch will not change the submit1 branch. The graders will grade the submit1 branch and will pay attention to the date.

git commit  # commit all the files necessary for this lab
git push origin main       # push branch 'main' to the remote 'origin' on GitHub
git branch submit1         # create a new branch 'submit1'
git push origin submit1    # push branch 'submit1' to the remote  
git switch main		   # switch back to the main branch (important)

See the lab-submission instructions page for more information. (We will discuss Git and GitHub in more detail soon!)

If you need to submit after the deadline …

See the lab-submission instructions for details of how to submit your lab late.


You will find some of the following commands useful; use man cmd to read about any command. It’s best to run man inside Linux so you are sure to get the manual for the Linux version of the command (MacOS can differ).

  • less
  • cut
  • head
  • tail
  • grep (note -n)
  • wget
  • sort
  • uniq
  • tr
  • sed
  • wc (note -l)

grep and sed depend on regular expressions. It is helpful to remember that ^ anchors a pattern to the start of a line and $ anchors to the end of the line.

Most Unix tools work line-by-line. For some problem(s) I found it helpful to translate the csv header line into a sequence of lines, on which I could operate with other tools.

Markdown example

Read about Markdown, and about Markdown tables. If you want to preview a Markdown file with a desktop app, you’ll have to either scp the file to your laptop, or copy-paste from your ssh terminal into an empty window in one of those apps. Another great tool is HackMD. We will view your on, so make sure it looks good there.

Here’s a quick example of a simple Markdown file.

	# this is a header
	Some normal text goes here.
	Markdown will join and wrap lines where needed.
	Use a blank line to indicate a new paragraph.
	## this is a subheader
	The following is the 'triple-ticks' notation, with optional language specifier to inform Markdown that the contents are bash.
	For best results, put a blank line between text and triple-ticks.
	$ echo hello world
	hello world
	## another subheader
	some more text.