In this lecture, we discuss the aim and schedule of the course, and take a brief look at Unix, C programming, and computer architecture.
The aim of this course is to develop the necessary systems programming skills in C and Unix as a foundation to tackle the design, implementation, and integration of a large software project working in small teams. The challenge of the course is to quickly get people up to speed so there is sufficient time to get into the details of a complex software design project. The first part of the course serves to develop design, programming and other systems skills such as source-code management, testing, and debugging. The second part of the course is all about the project and team work. Good team work will lead to success.
A note about Unix/Linux: Unix came first. All of the others, including Solaris, AIX, BSD and all the variants of Linux, are derivations of that original effort. While we use Linux on the Thayers servers, you may elect to use your own copy of Linux or the Unix that Mac OSX is based on. So, unless it really matters, I will use the term Unix to refer to them all.
The syllabus in a nutshell:
- Unix: shell, commands, shell programming;
- Git: Source code management
- C: structure, arrays, pointers, dynamic memory, files;
- Programming tools: gcc (GNU compiler), make (maintain groups of programs), gdb (GNU debugger), and valgrind (profiling and memory leak checker).
- Software development methodology: design, implement, unit test/debug, integration, demo, maintenance.
- Thematic programming assignments: building a search engine (crawler, indexer, query engine).
- Team project: it’s a surprise!
The course includes six Unix shell and C programming assignments for the first part of the course; these assignments are to be done individually. The last part (about 2 weeks) is devoted to the team project. There are no lectures in the last part of the course but the projects are run with design reviews and progress meetings where the team can brainstorm problems and come up with solutions. There will be a common project goal for all teams, but students are free to develop their own ideas beyond this common goal - show your entrepreneurial side!
Please take note of the weekly readings. While these are not required or compulsory, they are highly recommended and will help your understanding and appreciation of the course material.
Go over the Logistics information.
About engagement: you will learn more from this course if you are actively engaged, where engagement = preparation + participation. Read the lecture notes and weekly readings, practice at home, and participate in classroom activities. Also, visit office hours and learn from others. The Learning Fellows are here to help with classroom activities, and the TAs are here to help outside class.
About the programming: There is a significant amount of programming in this course requiring a considerable time commitment on the part of the student. You will need to be well organized to complete all the programming assignments and project. It will be challenging, but we hope it will be fun!
About the project:
You will be assigned to a team of two to four students and given about two weeks to complete a project requiring strong collaboration and a problem-solving mindset. Each member is responsible for contributing to the overall system design, implementation, testing, integration, and documentation. The goal of this activity is to help you develop the confidence, skills, and habits necessary to write large computer programs while being part of a multi-person team. You will become conversant in software engineering paradigms, such as source code control with
git and other open source tools that ease the software development process. In addition, you will develop vital skills in self-directed learning, problem solving, and communication. The project concludes with a demo and a review of your design and implementation. All members of the team get the same base grade, plus a team-contribution grade determined in part by a short evaluation form completed by all members of each project team.
Goals for today’s lecture
We plan to cover the following in today’s lecture:
- The concept of a command line;
- Logging on to a Unix machine;
- Looking at the home directory and its files;
- Copying files to/from remote machines;
- Logging out; and
- Some housekeeping business.
In the first few lectures we cover Unix, the shell, and shell programming. This is not meant to be a detailed presentation of the Unix OS and its programming tools – it would require a whole term to cover all that material in detail. We need to know enough about Unix and its tools to be able to navigate our way around the system, write some basic shell scripts, and use its programming tools.
It is important that you use these notes as a starting point, but like any budding hacker you need to do some experimenting and read up on the details. You need to go on the web and find information if there are gaps in your knowledge, and visit office hours. There are many references on the Resources page.
Caveat: Please take note that lecture notes will not always be detailed. You will need to augment these notes with your own comments and by using the references and reading assignments so you can dive deeper into the topic.
The command line
Unix was originally developed for computers with hardwired ‘terminals’, each of which was basically an electronic typewriter - a printer with a keyboard. There were no graphical displays, and the concepts of ‘windows’ and ‘mouse’ had not yet been invented. To interact with the computer, the user types on the keyboard, and the computer echoes the keystrokes on the paper. The system is controlled by typing ‘commands’, most of which print results for the user to see. The interaction occurs over the ‘command line’.
Modern Unix systems support rich graphical user interfaces, but under the hood they all support the command line. In this class, I will demonstrate everything using the MacOS command line, accessed via the MacOS application called
Terminal. If you have a Mac, you can follow along. If you use Windows, consider enabling Windows Subsystem for Linux (WSL), which allows running native Linux command-line tools directly on Windows. Check out this link on how to enable this feature. But all of us can (and will) also use Thayer Linux servers.
When I open a new Terminal window on my Mac, I can type Unix commands at the ‘prompt’. This prompt is from the ‘shell’ – a program – that interprets your command line and runs a program representing each command. Your prompt may look different - the shell allows users to customize prompts and you’ll see a few formats in the examples below.
f00xxxx@plank:~$ ls cs50-dev f00xxxx@plank:~$ echo Hello CS50. Feeling Good? Hello CS50. Feeling Good? f00xxxx@plank:~$
ls command is the “list directory contents” command.
echo command is like a print statement - it echoes its arguments back.
Most commands quickly produce some output and then exit.
Some will run too long - perhaps printing too much output; you can stop (kill) the command, forcing it to exit, by typing control-C
^C at the keyboard.
One silly program,
yes, just prints an infinite sequence of
y characters until you kill it:
[temi:~]$ yes y y y ^C [temi: ~]$ yes
Some commands ask for your input, and continue to read input until they read an “end of file” (EOF); you can cause the program to detect an EOF by typing control-D
^D at the beginning of an input line. Below I typed two lines of text, the
^D at the start of the third input line:
f00xxxx@plank:~$ cat > test.txt Hello from CS50 This is day 1 I am already having fun. ^D f00xxxx@plank:~$
Here, we used the
cat command to create a test file and provide content to the file. When I finished, I typed ctrl-D, so the cat program detected an end of file (EOF) on its input (i.e., the keyboard), then exited. You will see a new file test.txt has been created in the current directory with the provided input as its content.
Notice the difference between
^D; the former kills the program immediately, whereas the latter causes it to detect EOF when it next reads input from the keyboard.
Logging into a remote machine using
The Thayer School of Engineering operates a set of Linux servers. For this course, the primary server we will use is called
plank.thayer.dartmouth.edu; you may also use equivalent servers called
babylonX.thayer.dartmouth.edu, where X is 1 through 8. (You can check the status here.) Your laptop must first be on the campus network, or on Dartmouth VPN, to access these servers. You should plan to log in to one of those servers for doing the labs.
I’ll login in from my Mac using the secure shell (
ssh) Unix command. The
ssh command establishes a secure channel and uses public-key cryptography to authenticate the remote computer and the user.
sshcommand replaces the archaic
telnet(remote communications with another computer) and
rlogin(remote login) because they lack security. The
sshcommand is exclusively used, these days, because your session is encrypted when it’s transmitted over the network, rather than being sent in clear text.
Assume your NetID is f00xxxx, you will log into the server named
plank using its full IP name:
temi@~> ssh firstname.lastname@example.org f00xxxx@plank:~$ ls cs50-dev test.txt f00xxxx@plank:~$ logout logging out... Connection to plank.thayer.dartmouth.edu closed. temi@~>
Tips for speedier login
If you are a Mac or Linux user, you can reduce typing by adding the following to your laptop’s .ssh/config file (using your own NetID); then you can type ssh cs50 and only have to enter your password to login.
Host cs50 Hostname plank.thayer.dartmouth.edu User f00xxxx
On MacOS, this file is
~/.ssh/config, and you can create or append to it by opening that file in your favorite editor.
Every time you login, it reminds you about whether you are getting close to consuming all your disk-space quota:
<< You are currently using 47.39M of your 5.00G home directory quota. >>
That’s nice, but it is slow. You can turn off this message by creating a file in your home directory on plank. Although the mere presence of this file is sufficient, put some text in the file so you can remember why it is there:
echo The presence of this file disables login notification of your disk-quota usage. > ~/.notfsquota
If you later wish to check your disk usage, you can remove this file (then logout and login), or simply run
There is also a simple web portal to check your quotas.
It is also possible to connect to a remote Linux server using X-windows, an early form of graphical user interface that has the capability to run applications on a remote server but put the windows on your local computer. I won’t be using X in this course, but you may wish to explore the option.
There are hundreds of Unix commands – but you probably only need a few dozen to get by. Each Unix command has a short abbreviated command name (e.g., LiSt directory (ls) or Secure SHell (ssh)) and its associated syntax that typically includes various arguments, and options; typically, these options (or switches as they are also known) are either a single letter preceded by a hyphen (e.g.,
-l) or one or more words preceded by two hyphens (e.g.,
--verbose). For example, the format of an
ssh command line looks like this:
ssh [options] [user@] hostname [command]
ssh -l f00xxxx plank.thayer.dartmouth.edu, the switch
-l informs the
ssh command that the username of the user logging in is f00xxxx. (Alternately, I could have used the form
This is a good time to look at the formatting of these command help texts. The
[ ] are used to denote optional things, like
while the things outside of the
[ ] , like
hostname, must be specified.
Getting Information using the online manual (man)
If you want the detailed syntax of a Unix command you can use the manual command followed by the command, as in
man ssh which produced the following:
SSH(1) BSD General Commands Manual SSH(1) NAME ssh -- OpenSSH SSH client (remote login program) SYNOPSIS ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec] [-D [bind_address:]port] [-e escape_char] [-F configfile] [-I pkcs11] [-i identity_file] [-L [bind_address:]port:host:hostport] [-l login_name] [-m mac_spec] [-O ctl_cmd] [-o option] [-p port] [-R [bind_address:]port:host:hostport] [-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]] [user@]hostname [command] DESCRIPTION ssh (SSH client) is a program for logging into a remote machine and for executing commands on a remote machine. It is intended to replace rlogin and rsh, and provide secure encrypted communications between two untrusted hosts over an insecure network. X11 connections and arbitrary TCP ports can also be forwarded over the secure channel. ssh connects and logs into the specified hostname (with optional user name). The user must prove his/her identity to the remote machine using one of several methods depending on the protocol version used (see below). If command is specified, it is executed on the remote host instead of a login shell. The options are as follows: -1 Forces ssh to try protocol version 1 only. -2 Forces ssh to try protocol version 2 only. -4 Forces ssh to use IPv4 addresses only. -6 Forces ssh to use IPv6 addresses only. -A Enables forwarding of the authentication agent connection. This can also be specified on a per-host ... and a whole lot more
This is just a snippet of the
man ssh output (
man is short for manual). The manual output includes all the nitty gritty details on options and about the command. For most commands you can use the common option
--help (two hyphens) to get a brief breakdown of the command and its switches. This doesn’t work for all commands (including
ssh, interestingly), but in that case the use of
-help is interpreted as an invalid entry by ssh and it lists of the options anyway.
You can use
man -k keyword
to search through the manual pages for matches on a keyword. For example:
[f00xxxx@plank ~]$ man -k shell bash (1) - GNU Bourne-Again SHell capsh (1) - capability shell wrapper chroot (1) - run command or interactive shell with special root directory chsh (1) - change your login shell CPAN::Admin (3pm) - A CPAN Shell for CPAN admins CPAN::Plugin (3pm) - Base class for CPAN shell extensions csh (1) - C shell with file name completion and command line editing ... and much more
Your home directory and its files
Each user has a ‘home directory’. After you have logged in using
ssh you are in your home directory - that is, the shell’s notion of your ‘current working directory’ is your home directory.
We can look at our home directory ‘path’ using the
pwd (print working directory) command. You can always use the
info commands to get more information.
f00xxxx@plank:~$ pwd /thayerfs/home/f00xxxx f00xxxx@plank:~$
The tilde (~) above is shorthand for ‘home’. Let’s take a look at the contents of my home directory (using the -l switch which means long format):
f00xxxx@plank:~$ ls cs50-dev test.txt f00xxxx@plank:~$ cs cs50-dev/ Command 'cs' not found f00xxxx@plank:~$ cd cs50-dev/ f00xxxx@plank:~/cs50-dev$ ls class_activities dotfiles examples play README.md setup f00xxxx@plank:~/cs50-dev$ ls dotfiles/ macos plank f00xxxx@plank:~/cs50-dev$ ls -l total 167 drwxr-sr-x 2 f00xxxx thayerusers 69 Jan 11 2021 class_activities drwxr-sr-x 4 f00xxxx thayerusers 74 Jan 9 2021 dotfiles drwxr-sr-x 7 f00xxxx thayerusers 1730 Feb 3 2021 examples drwxr-sr-x 2 f00xxxx thayerusers 154 Jan 27 2021 play -rw-r--r-- 1 f00xxxx thayerusers 6941 Jan 9 2021 README.md drwxr-sr-x 2 f00xxxx thayerusers 152 Jan 9 2021 setup f00xxxx@plank:~/cs50-dev$
Yes, I mistyped a command in that example:
cs cs50-dev/. I wanted you to see what the system says when you make a mistake … just in case you make one yourself sometime. ;-)
cd command changes the current working directory; I finally typed it right and the shell reported that my working directory is
~/cs50-dev. The command,
ls -l, listed the content of that directory a second time.
Files can be plain files, directories, or special files (more later). We can see that each file has file permissions and other data associated with it; for example, the directory
drwxr-sr-x 4 f00xxxx thayerusers 74 Jan 9 2021 dotfiles
You can traverse directory trees assuming you have the appropriate permission.
Unix supports a number of shells (command line interpreters). If we use the echo command we can look at the environment variable that tells us which shell is running. For this course we will use the bash (Born Again SHell) shell.
Again, the shell is the command processor for Unix systems. One way to find out what shell you’re running, try this:
f00xxxx@plank:~$ echo $SHELL /bin/bash f00xxxx@plank:~$
The first parameter to
echo substitutes the value of the variable
More on the bash shell later.
Another useful command for copying files between machines is the
scp - secure copy (remote file copy program) command. Below I copy, a README.md file from my account of the plank server to my personal computer.
temi@~> ssh cs50 f00xxxx@plank:~$ ls cs50-dev test.txt f00xxxx@plank:~$ cd cs50-dev f00xxxx@plank:~/cs50-dev$ ls class_activities dotfiles examples play README.md setup f00xxxx@plank:~/cs50-dev$ logout logging out... Connection to plank.thayer.dartmouth.edu closed. temi@~> scp email@example.com:~/cs50-dev/README.md . README.md 100% 6941 152.5KB/s 00:00 temi@~> open README.md
~ indicates the home directory, so
~/ is the home directory for the user. I changed my directory there, then into its
cs50-dev, and listed the files there. Now I know the ‘path name’ for the desired file - the directory name followed by slash followed by the filename. The
scp command allows me to specify the host and pathname for the source and destination of the copy; here the destination is
., a shorthand for ‘current working directory. Finally, I used the MacOS command
open to open the file.
OK, we are ready to logout from our session.
f00xxxx@plank:~/cs50-dev$ logout logging out... Connection to plank.thayer.dartmouth.edu closed.
Okay, this has been a good start. We have covered a number of important issues that we will revisit in the course.
Housekeeping - things you need to do:
If you haven’t already, do lab assignment 0 today!
Visit Canvas and connect to Ed Discussion to make sure your accounts are set up and linked.
Review all of the class materials on the website (Tentative Schedule, Lectures, Logistics, Readings, Resources).
Before next class, use your NetID to go through the above examples: execute all the commands and get a ‘feel’ for the shell.
Choose an editor and become very familiar with it. This is a very important step. See this week’s reading on “Learn to use a real editor”. Your first homework assignment will be released at the next class period, so don’t delay!
One more thing: Remotely accessing Unix machines
Note the following, depending on your personal computer’s operating system.
Linux: The systems in the Thayer are running GNU/Linux. Your Linux will likely be sufficiently compatible.
Mac: Mac OSX Unix conforms to the POSIX standard for the C API, shell utilities, and threads and can compile and run your existing code. This is really exciting for Unix/Linux development. The GNU tools we will be using, such as
make, etc., are either included or freely available. (You may need to install command-line tools via Xcode; if so, you will be prompted to do so when you first type
make or related commands.) You can use the
Terminal application. You can also use the
ssh command to remotely log on to Thayer servers, as discussed above.
Windows: Starting from Window 10, Windows Subsystem for Linux (WSL) has been provided to run native Linux command-line tools directly on Windows. Check out this link on how to enable this feature. However, given this is a Unix course you would be far better off doing your assignments on the Thayer server. The most successful approaches have been to do your editing and documentation on the Windows system and to upload (using sftp, filezilla, etc.) and build/debug on the lab systems via ssh. Many students have adopted the Sublime Text editor and used its handy sftp integration or VS Code as a text editor of choice.
The name “Terminal” is still commonly used in Unix parlance, right down to the MacOS app by that name. This term refers to the fact that early Unix computers, like other computers of its time, had a small number of interactive ‘terminals’ connected to it – each directly connected by a dedicated wire to a card within the computer. Each terminal was little more than a keyboard and a screen - or in the early days, a keyboard and a printer that would print each character as it was typed, and print the characters sent by the computer.
The first such terminals were “teletypes”, adapted from the device used by newsrooms around the country, which printed characters transmitted over the phone line from a distant device. This video shows a Teletype model ASR33. From the sound of this machine, can you tell why television news shows’ opening music usually has a stocatto tap-tap-tap-tap-tap background theme?