Activity - Fetch a Web Page
In this activity, we play with the webpage module provided by the lab4 starter kit. We call its APIs to fetch the web page of a given URL and save it to a file, where the first line of the file is the URL, and the rest is the HTML of this URL.
Grab the skeleton code webpage:
$ cp -r /thayerfs/courses/22spring/cosc050/workspace/webpage .
$ cd webpage/
fetchweb.c is the skeleton code we will work on. Our task is to fill in the to-dos and finish the program.
To compile the code, first generate the libcs50.a library under libcs50/, then use the library to generate the executable:
$ cd libcs50/
$ make
cp libcs50-given.a libcs50.a
$ cd ..
$ mygcc -o fetchweb fetchweb.c libcs50/libcs50.a
Some example usage of the program:
$ ./fetchweb http://cs50tse.cs.dartmouth.edu/tse/letters/ letter
$ ./fetchweb http://www.google.com/ google
The first crawls the letters index page and saves the URL and HTML to a file named letter, and the second does the same to Google home page (though we should not use URLs outside cs50tse domain for Lab 4).
Work in groups to finish the program. Use gdb and valgrind for debugging if necessary.