P2: SEARCH ENGINE PROJECT
For the Web Crawler / Search Engine project, please follow the instructions in the Google Doc.
Resources & Tips
We have also compiled a list of resources to help you complete the project:
1. Database Functionality
For database tips, refer to Lecture 13 and specifically:
- How to insert a row to a table
- How to update a row in a table
- How to querying by keyword
Take a look at these links:
2. HTML
Take a look at the HTML sample files from Lecture 16.
3. Errors
Sometimes the URL you’re trying to crawl doesn’t exist or doesn’t have any content. If this is the case, your soup variable will be empty (e.g. soup is None will be True). If this happens, then just ignore the url and move onto the next one.