Activity - Cohesion and Coupling
With your group, revisit the Lab 5 indexer following the principle of cohesion and coupling.
We ask ourselves this question: How do we decompose indexer.c into functions so that each function contains cohesive operations while the connection among functions is loose?
Lab 5 review
From the DESIGN document: The indexer (indexer.c) will run as follows:
parse the command line, validate parameters
call indexBuild, passing pageDirectory
save index to file
clean up data structures
where indexBuild: takes a pageDirectory
parameter and returns an index data structure:
creates a new 'index' object
loops over document ID numbers, counting from 1
loads a webpage from the document file 'pageDirectory/id'
if successful,
passes the webpage and docID to indexPage
where indexPage:
steps through each word of the webpage,
skips trivial words (less than length 3),
normalizes the word (converts to lower case),
looks up the word in the index,
adding the word to the index if needed
increments the count of occurrences of this word in this docID
To implement this functionality, in the Lab 5 documentation under Hints and tips
We strongly recommend you add an
index
module to the common library – a module to implement an abstract index_t type that represents an index in memory, and supports functions like index_new(), index_delete(), index_save(), and so forth. Tip: much of it is a wrapper for a hashtable.
The index
module implements the index data structure. From the Lab 5 DESIGN document major data structures
The key data structure is the index, mapping from word to (docID, #occurrences) pairs. The index is a hashtable keyed by word and storing counters as items. The counters is keyed by docID and stores a count of the number of occurrences of that word in the document with that ID.
Activity
Work with your group, discuss how you’d implement such a wrapper and then how you would use it to implement the indexer described above. Also consider how you would further decompose indexBuild
to functions, considering cohesion and coupling.
Solution
A potential solution