Introduction to Information Science and Technology
|Special project: Web search for sites on Wool Felt
||Creation of a personal website
This paper was
originally written under the name Amy Proni, in November, 2002, for the
Introduction to Library Science, taught by Dr. Hak Joon Kim
at Southern Connecticut State University. As of January
links are no longer relevant to the topic. I have therefore included
links to some of my favorite felting sites (found outside of the
parameters of this lesson) at the end of this page.
Web search for sites on Wool Felt
This search is looking specifically for directions on how to make wool felt. I have determined that any web site that includes directions (either written or through the use of images) will be valid. Sites that refer to classes, courses, or workshops on how to make felt will be considered as sites that are not at all relevant to the subject (false drops). I have selected three major search engines for this exercise, Alta Vista, Google, and Alltheweb. The meta-search engine I have decided to try is Vivisimo.
The results in this paper are the culmination of five searches, conducted over a period of several weeks. My early attempts at searching this subject were so poorly worded that I felt as if I were drowning in a sea of websites. I learned that using the correct keywords is really important, and sometimes the only way to figure out which words are right is to try the search differently several times. After the third ineffective search I decided to search the subject headings in the Library of Congress Authorities, and found a scope note on felting: “here are entered works on the process of felt making. Works on the use of felt in handicraft are entered under Felt work.” Where I had first used the words “felted” and “hand” for searching, I now changed the search terms to “felting” and “technique.” The early searches also returned far too many hits for classes, courses, workshops, and books on the topic. Eventually I figured out that quotation marks around my terms would narrow the results without losing too many valid sites. It also seemed intuitively that the term “technique” would work better than either “directions” (which often yielded a map to the store) or the cumbersome phrases “how to make” or simply “how to.”
return to top
The Simple Search
A simple search on Alta Vista, using the keywords “felting technique” resulted in 30 hits. Remarkably, instructions were found on the first hit. This search yielded a total of six valid sites, and four of those sites were found within the first ten hits (numbers 1, 5, 6, and 9). The other two valid sites were numbers 18 and 20. The majority of the results from this search were for listings of classes available. My recall rate is 20%, or 6:30. And in this instance, the precision rate is the same because I retrieved a total of thirty documents, and precision is equal to the number of relevant retrieved documents (6) divided by the number of retrieved documents (30).
Google returned twice as many hits as Alta Vista (60) on a simple search using the same keywords again in quotation marks. I visited each of the first thirty hits. Of these, seven were valid hits (three of the initial ten sites: numbers 1, 6, and 10, followed by a cluster at 16, 17, and 18, concluding with number 24). There were twenty-three hits that either referred to books about felt making, workshops, or to artists involved with the craft. An evaluation of the results of this search provides differing percentages than in my previous effort. Recall is 7:30, or about 23.33%. Precision was lower, 7:60 or 11.66%, because most of the items retrieved were not relevant.
Using the same parameters as in the previous searches, Alltheweb returned 49 hits, and again, I visited each of the first thirty hits. Six hits were valid: numbers 1, 8, 9, 19, 20, and 28. As before, three of the first ten hits were on target, there was a small cluster in the middle, and one near the end. The recall rate is 6:49, or 12.2%, and the precision rate is 6:30, or 20%.
As expected, there were overlaps in the results gathered by each search engine. I was intrigued to see that the first hit on each engine was the same site and page.
There were significant overlaps in the number of invalid sites found. In fact, a sizable number of false drops resulted from the simple search technique, but I have to attribute that to using rather common words. The lack of specificity was evident in these results. I did not encounter any truly egregious false drops, i.e., sites that had absolutely nothing to do with the subject. These simple searches also did not return any 404 (file not found) or 403 (access denied) errors.
return to top
The Advanced Search
I wanted to use the advanced search functions to focus on the topic of nuno felt. Although nuno is a Japanese word for cloth, it has come to signify a laminated fabric in the felting community. Nuno is a method of integrating wool fibers with a woven fabric (generally silk or cotton), so that the end result is a warm and lightweight material. But because the search engines vary in allowable parameters, my initial search results seemed like ‘apples and oranges,’ and I found this aspect of the lesson to be quite challenging. After spending time with the different engines, though, I became more familiar with the options, and eventually designed an advanced search that seemed to work well with all three.
For the advanced search in Alta Vista, these parameters were: (felting AND nuno) OR (felting AND laminated) NEAR instructions – in any language – in any region – on sites modified within the past year. Alta Vista returned 85 hits, and I checked the first thirty sites. Eight of those sites were valid, and in this case, the recall was 8:85 or 10.6%; precision was 8:30 or 26.66%.
For the advanced search in Google, I modified the parameters to include all of these words in the text of a site: felting instructions nuno OR laminated – in any language – no geographic limits – on sites modified within the past year. Google returned 89 hits; ten were valid (out of the first thirty). Recall was 10:89 or 8.9%; precision, 10:30 or 33.33%.
It was also necessary to modify the search parameters for the advanced search in Alltheweb. The parameters were defined as: must include nuno and felting in the text; should include laminated and instructions in the text – in any language – in any region – on sites modified between October 14, 2001 and October 14, 2002. The search engine returned 98 hits; of these, 8 were determined to be valid. Therefore, the recall was 8:98 (8.1%) and the precision was 8:30 (26.66%).
As had been theorized in the class lecture, my results showed that the advanced search was more precise. It’s interesting to see the numbers reflect that inverse relationship between precision and recall.
There were valid overlapping sites in the advance search:
Interestingly, there were only three web sites that overlapped between the simple and advanced searches:
It is obvious that the Boolean expression AND used in the advanced searches [(felting AND nuno) or (felting AND laminated)] was crucial to the results. It is also probably not a coincidence that the sites Mielkesfarm.com and Yurtboutique.com were found by both types of searches, as these sites provide instructions for a variety of felting techniques. The Wasatch Woolpack site not only provides links to a large number of sites, but also has instructions for different methods of felting – but it is reasonable to think that the parameters in the Alltheweb search (requiring the key words to be in the text) did not permit the site to be in the retrieved sets.
return to top
Final search using a meta engine
Vivisimo, a clustering engine, returned 97 documents using the simple query “felting instructions.” I was very pleased to find that fifteen of the first thirty hits retrieved were relevant, and two sites were new to me. Recall was 15:98 or 15.4%, while precision was 15:30 or 50%. One very nice feature of this search engine was its use of windows: a window on the left provided a list of clustered results. This clustering mechanism essentially broke the retrieved set of documents into subsets. I knew immediately that of the ninety-seven sites retrieved there were thirty-nine links for “instructions for making wool felt,” and eight links regarding “felting needles.” There were two links each for “hand felter” and “basic felting instructions.” These results were really exciting, especially after having spent hours looking for the proverbial needle in a haystack. The differences between meta-search technology and the more traditional web search engines were made clear in this simple search on Vivisimo.
I used the same advanced search query in Vivisimo as previous, with some technical changes. The keywords sought were: felting (nuno OR laminated) instructions. As noted below, I instructed Vivisimo to cast the net very wide.
The search returned 55 hits, and I looked at the first 30 sites: fourteen were valid sites. The majority of the valid sites were familiar to me by this time, but again, there was one new site: Brainways.co.nz. Recall was 14:55, or 25.4% and precision was 14:30, or 46.67%, not quite as high as the results from the simple search, but still considerably improved over the traditional search engine statistics.
return to top
Searching skills, like any other skills, take time and practice to develop. As a librarian, I can appreciate the variety of searching mechanisms available, but the value of those information tools is impacted by my ability to use each tool properly and appropriately. This exercise has been quite interesting in that I learned about the subtleties of three different search engines, and the power of a meta-search engine. I also learned that sometimes a keyword is just a keyword, and that is not the same as authority-controlled subject heading. I have a much greater respect and appreciation for the authority control unit. The worldwide web is an unregulated environment; it will never offer the level of sophistication or standards that can be found in library catalogs. For this reason alone, I think it is especially important to master the necessary skills to effectively search and retrieve valid information, and I look forward to continuing that process.
Links to sites that provide instructions, explanations, or contribute to the body of knowledge regarding felt making:
return to top