pie and aphasia

Monday, December 04, 2006

Webs: Invisible and 2.o

Here is how useless I am: I never think about the "invisible web." Actually, that's not entirely true, sometimes I think about it while I'm at the Reference Desk. I think about the ways in which the library online catalog and the databases are so much more specifically useful and generally harder to use than the "Google-web." This vague musing has never, however, progressed to wanting to do something about this, or wondering at the algorithmic structures and limitations underlying web searches. I tend to just habituate and take what I get. So I've been kind of participating in a digital dual-citizenship born of ignorance. Like Persephone in Hades, I spend half my time Googling acquaintances and shopping for boots on the "Publicly Indexable Web" and the rest of my time "underground" on subject specific, password protected "invisible" databases.

Ru and Horowitz observe the difficulty and necessity of making the "invisible web" accessible to the average user. They observe invisible web indexes usually either index the surface of the website, or index a portion of the contents of the site. Both of these approaches have the disadvantage of relying primarily on human indexing, which is subject to the subjective experience and preferences of the person doing the indexing...

The Notess article "The Terrible Twos" was helpful to me in that it gives a context to the proliferation of 2.0's in the information world. Notess describes the idea of "Web 2.0" as a simple way of conceptual differentiating the current incarnation and capabilities of the Internet from previous iterations. He includes a discussion of some of the technology that has influenced and been influenced by 2.0, including Ajax and Apis. Certain concepts identified with Web 2.0 including tagging, clouds, "the long tail," wikis, blogs, RSS feeds, etc. are easily extensible to other branches of information science, including news and of course, libraries. The idea of "Library 2.0" must certainly be a fairly limited play on the "Web 2.0" idea, as libraries have been through so many "versions" they must be on the 100.0's at least by now.

I am of a divided mind about the usefulness of 2.0 as a way of conceptualizing the different ways we use the web or libraries, or anything, today. I think the term is valuable and should certainly be "kept" http://en.wikipedia.org/wiki/Web_2.0#Criticism because it is currently in use among information professionals, who evidently feel the need to call the newest web something...My ambivalence about the term comes from my feeling that the Internet is so dynamic and such a work in progress it is hard to know what you're naming when you assign it a name.

Wednesday, November 22, 2006

Digital Library archaeology and Search Interface Aesthetics: Nicholson and Rose

Digital resources are wildly fresh territory for "information science" and everyone else in so many ways. As technology becomes increasingly sophisticated and it's applications increasingly prolific, we need new ways to measure, evaluate and improve the digital tools that have become so integral to almost every aspect of our lives.

We are only able to conceptualize things using and building on frameworks and tools we already use. In this tradition, Scott Nicholson makes a case for the appropriation of the practices and theories of physical archaeology for the study and evaluation of digital libraries. While this practice of "digital archaeology" is extensible to any digitally based resource, the article focuses on its digital applications. Information science can use the methodology and theory of other sciences to study analogous phenomena. Archaeology lends itself particularly well to this use because of its evolutionary history, focusing in the beginning on "gathering and describing" in traditional archaeology, progressing into pattern finding in "new archaeology" and finally transitioning into supplementing gathering, describing and apophenia (making connections where none previously existed) with an increased focus on postprocessual archaeology, examining the processes of study and the individuals involved.

Nicholson's argument is most interesting to me at the point he introduces postprocessual archeology with it's user-centered focus, allowing "the researcher to bring in qualitative elements to the quantitative bibliomining process." He notes the failure of pure descriptive and pattern-finding research to take into account the subjective needs of individual users. This move to user-focused research reminds me of the dialectic of ontologies versus collaborative tagging. Ontologies correspond to the descriptive and patterning phases of archaeology, while collaborative tagging assumes the user-centered approach of postprocessural archaeology.

The Rose article, "Reconciling Information-Seeking Behavior with Search User Interfaces for the Web," also addresses the issue of a need for researchers to take a more user-focused approach. Rose argues that the majority of search engine interfaces are still too reflective of the technology that powers them, and not reflective enough of the needs of the people who make use of them. Rose discusses three areas of user behavior which should be studied to improve interface design: 1. the variety of information-seeking goals 2. cultural and situational context and 3. the iterative nature of the search task. He suggests that studies in these areas should promote the evolution of web interfaces that guide use more efficiently and effectively.

Tuesday, November 14, 2006

Digital Context

Shifting our culture into a digital, "ubicomp" format profoundly changes the way we live. We are beyond the point where we are able to pretend that technology is a tool we use to perform certain tasks. Our relationship with the technological environment is much deeper and more reciprocal. We are created by the technology we have created. The allowances and limitations we find in being able to instantly find information, access almost any text, or communicate with people across the world, change the very shape of who we are in the world.

Both of the articles we read this week discuss an aspect of this shift. In "Reading behavior in the digital environment: changes in reading behavior over the past ten years," Ziming Liu studies the ways the proliferation of digital text impacts the quantity and quality of our reading behavior. Ziming finds that although people in today's text rich environment spend more time reading than they did in the print-only past, the depth and concentration applied to reading has declined. Reading online includes more scanning, keyword searching and following links and less careful, focused reading with annotation.

Ziming includes discussion of the "printing to read" phenomenon we've discussed in class. I found her mention of the Strassman statement "the human nervous system has a special control mechanism for the coordination of the hand with the focusing muscles of the eye..." (Ziming, 709) to be especially interesting. From the perspective of evolutionary biology, will this control mechanism begin to change if we habitually subject our eyes to screen based print? In the more immediate future, what will happen to scholarship, writing and "the academy" as our future scholars, writers and professor are raised on a diet increasingly high in web-based print?

In "The public library as a meeting-place in a multicultural and digital context: The necessity of low-intensive meeting-places," Ragnar Audunson advocates public libraries as an ideal "low-context" public space for members of an increasingly diverse and fragmented public. Audunson describes "high-context" places as arenas of primary engagement, where people form their unique, exclusive identities. "Low-context" public spaces, by contrast, are neutral places where people can meet, observe, and co-exist with people outside of their primary identity subgroups.

Two phenomenon contribute to our current need for low-context public spaces: 1) migration and the "globalization" of society and 2) the Internet, and it's ironic consequence of cutting people off from their neighbors just as they are being connected with people who share their interests around the world. Audunson argues that the public library is perfectly primed as an answer to this need for a "third space" or low-context place, to increase understanding between cultures and interest groups. According to Audunson, democracy "presupposes a degree of cultural community." To perpetuate democracy in a diversified society, we must find a way to celebrate diversity, while simultaneously building bridges to create "cultural community."

One really interesting thing about this article is that it was written by Norwegian, whose experiences with migration reflect the reality of the European Union at a time when the United States is intensely involved in it's own negotiation of integrating immigrant culture. There are amazing parallels between the role public libraries play in European democracies, and the role the public library is establishing in the United States.

I am also really interested in Audunson's idea of the public library as a "bridge between the virtual and the physical." The Internet has broadened our affiliations to identifications that vastly transcend geographical boundaries. Local government can seem irrelevant when all the news in the world is at your fingertips instantly. Audunson says using a public library is a local act of community involvement. Public libraries are also access points for the digital world, making them a balance of these two ways of belonging.

This reminds me of the Walter Benjamin essay "The work of art in the age of mechanical reproduction," http://bid.berkeley.edu/bidclass/readings/benjamin.html which though written when newspapers were considered "technology," is deeply applicable to the issues the Internet raises in our lives. Benjamin discusses the difference between "knowledge" and "information." He argues that "knowledge" is local, gained through interaction with other people, and tells us how to do things we need to be able to do to live in our immediate environments. "Information" is global, and disconnected from our immediate survival needs. Being a Marxist critic, Benjamin went on to assign value to these distinct terms, for reasons that still seem to apply.

Tuesday, November 07, 2006

Shirky: Ontology is Overrated http://www.shirky.com/writings/ontology_overrated.html

I took Thomas Steele's advice and chose this article on collaborative tagging as a viable alternative to ontologies for web organization, for my second article, because I am fascinated by the prospect of "good mob rule" that collaborative tagging seems to offer.

This article makes a very convincing case that ontologies are outmoded methods of organizing information, unsuited to a digital environment. Our most familiar ontologies, library catalogs, rely on methods designed to work with physical objects on a library shelf. Clay Shirky argues very coherently that while ontologies work relatively well for the conditions of a physical library, where every book published falls into a pre-established physical hierarchy, such as the Dewey Decimal, or Library of Congress classification systems. Shirky goes on to show how utterly inadequate and limited these ontologies are for categorizing something as broad, unphysical and unrestricted, as the Internet.

Collaborative classification, or folksonomies, allow for nuance and individuality to permeate findability. Shirky gives the example of people looking for "movies" having distinct interests from those searching for "film" or "cinema." In ontological systems, "experts" decide these interests are all the same, and should be lumped together, but the sensitivity of collaborative classification lets these subtle differences be guides.

I've written earlier about how much folksonomies appeal to my basic nature and desire to believe in the wisdom of "the people." This re-enforces that belief. Shirky addresses the unavoidable biases inherent in a classification system designed by a small team of experts deciding what an information object is "about." This is particularly dangerous and inappropriate in the context of organizing the web, because the "aboutness" of anything is so difficult to determine, when links and interactive content make the "isness" of web content so difficult to define.

Friday, November 03, 2006

Digital Archiving: http://kopal.langzeitarchivierung.de/index.php.en

Here's my idea in development...I have a Colombian friend who is a professor of Communication here at OU, and her area of expertise is "citizen's media." She is one of the founders of an international citizen's media organization called OURMedia/NuestrosMedios http://www.ourmedianet.org/general/index.eng.html.

Apparently, there is a vast collection of citizen's media documents, in print, audio, and video format stacked in boxes in the basement of the Cultural ministry building in Bogota, Colombia. So...the big idea is that this summer, I am going to go to Bogota with her and check this out, and one of my friend's Ph.D students and I are going to try to create a digital archive for this material, hosted through the Communication Department at OU. I've talked to my advisor about this, and she seems to think I should be able to count my work on this toward three hours of "Directed Project" credit...Though I don't yet have a SLIS faculty advisor for my part in the project (any suggestions? volunteers?)...

Anyway, this is pretty intimidatingly enormous for me, but it is causing me to think seriously about digital archiving, and it's potential to give a voice to unheard populations and expand the range of scholarship. One of my biggest questions about digital archiving is the issue of longevity. How can we make sure that electronically archived information remains accessible?

I chose the kopal: data into the future website for myself, because they are a cooperative organization attempting to realize permanent solutions for long term electronic archiving of documents and information. The kopal project is sponsored by the German Federal Ministry of Education and Research, and is partnered with the German National Library and IBM. Their goal is to find ways around the caveats traditionally associated with electronic document storage, including coping with software becoming "outdated" and finding stable hosts. Kopal will incorporate the IBM DIAS (Digital Information Archiving System) http://www-5.ibm.com/nl/dias/, as well as offering a range of other solutions for smaller archive projects. I've forwarded this site and some of the links it includes to other people involved in the project, and I think it may provide us with some good practical ideas.

Monday, October 23, 2006

Delicious! Organizing the web...Ding and Golder & Huberman articles

Before this week, I didn't know about del.icio.us. I knew it existed, but I didn't know what it was, and imagined it was a resource for people smarter or more organized than me. After reading this week's articles, I was inspired to check out the webpage and now I have an account and I am compulsively bookmarking and tagging all the time.

The fact that we were assigned the Ding article "A review of ontologies with the Semantic Web in view" and the Golder and Huberman article "Usage patterns of collaborative tagging systems," to read as a pair, sets up a neat dialectic of possibilities for web organization. I am too ignorant of...things...to be sure, but it seemed to me that the Semantic Web and Collaborative Tagging (folksonomies) are being presented as two divergent visions for organizing the web into something more "controlled" than the current Google-"indexed" chaos.

The ideas of Semantic Web and Collaborative Tagging/Folksonomy are particularly interesting in contrast to one another. The way I understand the Semantic Web is that it requires formalized ontologies, tools and special "languages," constructed by "experts." It is based on assigned meta-data and heuristics. Ontologies are meant to "help people and computers to access the information they need and to effectively communicate with each other." To achieve this end, requires relatively traditional hierarchical structures of "knowledge" and levels of "expertness." Folksonomies are pure, best case scenario, anarchy. The practice of collaborative tagging embodies everything optimistic about anarchy and apparently, proves "the wisdom of the people." According to the Golder and Huberman article, over time, collaborative tagging arrives at stable and objective consensus, or a high level of useful "aboutness."

Neither ontologies nor collaborative tagging are without their caveats, as any attempt humans make to decide what something is about is subject to the influences of culture. I personally favor collaborative tagging over formal ontologies, but I am fully willing to admit that may only because I don't understand ontologies!

Wednesday, October 18, 2006

Anarchists, Panhellenic reconstruction, and Aboutness: My meta-data blog entry

In the past week, I've had long lunches over pho and felafels, respectively, with an anarchist and a panhellenic reconstructionist, both of whom are currently working on their ph.d's in various fields. Neither of these people have my blog info, so I'm going to feel free to appropriate from these conversations with impunity.

The theme of much of both of these conversations was my traditional favorite: reality and by extension, truth, the nature of knowledge, human communication, etc...I love to make people tell me why they think "facts" or facts, why things are the way things are...These two come from VERY different ontological perspectives, but both seemed to believe that knowledge and the purpose of information and objects is decided by public consensus. Maybe they were just humoring, because that is what I spend all of my time thinking about, or maybe I was projecting, but that's beside the point. My perception rules with an iron fist here...

Anyway, I was thinking of all of this as I read this weeks articles on meta-data, meta-data schemes and meta-data optimization. One of the basic caveats of meta-data of any kind is that "data ABOUT data" requires that some person or computer decide what a document (book, picture, artifact, antelope, etc) is ABOUT. Wait. Is aboutness an objective fact? An immutable quality and object is "born" possessing? Or is "aboutness" a very abstract idea, constructed out of the culture and experiences of the audience for that document? Of course, I think it is the latter...but I will even concede that within a perfectly homogeneous community, it would be possible to decide what a document is about with a fair degree of authority. However, are our libraries and research communities ever truly homogeneous? Even if they were, is it possible that an alternative "aboutness" meta-data tag might expand research and open up perspectives?

Digitization and the Internet add new aspects of this to consider. We can't do away with meta-data because it is the foundation of "findability" in libraries and on the Internet, so some determination of "aboutness" has to be made. The Internet allows this process to become increasingly democratic, with the creation of "folksonomies" and self-assigned tagging becoming more and more prevalent.

The MODAL (Metadata, Objectives and principles, Domains, and Architectural Layout) framework outlined in the Greenberg article provides a valuable tool for "getting meta" on meta-data. If we agree that meta-data is essential to locating information, items and documents in a digital environment, it follows that we need to have tools for studying and comparing these schemes, in order to figure out what types of meta-data work best and in what situations.

The Dawson and Hamilton article is worth the paper on which I printed it out, if only for introducing me to the term "data shoogling." How magnificent and vaguely obscene is that?

Anyway, before reading this article I was all but unaware of the phenomenon of "search engine optimization," of which "data shoogling" is a distinct offshoot. I am embarrassingly ignorant of the algorithmic complexity involved in "marketing" databases by embedding tags that improve their Google "position." This article discusses techniques to increase Google-accessibility without compromising website content or traditional "meta-data" attributes. The authors discuss the "meta-data v. Google" phenomenon, in which we find ourselves with an elaborate and long-established tradition of complex meta-data assignation that isn't used in the main information seeking behavior ("Googling"), practiced by most of the world. "Data shoogling' seeks to preserve meta-data while still optimizing the google-ability of a website. I really like the idea presented here, that Google's supremacy may not be eternal, and losing the depth of information meta-data provides, could ultimately prove a terrible loss as technology becomes more sophisticated, and the Internet evolves.