Rice University’s Student Newspaper — Since 1916

Friday, November 08, 2024 — Houston, TX

AI in the archives: Fondren Library explores new tech

ai-archiving-amy-cao
Amy Cao / Thresher

By Amelia Davis     3/5/24 10:07pm

From Grammarly and Quizlet to SparkNotes and Spotify, artificial intelligence is now a major feature of nearly every website — and the archives of Fondren Library are no exception. The use of AI has been a notoriously hot-button topic for the last few years, involved in artist exploitation debates and the terms of the Writers Guild of America strike, but in the Woodson Research Center, its role has been to facilitate greater ease and expediency in many of their preservation and transcription processes. 

In the Woodson Research Center, AI has made possible a new digital collections website, which launched in May 2023. Norie Guthrie, the archivist and special collections librarian, said that Woodson staff previously used the “D space” — now the Rice Research Repository — where they would send audio files to an off-campus service to transcribe, edit and upload the returned PDF. With AI, this is expedited: Transcribing hours of audio now takes only 10 minutes, when previously the process had taken days. 

“We can push a button and then it creates a transcript. It is very helpful for audio because audio transcription takes time and this makes it so that the process can be a lot faster,” Guthrie said. “It’s a lot more user-friendly, and it’s a lot easier on us so we can be more efficient with our work.”



Now, the most laborious part of the audio transcription process is cleanup and correction of the completed transcription — for example, correcting the spelling of Wiess from “ei” to “ie” every time it is mentioned. 

AI has also been utilized by students in Fondren-associated research projects. Zoe Katz, a Will Rice College senior, applied to the project through Fondren Fellows, a year-long program through the library that provides research opportunities for Rice undergraduate and graduate students. 

“I applied because I kind of want to go to grad school and I was interested in seeing what research options they had. They had one about topic modeling, which is a subfield of computational linguistics,” Katz said. “I’m a linguistics and computer science major, so I was really interested in how both of those two topics would overlap.” 

Topic modeling is a type of statistical modeling that uses unsupervised machine learning to identify clusters or groups of similar words within a body of text. This text mining method uses semantic structures in text to understand unstructured data without predefined tags or training data. 

Katz chose to focus her project on using OCR, or optical character recognition, to convert old issues of the Thresher from PDFs into text-searchable versions. OCR recognizes text in scanned documents to convert a physical document into an accessible electronic text version. Due to her experience writing for the Thresher, Katz understood the need for an easily searchable database of past publications both for student journalistic research and historical value. Because the project was only finished early last semester, Fondren Fellows is still in the process of publishing the data, but her work will soon be available for public access. 

Steven Loyd, the processing assistant at Woodson, uses ChatGPT to write Python scripts to perform basic tasks. He combines his knowledge of algorithmic thinking and ability to conceptualize where a coding program would be useful with AI’s code-writing abilities. Often the code he collaborates with ChatGPT on is used to automate online chores, such as counting, organizing and renaming file types and folders based on contents and other variables by thousands or tens of thousands of files at a time. 

“These are tasks that humans could do with enough time, but there is no shortage of things at the Woodson that only humans can do, so having useful code on hand saves time for more challenging, provocative work,” Loyd wrote in an email to the Thresher. “Ultimately, for our purposes at the archive, ChatGPT is a very helpful tool that nonetheless requires significant human input to function usefully … As far as the future goes, I see AI as a technological advancement akin to the internet, new storage formats, etc., that help archivists process materials more efficiently and reliably.”

The Fondren Library Artificial Intelligence Task Force was created to discuss possible limitations around AI use in issues of academic judgment, but also how it can be better utilized in research and professional life. Their meetings are open to students, who can contribute to the discussions as Rice continues to grapple with the questions of ethics surrounding the use of artificial intelligence. These discussions will hopefully help find ways to use AI to support rather than supplant human experience and learning, Loyd said.

“Archives, I think, are ultimately humanistic, requiring [an] informed, passionate understanding of given materials and the cultural context surrounding them,” Loyd wrote. “I don’t foresee archival AI rising above its station as a timesaver to anything with actual responsibility.”



More from The Rice Thresher

FEATURES 11/5/24 11:38pm
A peek at the polls: political participation through the years

Waiting on election results isn’t new to Rice students. The 1916 presidential election saw students waiting for the Houston Chronicle’s news for three days; when the results were finally announced, Woodrow Wilson’s reelection drew incoherent shouting, rah-rahs and a congregation in the quad.

FEATURES 11/5/24 11:13pm
Renee Wrysinski crafts circuits for change

As a child, Renee Wrysinski fit the standards for a future engineer to a tee, even getting an early start on model design by building Legos. Fifteen years later, she would win first place in Circuit Showdown, a televised engineering design competition for college students hosted by distributor Mouser Electronics and media company eeDesignIt. Wrysinski, who studies electrical and computer engineering, secured $10,000 and equipment donations for herself and the university.   

FEATURES 11/5/24 11:12pm
From Alabama to Bahia, Hordge-Freeman examines emotion

One night in Brazil, Elizabeth Hordge-Freeman was driving back from a late dinner with friends when a military police officer stopped her and ordered her out of her car. As he aimed a rifle at the side of her head, she said she remembers standing there, shaking, unable to hear anything but his voice — not even her friends shouting at her. This anecdote is one of many Hordge-Freeman shares in her first book “The Color of Love,” which examines how racial hierarchies are reproduced and challenged in Black Brazilian families.


Comments

Please note All comments are eligible for publication by The Rice Thresher.