If Text Then Code

  • About the Course
    • Course Goals
    • Course Modules
  • Important Information
    • Contact Me
    • Policies
  • Schedule
  • Assignments
    • Reflection Posts
      • Prompt #1
      • Prompt #2
      • Prompt #3
    • “Found Text” Abstracts
    • Build Your Own Website
    • Write Your Own Text Adventure Game
    • Publish Your Own Digital Edition
    • Final Project
    • Rubrics
  • Resources
    • Readings
    • Tool Kit
    • Tutorials & Exercises
  • Reflections

Standardize, Organize, and Finalize: Neil and Ella Take on TEI

December 11, 2016 by Ella Ekstrom

Neil and I were apart of the Editorial Staff regarding TEI markup standards and revision. Our goal was to refine the markup in a way that focused on the clarity, consistency, and effectiveness of the tags themselves. For example, we found a limited use of the tag <roleName> surrounding terms like “Capt” or “Gen”. However, when it transferred to the extract file, it did not include the context of whom it was referring to, and it was not used consistently throughout the journal, therefore it proved to be neither clear nor consistent so we dropped its use. Instead, we kept the role titles within the <persName> tag. Yet, even so, we still found variation in spelling of certain names as well as a general ambiguity as to who each person referenced was. Therefore, we collected all the extract data under the tag persName, organized it, and researched civil war records to determine whom Linn was referring to. Then, we created TEI ID tags, such as #CGS for Captain George Shorkley, to include more information like birth, death, full name, and position on the names that were most referenced. This helped us find true spelling for names as well as provide a clearer and more in-depth understanding of the context Linn was writing about.

Going through the text further, we found insufficient use of tags like <name type=”event”>, used to describe weather or meals, and the <trait> tag, so we decided to remove them given their inconsistency and general inapplicability to the rest of our project. We also decided to add a tag that we saw wasn’t being used, <orgName>, to describe army groups or other organizations, which otherwise were mislabeled under <persName>.

We found the need to add specificity to certain tags as well, such as what we did with the TEI ID tags, but instead using “n” within the tag. For example, to specify certain places, like Roanoke, we used <placeName n=”Roanoke Island”> to clarify. Also, we decided to add use of n=”pers” and n=”weather” under the <state> tag, to clarify between the descriptions of emotional or descriptive state, and the state of the weather. Finally, for objects, we used <objectType n=“boat”> to indicate which were ships or boats.

Individually, after we made a decision of our overall goal and set the TEI standards, we split up the work accordingly. Neil would individually revise the TEI markup detail throughout the document, while I wrote the statement explaining our decisions and goal as the staff. Then I took the data from the persName extract to determine the identities and correct spelling of those mentioned using Civil War databases, and then I created and defined the TEI ID tags that Neil then applied to the document. Together we discussed whether certain terms should be included under certain tags and we each double checked spelling and formatting of the document when we finished.

screen-shot-2016-12-11-at-4-34-53-pm

This is a screenshot of the excel sheet that I used to organize all the data I collected of each identity from different Civil War databases.

screen-shot-2016-12-11-at-4-45-38-pm

After determining which of the names were used most frequently, I created and defined TEI IDs for each, including information like their title, full name, birth, and death.

I found this project to be a culminating experience, where the whole class worked together to finalize not only a topic that has been the focus of a large portion of this semester, but also by having each of us use skills learned throughout the semester. My favorite part, however, was knowing that all of our work will go toward something larger than the class itself, a unique history that hasn’t been touched by any other institution but ourselves. This unparalleled history is the key feature of this project and I am so grateful to be apart of the process.

Filed Under: Prompt, Reflections

Markup Cleaner

December 10, 2016 by Neil Lin

At the beginning, I was more interested in doing challenges like CSS and I believe it might be another time for me to get more familiar with this useful tool. While, at class, Professor provoked my thought of standardizing all the markups for Linn letters. I have to admit that she made it and this attracted me to adventure in another unexplored field of digital humanity.

Looking at 194 pages abounded with colorful markups, I made a lot effort before starting it like battling, struggling, and panicking. In my mind, to create a criterion, I should be fully apprehensive of the journal in general. I tried to read through it to get an extensive picture. But when I daydreamed about pounding the table, shutting down the computer, and leaving the room with my condescension and disdain, Professor gave me another way out by creating an excel that lists every tag. My following job was about this “Excel Guide”.

capture1

I mainly focused on the tag <state> which, I believe, is a relatively difficult one to conquer. Unlike <persName> or <objectType> which is all but nouns and names, <state> is more up to individuals not only because it is about adjective but also because people’s tastes and perspectives vary a lot. For example, “beautiful” often describes an appearance of a person or an outlooking of objects. In the context, it says “the weather is beautiful”. But that editor thought it describes the author’s mood as the reflection on the weather with more contexts related. Another example is about “brilliant” which should depict a person’s intelligence, but it actually means “the weather is brilliant”. So what I did is to divide all of them into two categories: one is for person and the other is for weather, predominately based on its first-handed meaning; like “beautiful” and “brilliant” in the above, I tagged both of them as for weather. While the third type in <state> column is like the word “crying”. I treated them more of an event which can illustrate people’s feeling, and in this case, it shows the author is sad. But to make everything simplified, I deleted all of them decisively and resolutely like a real leader.

Next, what I did is to look through every markup and correct them in various ways. The prime principle is “less is more”. A lot of tags like arms, wood, and coffee, are deleted for their uselessness, although each of them appears more than 5 times in total. The second rule is “one for all”. The word “house” does have different meanings in different situations and can be marked up as object or place. But because the characteristic of a noun which is not as complexed as an adjective, I marked up all of them as <placeName> for consolidation. To some extent, this makes a lot of sense. The third one is “linkup”. The purpose of marking up the Linn letters is to serve researchers who can easily grasp the basic idea and understand author’s “idioms” or proper nouns that are previously introduced. For instance, some people marked “battery” as an object due to the lack of background, which ought to be a place. Another example is “Roanoke” which stands for “Roanoke Island” and is a highly frequent word. So “linkup” refers to the second markup to give another category for those words sharing similar meanings like for all the boats, and it is still amazing to me that “Inquirer” is a name for a boat.

capture2

Ella and I both worked really hard on this final project. She focused more on writing the wonderful and precise editorial page and the tag <persName>. I focused on rest of tags. To be honest, in the middle of the project, I did regret what I had chosen. But at last, everything works out perfectly. I look myself as a markup cleaner to make everything organized. Picking useful and inflammable sticks only makes the campfire even brighter.

Filed Under: Reflections

Extracting Personality Data from Linn’s Texts

December 9, 2016 by Yash Mittal

The final leg of the Linn project turned out to be more exciting than I anticipated. Jingya and I were tasked with creating visualization(s) by analyzing Linn’s diary entries or letters. Of course, every visualization needs some underlying data, which in our case was Linn’s personality data. A major part of our challenge was to figure out a way to generate this data from his diary entries. Jingya and I thought we could employ existing web Application Programming Interfaces (API). In layman’s terms, an API can be defined as a package of rules which when given an input produce desired outputs in a structured format. In our scenario, the input was the text from the journal, whereas the output was the personality data associated with the text.

We started off slowly because we could not find a freely-available API which met our needs. We eventually stumbled upon IBM Watson’s Personality Insights Service. As stated on their website, “Personality Insights extracts and analyzes a spectrum of personality attributes to help discover actionable insights about people and entities.” This API was what we needed; however, we were required to use a server-side technology called node.js, which neither of us was familiar with.

My primary task then was to set up a skeleton node.js application. As I soon discovered to my disappointment, node.js being a server-side technology does not have access to the text displayed on web pages. So I wrote a script (shown below) to store the text which I could easily feed into my node.js application.

capture
script (left) added to Journal website to produce a dictionary (right) from date to the diary entry content

Once I had access to the text in a readily parsable format, I used node.js to make API calls to retrieve the personality data for each of Linn’s diary entries. Initially, I was skeptical about the number of API calls I would be allowed to make with the free-tier subscription. Fortunately though, I was able to make enough calls to test my application and store the needed data as a JSON file. Jingya is using this data to create a dynamically generated visualization using p5 and (or) d3 .

capture2
node.js script to automate retrieval and storage of personality data for all diary entries

My experience while working on this project was fascinating. I got to learn about a well-established and popular technology, which I did not think I would at the outset of this course. I think I did a fair job at familiarizing myself with node.js and the Personality Insights API. However, I could have saved myself some trouble had I read the Node.js documentation carefully. I tried a few hacks to access the web page content, but could not find a way around. Overall,  I have picked up valuable skills such as transcription, text encoding and analysis over the course of this project, and I look forward to applying them in my future endeavors.

Filed Under: Reflections

Where’s James Merill Linn? Mapping Linn in New Bern, NC

December 9, 2016 by Maureen Maclean

My role in this project was to spatially markup James Merill Linn’s journal entries from March 23- April 19, 1862, using ArcGIS online. Using the transcribed journal entries and historical research, I plotted, on various maps from the Civil War era, points where Linn mentioned he had gone for each of his journal entries (one point= one journal entry). Most of the points are around the New Bern area in North Carolina, where Linn spent most of his time (March 23- April 9, April 12-16, 1862) as part of the Union occupation of the town under General Burnside. On April 17, 1862, Linn and other Union troops left New Bern for South Mills in a (failed) effort to undermine Confederate transportation schemes by destroying the Dismal Swamp Canal locks. The Battle of South Mills (April 19, 1862) was one of only a few Union defeats as part of Burnside’s Expedition.

Screenshot of my work on ArcGIS
Screenshot of my work on ArcGIS

One of the hardest parts of mapping Linn was the fact that Linn rarely named the places he visited or the streets that they are on. For example, he mentions General Burnside’s headquarters as a white house in New Bern but doesn’t explain where it is or what the house was called. To find out what house it was so I could map it I had to do some research online. I figured out that the house was the Stanly House. However, the house was moved from its original location in the 1960’s and I couldn’t figure out where the house was originally located. I eventually found the general area where the house first was located after going through old land deeds but I hit another pitfall because the Stanly House isn’t listed on the map I was plotting the points on from 1866. The only building mentioned where the house was first located was listed as the Washington Hotel on the map. After even more research, I discovered that George Washington stayed at the Stanly house in 1770 and surmised that the map maker was referring to this moment in time when labeling the map.

General Burnside's headquarters at Stanly House.
General Burnside’s headquarters at Stanly House.

My struggle to find Gen. Burnside’s headquarters was probably one of my easier efforts to find a location Linn talks about because at least it is a physical landmark. For a lot of the journal entries, Linn just talks about his time at camp yet he doesn’t give the name of the camp he stays at and only offers hints of the general location. I used his hints, research, and the maps (both the 1866 and 1864 maps) to try to locate the camps he stayed at to the best of my ability. For example, on April 3, 1862 Linn mentions moving camp to be on the other side of the river (the two previous camps were on the south bank of the Trent) on the west side of town. Using a map from a newspaper article from that time, I surmised that the camp was located north of the Railroad Depot. Unfortunately, I cannot be entirely sure that the locations I plotted for Linn’s time at the camps, or for any of the points for that matter, are correct. That being said, my spatial analysis is more of a communication device than a historical record of Linn’s location; the purpose of mapping Linn’s entries is to provide an engaging interface for readers to contextualize Linn’s situation. A large part of this goal is aesthetics and accessibility, two key elements of effective communication. To make the analysis clear and good looking, I colored coded elements as well as labeled them. For example, instead of tracking Linn’s travels within New Bern, I instead just used a blue circle to indicate the general area he traversed during the majority of this time and labeled it as so. I figured that if I tried to track his moves around the city, like I did with his journey to South Mills and Pollocksville, it would be too cluttered and visually overwhelming. Additionally, instead of including the whole transcribed entry for each point, I instead only included a quote from the entry that related to the location of the pin. I did this because 1) it is more clean and concise and 2) it makes the reader want to find out more and thus click the link to read the full entry.

Newspaper map I used to help determine where on of the camps Linn stayed at was located. The red circle marks the label of a camp according to the map maker.
Newspaper map I used to help determine where on of the camps Linn stayed at was located. The red circle marks the label of a camp according to the map maker.
Entry that Linn talks about the camp. I used the context that Linn supplied with additional research (like the newspaper map) to determine the location of the camp.
Entry that Linn talks about the camp. I used the context that Linn supplied with additional research (like the newspaper map) to determine the location of the camp.

Overall, I am satisfied with the work I did and I think that I was effective in achieving my goals. In terms of relating the process to the other work we did, I think spatial analysis can be considered another form of transcription since I had to transcribe his words in a spacial fashion. Just like with the actual transcription of Linn’s writings, I had to negotiate Linn’s actual words with what I thought Linn meant, or in my case, where Linn meant he was. I had to question Linn’s own accuracy of his locations and use historical research to try to determine where he was. As I mentioned previously, my spatial interpretation of Linn’s entries is just that- an interpretation. This relates back to what Pierazzo says about digital editions:

“We should simply say that the notion of objectivity is not very productive or helpful in the case of transcription and subsequently of diplomatic editions and that we should instead make peace with the fact that we are simply doing our works as scholars when transcribing and preparing a diplomatic edition.” (Pierazzo 466).

Just as I choose what to markup for the various transcription/ TEI work we did this semester, I decided what places I thought were the most important to my edition and used research as well as Linn’s words to spatially markup the entries. As Pierazzo says, we have to distance ourselves from the notion of objective truth- our editions will be inherently subjective and we can’t be stuck on trying to make it “correct.” This is exactly what I did- I adjusted my knowledge and did the best that I could to make a scholarly edition of Linn’s journal entries.

Filed Under: Reflections Tagged With: 1862, arcGIS, burnside, final, Linn, map, maureen, new bern

Linn: “The Early Years” Revisited by Sarah and Julia

December 9, 2016 by Sarah Rosecky

For our final project, we, Julia and Sarah, worked on previously transcribed diary entries from James Merrill Linn’s 1850 diary. Using Oxygen XML editor, we were able to mark-up the diary entries with TEI. Sarah marked up the first half of the diary, while Julia marked up the second half of the diary. This was an interesting choice because in this class and HUMN 100 we focused on Linn’s diaries and letters from his time in the Civil War. There was so much to mark up in just the diaries we chose, but we decided to focus on the people. We chose to do this because in his letters and diaries about the Civil War, he did not write about many people very often.

Our work is an interesting addition to the Linn project that the whole class is working on. The majority of the project is on his time fighting in the Civil War, but we wanted to add a different aspect to it. People who look at our website need to realize that this man is not just a Civil War veteran! He grew up in Lewisburg, and graduated from Bucknell. He had a life before he fought in the Civil War. Some people may think that his life during the war was much more interesting, but his life before was intriguing as well.

We extensively marked up his diary of 1850, while also becoming more and more interested as we dug deeper into the analysis. We noticed a couple disturbing stories that Linn wrote about, including a murder of a baby. Also, Linn documented much of his social life, which was an amazing opportunity to learn more about the social life of people our age during the 1850’s. One of the recurrent themes that Linn wrote about that was similar to his Civil War diary is his meticulous attention to the weather and its documentation in the diary. Linn never fails to write about the weather happening wherever he is. Another aspect that we became more aware of was Linn’s preoccupation with other religions. We are not exactly sure what religion, if any, that Linn identifies with; however, Linn writes about his experiences at many different religious affiliations, including Methodist and Presbyterian meetings.

Blurry text showing Thursday 7th
Blurry text showing Thursday 7th
Readable text showing Wednesday 27th
Readable text showing Wednesday 27th

Of course, subsequent “7” became easier to identify. Another aspect of the project that posed some difficulty to us was that the original transcribers failed to produce dates for the entries. After some digging up into the original documents, we were able to identify the dates for the diary entries. Shown in the image, we were originally using transcribed material that was lacking in dates and line breaks.

Space indicates where diary entry/ date started
Space indicates where diary entry/ date started

We seemed to work really well on this project. We are both familiar with James Merrill Linn through our first DH class and each of our independent study projects. We tended to agree on the types of semantic markup that we wanted to do, and our interest level in Linn is very similar. We think that our joint markup of Linn’s 1850 diary was a success.

Filed Under: Prompt, Reflections Tagged With: DH, James, james merrill linn, Julia, Linn, markup, Merrill, Oxygen, Sarah, transcribe, XML

  • « Previous Page
  • 1
  • 2
  • 3
  • 4
  • …
  • 11
  • Next Page »

HUMN 271

Bertrand 012
TR 9:30-11:20am
Dr. Diane Jakacki

Authors

  • Dale Hartman RSS feed
  • Diane Jakacki RSS feed
  • ejp013 RSS feed
  • Ella Ekstrom RSS feed
  • jaa023 RSS feed
  • Jingya Wu RSS feed
  • Julia Wigginton RSS feed
  • Matthew Fay RSS feed
  • Matthew Lucas RSS feed
  • Neil Lin RSS feed
  • Peter Onusconich RSS feed
  • Sarah Rosecky RSS feed
  • Tong Tong RSS feed
  • Xing Fu RSS feed
  • Yash Mittal RSS feed

Creative Commons License

Creative Commons License Bucknell University Humanities 271 Course by Diane Jakacki is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright © 2023 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in