Tuesday, 27 September 2016

Getting started

Latest Progress

Yesterday, Leonel and I met with Dominik Seliner. He gave me a detailed look at his code, explained the different parts of EggShell and pointed out what to bear in mind, when attempting to get EggShell to run (how to install it, to work with Moose 5.1, to use the configuration object to load the software, etc.)
Today, Leonel and I discussed the possible directions of this project and decided that for now, implementing a domain specific language (DSL) and extending the data model are the main goals. The project's exact specifications will evolve over its course. 

Next Steps

There are a couple of things I want to do, in order to get started with my project:
  • Read and work through Pharo by Example and Deep Into Pharo (or at least as much of it as is needed to start my work)
  • Get EggShell up and running on either Linux or OS X, explore and understand the code
  • Set up a stable development environment
  • Read Dominik's thesis and the papers about CommunityExplorer and header metadata extraction
  • Possibly try to get Dominik's modified Xpdf version to run on Windows.
  • Possibly search for further relevant papers.
  • Possibly get familiar with Roassal
Once everything is up and running, my first priority is going to be the DSL. The goal is to provide methods to query the data model.

Current Project Outlook

In a first step, my project will pick up where Dominik's work left off. I'll attempt to equip EggShell with a domain specific language (DSL) to query the data model, in order to visually answer questions about scientific communities in a later step. Another goal is to extend the data model and structure recovery to extract more parts of the PDFs. Whether I'm going to permanently stick to EggShell, re-implement certain parts of it, or develop an entirely new tool, remains to be seen.

Likely challenges

  • The modified Xpdf binaries are only available for OS X and Linux, only the ones for OS X have recently been tested. I'll have to decided which platform I'll want to be working on: either use Linux or OS X, or try to get it to run on Windows.
  • Since I don't yet have any experience in Pharo or Moose, it might take me some time to get comfortable with it. This will definitely extend the process of understanding EggShell well enough to start working with it. I'll have to make sure not to get caught up in less important details of Pharo in the beginning, but still cover as much of it as possible over the course of my project.

No comments:

Post a Comment