-
Notifications
You must be signed in to change notification settings - Fork 4
NoraTop
The so-called WeScience0 project is a preparatory joint initiative with the [http://www.ub.uit.no/wiki/openaccess/index.php/NORA Norwegian Open Research Archives] (NORA) and the UiO [http://www.usit.uio.no/suf/ds/ Center for Information Technology] (USIT); the project is partially funded by NORA in 2009 and forms part of the larger [http://www.delph-in.net/wescience WeScience] initiative coordinated by the UiO [http://www.ifi.uio.no/research/groups/lns/lt.html Language Technology Group]. Some general motivation for the project and a preliminary project plan are available through the original [http://www.emmtee.net/nora/nora.20-apr-09.pdf project proposal]. Related initiatives include the [http://acl-arc.comp.nus.edu.sg/ ACL Anthology Reference Corpus], the [http://hylap.dfki.de/ HyLaP] project at DFKI Saarbrücken, and the UK [http://www.intute.ac.uk/irs Intute Repository Search].
This page (and other NORA sub-pages), at least as of August 2009, primarily serve for project-internal communication. Access to these pages is limited to regustered wiki users, using the exact user name registered on the NoraGroup page. Please contact StephanOepen, in case you want additional NORA pages to be created, experience difficulties with reading or editing these pages, or need assistance related to wiki usage more generally.
The WeScience0 effort can be sub-divided by basic processing tasks. These (a) PDF Inspection (NoraInspection), (b) text extraction (NoraExtraction), (c) language identification (NoraIdentification), (d) text correction (NoraCorrection), (e) sentence boundara detection (NoraSegmentation), and (f) interfacing to the [http://lucene.apache.org/java/docs/ Lucene] search engine (NoraLucene). Please view the individual pages for details and the current state of play.
Home | Forum | Discussions | Events