| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Implementation

Page history last edited by Reeta 14 years, 1 month ago

This page is the front page for implementation related issues in TamBiC project:

 

  • Overview - the main 3 components of TamBiC of the search engine.

 

 

  • Corpus database design - describes the structure of the tables used in the corpus database and how to set up those tables.

 

  • SW project structure - describes the directory structure of the project, and how to set up Eclipse with SVN.

 

 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

 Overview of the TamBiC system

 

The TamBiC system consists of 3 parts:

 

1.  SQL database holding the contents of the two corpora. Features: 

  • Provides efficient methods for searches based on string pattern matching
  • Provides easy methods for inserting and editing the corpora contents

 

2.  Administrative web application for developing the database. Features: 

  • Web form for making searches on rows (by text code)
  • Web form for editing and deleting rows
  • Web form for adding new data to the database
  • Web form for managing administrative user accounts

 

3.  Web application for making the searches in corpora. Features:

  • Basic searching, where user needs to select corpus and language (English/Finnish) 
  • Making a new search within your current search results (i.e. refining results) 
  • Versatile search options: 
    • Word matching w/o wildcard (e.g. play, play*, *play, play*, *play*, …)
    • Phrase matching w/o wildcards (e.g. “the one”, “the *ones”, “the * one”, …)
    • Case in/sensitivity: turn On/Off (e.g. cat, Cat, CAT, …)
    • AND, OR and NOT syntax (e.g. cat AND dog, cat NOT dog, cat OR dog, …)
    • Showing more context, which displays the next and previous sentence in the text
    • THEN syntax , which finds sentences where words are in the given order
  • Graphical UI, where the results are shown with hit counting and matching words being bolded in the search results to provide an easy verification of the results 
  • Having several searches open at the same time, since every results is put in a new tabs.
  • Saving results (in a .rtf file) w/o sentence context being saves, as well
  • Printing results (from a new .html page) w/o sentence context being printed, as well
  • “Help” page, which contains the user manual on how to use the system
  • “About TamBiC” page, where user can find more information on e.g. the text sources.

 

[ back to top ]

 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Comments (0)

You don't have permission to comment on this page.