Information Analysis Technologies

Emerald employs a myriad of tools used for the analysis and classification of data. Of these tools two serve as a back bone for many of our needs.

Scan Bot:

Emerald's Scan Bot is used to analize web sites on a multitude of levels. It does this using modular engines that can be activated individualy to allow a conclusion to be reached based on which ever criteria is necessary. We currently employ a Regular Expression engine, Script engine (dynamically loads C# script files), Keyword engine, and Link analysis engine. Each engine is capable of reaching their own decisions using internal scoring algorithms embedded in each of them. The Scan Bot then uses its own core Result engine to determine the correct category using all the gathered information. The Scan Bot uses these snap in modular engines because it allows us to rapidly develop custom engines for our clients for use with out ever changing the Scan Bot itself or other modules. This allows us to be extremely flexible. Some engines currently in development include an OCR engine and ActiveX control analysis engine. The Results of this process can be stored in flat text files, xml files, SQL database, or an x-base database.

Site Review Tool:

The Site Review Tool (SRT for short) is used by technicians to review sites and classify them when automated means cannot. The SRT is a self contained web browser and voting tool. Users load a site list from disk or from an XML web service. Users then view each site one a time voting on their categories. The tools in this application include the ability to view the source for the page being displayed automatically, the links this page references, and any included information found by the Scan Bot. This tool is run on individual workstations by users classifying sites. Sites categorized by the SRT are transmitted back to a server where it is stored. The server side component of this software assigns each technician a level of trust. This trust level is obtained by soliciting the same site to mulitple technicians early in their usage history and verifing they are all reaching a consensus. Until a technicians trust level is high enough any site they categorize will need to also be categorized by atleast 1 other technican or as many as 3 depending on each of their trust levels. If a consensus is not reached the site will be set for review by a technician in the top tier of the trust system.

News

RSS Newsfeed offline

Website updates are in progress. 

Not all of the content on this new site is complete.  If you have questions please contact us for more information.

Uncomplicated solutions for categorized URLs

Technology at work

We believe that making technologies that are easy to deploy and manage are essential to our partners success.