First Advisor

Mason, Robert T.

Second Advisor

Barnes, Stephen D.

Third Advisor

Plantz-Masters, Shari

College

College for Professional Studies

Degree Name

MS Database Technologies

School

School of Computer & Information Science

Document Type

Thesis - Open Access

Number of Pages

335 pages

Abstract

In this study, I intend to show how text based unstructured data, such as word processor documents and e-mails, can be systematically parsed for instances of data classes. Data classes can be any data type that can be fully described using the defined syntax of regular expressions. Examples of data classes can include SHA1, or MD5 hash values, Internet Protocol (IP) addresses, or any other data type whose format is well defined. Furthermore, possible correlations between datasets may be identified by grouping instances of equal or similar values within a data class. This approach of utilizing regular expressions to define the search criteria negates the need to predetermine what keyword to search for.

Date of Award

Spring 2012

Location (Creation)

Denver, Colorado

Rights Statement

All content in this Collection is owned by and subject to the exclusive control of Regis University and the authors of the materials. It is available only for research purposes and may not be used in violation of copyright laws or for unlawful purposes. The materials may not be downloaded in whole or in part without permission of the copyright holder or as otherwise authorized in the “fair use” standards of the U.S. copyright laws and regulations.

Share

COinS