Mason, Robert T.
Barnes, Stephen D.
College for Professional Studies
MS Database Technologies
School of Computer & Information Science
Thesis - Open Access
Number of Pages
In this study, I intend to show how text based unstructured data, such as word processor documents and e-mails, can be systematically parsed for instances of data classes. Data classes can be any data type that can be fully described using the defined syntax of regular expressions. Examples of data classes can include SHA1, or MD5 hash values, Internet Protocol (IP) addresses, or any other data type whose format is well defined. Furthermore, possible correlations between datasets may be identified by grouping instances of equal or similar values within a data class. This approach of utilizing regular expressions to define the search criteria negates the need to predetermine what keyword to search for.
Date of Award
© Alberto Lombardo
All content in this Collection is owned by and subject to the exclusive control of Regis University and the authors of the materials. It is available only for research purposes and may not be used in violation of copyright laws or for unlawful purposes. The materials may not be downloaded in whole or in part without permission of the copyright holder or as otherwise authorized in the “fair use” standards of the U.S. copyright laws and regulations.
Lombardo, Alberto D., "Development of Database and Software Modules to Identify Case Relationships Using Unstructured Data" (2012). All Regis University Theses. 625.