Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal

Article Title

Querying Large Databases


This paper investigates two approaches to improving query times on large relational databases. The first technique capitalizes on the knowledge of a database's structures and properties one typically has. This technique can execute some queries exactly in a constant, bounded amount of time. When this technique cannot be used to exactly execute a query we show how it can still be used to drastically lower the run-time on the query while getting a good approximation of the exact result. We also discuss the complexity of deciding whether a query is evaluable in this way, both theoretically and practically. The second approach approximates aggregate queries by incorporating only part of the data, rather than all of the data the query pertains to. We briefly investigate an established method of sampling a random subset of the data, and then a newer method which partially reads every tuple and puts deterministic error bounds on the results.