Title: Statistical significance of large-scale regression models utilized in genome-wide association studies Abstract: The model underlying genome-wide association studies is typically a large-scale regression model, where the trait of interest in modeled as a linear combination of many genetic markers. The goal is then to identify which genetic markers have non-zero coefficients in the regression model. Two complications of this process are that (1) the model is severely underdetermined since the number of markers greatly exceeds the numbers of observed individuals and (2) there is strong dependence among the markers and error term due to population and environmental structure. I will present a new statistical significance test of the coefficients in this model, which are theoretically and practically proven to provide a valid hypothesis test. The statistical test involves a set of parameters that can be directly estimated from large-scale genotyping data. I will present a new method, called a 'genotype-conditional association test' (GCATest), shown to provide accurate association tests in populations with complex structures, manifested in both the genetic and non-genetic contributions to the trait. I will show results of the method applied to real and simulated data.