I needed to create a normally-distributed sample data set in SQL Server recently. I suspect that there are a few of you out there who need to be able to generate this sort of sample data from time to time.
Using the so-called “Polar Method”, the following code will generate 200,000 values with a sigma of one and a mean of zero. On my laptop running SQL Server 2005 Developer Edition, this only takes a few seconds to run.
set nocount on go drop table #temp create table #temp (x float) go declare @p float, @2p float, @u1 float, @u2 float declare @x float, @y float, @index int select @index = 0 select @p = pi() select @2p = 2.0 * @p while (@index < 100000) begin select @u1 = rand(), @u2 = rand() select @x = sqrt(-2.0 * log(@u1)) * cos(@2p * @u2) select @y = sqrt(-2.0 * log(@u1)) * sin(@2p * @u2) -- @x and @y are now normally distributed around zero (0), -- with a standard deviation of one (1) insert into #temp (x) values (@x) insert into #temp (x) values (@y) select @index = @index + 1 end select 'Average'=avg(x) from #temp select 'Standard Deviation'=stdev(x) from #temp
There are a couple variations that I might post about. One removes the trig functions for better performance, and another wraps this code in a function so you can do different sigma and mean values.
Hope this helps someone!