Human-induced earthquakes are becoming an important topic in political and scientific discussions and is now vastly reported by the media. It is well known that surface and underground mining, reservoir depletion, injection and withdrawal of fluids and gas from the subsurface are capable of inducing slip on preexisting faults and potentially earthquakes of various magnitudes. Since 2009, new drilling technologies have given access to oil and gas in previously unproductive geological formations.
The recent increase of small to large earthquake events in central and eastern United States has fueled concern that hydraulic fracturing could be responsible for the increase of the rate of earthquake acitivity. Since 2009, the oil industry has expanded its use of the hydraulic fracturing technique because of the high price of oil that makes the extraction of unconventional oil and gas now economically viable. Microearthquakes (magnitude lower than 2 on the Ritcher magnitude scale, determined from the logarithm of the amplitude of waves recorded by seismographs) are routinely generated as part of the hydraulic fracturing process (fracking) to stimulate oil and gas reservoir. However the currently practiced protocol has low risk of inducing destructive earthquakes. Over the past 100,000 fracking wells drilled, the biggest earthquake recorded was of magnitude 3.6, which is too small to pose serious safety risk. Nevertheless, the correlation between the increase of fracking and earthquake events has drawn lots of attention from both the scientific community and the media. Ellsworth [2013] claims that some of the seismicity is associated with the increase in saltwater disposal that comes from 'flow-back' water after multistage fracturing operations (see the National Research Council report, Induced Seismicity Potential in Energy Technologies [2012]). Once the unconventional oil is extracted, along with the contaminated saltwater, this latter is reinjected into deeper sedimentary geological formations with high porosity and permeability via regulated class II underground injection control (UIC) wells. Sometimes the saline water is reinjected as part of water-flooding enchanced oil recovery. However the large increase of earthquake activity is thought to be associated with the disposal wells. Since 2009, Oklahoma has been impacted by a significant increase of earthquake events. The exponential increase of the number of earhquakes affects all range of magnitude, from the smallest to the largest earthquake sizes. Furthemore, in this state, the number of saltwater disposal wells have increased dramatically. Sometimes, the sedimentary geological formation used to store the contaminated saltwater is hydraulically connected to the crystalline bedrock. The increase of pore fluid pressure on preexisting fault surfaces in crystalline rocks weakens the fault and can potentially trigger earthquakes. For example several of the largest earthquakes in United States in 2011 and 2012 have been triggered by disposal wells. The largest was of magnitude 5.6 in Prague, central Oklahoma in 2011 and destroyed 12 homes and injured 2 people. However, only a small fraction of the 30,000 disposal wells appears to pose safety risk issues. It becomes important to understand the processes of injection-induced seismicity. In this project, we will examine the question: how well can be predict earthquakes in Oklahoma (the number of earthquakes, time of the next earthquake, magnitude) with the features we have at our disposal?
We obtained information of water disposal wells from years 2006 to 2012 from a public website. We looked at volume in gallons of injected water and the location coordinates for each well. We studied 53,389 wells in the state of Oklahoma. In the interactive map below, we show the location of the wells colored and sized by volume in gallons of injected water filtered by the date of injection. Use scroll to select year in order to see injection sites and volume for year selected.
Using recorded data from 1980 to present, we analyzed the magnitude and frequency of earthquakes in Oklahoma. In the map below, we observe the earthquake events colored by magnitude and sized by depth per year. Use scroll to select year in order to see earthquake events for year selected. The red events represent earthquakes for selected year, while the green events represent past earthquake events. Since our analysis used only earthquakes of magnitude 3 or greater, the default map filtered earthquakes with this criteria. Use slider to select range of magnitude.
Notice that there is a substantial rise of seismic events starting in 2006. Since water disposal through wells becomes more intense just a few years earlier makes us suspect an association between the number of earthquakes and fracking activity (specifically, based on past research, the use of disposal wells).
The relationship between earthquake magnitude and frequency can be modeled by the Gutenberg-Richter law. Typically in seismically active regions, the constant b is equal to 1. The constant a carries little scientific information. This power law relationship between event magnitude and frequency of occurrence is remarkably common, although the values of a and b may vary from region to region or over time. We model the seismic activity in Oklahoma before and after 2010
Using the maximum likelihood method, the predicted value of b for pre-2010 earthquakes is 1.502, and for post-2010 earthquakes is 1.538. The b value after 2010 is smaller than that prior to 2010, suggesting a smaller ratio of eathquakes of large magnitude to lower magnitude after 2010 than usual. However, after 2010 there are 295 annual earthquakes with magnitude greater than 3 while there were only 2 prior to 2010.
We further investigate the change of earthquake frequency. One way of expressing our hypothesis of a difference betwen post-2010 and pre-2010 is as follows. The cumulative number of earthquakes will, of course, increase over time and it is a reasonable first guess to think that it would increase linearly. If there is a real difference between pre-2010 and post-2010, we would expect that fitting lines to the two groups of data separately would produce lines of different slopes. We investigate this in the visualization below. We fit a line to the pre-2010 data and smooth the points for the post-2010 data. We continue the pre-2010 line in the background for comparison
The visualization indeed suggests that the line for the pre-2010 data does not fit the post-2010 data very well. We perform a two sample Welch’s t-test on the log of the annual rate of earthquakes (since the rates are severely skewed to the right) between pre-2010 and post-2010 and obtain a p-value of 0.0003. In addition, we performed a bootstrap test that yielded even a smaller p-value. Thus, the evidence of a higher annual frequency of earthquakes after 2010 is overwhelming.
We build regression models to observe if there is an association between number of earthquakes to number of wells present and amount of reinjected fluid. First, we start by separating Oklahoma into square grids and for each square grid we count the number of earthquakes, wells and the total volume of injected water. We regress the number of earthquakes on the number of wells and volume of injected water volume using different algorithms. Our baseline model, a linear regression of the number of earthquakes on the number of wells and volume did not prove to be an effective model, yielding an R-squared value of 0. Additionally, we perform three types of advanced regression which we call: grid regression, grid regression with interarrival, and grid regression with cluster. For more details on these three algorithims, please visit our github site. Below, we observe and compare the performance of these three algorithms.
The simplest out of these three models, the ridge regression (Tikhonov regularization), yields the highest R-squared value. This model was looped over a series of grid sizes and the optimal grid area was 1.5 degrees squared for the events after 2010, and 2 degrees squared for the events before 2010. This model yielded a negative coefficient for the number of wells and a positive coefficient for the volume of injected water.
The ridge regression algorithm has shown to be effective in modeling the relationship between earthquake events and number of wells along with the volume of injected water. Because of the poor state of our data, especially the wells data which had many wells with missing years, we are satisfied with the performance of this model. The coefficient of the number of wells is negative because when we keep the total volume of injected water constant in one area, if there are more wells then the pressure of each well is less than of another area with same volume of injected water but with less wells. We conlude that the number of earthquakes are in fact associated with the volume of injected water by these wells. Surprisingly, the grid sizes vere fairly large (120.75 squared km) to yield the optimal model. This project recommends the application of ridge regression in future studies of seismic events.