8000 GitHub - lauraccunningham/homework-02: recent-quakes
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

lauraccunningham/homework-02

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 

Repository files navigation

homework-02

Due October 21, 2013

Recent Quakes

Vertical Group, #2

Name ( GitHub Account Homepage )

Step-by-Step Instructions for Homework-02

GOAL

Using USGS's earthquake data, we will be visualizing a relationship of a specified time frame by examining location, magnitude, and depth of an event. We have combined the roles of Curator and Producer as a signular analyzer of data and producer of a simple rubric for developing the data. A data frame will then be examined and recreated as a visualization by the Entrpreneur & Integrator.

Any questions we may have can be foud here.

Parameters

  • Earthquake Information: Location, Magnitude, & Depthesthetic
  • Group Roles: Curators & Visualizers

Earthquake data can be found here and at the right, the appropriate feed can be chosen for this assignment. For our purposes, we will be choosing the data for all earthquakes in the past 30 days

To Begin, Preliminary Steps

You will need to first open your Virtual Machine. Log in, and get it running! From here type in the following commands as we want to make sure you are able to run iPython Notebook and all necessary packages.

sudo apt-get install libgeos-3.3.3 python-mpltoolkits.basemap python-mpltoolkits.basemap-data python-mpltoolkits.basemap-doc python-pandas

Open a new directory within your virtual machine. Here is where I am going to store my information for homework-02 and enter this directory. From here we will open our iPython Notebook.

mkdir homework-02
ls
cd homework-02

ipython notebook --ip=0.0.0.0 --no-browser --script --pylab=inline

From here we will now be able to run and operate out iPython Notebook. Within your browser open 127.0.0.1:7777 for seeing your iPython Dashboard. Here you should be all set to begin, and in the correct location for homework-02 and able to create a new data file within the homework-02 folder.

Curation, Analysis, & Visualization

Within iPython Notebook we curated our data and provided a visual representation with our data.ipynb file. ( We have also included a more partitioned version of the curation process in code here as data2.ipynb. We imported various packages and data from url's associated with the earthquake data to produce a data frame with the correct and accurate information.

With our data frame, we produced various forms of visual representations that identify location, depth, and magnitude of earthquakes in various regions.

__To run our data on your machine, you can open up our data.ipynb or data2.ipynb by git cloning our repository to your virtual machine:

git clone https://github.com/lauraccunningham/homework-02.git

This command will copy the repository to your machine, and you can continue to open your machine by providing the homework-02 directory as your source and the ipython notebook --ip=0.0.0.0 --no-browser --script --pylab=inline command. Opening your browser with 127.0.0.1:7777 will prompt iPython Notebook and you can continue working. There will be no need to create a directory this way, as the directory's purpose is a part of the process of creating the code initially.__

Our Final Project: Repository & Notebook Viewer Version 1 or 2

==========

Final Projects & Presentations

==========

Objective

The first objective of this assignment is to improve the data handling of the code by upgrading from the deprecated data source to the new data source which uses a different data format called JSON.

Since we are working with live data we also need to cache the data so that we can reliably re-run the code using the same data (or optionally with the live data).

The second objective is that we would like to be able to see data for earthquakes in states other than Alaska, so the next part of the assignment requires refactoring the code to parameterize the function definition instead of relying on the hard-coded values for latitude and longitude of the bounding box around the region of interest.

This assignment features two main roles: the Data Curator and the Visualizer. All 4 members of your vertical group should work together no matter what individual roles you have assigned.

Your task

  1. Data Curation

The USGS eqa7day-M1.txt data url used in this program has been deprecated.

As suggested by the warning message in the data feed:

This USGS data file has been deprecated. To continue receiving
updates for earthquake information you must switch to the new data
format [http://earthquake.usgs.gov/earthquakes/feed/]. In the
future, data feeds will be updated and deprecated following our
official deprecation policy
[http://earthquake.usgs.gov/earthquakes/feed/policy.php].

you need to upgrade this program to use the new USGS data feed:

http://earthquake.usgs.gov/earthquakes/feed/

The new data feed includes a link for Programmatic Access. You should use the pandas JSON parser to read the data instead of the read_csv function in the original code.

You will also need to find a way to cache the data locally so that your runs are exactly reproducible since the live data gets updated in real-time. You should write a program which can use the live data, but also optionally can store data from previous runs so that we can re-run the program in either mode using cached data or live data.

Start simple, keep the data isolated/separate from the source code, and remember that the goal is to make it reproducible by someone else.

The duty of the Data Curator is to write a new version of the code that reproduces the same output as the old version of the code, but using the new data source. Note, it may not be possible to do that exactly if the new data and the old data are substantially different, but hopefully they are exact in content, and differ only in format (CSV vs JSON).

  1. Visualization

The definition of plot_ak() has a very bad code smell! What if we want to plot the earthquakes in California where I live now instead of in my home state of Alaska? Can you spot the code smell? How can you fix it so that given any arbitrary list of earthquakes you can plot the bounding box around the location (e.g. the whole state) where the quakes occured?

The duty of the Visualizer is to refactor the hard-coded values out of the program and instead parameterize the function so that for any list of earthquakes in a particular region (i.e. California) it will generate a map showing the correct location. Instead of including the lat/lon bounding box values in the body of the program you might consider storing that data in some format on disk and read it back into the program as a dataframe to pass to the function as a parameter.

Also plot the quakes so we can see the magnitude and depth of each dot instead of the way they are plotted now which only shows the location; all the dots are the same color and the same size, but could be varied to represent more information in the same amount of space.

About

recent-quakes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  
0