8000 Add example dataflows by aakashkolli · Pull Request #41 · urban-toolkit/curio · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add example dataflows #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,13 @@

Icons indicate the complexity level of each example: 🟢 Easy, 🟡 Intermediate, 🔴 Advanced.

- 🟢 [Visual analytics of heterogeneous data](examples/1-visual-analytics.md): Creating a dataflow that integrates climate models and sociodemographic data.
- 🟡 [What-if scenario planning](examples/2-what-if.md): Performing what-if analyses with different sunlight access scenarios.
- 🔴 [Expert-in-the-loop urban accessibility analysis](examples/3-expert-in-the-loop.md): Training and refining a computer vision model to assess urban infrastructure.
- 🟢 [Accessibility analysis](examples/4-accessibility-analysis.md): Visual analysis of sidewalk accessibility data.
- 🟢 [Flooding analysis](examples/5-flooding-complaints.md): Inspecting flooding complaints.
- 🟢 [Interactions between Vega-Lite and UTK](examples/6-interaction.md): Adding interactions between Vega-Lite and UTK.

- 🟢 [Visual analytics of heterogeneous data](examples/01-visual-analytics.md): Integrates raster, meteorological, and sociodemographic data to compute thermal indices and map urban heat exposure.
- 🟡 [What-if scenario planning](examples/02-what-if.md): Simulates sunlight obstruction based on 3D building geometry and explores shadow changes through interactive height adjustments.
- 🔴 [Expert-in-the-loop urban accessibility analysis](examples/03-expert-in-the-loop.md): Trains and evaluates a computer vision model for sidewalk material classification with human-in-the-loop inspection.
- 🟢 [Accessibility analysis](examples/04-accessibility-analysis.md): Analyzes sidewalk accessibility features using severity and agreement metrics to visualize neighborhood patterns.
- 🟢 [Flooding analysis](examples/05-flooding-complaints.md): Aggregates and visualizes 311 service requests for flooding by zip code using simple data transformation.
- 🟡 [Interactions between Vega-Lite and UTK](examples/06-interaction.md): Demonstrates how to link user interactions between UTK map and Vega-Lite plots.
- 🟡 [Speed camera violations](examples/07-speed-camera.md): Performs temporal aggregation and creates linked bar and line charts to analyze top camera violations over years.
- 🟡 [Red-light traffic violation analysis](examples/08-red-light-violation.md): Builds a dataflow to analyze temporal, spatial, and seasonal trends in red-light violations using interactive dashboards.
- 🟢 [Building energy efficiency](examples/09-energy-efficiency.md): Compares mean and median energy use intensity across building types to identify outliers and efficiency gaps.
- 🟢 [Green roofs spatial analysis](examples/10-green-roofs.md): Visualizes the distribution and density of green roofs across Chicago using dot density maps and zip code aggregation.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -260,4 +260,4 @@ This workflow creates a visual analytics system for sidewalk accessibility data.
3. Prioritize neighborhoods for accessibility improvements
4. Make data-driven decisions for urban planning

The interactive nature of the visualizations allows for exploration and discovery, giving insights into urban accessibility challenges. This can ultimately help make cities more accessible for everyone, particularly people with mobility impairments.
The interactive nature of the visualizations allows for exploration and discovery, giving insights into urban accessibility challenges. This can ultimately help make cities more accessible for everyone, particularly people with mobility impairments.
File renamed without changes.
140 changes: 140 additions & 0 deletions docs/examples/07-speed-camera.md
F438
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Example: Visual analytics of speed camera violations

Author: Ameer Mustafa, Filip Petrev, Sania Sohail, Aakash Kolli

In this example, we will explore how Curio can facilitate the temporal analysis of urban mobility data by processing and aggregating tabular records to analyze and visualize trends in speed camera violations across Chicago.

Here is the overview of the entire dataflow pipeline:

![Dataflow](./images/7-1.png)

Before you begin, please familiarize yourself with Curio’s main concepts and functionalities by reading our [usage guide](https://github.com/urban-toolkit/curio/blob/main/docs/USAGE.md).

The data for this tutorial can be found [here](data/Speed_Camera_Violations.zip).

For completeness, we also include the template code in each dataflow step.

## Step 1: Load speed camera violatiions data

We begin creating a Data Loading node to load the speed camera violations dataset into Curio.

```python
import pandas as pd

df = pd.read_csv("Speed_Camera_Violations.csv")
df.dropna(inplace=True)
return df
```

![](./images/7-2.png)

## Step 2: Data Pool

Next, we create a Data Pool node, which passes the cleaned DataFrame to downstream nodes for further transformation and visualization.

## Step 3: Data Transformation – Top 5 Cameras by Violations per Year

Now, we will create a Data transformation node connected to the output of Step 2:

```python
import pandas as pd

df = arg

df['VIOLATION DATE'] = pd.to_datetime(df['VIOLATION DATE'], format='%m/%d/%Y')

df['Year'] = df['VIOLATION DATE'].dt.year

yr_sum = (df.groupby(['CAMERA ID', 'Year'])['VIOLATIONS']
.sum()
.reset_index()
.rename(columns={'VIOLATIONS': 'avg_violations'}))

top_ids = (df.groupby('CAMERA ID')['VIOLATIONS']
.sum()
.sort_values(ascending=False)
.head(5)
.index
.tolist())

yr_sum = yr_sum[yr_sum['CAMERA ID'].isin(top_ids)]

camera_pos = (df.groupby('CAMERA ID')[['LATITUDE', 'LONGITUDE']]
.mean()
.reset_index())

yr_sum = yr_sum.merge(camera_pos, on='CAMERA ID')

return yr_sum
```

![](./images/7-3.png)

This analysis aggregates the violations by camera and year, identifying the top 5 cameras with the highest total violations.

## Step 4: Linked View Visualization – Interactive Exploration

We then create a linked view visualization using the 2D Plot (Vega-Lite) node. This visualization includes both a stacked bar chart and a line chart to explore total violations over time.

```json
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": { "name": "table" },
"config": { "bar": { "continuousBandSize": 18 } },
"hconcat": [
{
"width": 320,
"height": 260,
"selection": { "yrBrush": { "type": "interval", "encodings": ["x"] } },
"mark": { "type": "bar" },
"encoding": {
"x": { "field": "Year", "type": "quantitative", "title": "Year" },
"y": {
"aggregate": "sum",
"field": "avg_violations",
"type": "quantitative",
"title": "Total Violations"
},
"color": {
"field": "CAMERA ID",
"type": "nominal",
"legend": { "title": "Camera ID" }
}
}
},
{
"width": 320,
"height": 260,
"transform": [
{ "filter": { "selection": "yrBrush" } },
{
"aggregate": [
{ "op": "sum", "field": "avg_violations", "as": "total" }
],
"groupby": ["Year"]
},
{ "sort": { "field": "Year" } }
],
"mark": { "type": "line", "point": true },
"encoding": {
"x": {
"field": "Year",
"type": "quantitative",
"title": "Year (brush range)"
},
"y": {
"field": "total",
"type": "quantitative",
"title": "Total Violations"
}
}
}
]
}
```

![](./images/7-4.png)

## Final result

This example demonstrates how Curio can be used for a detailed temporal analysis of urban safety data. By transforming and aggregating violation records, we can generate interactive visualizations like stacked bar charts and linked views to effectively identify and compare trends over time. This workflow allows for a deeper understanding of violation patterns and the performance of specific camera locations.
Loading
0