Data Pipeline (Extract, Load, Transform)
Description: This Data Pipeline automatically extracts all data (if available) every 2 weeks from our public data source and stores it in its rawest form within our 'accidents_bronze' table. We then apply the necessary transformations to a copy of that and save it into our 'accidents_silver' table; making our data ready for visualization. Our 'log' table that gets written to during every run comes in handy for auditing and debugging.
Transformations Applied to Bronze Layer to create Silver Layer:
- Remove cross symbols if they exist
- Create complete date using partial date and season
- Exclude summary tables from our 'accidents_silver'
- Convert state abbreviations to full names
- Refine location for geocoding locations
- Geocode each location and state to obtain coordinates
Methodologies Used: ELT, Web Scraping, Automation, Data Engineering, Medallion Architecture
Technologies Used: Python, PostgreSQL, Pandas, SQLAlchemy, Google Maps API, Beautiful Soup 4, US, Datetime, Requests
Back End Web Server
Description: The back end server is designed to connect to the 'avalanche_data' database to fetch all data from 'accidents_silver' and serve it to my frontend securely and asynchronously upon request. This API contains features like caching, CORS middleware, rate limiting, logging with rotation, security headers, and custom exception handling.
API Endpoints and Their Use Cases:
- /api/accidents: Fetches all data from 'accidents_silver' and returns it to the front end
- /api/accidents/{accident_id}: Fetches details about specific accidents and returns it to the front end
- /api/aws-credentials/: Fetches AWS Credentials needed for map display and returns them to the front end
- /api/invalidate-cache/: Clears out cache to ensure fresh data is made available via trigger on successful data load within ELT process
Methodologies Used: Back End Development, API Development, Asynchronous Programming
Technologies Used: Python, FastAPI, PostgreSQL, Redis, SlowApi, Pydantic, Logging, Starlette
Frontend
Description: The front end server is responsible for making GET requests to our API endpoints to receive credentials and data needed to successfully display the map with data.
Methodologies Used: Data Visualization, Front End Development
Technologies Used: HTML, CSS, JavaScript, AWS Location (Esri map display)