Day 9 - Using Programs to Process Data
Day 9: Using Programs to Process Data
Learning Objectives
- DAT-2.B: Extract information from data using a program.
- DAT-2.C: Identify the challenges associated with processing data.
Essential Questions
- How can programs help us process and analyze large datasets?
- What programming techniques are useful for data processing?
- What challenges arise when processing data with programs?
Materials Needed
- Presentation slides on programmatic data processing
- Sample datasets for analysis
- Programming environment (Python recommended)
- Code templates for data processing
- Lab handout with programming tasks
Vocabulary
- Data processing
- Filtering
- Sorting
- Aggregation
- Transformation
- Iteration
- List/array
- Algorithm
- Data cleaning
Procedure (50 minutes)
Opening (8 minutes)
-
Review and Connection (3 minutes)
- Review manual data analysis from previous lesson
- Connect to today's focus on using programs to automate data processing
-
Warm-up Discussion (5 minutes)
- Ask: "What are the limitations of analyzing data manually?"
- Discuss: "How might programming help overcome these limitations?"
Main Activities (32 minutes)
-
Lecture: How Programs Help Process Data (10 minutes)
- Explain advantages of programmatic data processing:
- Speed and efficiency with large datasets
- Reproducibility and consistency
- Automation of repetitive tasks
- Complex calculations and transformations
- Introduce common data processing operations:
- Filtering: selecting data that meets specific criteria
- Transformation: changing data format or values
- Aggregation: combining data to calculate statistics
- Sorting: arranging data in a specific order
- Discuss how these operations can be combined in programs
- Explain the importance of data cleaning before analysis
- Explain advantages of programmatic data processing:
-
Demo: Writing Simple Programs to Filter and Transform Data (10 minutes)
- Demonstrate writing code to:
- Load data from a file
- Filter data based on conditions
- Transform data (e.g., convert units, calculate new values)
- Aggregate data (e.g., calculate averages, find maximum values)
- Output results in a useful format
- Show how to handle common data issues (missing values, inconsistent formats)
- Explain how to structure a data processing program
- Demonstrate writing code to:
-
Lab: Create a Program to Analyze a Dataset (12 minutes)
- Students work individually or in pairs
- Provide a dataset and a set of questions to answer
- Students write a program to:
- Import and clean the data
- Process the data to answer the questions
- Generate summary statistics or visualizations
- Output the results in a clear format
- Circulate to assist with programming challenges
Closing (10 minutes)
-
Code Review and Discussion (5 minutes)
- Selected students briefly share their approach
- Discuss different programming techniques used
- Address common challenges encountered
- Highlight particularly effective solutions
-
Program Submission and Preview (5 minutes)
- Students complete and submit their programs and analysis
- Preview that next class will focus on data visualization
Assessment
- Formative: Quality of programming approach during lab
- Program Code and Analysis: Functionality, efficiency, and clarity of code; accuracy of results
Differentiation
For Advanced Students
- Provide more complex datasets requiring sophisticated processing
- Challenge them to optimize their code for efficiency
- Suggest additional analyses beyond the basic requirements
For Struggling Students
- Provide code templates with key sections to complete
- Offer more structured guidance on program design
- Allow use of simpler datasets with fewer variables
Homework/Extension
- Enhance the data analysis program with additional features
- Process a dataset of personal interest and report findings
- Research how professional data scientists use programming in their work
Teacher Notes
- Have sample solutions ready for common programming challenges
- Be prepared to help with syntax errors and logical problems
- Make connections to programming concepts from earlier units
- Emphasize that data processing is one of the most common real-world applications of programming
- Consider discussing ethical considerations in automated data processing