Final Projects

Students in my previous data science classes have built the following final projects. Want to make something like that? Take the course!

August 2025

Alfred Cheung: Flight Price Prediction. Flight prices increase with the number of stops, with a strong positive correlation of 0.621

Luke Li: Credit Card Payment Behavior. For every 1 point increase in a standardized financial knowledge test score, the log odds of someone paying off a credit card bill increases by 0.047, and the 95% confidence interval for this estimate ranges from -0.005 to 0.098.

Eric Butte: Goal Scoring in Hockey. For the hockey player Sam Reinhart, each additional foot away from the net he attempted a shot, the log-odds of scoring a goal decreased by 0.033, with a 95% confidence interval from -0.045 to -0.022.

Rishith Bethina: Restaurant thriving analysis. This analysis reveals that delivery capability became 192% more important for restaurant success during COVID-19 (coefficient increasing from 0.12 to 0.35), while price sensitivity intensified by 88%, demonstrating that operational flexibility.

Surya Fraser: Internet Access in the US. While increases in income were associated with only a small increase in the log-odds of internet access, one year increases in age were associated with a 0.0339 decreased log-odds of internet access with a 95% confidence interval of -0.0341 to -0.0338.

Jacob Khaykin: AI Jobs Weighting Analysis. Weighting by log-population reveals that labor market tightness, not education, is the most consistent driver of AI job adoption across different county sizes.

Hassan Ali: Greenhouse gas emission. The analysis identified the predominant chemical species and highlighted significant facility-level and temporal trends in the industry by focusing on reported emissions in metric tons. A limited number of chemicals contribute disproportionately to total emissions, as indicated by exploratory data analysis, whereas annual patterns reveal variability associated with facility operations and manufacturing cycles.

Ansh Patel: Global Commodity Trade`. Following the 2008 financial crisis, U.S. live animal imports dropped by over 25% between 2008 and 2010, and remained on average 15% lower in the years after, indicating a lasting shift in agricultural trade behavior.

Faran Abbas: Global Economic Growth Analysis (2000–2023). Investment is the single most powerful driver of economic growth, with each percentage point increase in gross capital formation (as % of GDP) associated with 0.2-0.3 percentage points higher GDP growth, making it the key policy lever for sustainable development.

Naveed Ahmad: Income Inequality in U.S. The income inequality (Gini Index) in U.S is predicted to be 0.48, with a 95% confidence interval of 0.46 to 0.5 in most of the counties.

Ayush Chandra: EV vs Gas Costs over 3 Years. The best EV model compared to its gas counterpart is the Honda Prologue, and the best states to buy an EV in are Washington and California, which saves approximately $3,850 in Washington.

Jack Xu: The Effects of Wildfires on Bird Migration Routes. For each one-degree counterclockwise increase in direction relative to the east, the model predicts that the distance to the nearest wildfire decreases by approximately 0.059 units with a 95% confidence interval ranging from -0.096 to -0.021.

Adit Rohit: California Housing. This project explores California’s housing market using a cleaned dataset containing neighborhood-level statistics on income, home values, and geographic location.

Uma Ravat: Relationship between food consumed and body mass index (bmi). I examine the relationship between food consumed and body mass index.

Paramanyu Gupta: Music Performance Analytics. The model predicts that for every additional 30 minutes of daily practice, a musician’s performance score increases by approximately 5% with confidence intervals from 3.5% to 6.5%, with this effect being consistent across all instruments examined.

Neelam Arshad: US Traffic Accidents Analysis. This analysis reveals that daily human activity, traffic density, and weather conditions (snow with 1.2% more chances of accidents and heavy rain likely cause 3% greater number of accidents) play a significant role in determining the severity and when accidents are most likely to occur in the U.S.

Sophia Yao: Titanic Survival. During the sinking of the Titanic, men had survival odds reduced by nearly 91% (95% confidence interval: 87%–95%), third-class passengers by 91% (95% confidence interval: 84%–95%), and each additional year of age lowered survival odds by about 3% (95% confidence interval : 2%–5%).

Faisal Jan: LEP on Income. Across all education levels, males earn 20–50% more than females, with the largest gender and language-based income gaps appearing at the graduate level where English Proficient males earn over $70,000, nearly 50% more than their female counterparts.

Inam Khan: Predicting Life Expectany. A country’s income and resource capacity has the strongest impact on life expectancy—each 0.1-point increase predicts 1.1 additional years of life, outweighing health risks like HIV prevalence.

Shuntaro Kawakami: Bitcoin Price Trend Analysis. Dow Jones Index exhibited a strong positive association (p < 0.001) where increases in stock market performance correlate with higher bitcoin prices and the 10-Year Treasury Yield Granger-causes Bitcoin price (p = 0.032), suggesting it has a statistically significant temporal influence on Bitcoin price.

Sharjeel Jamal: Predicting Chess outcome by recognizing moves patterns. Using logistic regression on a dataset of 170,500 games, the model was able to predict win or loss outcomes with 86% accuracy.

Sajida Rehman: Teen Smartphone Usage and Addiction. Each additional hour spent on the phone increases the odds of addiction by approximately three times, whereas an extra hour of sleep lowers those odds by about 44%.

Jishnu Veerapaneni: Predicting Psychiatric Disorders. Bipolar disorder, panic disorder, social anxiety disorder, and schizophrenia all show significantly increased left temporal (F7) delta power, with bipolar disorder demonstrating the strongest effect at 0.080 units (95% CI: 0.018 to 0.143), suggesting that left-hemisphere temporal abnormalities could potentially aid in early detection and diagnosis of these specific psychiatric conditions, though the wide confidence intervals indicate substantial uncertainty in the true magnitude of these neurological signatures.

Abdul Hannan: Online Shopping Behavior. People who view more product pages are significantly more likely to make a purchase, highlighting the importance of keeping visitors engaged.

Umaira Nazar Hussain: Digital Inequality in U.S. Broadband Access. A $1,000 rise in median income is linked to a 2.4% increase in broadband access—highlighting deep digital inequality across 3,000+ U.S. counties.

January 2025

Tyaveon Levit: Educational Achievement in Florida.

Angelo Morelli: Citi Bikes

Tyler Zoucha: Football Analytics

August 2024

Alan Tao: Spotify Songs. 25 points in danceability is associated with 3 (+- 0.25) points of popularity on a 100 point scale.

Lela Sengupta: Analysis of Hidden Stock Orders. Low volatility stocks experience about 48% of the hidden orders high volatility stocks experience, plus or minus 4%.

Gitanjali Sheth: Effect of physicochemical properties of wine on their quality. For each one-unit increase in alcohol content, the quality of the wine is expected to increase by 0.34 units, with a 95% CI of 0.31 to 0.37.

Tanay Janmanchi: 2020 United States Presidential Election results by region.. Given that a voter is from the South, there is a 52% chance that they will vote for Biden, although that number could be anywhere from 48% to 56%, despite being his least popular region.

Aditya Narayan: Different Factors Affect On Academic Performance. Increased stress is linked to poorer academic performance, with each additional unit of stress generally resulting in a decrease of about 1.4 units in academic performance, plus or minus 0.3 units.

Grant Quattlebaum: Quality of Magic: the Gathering Cards for Draft. For rare Magic cards, a one mana higher cost is associated with a 0.55% increase in winning the game, give or take 0.25%.

Sophie Zhu: Sepsis Survival Predictions. A 60 year old female with 1 case of Sepsis is predicted to have a 95.86% chance of survival, though this may be between 95.25% to 96.41%.

Harshil Kurapati: Income correlated to Race and Gender. The analysis reveals that both race and gender significantly influence income levels, with disparities evident across different racial groups and between genders.

Anna Shao: Health in the US. Compared to individuals without children, individuals with one to three children showed a significant decrease in the likelihood of having a health condition.

Melody Liu: Waste Recycled. There seems to be a strong correlation between waste recycled and incinerated, mismanaged, and landfilled waste.

Elaine Zhang: Maternal Mental Health on the Health of the Baby. It is shown that mothers with a high Edinburgh Postnatal Depression Scale (EPDS) results have children weighing 0-8% less compared to the average weight of 3,463 grams.

Roshan Ranganathan: Formula 1 data analysis for the 2023 season. Our analysis reveals that Max Verstappen, a driver for Red Bull, is statistically the best driver in the 2023 Formula 1 season.

Mihir Kaushal: Effect of Drugs on Memory Recall.. One of the drugs has a negative impact on memory recall.

Akhil Bellamkonda: Mesa Arrests. Our analysis shows that males are statistically more likely to be arrested than females, with a predicted arrest probability that is 15% higher on average.

Wallstreet Petrus-Nihi: Heart Disease Prediction by Age and Gender. This project examines how variations in age and gender affect heart disease risk, aiming to develop a predictive model that enhances health assessments.

Lowell Ethan Xavier: Lung Cancer Prediction. A person who is exposed to a lot of air pollution is at a greater risk for lung cancer than a heavy smoker, with an estimated difference in their impacts being 0.11(± 0.04) on an 8 point scale.

Ronit Dash: What Crimes Would Most Likely Find A Culprit in England?. This project analyzes the types of crimes that find a culprit, finding that despite total conviction rates being low in England, Burglary and Theft Crimes have the highest rate with 3.06% and 2.87% on average respectively.

Tanish Thaker: OW2 Win Rates. Based on the collected data, the win rates of players tend to be the highest when their statistics for each variable is at the 60-80 percentile of that variables spectrum.

Areeb Atif: BMI and Diabetes. BMI is correlated with the risk of diabetes

Aaditya Gupta: Math, Reading, and Writing Exam Scores by Parental Education Level, Gender, and Test Preparation. On average, male students do 8% better than female students on math exams, plus or minus 3%.

Bingbing Luo: Environment impact by food. Modeling the relationships between types of food and negative impact on the environment

Aedan Liao: S&P 500 Correlation w/ US Presidents. The stock market is affected by policies of certain presidential administrations

Annika Shivam: What Makes a Book Popular?. Fiction books tend to sell, on average, around 600 more units than nonfiction books

Shyamali Sheth: student-GPA. How different learning factors effect the GPA of a student.

Jason Fisher: Alzheimer’s Diagnosis Prediction. Physical function has a greater association with Alzheimer’s diagnosis than mental function.

Rajeev Kumar yadav: Housing Prices. Preferred area housing prices are higher despite the furnishing status of the housing.

Hunter Stephens: Effect of Silver and Bronze Medals on Olympic Placing. Silver medals have a much stronger effect on Placment than Bronze medals

Hao Lin: Wildfire Model. Modeling the Impact of Environmental Conditions on Wildfire Incidents in California

Pragnyasri Sankar: Indicators of Diabetes. This study uses factors including age, gender, BMI, and more to detetc within which conditions diabetes is more common.

Ali Atif: Olympic historical Model. This project predicts thetotal number of medals based on the number of gold, silver, and bronze medals.

Nithura Thevarajah: Predicting Technological Evolution using Anthropology. Precision in motor dexterity is a major contributor in the need for technological development compared to cranial capacity

Tulika Punia: Correlation Between Education Level and Income. Education levels may not influence income as significantly as commonly believed.

Judy Zhu: How Chat-gpt Affects Global C02 Emissions. Our new daily routine of using Chat-gpt releases over 50 gigawatt hours a year, and over 3% of global consumption, and are expected to double before 2030.

Dinesh Satyavolu: New York City Bike Durations. On average, bike rides are shortest late in the day, and longest on Friday, with an approximate increase of 7.79 minutes compared to other days of the week.

Aleksandar Joksic: Income predictors. Individuals with higher income tend to be White, male, more educated

Ivy Spratt: Depression vs Music Genre. Gospel music seems to decrease depression levels the most, and listeners of the genre are the happiest.

Aashna Patel: Cardiac Clues and Causes. This analysis uncovers critical factors influencing heart attack risk, providing valuable insights for improving prevention and treatment strategies.

Jacob Hardin-Bernhardt: Effects of Nuclear Proliferation on Peace. The effects of nuclear proliferation as opposed to those of great power involvement on peace are negligible.

Rishith Ravi: Acceptance of Personal Loans. Customers with a higher income are 30% more likely to accept a personal loan compared to those with lower income.

Hans Zhang: Life by socio-economic. Life expectancy by socio-economic factors

Sharav Joshi: Difference in growth between hispanic and non-hispanic populations. Both year and population group significantly influence growth trends, highlighting the dynamic demographic shifts within the U.S. population.

Zephan Shivam: Three Key Takeaways from The Game of Golf. The difficulty of courses in the PGA are well balanced, driving the ball over 300 yards is very tough while driving over 280 yards is fairly easy, and finally, yardage has a small effect on the overall score.

Andrew Ross: Happiness and Money. The happiness of a country is related to it’s GDP

Nihal Neeraj: Amazon Sales Profit Prediction. Online sales are the most profitable for Amazon, with a 764-unit profit increase per year, but uncertainty rates low as 153.64% and high as 307.29% due to external factors.

Ansh Kare: Chances of Kobe Passing the Ball. Kobe most always relied on his own shot-making abilities to affect the game but if they were at a disadvantage, for every point there opponent was winning by, his chances of passing increased by 3%.

Amogh Patil: House Price Analysis. The area of the garage (if there is one) is the biggest factor in determining the price of a house, having a coefficient of 45.

Nicholas Oliver Silveira Powell: US Visas Given Out to Different Nations. Mexico Is far and away the highest accepter of US visas out of every country in the world and with a 95% degree of certainty is 1.62 times more than the next largest, The Dominican Republic.

Madhavan Prasanna: Bitcoin Price Cyclicality Prediction. This project reveals Bitcoin price patterns using historical data and statistical modeling, highlighting cyclical trends.

Adhithan Manivassakam: What factors influence Depression?. The analysis indicates that age, gender, suicidal intent, and PHQ score are key predictors of depression, though individual cases may differ.

Cheng Ma: Oil Prices Recession. Predicting the next recession by relationship between Brent Oil Prices and US recession

Thomas Seoh: What Defines Happiness?. This project explores the most important factor that impacts happiness, which has proven to be social support.

Satvika Upperla: Longevity of Olympic Athletes. This project explores the relation ship between the total number of medals an athlete has won and the longevity of their career at the olympics.

Ashitha C Kukatla: Electric Vehicle Population in Washington state. The number of electric vehicles bought in Washington state is higher counties closer to major cities.

Dharshan Lakshmikanthan: Causal Effects of Education and Experience on Global Salaries. This project analyzes the impact of education level and years of experience on global salaries, using Bayesian modeling to quantify and visualize the causal effects of these factors.

Ashwika Katiyar: Conversion of Clinically Isolated Syndrome to Multiple Sclerosis. Patients with a presence of bands of immunogloblins, known as Oligoclonal Bands, have about 84.5% (± 4.5%) lower odds to be diagnosed with MS.

July 2023

Emmanuel Yeboah: Social and Economic Factors Endangering Vulnerable Languages. Improving agricultural opportunities in rural areas preserve vulnerable languages, but higher literacy rates and urbanization increase odds of language extinction by 20%.

Avery Zheng: Wimbledon 2023 Prediction. Alcaraz, Medvedev, and Djokovic have the highest chances of winning the Men’s draw, and Swiatek has the highest chance for Women’s.

Daniel Cho: European Soccer Wages. Payroll rank within league is associated with end-of-year standings; an increase in payroll rank leads to a 0.69 increase in standing ± 0.02.

Alan Tao: Carbon Dioxide and Temperature Over Time. Around 53,000 (+/- 2000) metric tons of carbon dioxide is going to be in the year 2025 and the average land temperature is going to be 20.6 C (+/- 0.6).

Vaangmaya Rebba: Annual Income Predictions for Universities in Pennsylvania. Over 90% of the graduates make less than 100k

Luit Deka: Education Expenditure on Urban and Rural Populations. Every 440 billion dollars spent on education correlates to about .1% urban population growth while every 350 billion dollars of educational expenditure correlates to about .08% rural population growth.

Rishvanth Amsaraj: Analysis of AP Exam Data. The findings emphasize the need for targeted efforts and interventions to support and empower minority groups, including females and black individuals, within the education system.

Aadhira Satheesh: The Probability of Contracting Alzheimers Based on Different Lifestyles and Conditions. There are various lifestyles that you would think increase/decrease the risk of alzheimers, and this project shows you what actually can raise the risk of developing Alzheimers

Ethan Xiao: The Effect of Engine Displacement on Fuel Economy. The highest displacement engine has 6.375x less fuel efficiency than the lowest displacement engine.

Rishi Sandrana: Trends in Global Education Rankings. A country’s average IQ and test scores are almost perfectly correlated (r = 0.92), but there is only a moderate correlation (r = 0.65) between either of these factors and a country’s education ranking.

Nicholas Grant: NFL Point Spread: 1978-2013. Overall, most NFL games are decided by double digits margin of victory, with a higher average margin of victory when the point spread exceeds 7 points.

Sririthvik Bellala: Outcomes of Infectious Diseases from 2001-2014. The total amount of people infected with E.Coli, HIV, and Malaria has decreased over time, reaching close to 0 for some years.

Pranav Chivukula: Chelsea’s Transfer Market and its Effect on Their Season. If you spend like Chelsea did in the transfer market, your season will suffer as a result.

Melissa Ban: Influence of Patients’ Initial Treatment Age on OrthoK Lenses Effect. The effect of OrthoK lenses in treating myopia is not affected by patients’ age when receiving initial treatment.

Vibha Dara: Diabetes Analysis. Females have 0.0046 times lower odds of having diabetes with a range from 0.002 to 0.010 compared to males.

Rajarshi Mandal: Market Value of Fortune 500 Companies from Economic Indicators. Companies in the Fortune 500 with 100,000 employees have an expected market value of $75 billion, plus/minus $8 billion.

Aryan Kancherla: Chicago Bulls 2023-2024 Field Goal Percentage Prediction. For the 2023-2024 NBA Season, the Chicago Bulls’ predicted average field goal percentage is 48.7%.

Pratham Kancherla: NBA 2023-24 Free Throw % Prediction at Each Position. Each position has a predicted FT% for the NBA 2023-24 season.

Gaurinath Subash: Player Salaries vs Field Position. On average, forwards tend to earn the most money, with an average salary of around $145,000, which is more than any other position on the field for players in the MLS League during 2007.

February 2023

Ava Ransbotham: Great British Bake Off Analysis. Age does not affect how far people make it in the competition.

Dermot Curtin: The Indo-European Similarities Project. European branches are generally similar to each other and dissimilar to Asian branches, even though they are not always more closely related.

Yash Dhore: Analysis of Advanced Placement. Rural states with lower populations tend to have a lower number of AP Scholar Awards per capita.

Venkat Tatituri: Top 10 Youtube Channels in Subscribers. These are the Top channels on Youtube! Lets see who wins the race to 10 billion

Matthew Ng: Coke Stocks. Coca Cola Stocks have a good trend line and it’s safe to say that soon, there may be a drop again but it will definitely rise back up in around 10-20 years.

Anish Bellamkonda: Impact of climate change on the world. Climate change is a complex and urgent global issue that requires collective action to mitigate its impacts and prevent further damage to the planet’s ecosystems and human societies.

Zayan Farooq: How Do Position and Age Correlate With Soccer Injuries? Position and age in the Premier League tend to affect the type of injuries that happen to players.

August 2022

Miriam Heiss: Classical vs Contemporary Music. Taylor Swift’s average Song length is almost 3 minutes more than J. S. Bach’s, disproving the point ‘Classical Music is too long’.

Maxwell Lu: Heart Disease: Risk Factors and Demographics. I found that men contract heart disease significantly more than women, American Indians/Alaskan Natives make up more percentage of those with heart disease than any other race.

Krish Saluja: Variables that Accurately Predict the Likelihood of a Stroke. Heart disease and hypertension are the most effective factors that help to accurately predict whether or not a person will suffer from a stroke.

Sam Yeh: Comparing Cryptopunk Data With Cryptopunk Attributes. The price of Cryptopunks vary based on many attributes, but collectors buy Cryptopunks that have favorable attributes, such as lighter skin color.

Elif Kuyuk: Factors that affect a Student’s Academic Performance. A higher range of scores were noted for students taking a standard lunch; female students were observed to have higher writing and reading scores.

Krish Badri: Growth of Small vs Large Market-Cap Stocks. The average small market cap stock in the S&P 500 is growing at a rate much faster than the average growth of high market cap stocks at the top of the S&P 500 Weighting List.

Rachel Heiss: These Prices are Defying Gravity: Broadway Trends 1985 - 2022. Today’s Broadway tickets cost much more than they did in the 80s, but more people are going to shows.

Dorothy Tang: How the COVID-19 Pandemic Changed Emotions. A survey measured people’s feelings, on 7 different emotions, during the COVID-19 pandemic in the U.S.

Varun Khedkar: Intelligence Quotient and Potential Indicators of Success. Intelligence Quotient does not directly correspond with may be defined as success, however does hold a clear indirect impact on the key precursors to success.

Jihan Bhuiyan: An Analysis of Monkeypox Demographics. While it may appear that monkeypox only impacts gay and bisexual men, this doesn’t correspond with previous monkeypox outbreaks and is likely due to restrictions on testing set by governments.

Mann Talati: NCAA Women’s Volleyball Statistics. While several teams are able to attack the ball through their attack attempts there is a very low chance of the team achieving the kill due to a block or a dig leaving the ratio of a kill to attack attempt close to 1:4 or 25% chance of achieving a kill.

Selim Coskunuzer: Observing NFL players Average Salary by Category over the years 2014-2020. NFL players salary data over a few years to see if it has differed

June 2022

Suvan Dommeti: Presence of Polarzing Words in Australian Media. Does Australian media excessively use the word “riot” over “protest” to convey a biased message?

Jackson Roe: March Madness: How to Pick a Perfect Bracket. Analysis of the trends in every March Madness bracket since the 1985 expansion to 64 teams.

Farhan Sadeek: Analysis of World Population. Life Index is proportional to the Per Capita GDP of a country.

Allison Bubar: Newton North High School Basketball Evaluation. The NNHS basketball teams success was determined by FG% and defensive success.

Alex Kuai: Microchip Market. Dramatic increase in microchip market demand in the Asia Pacific region

Ethan Hu: Breakdown of the S&P 500. Shows the basic trends that are seen in the top 500 public companies in the US.

Nathan Rothschild: Comparison of Three 33 y/o Professional Tennis Players’ Effectiveness. Professional players of the same age, career length, and similar career success had vastly different effectiveness trajectories during their career.

Shivangi Nadkar: COVID deaths by race and education level. Graphs that display COVID-19 deaths and are faceted by race and education level.

Henry Sippel: Fish Stocking Data in Lake Michigan by Species. Movement of fish stocking in Lake Michigan.

Pooja Kawatkar: Analysis of Superbowl Statistics. In 1990, the San Francisco 49ers defeated the Denver Broncos by 45 points, the greatest point difference in Superbowl history

Samruddhi Naik: Wages by Gender and Occupation. The biggest wage gap is in the financial industry where men earn around $45,000 more than women do.

Awais Choudhary: Traffic Violations. What a driver should avoid on the road; an examination of traffic infraction records.

Sai Gottemukkula: An Exploration of COVID-19. Graphs that look at some under-explored information of the pandemic.

Akhilesh Sangaraju: Cars - CO2 Emissions and Efficiency. When buying an electric car, you can be confident that you are making the eco-friendly decision, and that it will statisfy all of your needs.

Amisha Jain: Karate Statistics Around the World. These graphs analyze how various countries are impacted by Martial Arts.

Nathan Nambiar: Progression of Agent Pick Rates in Valorant. The change in agent pick rates throughout Valorant history.

Hannah Adams: Per capita CO2 emissions. Shows trends in per capita CO2 emissions by country, which could inform equitable environmental policy.

Katie Zhong: Factors of Success in WEBTOON Originals. Viewer ratings and subscriber counts prove that Romance and Fantasy Originals are the most popular.

Noel Jung: Temperature and Vector Borne Diseases. Do changes in temperature affect the number of cases of West Nile and Lyme disease?

Lucas Cao: Number of Billionaires over Years. The number of billionaires has been increasing rapidly every year since 2000.

Andrew Zhuo: Chess Openings. Analyzing the most commonly used chess openings by 5 top chess Grandmasters.

Austin Zhuang: Adult Mental Health in the USA. The project analyzes of how financial factors affect access to mental health

Suvan Chatakondu: Movies 2021 Analysis. The factors that made movies in 2021 successful and special.

June 2021

Sophia Zhu: BREAKING NEWS: FBI Says They Found ‘Man Child’ In The Woods! An analysis of the length and wording of 2018 fake news articles titles.

Nuo Wen Lei: PAC Influence on US Politics. Trend of foreign donations seems to parallel party representation in US congress.

Kevin Xu: ATP Tour Analysis Over Time. The game of tennis on the ATP Tour has shifted tactically and demographically since its conception.

Stefan Arroyo-Cottier: The Time Hass Come. Avocado consumption and prices in the United States over time.

Osaretin Lawani: An Analysis of the Causes Singapore’s Aging Population. This is an investigation of the relationship between Singapore’s aging population and the nation-state’s declining birth rates

Dhruv Syngol: World Happiness Report Data Analysis and Visualizations. Data from the World Happiness Report over the years reveals interesting trends in a variety of factors, from GDP per Capita to the Freedom to Make Life Choices.

Abhay Paidipalli: The Evolution of Basketball. An analysis of some of the most noticeable trends in the history of basketball.

Marco Tchernychev: 20/21 EPL Data Visualizations. This project examines the correlation between age and goals, as well as nationality and passing.

Max Xu: Covid 19 Around the World. Examining Covid 19 cases and trends among the continents.

Ajay Malik: High School Performance and Future Salary, by College. The more students of a university who were at the top of their high school class, the more salary these students will make.

Shreeram Patkar: Covid-19 Socio-Economic Causal Analysis. Comparing different socio-economic metrics to Covid death and infection rates

Z Schwab: GDP and Wealth Distribution: Not a Uniform Picture. GDP is an inaccurate indicator of quality of life because it doesn’t factor in income inequality.

Andre Arroyo-Cottier: Covid Cases and Vaccination Rates in the US. Graphs that show the effectiveness of vaccinations in the USA.

Daniel Chen: Population and Crime rate in the US. Population and crime rate have no correlation in the US.

Tejas Mundhe: United States Covid-19 Death Toll by State. Are more populated states impacted by the Covid-19 pandemic more than less populated ones?

Matthew Ru: An Analysis of Pro Tennis Players. An arrangement of graphs to show the relationship between height, weight, birth month, and country of origin in the tennis pro-scene.

Ronit Anandani: COVID-19 and its Impact on Sectors of the Stock Market. Visualizing the effect of COVID-19 pandemic on different stock market sectors.

Frank Li: Chance of Automation Replacement of Various Occupations. Exploring the likelihood of a variety of occupations being replaced by machine labor.

Zuhair Usmani: Movie Genres and if Producers are using the same genres. Gathering movie genres and analyzing if producers are using the same genres.

Bryan Li: Covid-19 Effect on Student Mental Health and Behavorial Change. Analyzing the effects of the pandemic on the mental and physical health of students and adults.

Mahima Malhotra: Diabetes Prevalence by State. A comparison of the prevalence of diabetes among adults by state to rates of depression amongst adults with diabetes and mortality rates.

Yuhan Wu: Housing Fluctuation Using Case-Shiller Index. The two key factors that influenced the housing price are income and housing supply; population matters little.

Nabiha Rabbani: Sci-Fi and Techno Orientalism. Measuring interest in science fiction movies over time and comparing that interest to economic success in East Asian countries.

Emmanuel Buabeng: Average Salary Per Department in Chicago. An analysis of spending across government departments in the city of Chicago.

Fahim Ahmed: Investment Analysis of Exchange Traded Funds (ETF). What types of ETFs would have made you the most returns if you invested in them last year?

Isaac Frank: Anxiety and the Need for Research. Examining the correlation between pharmaceutical spending and anxiety.

Anmay Gupta: Effect of Various Factors on 3D Print Strength. 3D prints can be strengthened by changing a variety of factors.

Srihith Garlapati: Goverment Department Earnings by Region. This is an investigation of the highest earning departments on average per each state in each region in the US.

Tom Pan: COVID-19 Cases/Vaccinations Analysis. Analyzing cases and vaccination progress in China, Canada, United States, and United Kingdom.

George Pentchev: Course Enrollment in Secondary Schools. Ranking secondary school courses by enrollment and examining the growth of the most popular courses.

Gabriel You: Sinoalice Power vs Cost Analysis. Looking at Sinoalice Weapons and comparing weapon total power to cost to see the correlation between them.

Joseph Jeiwan Kim: Change In Sea Level. An examination of climate changes impact on the global sea level.

Varun Mittal: Impact of Covid On Home Court Advantage in the NFL and NBA. Examining how much home court/field advantage was affected by Covid-19 by measuring the amount of times that home team won and other stats.

Shyam Sai Bethina: Motor Vehicle deaths by Age and Year. Is the message “Don’t drink and Drive” still prevalent in older people?

Oliver Altindag: Liverpool Football Clubs Best Eleven. This is an inquiry into the statistics of Liverpool FC and the best possible Eleven the club could field in future games.

Daniel Wang: Effects of Covid-19 on Student’s Educational Stress. Examination of the stress levels experienced by students before and during Covid-19.

Arghayan Jeiyasarangkan: How The Overall Gold Difference Changes Throughout Professional League of Legends Games. Examining the relationship between gold difference and game duration in professional League of Legends matches.

Arjun Velayutham: Earnings vs Inflation. Have the increase in a companies earnings outpaced the inflation margin.

Shreya Sree Morishetty: Undergraduate Engineering Enrollment by Demographics. Exploring if the demographics have changed through the years as enrollment in engineering programs have risen.

Stephanie Saab: These Boots are made for Shopping. A comparison of men and women’s shoe shopping tendencies shows that women spend more money while men are fond of a sale.

Kyle Sabo: How Speed Affects Offensive Ability in Baseball. Determining the extent that speed enables players to get more hits.

Varun Dommeti: Prevalent Emotions in Mainstream Music, by Decade. Exploration of emotions in the billboard top 100, sorted by decade.

Soham Gunturu: Mortality rate of babies under different variables. Relationship between mortality rate of children under 5 per 1,000 live births and year, country, and economic class.

Felix Cai: Relationship Between GDP per Capita and Percentage of Population with a Confirmed Case of COVID-19. Exploring the relationship between GDP per capita and percentage of population with a confirmed case of COVID-19.

Heather Li: An Overview of Toronto Shelter Data in 2020. Mapping the occupancy percentage of Toronto shelters during 2020.

Gov 1005, Spring 2021

Nosa Lawani: Towards Understanding Gun Homicide in the 50 States. The project compares correlates of gun violence across the 50 states.

Hana Kim: How NYC Students of Different Racial Groups Perform Academically Over Time. My project explores how students of different racial groups in NYC perform on benchmark exams over time.

Shai-Li Ron: Covid-19 Vaccination Rates for Cities of Different Socio-econmic Status. This project investigates how socio-economic status impacts vaccination of cities in Israel, a country which has been leading with the highest vaccination rates.

Mohit Mandal: The Future of Cricket: the Evolution of Twenty20. An investigation of how T20 cricket has evolved since its inception in the early 2000s.

Trevor Cobb: Healthcare Costs and Barriers to Care in the United States. An exploration of the relationship between healthcare costs and barriers to care in the United States.

Alice Chen: Tech and Money: Global Access and Ownership of Personal Technology. This project explores the relationship and change in personal tech device ownership and internet access of people globally.

Fahad Alkhaja: Managing Expectations: xSoccer Data. I look at and analyze different soccer stats with a few sample cases, specifically xG, xPts and models I generated.

Alex Tsotadze: The Effects of COVID-19 on Domestic Disputes in US Cities. I analyze how new COVID-19 cases affect domestic dispute calls in certain US Cities, specifically Baltimore, Cincinniati, Los Angeles, Orlando and Seattle.

Caroline Behrens: Behind Bars: How Prison Statisitcs Vary Between Northern and Southern States. My project explores the differences between prisoners in northern and southern states.

Dennis Blyashov: Kittbio Labs: An Exploration into Customer Demographics. The applications leverages the power of google search trends to explore specific keywords and build customer demographic models.

Scott Bek: COVID-19 Death Rate and Air Pollution Level in China. My project explores whether air polution level is related to COVID-19 death rate in China.

Christopher Snopek: Hitter Evaluations Regarding Barrels. This project investigates which MLB players are undervalued and overvalued based on the percentage of hits they have that are hit hard.

Shaked Leibovitz: Happiness, Freedom and Gender Inequality: Exploration Accross Countries. My project explores levels of gender inequality in different countries all over the world and the relation to happiness and perceived freedom to make life choices.

Nana Koranteng: Coups and Stability: How the Middle East Compares to the Rest of the World. My project explores how Middle Eastern countries compare to the rest of the world in terms of coup related events.

Sarah Brashear: Student Achievement and Socioeconomic Status in U.S. Public Schools. I analyzed the relationship between student achievement and socio-economic status in U.S. public schools.

Gov 50, Fall 2020

James Fitz-Henley: Distributing Opportunities: Economic and Demographic Predictors of Opportunity Zone Designation. How do a census tract’s economic and demographic conditions predict its designation as an Opportunity Zone in the American southeast?

Katherine McPhie: Social Connectedness in the Harvard Class of 2024. Analyzing how Harvard first-years have been forming social connections during the COVID-19 pandemic. This project is joint work with Elliott Detjen, Giovanni Salcedo, and Ava Swanson.

Ryan Zhang: Strategic and Sincere Considerations in American Presidental Primaries. Exploring how often, and the extent to which, primary voters balance sincere concerns (candidate favorability and ideology) and strategic concerns (candidate viability/electability) in American presidential contests.

Ben Lee: A Minnesota Fisherman’s Guide to the Spring 2021 Season. This project gives recommendations for the best Northern Minnesota lakes for walleye fishing for the 2021 season.

Julia Blank: International Judging Biases in Ice Dance: Analyzing the 2018 Olympic Season. An investigation into whether a judge’s nationality bias is reflected in the scores of Olympic level Ice Dancers.

Akila Muthukumar: English Requirements in US Clinical Trials. An analysis of English and Spanish language requirements in eligibility criteria for neurological and mental health related clinical trials.

Alexa Jordan: Women in the Courts: Parties and Presidents. This project looks at the progress we have made since Sandra Day O’Connor in 1981 and investigates the role of sexism in the continuitng lack of gender parity.

Carter Martindale: SCOTUS Political Leanings. A look at how the Supreme Court has ruled on various issues and an attempt at predicting how the current court will vote.

Sara Park: Polity and the Social Institutions of Gender Discrimination. An analysis on the relationship between polity and social institutions of gender discrimination.

Daniel Salgado-Alvarez: When All Work Means All Play: International Tourism in Mexico. Analyzing what different tourist attractions and government interventions generate the most tourism GDP using data from all the states of Mexico

Amaya Sizer: An analysis of foundational styles for ranked UFC fighters. An analysis of the different foundational styles of ranked fighters, and how they are related to number of wins and rankings.

Monica Chang: Justice Delayed is Justice Denied: Litigating Delays in Disability Accommodation for Homeless Families. My project investigates whether disability-related needs are being met for homeless individuals within a reasonable timeframe in support of the Greater Boston Legal Services’ class action lawsuit against the MA EA Shelter system.

Nick Maxwell: Examining the Ideology of the Roberts Court. An analysis of the voting history of the Roberts Court by ideology and issue.

Alexander Park: Home Field Advantage in the NFL, MLB, and NBA. An analysis of the importance of home field advantage in America’s three largest professional sports leagues.

Esther Kim: Modern Patterns in Korean Immigration. A look at modern patterns in Korean immigration through the relationship between types of visas and national economic growth.

Nick Brinkmann: Best To Test: Analyzing My Online Chess Games. Investigates trends in my online chess games over the past 2-3 years, including rating changes and performance in different openings; attempts to predict game outcomes based upon various characteristics.

Osvaldo Cervantes: An Analysis of Gaming, Gender, and Psychology. A study into how behavioral factors such as gender identity, anxiety, or other behavioral descriptors are revealed through gaming stats and real world metrics.

Will Rowley: Starting Pitchers: A pitch by pitch analysis of the 2019 MLB season. A look at how different starting pitchers compare to one another in pitch selection, pitching under pressure, and performance statistics, with an analysis of the effect of pitch metrics on WHIP and ERA.

Yifan Chen: Analyzing Trends in College Financial Aid. My project will look at how percentages in different types of financial aid has changed over time, and how it varies depending on student characteristics.

Jasmine Hyppolite: Hashtags As Social Movements: Can They Impact Legislation?. This project looks at how the popularity of hasthags for the Black Lives Matter Movement and Me Too movement correlate with the presence of bills regarding related topics in the NY Legislature to understand how impactful hashtag activism is.

Dash Chin: kanye omaRi west. An analysis of music streaming and writing trends through the lens of music by Kanye Omari West.

Sophie Bauder: The Impact of COVID-19 on Education. An analysis of how COVID-19 cases impact school reopenings, and how the pandemic as a whole effects edcators.

Loic Tagne: The Splash Brothers. Analyzing Stephen Curry and Klay Thompson’s 2015-2016 NBA Season

Michelle Kurilla: Analyzing Case Salience and Majority Opinion Assignments by Chief Justices. A look at how case salience impacts the frequency of majority opinion assignments by Chief Justices to themselves

Ang Sonam Sherpa: Gender Representation in Politics - Case Study Nepal. Analyzing the political representation in Nepal along gender lines with regressions of various demographic, developmental and gender-related characteristics in a district

Hope Kudo: How the Average American Spends Their Time. Exploring the American time usage survey and looking at how demographic and socioeconomic factors influence how people spend their time

Josh Willcox: Natural Resources and State Structure. Exploring the relationship between types of state revenue and regime type

Neloy Shome: Covid-19: Domestic Impact and International Pandemic Management. Analyzing Covid-19 impact on mental health disorders and racial minorities.

John Chua: Food Insecurity in San Francisco. My project visualizes and predicts risk factors for food insecurity in San Francisco in support of Alemany Farm’s Community Food Project Program.

Ton-Nu Nguyen-Dinh: The Dam Building Boom. An interactive map visualization of dams and reservoirs in the world and analysis of factors influencing their construction.

Joshua Berry: Economic Indicators for G20 Attitudes towards China: How Trade Flows Predict Public Opinion. Analyzing how international trade flows and macro-economic data predict individual, micro-level attitudes towards China among the G20 countries.

Kai McNamee: Racial Geography of the MBTA. Examines the racial geography of Boston’s public transit system.

Ruby Huang: NCAA D1 Women’s Volleyball 2011-2019. This project looks at the average performance of top teams and individual players in the nation across every skill category from 2011 to 2019.

Taryn O’Connor: Analysis of US Senate Elections, 1980-2018. An analysis of the relationship between Senate campaign expenditures and election outcomes.

Justin Qi: Equitable Lending? Don’t Bank on It: Using Waiting Times to Measure Racial Disparities in the Paycheck Protection Program. Exploring the potential role of taste-based discrimination in Paycheck Protection Program loans.

Jack Murphy: UNGA Regional Voting Patterns. An interactive visualisation of regional voting patterns in the UN General Assembly, allowing comparisons at both the region- and country-level.

Hollyn Torres: NFL Ticket Price Data. Analyzing the average NFL ticket price in relation to regional location and success

Megan Mackey: The Economics Of The Premier league From 2009-2015. A look at how the league finish of a team in the premier league affects the payments they receive and the consequences that come with that

Nikita Lledo: What direction are we going?. Exploring data in what direction african citizens think their respective countries are going.

Matt Tynes: NBA MoneyBall. Analyzing advanced statistics and salary in the NBA

Matthew McGlone: Minesweeper!. An analysis of minesweeper stats between a pro and myself at several difficulties to compare potential record times

Andre Ferreira: The Reality of Climate Change. My project analyzes trends regarding temperature anomalies and its correlation to co2 emission growth.

Diana Zhu: Analyzing Figure Skating Scores on the International Circuit. Using World Championship Figure Skating data to explore scoring trends and identify home-town advantage.

Emily He: U.S. Teacher Characteristics Throughout the Years. How have the qualifications of U.S. teachers evolved over the years?

Drake Johnson: Protest Risk in Different States and Countries. An analysis of the likelihood of violence and/or fatality by the police during a protest in any given state or country

Jasper Goodman: The Presidential Politics of COVID-19. This project provides a framework for understanding COVID-19’s impact on the 2020 presidential race.

Ruy Martinez: COVID Cases and Deaths as a Factor of Race. How does race factor in to who gets COVID and who dies — and what does it say about how we deal with disease?

Rena Cohen: First to Close, Last to Open: The Economic Effects of COVID-19 on Arts Organizations. An exploration of the financial impacts of COVID-19 on the arts and culture sector using survey response data from over 17,000 organizations collected by Americans for the Arts

Sammy Murrell: Work-Related Fatal Injuries in New Zealand (2000-2014). An exploration of work-related fatal injuries by demographics, working circumstances and location.

Yishak Ali: Fair or Exploitative Pricing: Pharmaceutical Drug Price Trends. An analysis of recent rising drug costs in the US in the context of R&D costs to determine if firms are raising prices for R&D reasons or profit-raising motives

James Joyce: Global Terrorism and Economic Inequality. An analysis of how economic inequality affects the frequency of terror attacks around the world.

Derek Chang: U.S. Economic Growth: A Look Back and a Look Forward. An exploration of the current economic environment and public policies amidst COVID-19 that begins with an examination of America’s recent economic history.

Sofie Fella: Female Athletes, Body Image and Societal Expectations. My project investigates data on women in sport, general female body image, how female athletes are seen in media and most importantly the relationship between female athletes’ and their body image.

Rom Blanco: Urban Population and Forested Area. A study of the relationship between urban population and forested area in 227 countries and territories between 1990 and 2019.

James Wolfe: U.S. Politics and Public Opinion in the Wake of Covid. My project analyzes the state of U.S. public opinion and political sentiment in the time of coronavirus, with a focus on Presidential politics and the 2020 election.

Khalid Thomas: PGA Tour Data: How do you match up against the pros, and what does into a good golf game. An analysis of how to improve your golf game.

Ciara Duggan: Predicting World Bank Group Employees’ Overall Job Satisfaction. Which variables related to World Bank Group employees’ work and workplace environment are most predictive of their overall job satisfaction?

Hiren Lemma: A Multifaceted Analysis of Trends in Voter Data. An exploration of the extent to which various factors affect trends in voting data, based within the United States by county

Owen Asnis: The Battleground: Wisconsin, Michigan and Pennsylvania in Contemporary Presidential Elections. Studying the battleground states of Wisconsin, Michigan and Pennsylvania in contemporary presidential elections

Anthony Morales: Does Money Win You Titles in the Premier League?. An analysis of how the money spent by each Premier League team each season affects their performances.

Winona Guo: Climate Colonialism. This project explores how colonial histories have impacted countries’ climate risk today.

Abigail Skalka: EU Sanctions 2002 - 2020. This project examines the people and entities subject to EU sanction in a comparative framework with the United States.

Aidan Borguet: NFL’s Best Plays. It is an analysis of the NFL’s highlight plays for each team and what their tendencies are.

Geena Kim: NYC Crime Analysis 2019-2020. An analysis of how different factors and characteristics correlate with being a crime suspect or victim in NYC.

Bobby Current: Education and Political Leaning. I look at how much, if at all, education spending by district affects voting pattern in presidential elections, with both the total spending and the spending per capita being used.

Lavinia Teodorescu: How communism affected Women Rights. My project compares data from communist, post-communist and “never-communist” countries and analyzes differences in employment, education and healthcare

Uluc Kadioglu: UFO Sightings in the United States. An analysis of the different factors that might be correlated with UFO sightings across the US.

Ana Castaner: Does Political Affiliation Influence County-Level COVID-19 Infections in the United States?. An analysis of the relationship between county-level political ideology and COVID-19 case density across the United States.

Buddy Scott: Buddy Ball: Understanding NBA Finances Amidst COVID-19. Analyzing finances of NBA teams and how COVID-19 (specifically the lack of revenue and income generated by ticket sales) could affect the economics of the league.

Liam Hall: Oklahoma Demographics and Economic Status by School District and the Likelihood of a Special Olympic School Program. What is the correlation between demographic groups, economic status, and the presence of a Special Olympic School Program in the state of Oklahoma?

Sophie Li: Analyzing COVID-19’s Impact on Poverty in Sub-Saharan Africa. Analyzing COVID-19’s impact on poverty in sub-Saharan Africa

Satoshi Yanaizu: Comparison between Global Steel and Aluminium Production. My project analyses the global trend in steel and alumnium production

Reem Ali: Failed Punishment: How US Sanctions Have Impacted Development Worldwide. An investigation of how US Sanctions (1980 - 2015) have impacted developement (evaluated through the indicators of public health, education, democracy scores, imports, and exports) worldwide.

Eleanor Fitzgibbons: Supreme Court Justices Over the Years. My project explores the voting history and ideological leanings of each Supreme Court Justice from 1946-2020.

Naomi Jennings: Misogyny in Rap. A look into trends of misogynistic sentiment in rap music over the decades

Felix Deemer: How does Politics impact Income Inequality?. My project investigates the link between income inequality and political institutions, and how certain political environments are correlated with greater levels of inequality.

Daiana Lilo: Analyzing the Emotional and Psychological “Tells” of a Supreme Court Justice. Looking to see if behavior is a good inidcator in predicting how a Justice will vote during the oral argument stage of a case.

Anh Ton: Analysis of Prosecution Data in Middlesex County, MA. Analyzing the Efficacy and Racial Disparities in Prosecution

Lucas Gazianis: Exploring the Ideological Tendencies of the Supreme Court. This project explores the ideological consistency of Supreme Court justices, more specifically how frequently they vote out of lockstep from the other justices appointed by presidents of the same party.

Josiah Meadows: Forces Behind Florida Voter Registrations. Determining how closely rising unemployment, COVID-19 deaths and cases, and other pivotal events in 2020 correlate with increases in FL registration numbers.

Salomé Garnier: Sexual Education and Health Outcomes for Women and Girls. My project explores the relationship between sexual education and health outcomes for women and girls, using data from the Sustainable Development Goals.

Kendrick Foster: Predicting Corruption in Latin America. What economic and political factors are most useful in predicting corruption in Latin America?

Lukas Emge: NFL Combine: Do the Results Really Matter?. My project investigates and analyzes the data from the 2013 to 2017 NFL Combines, looking at which tests matter the most for different positions and what each test can tell us.

Prashanth “PK” Kumar: County-Analysis of COVID-19 by Education and Population. Here is a look at each county in the US and their COVID rates as affected by demographics including education, population, and population density.

Pierce Bausano: Harvard Wrestling Database. A database of Harvard wrestlers.

Charles Hua: Assessing the Landscape of Climate Political Fundraising. My project analyzes the landscape of climate political donations in the 2020 election, broken down by political party, political office, and state.

Janna Ramadan: Perpetuating Islamophobia in the United States: Examining the Relationship Between News, Social Media, and Hate Crimes. Exploring the role social media plays in perpetuating negative sentiments towards Muslims and the conflated Arab and South Asian communities and the relationship between hate crimes and media coverage.

Sreya Sudireddy: Effect of Social Distancing Policies on COVID-19 Outcomes. A look at how Massachusetts’ social distancing policies have affected COVID-19 outcomes.

Trisha Prabhu: Analyzing @realdonaldTrump: A Deep Dive Into Donald Trump’s Tweets. An analysis of what drives Donald Trump’s sentiment on Twitter.

Setu Mehta: Going the Social Distance in New York City. Exploring the links between social distancing violations and COVID-19 cases, deaths, and hospitalizations in New York City.

Katrina Keegan: Is Someone, Anyone, Leading Protests in Belarus?. The goal of this project is to understand how the 2020 protests in Belarus are organized and led based on information from the social media platform Telegram.

Andrew Jing: Equity in Education: Effectiveness of State-Level Policies. Using specific education policies to predict the equitablility of state-wide systems.

Victoria Wang: Violence at 2020 Protests. Exploring factors that predict violence at nationwide protests in 2020.

Seth Filo: Predicting NFL Success by College,Combine, and Physical Traits. This project is a deep-dive on what traits and past production have the most predicitve power for NFL success, it studies each of the 22 offensive and defensive positions and strives to find key statistics that may be undervalued.

Noah Dasanaike: How is Democracy Changing?. Measuring changes in democracy over time with predictions for the future.

Vlad Ivanchuk: Life on the Frontline: People’s Access to Services Amidst an Armed Conflict in Eastern Ukraine. Analysis of access to basic services and reliance on local government among people who live in the midst of the military conflict in eastern Ukraine.

Zelin Liu: Elephants, Donkeys, Doves, and Hawks: Predicting U.S. Treaties. Examines the impacts of political party and military spending on how many treaties the Senate will receive in a given year.

Gov 1005, Spring 2020

Sophie Webster: HUDS Feedback Analysis. This project analyzes 2500 text messages of feedback sent to Harvard University Dining Services (HUDS) by students in 2019.

Andy Wang: Contributions In and Out of the Classroom: Harvard Faculty Political Spending. This project looks at political spending behavior of Harvard faculty since the 2016 election, as well as modeling for variables such as age, gender, and school

Jeremiah Kim: Harvard 2023 Social Connections. We analyzed social connections across the Harvard class of 2023 with respect to demographics, campus geography, and extracurriculars. Other members of the group: Helen Pang, Jack Kelly, Emily Ni, Kelsey Wu and Mark Stephens.

Jamal Nimer: Harvard Housing Project. We randomly assign blocking groups to one of the twelve houses and compare the distribution of identifiers (i.e., ethnicity, varsity athletes, legacy) to the actual distribution at the college. We also study patterns in blocking group formation. Other members of the group: Eliot Min, Lucy He, Austin Li, Sam Saba, Carina Peng, Angie Shin, Shojeh Liu and Ilyas Mardin.

Lainey Newman: Union Density & Democratic Vote Share in the U.S. 1976-2016. My final project for Gov 1005 looks at the relationship between union membership density and Democratic vote share in presidential elections.

Wyatt Hurt: Transboundary Water Conflict. An exploratory study that interrogates qualitative research claims about the causes of transboundary water conflict.

Jessica Edwards: Investigating Demographics within Higher Education Engineering Programs. This project analyzes racial and gender demographics for highly ranked US engineering programs within colleges and universities.

Paolo Pasco: Thinking Inside the Box: Analyzing Crosswords. Exploring the New York Times and Los Angeles Times crosswords for patterns, trends in difficulty, and changes to language over time.

Lindsey Greenhill: Human Trafficking and Exploitation. An analysis of the Counter Traffickign Data Collaborative’s publicly available data focusing on the demographics and movements of trafficked victims across the world.

Luke Kolar: Discovering ‘Discovery’. Examining samples, remixes, and covers of the 14 songs on Daft Punk’s Discovery, and exploring potential connections between use volume and song qualities.

Vivian Zhang: EU Remittance Flows. A visualization tool to analyze EU remittance inflows and outflows by country and by year from 2000 to 2018.

Jessica Wu: In the Time of Coronavirus. This project explores the state of life, love, and sorrow in the time of coronavirus and social distancing, from the rise of boredom and the deterioration of relationships to the manifestation of socioeconomic inequality.

Elias DeLeon: This Is Magic. Looking at what makes a Magic: The Gathering card “good,” specifically tournament viability and selling price on online marketplaces.

Liz Hoveland: Effect of Absentee Voting on Future Electoral Participation. A study looking at the effect first-time voting type has on future voting behaviors.

George Dalianis: Social Spending Programs in OECD Member Countries. Examines the effect of social spending programs on GDP per capita, economic inequality, life expectancy, and other social/economic variables in OECD member nations.

Saul Soto: A Decade of Dance. A comparison between the UK and Germany Top 100 Charts from 2000 - 2010.

Yong Lee: Effect of Attendance Policy on Synchronous Online Lecture Attendance. Survival analysis examining differential levels of attrition for synchronous lecture attendance in online classrooms.

Jason Yoo: Is Chess Solvable?. Looking at opening moves in chess and various other factors to determine their correlative effects on win percentages.

Nidal M.: Urgent Fury: Will Your Coup d’Etat Succeed?. A project that predicts the probability of coup success.

Alexandra Ubalijoro: Impact of Demographic Factors on Obesity Rates in the US. A project that looks at obesity rates in the US and how these are affected by social factors such as food access and income.

Michael Chen: Impact of Varying College Characteristics on Innovation Rates. This project is a comprehensive analysis on Opportunity Insights data on patent rates by college, as well as data on college characteristics from the College Scorecard Project.

Westley Cook: Payroll and Performance: Does Money Buy Wins?. An analysis of the relationship between payroll and performance using data from the NBA and MLB, demonstrating a moderately strong relationship between spending and success for most teams except the New York Knicks (surprising no one).

Jad Maayah: Women and Religion in the Middle East. This project analyzes public opinion behind policies and cultural norms that promote gender equality in the Middle East and explores the relationship between discriminatory values and respect for religious authority.

Michael Wu: Unemployment in the Era of Covid-19. Examing the effect of the coronavirus pandemic on US weekly unemployment claims across all fifty states.

Amy Zhou: U.S. Case Law Over the Years. Examines US officially published case law and the salience of various topics. Takes a special look at gender and the Supreme Court.

David Sutton: Switchers: Analyzing the Relationships Between Vote-Switching & Demographics, Policy Positions and Public Opinion. This project investigates switchers - voters who cast their ballot for a major party’s candidate in one election and then cast their ballot for the other major party’s candidate in the following election (same office).

Richard Zhu: Oscars So Local?: Film Awards by Demographics and Geography. Analysis of the Academy awards versus other film awards in terms of geography, demographics, and popularity over time.

Benjamin Villa: The Sacred and the Profane: Social, Religious & Medical Effects on Emotional Wellbeing. The following project attempts to see how different social, economic medical, and religious factors have an effect on the mental health of 89 Metropolitan and Micropolitan Statistical Areas in the United States in the Year 2016.

Arnav Srivastava: Global Health Spending Trends. Exploring relationships between health spending and factors such as income and demographics for various countries, helping us better understand the similarities and differences in health spending across our global community.

Cassidy Bargell: Sport Perceptions. This project looks at how sports are searched for and talked about on the internet, specifically related to concussions and other injuries.

Scott Mahon: Effect of MLB Statistics on Team Winning Likelihood. A look at how different MLB statistics, such as runs, batting average, etc. impact a team’s total number of wins at the end of the season.

Raymond Hu: Online Grocer Case Study: Instacart. Analysis of Instacart shopping trends and a comparison with traditional brick and mortar grocery stores

Rachel Phan: Obesity and Food Insecurity in the US. This project analyzes how food insecurity might affect obesity rates in the US and across states. It also takes a look at how poverty and other demographics affect both of these factors.

Elias Abu Nuwara: The War on Drugs: Let the Numbers Speak. An analysis of data on the US War on Drugs. Examining how increasing the intensity of drug law enforcement could yield counterproductive results.

Linda Qin: The Language of Emojis 🤠. Analyzing the most important question of our generation: which emojis do Harvard students prefer?

Belinda Hu: How Startups Can Thrive in the US. An analysis of different factors that affect how startups operate in the US, like type of industry, location, and education in the area.

Karen Jiang: Accountable Care Organizations in Medicare Shared Savings Program. This project looks at the generated savings or losses from ACOs for Medicare based on ACO, patient, and provider factors.

James Hutt: Reshaping the United States. States are increasingly defined by their composition, rather than their geography - this project remaps the US based on flight paths, migration, and political, religious and racial homogeneity. Finally, the East and West Coast can be together, like they always wanted.

John Morse: HIV and the Potential of PrEP. A look at HIV diagnoses across the U.S., and the potential cost savings offered by PrEP.

Katherine Wang: Schools as Social Mirrors. This project analyzes the extent to which a school’s inequality mirrors that of its community’s

Julia Englebert: Agriculture and Education in the Midwest, 1870-1960. This project uses census data and mapping to explore the history of agriculture in the Midwestern United States, paying particular attention to its relationship with education.

Hannah Phan: Effect of Public Infrastructure on Boston Crime Levels. Analysis of how public resources and amenities such as Bluebikes, streetlights, and trees influence crime levels in Boston from 2015 to present.

Connor Riordan: How Health, Income and Education Affect Voting. A look at how three separate variables, health, income and education, affect voting in the 2016 presidential election.

Hudson Miller: College and Military Demographic Comparison. Comparing the demographic breakdowns of members US military and undergraduate students.

Paddy Adams: Brexit Voting Demographics. Understanding the basis for demographic stereotypes following the 2016 Brexit referendum by analysing national census data and the correlations to Brexit voting.

Hamaad Mehal: NBA Fine Data. Looking to see if NBA fines work equitably and efficiently in changing certain behaviors in players.

Suruchi Ramanujan: Opioid Trends Across the United States. This project uses data from the CDC and chapter55.digital.mass.gov to examine trends in opioid death and treatment in the United States.

Hamid Khan: Cricket Analytics. Analyzing how changes in the rules of ODI cricket have impacted run-scoring and wicket taking

Leena Ambady: Organ Donor Registration and Demographic Trends in New York State. Looking at how demographic factors like age, income, and race might affect organ donation registration rates in New York state counties

Lara Teich: Winning Early in Curling. Looking at the influence of winning the first end on winning the full game in curling for Olympic curlers and College curlers.

Jerrica Li: Coronavirus Up Close. I am creating visuals and maps for understanding the scope of underreported confirmed cases and the magnitude of COVID-19 in the US.

Matej Cerman: Educational Inequality in Slovakia. Exploring the ties between socioeconomic conditions and unequal educational outcomes among Slovak regions.

Ella Michaels: Goodwill Hunting. Looking at goodwill locations throughout the US, searching by and understanding their relationship to neighborhood housing prices.

Rebecca Xi: The Covid-19 Data Project. A group project to analyze the spread of COVID-19, its economic impact, and the efficacy of government policy in mitigating the crisis. Other members of the group: Jun-Yong Kim, Katelyn Li and Nishu Lahoti.

Emma Freeman: Diversity of Upper Level Educational Institutions. Exploring factors, such as female population or admissions rate, that may contribute to racial diversity in upper level educational institutions.

Chase Souder: Modernization in Drum Corps International. Investigating the trends in musical selections of Drum Corps International World Class Finalists, as well as potential correlations between the modernity of a repertoire and score/placement.

Jacob Hansen: Is Arizona More Blue Because It’s Less White?. Examining whether changes to Arizona’s racial demographics, particularly increasing Hispanic/Latinx populations, are correlated with increased Democratic vote share.

Grace Zhang: Harvard General Education Courses. This project aims to understand trends in general education courses at Harvard and their implications on general education course enrollment caps.

Josh Mathews: NBA All Star Game (1951-2020): Does Player Popularity Impact Performance?. This project investigates the effects of popularity on minutes granted to players in the All Star game and their performance while also allowing for visualization of player metrics and shot charts over time.

Owen Bernstein: Content Analysis of Presidential Speeches. Identifying and quantifying the usage of populist, immigrant related, conservative, progressive, and environment related language in presidential campaign speeches in the United States.

Ishan Bhatt: Gender’s Effect on Speaker Points in National Circuit Lincoln-Douglas Debate. Using pretty much every national tournament from the past three years, I investigated to see if there’s a gender bias in the amount of speaker points judges assign to debaters.

Cameron Reaves: Predicted Net Migration from Sea Level Rise by 2100 for US Counties. This project visualizes a dataset containing predicted net migration from sea level rise by 2100 for US counties

Jason Rose: Harvard College Courses. In this project, I visualize trends in Harvard’s academic offerings and course enrollment.

Daniela Teran: Gender Equality and the Informal Economy: An analysis of the Andean Community Countries. This project aims to visualize the correlation between the percentage of women working in the informal economy and gender inequality in the Andean Community countries.

Kiera O’Brien: Climate Policy & Public Opinion. I analyze a common misconception regarding American politics and the climate challenge: that Republicans are opposed to action of any sort.

Taylor Greenberg Goldy: Understanding E-commerce Analytics and Purchase Behaviors. This project aims to look at different purchasing behaviors of customers on an e-commerce website and predict and suggest what other products they may also want to buy.

Teddy Landis: Effect of First-Time Voting Method on Future Participation. We look at first-time voters in North Carolina and see if voting method (absentee ballot vs. in-person) has an effect on future participation in elections.

Brian Kim: Korean Parliamentary Elections from Democratization (1987) to 2016. This project looks at election results data from democratization in 1987 to 2016 to see nationwide PR voting behavior, regional voting behavior (based on single-member districts) and any relationship between senior population and vote share by ideology.

Jenna Moustafa: Common Core Standards and Racial Education Achievement Gaps. This project assesses the impact of the implementation of the Common Core Standards on narrowing the educational achievement gap for racial minorities.

Tivas Gupta: Interactions Between Inequality, Populism, Social Spending, and Happiness. This product looks at the relationship between inequality and populism, social spending, and happiness throughout countries worldwide.

Will Schrepferman: State of the Union Text Analysis. This project applies textual data science techniques- including sentiment analysis, topic modeling, and neural network-powered Natural Language Generation- to State of the Union addresses!

Arushi Saxena: COVID-19: Early Public Sentiment about Social Distancing. The purpose of the project is to understand public sentiment about social distancing via Twitter data, across the United States, and correlate it to the extent to which US states are effectively social distancing.

Diassa Diakité: College Football: Win Regression Model. A dive into which statistics in college football over the past ten seasons correlate best to success within a given conference, defined as conference win percentage, and throughout most of Division 1 FBS football, defined as total win percentage.

Thomas Weiss: Moral Hazard Among Lawyer-Legislators. This study examines whether the professional background as an attorney of many U.S. politicians has an influence on their voting behaviour.

Mak Famulari: FBI’s Top Ten Most Wanted Fugitives. This project investigates the FBI’s Top Ten Most Wanted Fugitive list since its inception; it consists of a breakdown of criminals and their crimes, as well as considerations of special cases.

James Bikales: Demographics of EV Charging Station Placement. This project examines more than 28,000 EV charging stations in the U.S., comparing their locations to Census demographics, such as median household income, of the county in which they are located.

Andrew Courtney: Trends from the 2018 Midterms in CA-48. I study the support for candidates in California’s 48th Congressional District during the 2018 Midterm elections by demographic breakdown.

Brendan Chapuis: How Accurate Are Prediction Markets?. This project examines the accuracy of prediction markets as compared with polls in the context of the 2016 and 2020 Democratic primary elections.

Stephanie Cheng: Deep Dive into the Design Census. This project investigates the demographic of the US design industry and provides a predictor for job satisfaction and salary by a few metrics of interest.

Grace Pan: Need A Good Date Spot?: Analyzing Yelp Restaurant Reviews. This project uses a geographically diverse subset of Yelp data to analyze what type of restaurants make good date spots, the results of which are useful not only for individuals trying to impress their dates, but also for professionals in the restaurant business looking to attract more customers.

Julius Gunnemann: Berlin housing: A diff-in-diff assessment. Berlin has intervened in the apartment rental market in an unprecedented way, demanding landlords to cut prices to 2010 levels. Can this work?

Gabe Cederberg: Partisanship and COVID-19 Shelter-In-Place Adherence. Partisan leaning significantly impacts the degree to which people follow COVID-19 shelter-in-place orders.

Tahmid Ahmed: Weather Effects on NY Giants. This project focuses on the effects of weather on the NY Giant’s performance and attendance using regression modeling.

Asmer Asrar Safi: Government Censorship on Twitter: Requests and Response (2012-2019). This project tracks content withdrawal requests made by Governments on Twitter and Twitter’s response, in relation to a country’s GDP and overall Freedom Score.

Jonah Fried: Professor Ratings. My project is an analysis of the effect of different factors on the rating of professors at a variety of colleges.

Henry Austin: Visualizing Wealth Inequality in Michigan. This project analyzes at the geographic movement of wealth across Michigan over the past several decades and its impact on educational attainment, as well as the impacts of past policies on the state we see today.

Ibraheem Khan: Chicago Public Schools. The project explores the interplay between Chicago Public Schools’ budget deficits, school closures, and overall enrollment

Mohamed Mabizari: College Football: Win Regression Model. A dive into which statistics in college football over the past ten seasons correlate best to success within a given conference, defined as conference win percentage, and throughout most of Division 1 FBS football, defined as total win percentage.

John Mark Ozaeta: Worldwide Correlates to Happiness. An exploration of how different variables affect the happiness of countries around the world.

Shreyvardhan Sharma: Mapping COVID-19: Outbreak, Second Order Effects, and Public Opinion Analysis. An analysis of COVID-19 involving mapping the spread of the coronavirus across the U.S, measuring the second order effects of the disease, and analysing the change in public opinion over the course of the disease.

Julian Habermann: Immigration Ineligibilities. An analysis of immigrant ineligibilities that hopes to find the common reasons why immigrants are denied.

Micah Williams: Visualizing Covid-19 in the United States. Visualizing the spread of Covid-19 In the United States with graphs and animations.

Naina Tejani: It’s not a Myth: Climate Change and its Related Factors. This project traces the rise in global temperature over the past century and looks at factors that might be responsible.

Garrett Rolph: Students in Schools. I compared factors such as school expenditures, average teacher salaries, and school demographics to explore the relationship these variables have with standardized test scores (both SAT and ACT).

Yanghe Liu: COVID-19 in the World. This project explores visualizations of COVID-19 in the world in multiple perspectives, including but not limited to map and histogram.

Fatma Al-Alawi: The Greenest Building. An exploration of how much energy is embodied in our built environment, and what it means to demolish existing buildings rather than reuse them.

Tate Huffman: MLB Pitch Sequencing. An exploration and analysis of pitch sequencing and its effects from 2015 through 2019 in Major League Baseball.

Rachel Auslander: Investigating Mainstream News Coverage of Presidential Candidates. An exploration of whether news source choice influences the likelihood to vote for a particular candidate.

Chase Bookin: High School Players and the Major League Baseball Draft. An analysis of the decision made by top high school baseball players to sign in the Major League Baseball draft.

Adelson Aguasvivas: Effects of Redlining in New York City. A study looking at the effect of redlining in New York City from 1937 to 1940, mainly affecting communities of color, their median household income and housing value from 1950 to 2010.

Kayla Manning: Undergraduate HUDS Traffic Patterns in 2017-2018 and 2018-2019. This project analyzes data on swipe counts for every day, location, and meal for each of the undergraduate HUDS facilities over the 2017-2018 and 2018-2019 academic years.

Gov 1006, Spring 2020

The final project for Gov 1006: Models at Harvard in the Spring of 2020 was a paper replication exercise. Inspiration for this project came from “Publication, Publication,” PS: Political Science and Politics, Vol. 39, No. 1 (Jan., 2006), pp. 119-125 and “How to Write a Publishable Paper as a Class Project,” both by Gary King. See the permanent repo for copies of all the papers.

Miroslav Bergam — (repo pdf) — Zelizer (2019) finds that the cues that legislators take from their peers, in addition to other credible sources of information like briefings, influence their policymaking decisions. I succesfully replicated Zelizer’s results. Zelizer took a Bayesian approach to his findings, running 10,000 simulations for each table and figure using for-loops to produce estimates and standard deviations; however, this made his code inefficient and extremely slow Using the rstanarm package, I was able to simplify his code while maintaining the Bayesian integrity of the study. This extension served as both a robustness test of his results and a simplification that makes the study more easily reproducable.

Maria Burzillo —(repo pdf) — Trounstine (2016) suggests that high levels of residential segregation are associated with increased political polarization and decreased public goods spending. In this analysis, I was able to successfully replicate Trounstine (2016)’s main results. I then attempted to better deal with the large amounts of missing data in the datasets used in the original analysis by multiply imputing missing values and re-running the original models using the resulting multiply imputed datasets. My results suggest that segregation is associated with increased racial political polarization, although maybe not as strongly as Trounstine (2016) originally suggested. Furthermore, I find that Trounstine (2016)’s conclusion that increases in segregation are associated with decreases in public spending holds for large cities, but that diversity is a better explanatory factor for small cities.

Evelyn Cai — (repo pdf) — Horiuchi, Smith, and Yamamoto (2020) found through differences between observational data and their conjoint survey results that Japanese voters’ voting preferences are dependent on external factors. The conjoint survey results yielded consistent voter preferences across priming on different electoral systems; however, observational data and theoretical explanations show that voters’ preferences do vary across electoral systems. I was able to successfuly replicate these results and findings. My extension focuses on examining the difference in marginal means instead of the difference of average marginal component effects, which avoid setting an arbitrary baseline when calculating treatment effects.

Katie Cao — (repo pdf) — Ingram and da Costa (2019) find the effect of municipal politics and other socio-structural predictors of violence to be uneven and geographically varying across municipalities in Brazil. In a geographically weighted regression (GWR), the political party of the local mayor yielded inconsistent prediction coefficients for change in homicide rates, with party affiliations with the 3 major parties showing beneficial, harmful, and mixed effects on homicide across Brazil. After replicating the findings of this paper, I use a regression with interaction effects to investigate the authors’ claim that municipality-federal political party alignment can explain the effects. I do not find support for the municipality-federal alignment causal pathway and conclude with a discussion of the limitations of GWR.

Molly Chiang — (repo pdf) — Marshall (2015) shows the causal effect of additional years of schooling on voting conservative in his analysis of voting records before and after the British 1947 school-leaving age reform. I successfully replicated Marshall’s code, except an update in the rdrobust package led to slightly different coefficients. In an extension of Marshall’s work I investigated treatment effect heterogeneity between genders. Running rdrobust and creating regression discontinuity figures on male and female subsets of the data revealed Marshall’s effect of more years of education increasing the likelihood of voting conservative was stronger in women than men. This finding could complicate Marshall’s argument that more education leads to higher income then to more conservative political opinions and perhaps reveals something about the differing effect of education on men and women.

Ali Crump — (repo pdf) — Zoorob (2019) shows that geography and fentanyl exposure explain much of the variation in increased overdose mortality rates between 2011 and 2017. I successfully replicated much of Zoorob’s results, but I found discrepancies in the fentanyl exposure coefficients and the total death estimates for each model. My replication finds that the total death estimates are approximately 13% and 16% larger for each model and that the regression coefficients on fentanyl are slightly larger than those published. In addition to replicating Zoorob’s work, this paper provides a Rubin causal model table to better understand the model framework, focusing on explaining the ordinary least-squares model. Next, the bulk of the extension investigates alternative definitions of fentanyl exposure while keeping all of Zoorob’s other modeling choices the same. The original paper’s definition of fentanyl exposure explains more variation in age-adjusted mortality rates than those proposed in the extension. This finding is important because many different methods of defining fentanyl exposure exist, however, the proposed alternative definitions in this extension do not appear to improve Zoorob’s model.

Drake Deuel — (repo pdf) — Fuller et al. (2019) claimed that the disruption in public transit services caused by Philadelphia’s transit workers strike from November 1-7th led to a short term increase in bikeshare use in Philadelphia when controlling for temperature, precipitation, and bikeshare use is similar cities during the same time. I successfully replicated Fuller’s results. I recreated the interrupted time-series model that the authors used to model this natural experiment, and the Bayesian time-series model using the CausalImpact R package. I used the same raw data to with these structures to model bikeshare use per 100k population in Philadelphia. The models indicate that while the disruption to normal transit availability caused short term increase in bikehsare usage, usage returned to baseline within a short period. This may inform policy makers that short term interventions to promote cycling in cities may not have long term impacts.

Angela Fu — (repo pdf) — In their paper “Crossing the Line: Local Ethnic Geography and Voting in Ghana,” Nahomi Ichino and Noah Nathan found that the local ethnic demographics of the area in which Ghanian voters live affected who they chose to support in the 2008 Ghana presidential election. I successfully replicated Ichino’s and Nathan’s results. I have also reexamined their models and added an additional variable to explore the effect of a voter’s trust in members of their own ethnicity. I found that the inclusion of this variable does not significantly alter the model. As a result, trust in one’s co-ethnics or in others of a different ethnicity do not impact one’s voting decisions as much as the demographics of one’s location do.

Michelle Gao — (repo pdf) — Broockman, Ferenstein, and Malhotra (2019) show that technology entrepreneurs have a unique set of political beliefs: they are liberal on social issues, globalism, and redistribution, but very conservative on government regulation. The replication succeeded with the exception of some discrepancies in scaling variables. An extension that used linear regression to formally test whether certain values or predispositions, such as cosmopolitanism and authoritarianism, could predict policy preferences in particular domains, such as support for globalism and social issues, added support to the authors’ ultimate claim that tech entrepreneurs’ unique political beliefs stem from unique underlying philosophical values. Analyzing technology elites’ political beliefs is an increasingly timely task as the tech industry becomes more influential in politics.

Debora Gonzalez — (repo pdf) — Hill, Hopkins, and Huber (2019) argue that demographic changes are not associated with increased Republican vote share at the precinct level between 2012 and 2016. I succesfully replicated Hill, Hopkins and Huber’s results. A robustness test using a subset analysis focusing on the state of FL indicates, despite minor state-specific deviance and poor statistical significance due to reduced sample size, that the authors’ overall findings are valid. Another test focusing on the effect of any individual state on the overall outcome was also found to support the original paper’s robustness. In addition, I constructed a fixed effects model by state, which further confirmed original findings even under fixed effects assumptions. In a twist of analysis, I constructed a new model analyzing the association between Republican vote share and economic indicators while controlling for change in percent Hispanic. The latter models indicate a stronger relationship between unemployment proportion change and GOP vote share, which suggests an opportunity for further research with alternative theoretical pathways involving economic indicators.

Erin Guetzloe — (repo pdf) — Harris et al. (2020) find that local registration visits by election staff to Kenyan polling sites improve voter registration, while civic education and SMS messages reminding voters to register were less effective at increasing registration. I successfully replicated all of the major findings from Harris et al. (2020). My extension focused on using Bayesian modeling and posterior distributions to reproduce the frequentist models employed in the original paper, which showed that poverty, distance of registration offices from polling stations, and population sparseness were all negatively correlated with higher levels of voter registration. The results of the frequentist and Bayesian models were almost identical, providing corroboration to Harris et al.’s claim that attempts to increase levels of voter registration in Kenya must find ways to better interact with areas that are poorer, that have polling stations distant from registration offices, and that are more isolated.

Carine Hajjar — (repo pdf) — Barber and Pope (2018) show that most Republicans operate according to party loyalty over policy (ideological) loyalty. I successfully replicated Barber and Pope’s results. The results of the paper and the replication indicate that Republican party loyalists vote in line with President Trump, regardless of the political content of Trumps’s cues. More specifically, Republicans with low political knowledge, high self-ranked ideology, high partisanship, and high approval of Trump are more likely to support their leader’s cues, regardless of the true ideological implications, even if they are not in line with the party’s traditional views. I looked at Barber and Pope’s regressions testing the causal effect of conservative and liberal cues from President Trump on Republican, Democrats, and Independents with varying levels of political knowledge, partisanship, approval of Trump, and political ideology. I took the regression on partisanship and knowledge as well as the overall regression of average cue response among all political identities and ran a more robust binomial regression as well as corrected for a mistake in the first figure of the paper. I reiterate the fact that Republican party loyalists are not necessarily ideological loyalists and, more specifically, that many Republican Trump supporters respond positively to liberal or conservative cues from Trump but not necessarily from other Republicans. This finding forces Americans to rethink the importance of parties and the ideological strength of their positions.

Benjamin Hoffner-Brodsky — (repo pdf) — I’m replicating Jensenius (2015) Development from Representation, which appeared in the American Economic Journal: Applied Economics. Since 1950, the Indian Parliament and India’s state assemblies have guaranteed a minimum number of seats to Scheduled Castes (SCs). Ensuring ascriptive representation for the 16% of Indian citizens who belong to SCs was intended, in part, as a mechanism to equitably allocate resources along caste lines. To implement SC quotas, the federal government non-randomly selected constituencies in which only SC members can run for office, though all members of the constituency are allowed to vote. The paper uses a dataset of constituency-level data of 3,134 state assembly constituencies from the 15 largest Indian states to compare development levels across reserved and non-reserved constituencies in 1971 and 2001. As reserved constituencies were non-randomly determined, Jensenius forms pairs of reserved and non-reserved constituencies, matching based on pre-selection characteristics to mitigate the effect of selection bias. She finds a null constituency-level effect on overall development, redistribution to SCs, literacy rates, SC employment patterns, and village amenities.

Suriya Kandaswamy — (repo pdf) — Caughey and Warshaw (2018) show that in American states, dynamic responsiveness of policy liberalism to mass liberalism has increased over the years and, further, that partisan control in any given year has a minimal effect on the liberalism of that year’s policy. Replication of results was successful, except for some discrepancies in the magnitude of coefficients, which do not affect their main conclusions, but do affect some marginal conclusions drawn. While this paper focuses on the impact that different features of public opinion, geography, and legislative partisanship have on policy, this extension sought to determine whether partisan control of a state’s legislature impacts the responsiveness of its policy to public opinion. In other words, while the party in control may not have a large impact on the liberalism of implemented policy by itself, this extension shows that the party in control has a noticeable short-term effect - through variable interaction - on how responsive policy liberalism is to mass liberalism for economic issues but has a negligible long-term effect for all types of issues. Regardless of what party is in power in any state, or the country for that matter, it is important that the policy of an administration reflects the wants and needs of its populace and it should be clear that policies evolve with the needs of the people rather than the wants of the party.

Alexander Klueber — (repo pdf) — Pan (2017) shows that the emphasis on Chinese local government websites on either the competence or benevolence of county executives depends on where they are in the political tenure cycle. I was largely able to replicate these results. My extension confirms that this is the most likely explanation for the observed effect by comparing the statistical explanatory power of alternative models (e.g. cultural differences among regions, gender differences, etc.) through the leave one out method. In addition I validate the geographical randomness of the sample through simulations of repeated sampling and the construction of confidence intervals. They corraborate the findings of the paper by confirming the geographic randomness of the sample.

Samuel Lowry — (repo pdf) — Campbell et al. (2019) details two separate studies which bolster the cue-based hypothesis of local roots. I successfully replicated all of their results. For my extension, with the first study, I used a Bayesian approach and found a negligible difference but did find heterogeneous effects based upon social class. With the second study, I looked at the interaction between the components and the respondent age groups as well as interaction between the components themselves where I found 18-24 year-olds being less critical of personal views influencing a member of Parliament’s policy than their older counterparts and local roots downplaying the negative impact of the trustee model of representation on voter preference.

Chelsea Marlborough —(repo pdf) — Stokes (2015) finds that voters do pay attention to climate policy and afterwards penalize the incumbent governments for facilities viewed as harmful to the communities. I successfullly replicated Stokes’ results. In my replication, I tested the strength of Stokes’ model using a Bayesian regression model. Contrary to Stokes’ findings, I found that on average, voters were more likely to support the incumbent governments.

Diego Martinez — (repo pdf) — Campbell et al. (2019) find evidence that local roots serve as a cue for behavioral information; however, they still find local roots to possess a positive impact on candidate selection. I was successfully able to replicate all the results presented in the original article “Why Friends and Neighbors? Explaining the Electoral Appeal of Local Roots.” by Rosie Campbell, Philip Cowley, Nick Vivyan, Markus Wagner. To further their analysis, I modeled how local roots influence different sub-populations, including male vs. female voters, as well as between different age groups. I found the average treatment effect of local roots are not constant across subsets of the population. Furthermore, I found the average marginal effect of local roots to have substantial differences between males and females. Local roots affect groups of voters differently and any further research regarding local roots should account for demographic differences.

Liz Masten — (repo pdf) — Findley, Piazza, and Young (2012) show that interstate rivalries are a positive predictor of transnational terrorist activity. The authors argue that terrorism is often a component of broader hostilities that can be emperically analyzed using a series of politicaly relevant directed dyads. In this paper, I successfully replicated all of their results and executed an extension which used dyadic quasipossion models. These models are equally statistically significant with respect to rivalry, the main concern of the original paper, and thereby confirm that the original findings are robust. However, concerns exist regarding the use of dyadic analysis and the conclusions we can draw from such analysis in general.

Beau Meche — (repo pdf) — “The Distributive Politics of Enforcement” by Alisha Holland (2014) analyzes electoral behavior’s relationship with police action in opposition to low-income unlicensed street vendors in three cities in Latin America. I was successful in replicating the results, with minute variance due to apparent differences between regression output between R and Stata. In my extenstion I re-regressed the models from the original paper under Bayesian modeling methods in the interest of discovering any differences likely to arise. The regression outputs themselves were quite similar and model comparisons favored similar models to the author; however upon cross-validation model analysis I found that a majority of the models did contain ‘problematic’ values. This implies that the models showing statistically significant support for the author’s claim do not effectively model a subset of specific cases should any one case be removed. Should this prove to be a problem, it is only a problem of outliers which are not surprising in a small dataset. Otherwise, I find that Bayesian analysis supports Holland’s claims both in conclusion and in process.

Robert McKenzie — (repo pdf) — “The Achilles Heel of Multiparty Democracies”, by Ernesto Calvo and Jonathan Rodden (2014) shows that majoritarianbiases increase with the number of parties, and majoritarian systems harm small parties when their vote is more dispersed than average, and large parties when their vote is more concentrated than average. They built a mathematical model of the relationship between geographic distribution and electoral representation, using MC methods, and then analyzed UK elections over the past 60 years to determine which parties benefit from majoritarian bias. I’ve extended their paper by updating data, and adjusting specifications to look at the impact of regional parties.

Prachi Naik — (repo pdf) — Enos, Kaufman, and Sands (2019) show that the 1992 Los Angeles Riot— one of the most well-known and documented instances of political violence in recent American history— caused a significant liberal shift in policy support at the polls due to the increased mobilization of Black and White voters, a mobilization that has endured over a decade later. The replication attempted in this project successfully found that White voters demonstrated a 0.028 increase in support for public school funding relative to university funding (CI: [0.018, 0.039]) and Black voters demonstrated a 0.073 increase (CI: [0.066, 0.081]). To extend the work of this paper, this project sought to examine the effects the Riot had on Asian American voters. The findings were inconsistent with what the authors found for White and Black voters– the riot appears to have caused a decrease in liberal policy support for Asians. This matters because heterogenous treatment effects are worth further scrutiny and complication, especially given climates of racial polarization.

Alexandra Norris — (repo pdf) — Ejdemyr, Kramon, and Robinson (2017), show that ethnic segregation is a key determinant in whether ethnic favoritism plays a role in the provision of public goods. I was able to successfully replicate the authors’ results. For my extension, I used the same models as for the replication except instead of using boreholes (wells) as my proxy for public goods provision as the authors did, I used health clinics and schools. When using these other dependent variables, I found that the effects of segregation were different from those presented in the paper.

Cris Patvakanian — (repo pdf) — Frye, Reuter and Szakony (2019) examine voter behavior in Russia and Venezuela and find different types of brokers, appeals, and targets have different effects on voter turnout. I successfully replicated all of their results. As a robustness test, I impute missing values in the dataset and find results in line with that of the original study, but of a slightly different magnitude. These results confirm the authors’ original findings and suggest that the missing values in their sampled population do not bias the results. All analysis for this paper be found in the original paper and data verse.

Pieter Quinton — (repo pdf) — Bauer (2018) finds that there is no consistent relationship between unemployment and one’s trust in government or their satisfaction with democracy. I succesfully replicated Bauer’s results except for a minor disprepency in a summary table which does not affect his conclusion. My extension evaluates how consistent or long-term unemployment may impact one’s views. Rather than examining the effects of just a single year of unemployment, as Bauer did, I track unemployement trends over longer periods of time to see if extended periods of unemployment have a stronger impact on one’s feelings towards the government and its institutions than shorter periods of unemployment do.

Timothy Ravis — (repo pdf) — Buntaine, et al (2015) find that donor-financed, government-implemented land tenure legalization efforts in forested areas have a negligible effect on the rate of deforestation versus that found in areas not subject to such an intervention. After matching treated plots with untreated ones, they find no significant treatment affect on deforestation rates. The major results of their analysis were successfully replicated in this study. In the present paper, I explore how spatial autocorrelation in both the treatment and the outcome affects the treatment effect found by the original authors. I attempt to address the issue by adding a local indicator of spatial autocorrelation to the matching process, and by using a spatial lag model to account for spatial dependency between observations. This illustrates the importance of including spatial relationships into models attempting to explain causality within explicitly spatial datasets.

Niel Schrage — (repo pdf) — Enos (2016) measures the shift in voter turnout for white voters living in Chicago near demolished public housing, occupied predominantly by African Americans, as compared to white voters living farther away; observing that white voters living in close proximity to demolished public housing had a 10 percentage point drop in voter turnout between 2000 and 2004, Enos concludes that this change in behavior was the result of the decline in race threat from the change in size and proximity of the outgroup population. The results of my replication effort were largely successful, although there were some challenges. For my extension, I expanded the parallel trends robustness check that Enos presents in his appendix; my results were consistent with his findings. These results are significant in two important ways: first, they illustrate the strength of the robustness checks that Enos conducted and, second, they suggest that his conclusions about the effect of racial threat on voting are even more robust than his paper suggests.

Daniel Shapiro — (repo pdf) — Lazarev (2019) uses data describing individuals’ choice between Russian state law, sharia law and customary law (adat) in Chechnya to show that gender can play a large role in societal splits in post-conflict societies. I was able to replicate all of the author’s results. In my extension to Lazarev’s paper, used the rstanarm package in R to check Lazarev’s analysis with Bayesian regression, and the results confirmed the author’s findings. I also pointed out certain areas where I disagreed somewhat with Lazarev’s analysis and where I felt that the paper could improve. This paper’s successful replication of Lazarev’s findings helps strengthen Lazarev’s argument about the role of gender in choice of legal organ, and my additional comments on Lazarev’s anaylsis help to further discussion about postwar Chechnya and post-conflict society as a whole.

Mike Silva — (repo pdf) — Tingley et al (2014) find that olfactory senses could explain assortative mating by ideology. My replication of this paper succeeded in most cases. It failed in a few ways including clustering standard errors around certain variables and recreating graphs that used 21 specific observations from the data set. Through my extension, I evaluate one of the three models Tingley et al uses by using a bayesian fit instead of a regular linear model. I further create a posterior distribution of predictions on the outcome variable using the model and graph those predictions with the actual outcome values. Through this model, I confirm that Tingley’s model significantly explains the data.

Cian Stryker — (repo pdf) — Hager, Krakowsi, and Schaub (2019) find that exposure to ethnic violence negatively affects prosocial behavior within and across ethnic groups in Osh, Kyrgyzstan. I was largely successful in replicating their main results and use their survey data of Kyrgyz and Uzbeks—the majority and minority ethnic groups of Osh—to expand upon their work. I find that the prosocial behavior of Kyrgyz towards Uzbeks is partially positively affected by exposure to violence. These results contradict the authors’ original findings that exposure to ethnic violence has a homogenous treatment effect. My models demonstrate that on the contrary, ethnic violence can have a heterogenous treatment effect, which warrants further analysis of ethnic violence’s influence on interethnic relations.

Amanda Su — (repo pdf) — So, Long, and Zhu (2019) determine that novelists marked as “white” versus “black’ produce different narratological effects with respect to the interaction of race and religious authority, finding that black writers who cite the Bible are more likely to cite it in a social context compared to white writers who cite the Bible in their novels. I was able to successfully replicate the results of the authors’ paper. For my extension, I decided to reconstruct the paper’s primary model using a Bayesian approach. I found that the results of the model were largely the same as that of the original. This corroborates and strengthens the paper’s conclusions about how race and writing intersect across more than a century of U.S. fiction.

Abrar Trabulsi — (repo pdf) — In his paper ‘The Desire for Social Status and Economic Conservatism among Affluent Americans’, Thal (2020) shows that affluent American’s desire for social status drives conservative attitudes amongst them, and the will to advance economically conservative politics. Overall, I was successful in my replication efforts in this paper. Moreover, I extend Thal’s results by running logistic regression and appropriate analysis, such a distribution of the posterior and predictive probabilities on his primary data. I find that Thal was indeed correct. Social status does in fact drive conservative attitudes amongst affluent Americans, and especially men.

Hannah Valencia — (repo pdf) — Levine and McKnight (2017) show that in the 5-month period following the Sandy Hook school shooting in December 2012, a large spike in gun sales contributed to an increase in accidental firearm deaths. Their findings conclude that there was a spike in accidental firearm deaths resulting from the increase in exposure, which is confirmed in this replication. I was able to successfully replicate most of Levine and McKnight’s results. As an extension to this paper, the original linear regression used to determine the increase in firearm sales per 100,000 popoulation in the post-Sandy Hook period was changed to a Bayesian generalized linear model. Even after this change, the results showing increases in certain states hold, backing the authors’ claims. Even though the Sandy Hook shooting showed the need for stricter gun laws, the immediate aftermath of this realization led to the opposite effect as desired: more accidental firearm deaths.

Kevin Wang — (repo pdf) — Hopkins (2015) finds that exposing a representative sample of Americans to video of an immigrant speaking accented English prompts respondents to adopt more inclusionary attitudes. I successfully replicated Hopkins’s results, except for minor manipulation and robustness checks and the composite immigration index, which do not substantially affect his conclusions. As an extension, I included respondents’ self-reported frequency of contact with Spanish in Hopkins’s regression models and tested for heterogeneous effects of accented English among subgroups defined by level of Spanish familiarity. I found that while preexisting familiarity with Spanish is associated with more exclusionary baseline attitudes, there are no significant subgroup differences in the treatment effect of accented English. This suggests the difficulty of changing exclusionary attitudes developed through long-term encounters with culturally distinctive traits in real life.

Feven Yohannes — (repo pdf) — Kuipers (2019) explores the effect that the election of female candidates in the Indonesian legislature have on intimate partner violence attitudes. I replicated Nicholas Kuipers’ paper “The Effect of Electing Female Candidates on Attitudes Towards Intimate Partner Violence” and I found my results to be consistent with the results found in the paper. After running the code from the main models, I concurred that the results showed that the election of female candidates did, in fact, have an effect that’s statistically significant on the IPV attitudes on female constituents. These results are particularly important because it shows the possible effect that female candidates can have on decreasing IPV, by at least contributing to more condemnation of IPV. Thus, the election of female candidates can result in tangible responses to IPV, leading to safer and healthier communities for women.

Yao Yu — (repo pdf) — Lin et al. (2018) found that the time interval between mass shootings has been drastically decreasing in the past three decades, suggesting that the rate of shootings is increasing. I was able to replicate all of the results from Lin et al. (2018), but while I was able to replicate the inconclusive results in table 1, I was not able to replicate the exact zero-inflated Poisson model. My extension broke down the interval trends between different venues of shootings showing in figure 2 of Lin et al. (2018). I found that the interval trend of mass school shootings remained relatively steady while the interval between mass workplace shootings and other mass shootings drastically decreased since 2015. This suggests that more research should be done looking at why workplace mass shootings have specifically increased drastically since 2015.

Ruth Zheng — (repo pdf) — Hankinson (2018) shows that renters exhibit “Not in My Back Yard” (NIMBY) behavior on par with homeowners in high-rent cities despite overall support for a housing supply increase. I successfully replicated Hankinson’s results and confirmed they are consistent with results using a logistic regression model. The increased likelihood for these renters to reject policy proposals that create new housing helps explain the affordable housing crisis in major American cities

Gov 1005, Fall 2019

Yao Yu: Gun Violence Decrease in San Francisco and Oakland. A study looking at why gun violence in San Francisco and Oakland decreased from 2013 to 2017 while it increased in other US cities during that time.

Sydney Sorkin: NCAA on Twitter: Does the NCAA tweet about Men’s and Women’s sports differently?. This project analyzes the tendency of NCAA affiliated Twitter acounts to tweet about male or female sports and athletes.

Billy Koech: Kibuon Project Data Analysis. Determining an optimal well location and water usage trends for a community in the southwest region of Kenya.

Bridger Gordon: Social Connectedness in America. Analysis of Facebook’s Social Connectedness Index shows that the geographic closeness is a significant factor in who we know / interact with, which is one possible explanation for demographic similarity in social circles.

Molly Chiang: New York City Airbnb and Housing Prices. In general housing prices and Airbnb prices are very slightly positively correlated, but there is lots of variation within and between boroughs.

Katherine Enright: Public Opinion Surrounding the 2019 Hong Kong Protests. This project analyses underlying public opinion factors behind the Hong Kong pro-Democracy protest movement as well as exploring a Twitter-based Chinese government operation to influence international opinion about the protests.

Carine Hajjar: Diego Arias. We analyzed how different characteristics influence Harvard students’ choices of friend groups.

Aysha Emmerson: Project Resilience. The Project downloads tweets pertaining to “resilience,” performing a series of analysis—including a sentiment analysis and word cloud plot—to investigate what feelings, concepts, and words, the general public associates with this concept.

Michelle Gao: Performance of Political Ads on Google. I look at the relationship between Democratic presidential candidates’ Google political ad spending and their primary polling results over 2019.

Mari Jones: Race & Gender Implications of the Criminal Justice System. Analyzed race and gender associations within the criminal and youth incarceration systems.

Mengxi Tan: US Immigration Explorer. This website examines immigration into US, and answers the two following questions: 1. Where are the immigrants coming from, and through which channels are they admitted? 2. Once the immigrants are in the US, how well are they fitting into the society?

Mitsue Guerrero: Water stress in Mexico City. Given the large population and limited water resources available in Mexico City, we visualize the consumption at block level to identifiy the biggest consumers and observe the water inequality gap that is making the city run out of water by 2050.

Lewis Zou: Predicting the Results of LoL Games. I analyze the 2018 League of Legends World Championships and find what factors are most important in determining the outcome of a game.

Amal Abdi: Exploring Evictions and Rent Burden in Ohio. I look at county-level evictions data throughout Ohio and rent burden by race and ‘ruralness’.

Miroslav Bergam: Donations of Harvard-Employed Individuals to 2020 Presidential Campaigns. Elizabeth Warren recieves the most support among Harvard-employed individuals by most measures, and other facts like overall campaign size, political ideology, employer, and occupation are correlated to the size of the donation and the campaign being donated to.

Feven Yohannes: Ethiopia Economic Development. In this project, I’m using data from the Worldbank and The UN that shows the social and economic changes that have been occurring in Ethiopia in the last 60 years.

Alexandra Norris: Visualizing Migration. I visualize migration data, looking at where refugees are coming from, where they are going, and the relationship between the number of refugees entering a country and different GDP indicators.

Cian Stryker: China and the Belt and Road Initiative. This website is an introduction to China’s Belt and Road Initiative, which is one of most important geopolitical phenomena of the 21st century.

Hanif Wicaksono: Cambridge Energy Use. A visualization tool to understand building makeups and how City of Cambridge uses energy.

Jackson Kehoe: Orchids Around the World. A multi-faceted approach to better understanding the global distribution and trade of orchids.

Daniel Shapiro: Russian Regional Demographic Change. This project breaks down Russian demographic data by region and analyzes trends and patterns over time.

Sanjana Ramrajvel: Homelessness in the U.S.. This project seeks to examine how well our country meets the shelter needs of its homeless population.

George Guarnieri: The Harvard Shop Sales. My project performs an analysis of sales data from the Harvard Shop, specifically focusing on web sales.

Jessica Scazzero: Is Cash Here to Stay? An Exploration of the Factors that Drive Individual Cash Use. This project used the Fed Consumer Payment Diary surveys to analyze individual cash usage by individual level characteristics, transaction level characteristics and time variables.

Chloe Shawah: Fingerprints of Colonization. The project seeks to determine if there are effects of foreign colonization/occupation traceable today by tracking indicators of the economic prosperity, health, and education in world nations over time.

Grace Rotondo: Fixing the Flaws of Networking: An Alumni Directory of the Harvard Women’s Lacrosse Program. This is an alumni directory of the Harvard Women’s Lacrosse Program, a platform for Harvard Women’s Lacrosse affiliates to easily access accurate alumni information.

Joshua Pan: Dunk on Some Stats. This project analyzes trends in professional and college basketball and comes up with models, focusing primarily on player positions/roles.

Angela Fu: United Nations Resolutions. The project analyzes UN resolutions that were voted upon dating back to 1986.

Emily Axelsen: More Permits More Problems? Tracing Factors Correlated to Gun Violence. I analyzed trends in the number of permits granted and the resulting number of gun violence incidents and noticed that the per capita number of gun violence incidents is similar among states that have gun violence policies (such as required gun registration and waiting periods) and states that do not.

Katie Cao: The Billboard Top 100: An Analysis of Timelessness and Lyrical Content. How does the lyrical content of songs predict their timelessness?

Diego Flores: Just How Great (Truly) is Democracy?. This project aims to determine whether or not Democracy is truly the superior form of government by assessing potential relationships between its implementation and variables associated with gauging the prosperity of a society.

Sophia Zheng: Housing Prices in Three Major US Cities. This project aims to compare historical housing data over the past thirty years in New York, San Francisco, and Seattle in order to see correlations with income, and study distribution by zip code.

Dominic Skinnion: Whom Does the Electoral College Benefit?. The electoral college appears to benefit Republicans more than Democrats.

Rick Brown: Football Wins and College Applications. I analyze the change in applications to colleges based on the change in wins of each college’s football team and find a very weak correlation between the two.

Mariah Dimalaluan: Can you make it to next year’s Billboard Hot 100 Chart?. This project attempted to study what song characteristics – especially the key it was written in – make it likely to have a higher ranking on the Billboard Hot 100 chart.

Jeremy Ornstein: American Creative Class. How do members of the creative class – artists and engineers – relate to populations and incomes of american counties?

Rucha Joshi: Analysis of Seasons and Characters of The Office. This project analyzes information about the characters’ frequency and emotions during the show.

Elizabeth Pachus: Firearms and Sucide in America. This project investigates the correlation between suicide rate and firearm death rates while also exploring which groups of people are being primarily affected.

Abrar Trabulsi: Outlining the Relationshop Between Regime and Economic Development. Exploring the relationship between regime and economic development, especially with regards to autocratic and democratic regimes.

Andy Price: MLB Pitcher Raw ‘Stuff’. This project predicts pitch outcomes based only on a pitch’s velocity and movement.

Hannah Valencia: Analysis of Queen’s Music. This website takes a look at Queen’s music and analyzes the audio features in their 15 studio albums.

Pieter Quinton: Housing Market. An examination of the housing market, starting at the national level and narrowing the scope all the way to just one city.

Bernadette Stadler: Equal Work, Equal Pay? The State of Women’s Soccer in 2019. This project allows users to explore the numbers behind the USWNT discrimination lawsuit against U.S. soccer, as well as to look at the state of gender equality in professional soccer world wide.

Hoda Abdalla: The Effect of Media on Presidential Primary Candidate Performance. I look at the relationship between mainstream media mentions for major 2020 Democratic Presidential Candidates and their performance outcomes, measured in polling percentage and betting prices.

Cristopher Patvakanian: Armenian Diaspora Project. This project showcases the different Armenian Diaspora communities in the world and provides information with regards to their locations and sizes.

Alexander Klueber: An Investment in the Past. Carbon footprint of the Harvard Endowment and break-down to individual students.

Ryan Graff: NBA Statistical Trade Machine. My project calculates the average statistics of NBA draft picks and combines them with current player data to create an NBA trade/Value comparison machine for the user, while also performing regression on draft picks and their advanced stats.

Amanda Su: Trends in International Student Choices and Motivations in the U.S.. This project looks to analyze various patterns in the experiences and motivations of international students in the United States, specifically examining country of origin indicators, fields of study, and funding sources.

Camila Sanmiguel: The Violence Crisis in Mexico: Public Perceptions of Safety Alongside Border Apprehensions. Visualizing the spikes and falls in violent crimes in different Mexican states allows us to track the movement of the Mexican drug war; this project also examines the Mexican public’s perceptions of general danger beside Border Patrol apprehensions along the southern border.

Harrison Burke: Exploring Olympic Rowing Success. I look at various factors that influence performance at the olympics.

Jake Schonberger: Unclaimed Property. This project focused on diving into datasets related to “Unclaimed Property, specifically Californias unclaimed property division.

Victor Chen: Cincinnati Bengals 2018 Season Analysis. I look at play-by-play data for the Cincinnati Bengals in 2018 to find key drivers of expected points and win probability.

Anan Hafez: An Analysis of 3-Pointers in the NBA. Professional basketball is obsessed with the 3-pointer, how did we get here and what has changed because of it?

Oren Rimon Or: Inventors in the US. This project explores the relationship between economic mobility, income distribution and invention rate in the US.

Sam Lowry: The Effects of IMF Structural Adjustment Programs on Angola. Using World Bank and WHO data, I analayzed the effects of IMF loans on Angola in order to better understand how the IMF influences developing countries.

Chelsea Marlborough: Spotify Top Tracks Chart. This project analyzes trends in audio features found throughout Spotify’s Top 100 Chart of 2018.

Kevin Wang: United Nations General Assembly Voting Patterns. This project analyzes the frequency at which different countries vote in the majority in the UN General Assembly.

Liz Masten: Fatalities in Yemen’s Civil War. This project attempts to make sense of the chaos of Yemen’s civil war through mapping and analyzing instances, methods, and attributability of attacks.

Drake Deuel: Strava Leaderboards. This project looks at the relationship between Strava KOM ranking and climbing time.

Olly Gill: The History of The Olympic Games. I looked at data from the past 120 years of Olympic History in order to learn more about what has both changed and stayed the same for the Games and the athletes that make them so special.

Gayatri Balasubramanian: Distribution of Ethnicity and Industries in Indiana. Ethnicities tend to collect in pockets across the state, and this project overlays industries and ethnicities to see if perhaps one is more closely located near the other.

Ali Crump: Visualizing NHL Statistics. Analyzing NHL statistics which date all the way back to 1917.

Morgan Booker: Catching Criminals: A Criminal Minds Analysis. An in-depth analysis of the creative elements of first five seasons of Criminal Minds and the twisted criminals the team chases.

Madeleine Snow: It’s A Hit: Tony Awards for Best Plays and Musicals. This project examines Tony Award-nominated and Tony-Award winning Broadway plays and musicals, 1948-2019.

Isheka Agarwal: Perspective on Historical and Future Consequences of Climate Change. I analyzed the historical consequences of climate change, future projection for consequences of climate change and opinions of people about climate change living in various regions in the United States.

Grace Kim: Believers in the Divine: The Religions of South Korea. This project explores the various religious groups in South Korea, more specifically the relationship between the rise of Christianity compared to Buddhism.

Togo Kida: An Analysis of Creative Class in the United States. Analyzed the socioeconomic status of designers and creatives in the United States.

Cade Knox: NFL Big Data Bowl. This project aims to create a model that will predict how many yards per carry will happen in an NFL run play.

Prachi Naik: Making the Case for Investment in School-Based Mental Health. I analyze 2016 School Survey on Crime and Sfaety to understand the circumstances surrounding schools’ ability to provide mental health service.

Emmanuel A. Calivo: Income and Transit Access in the San Francisco Bay Area. An analysis of the relationship between household income and access to public transportation in the SF Bay Area.

Parker Mas: Behind Billboard: Exploring the Audio Features of Pop Music. Analyzes Billboard Hot 100 chart data and song audio features in order to provide interesting data visualizations and model peak song popularity.

Margaret Butler: Gothic Literature and Monstrosity. I attempted to look at monstrosity in five different classic gothic lit books through word analysis.

Minjue Wu: Analysis of novel fly model for X-linked Dystonia Parkinsonism. A review of a novel animal model used to simulate a neurodegenerative disease, analyzing relationships between knockdown gene pathway, viability, sex, and recovery for potential recommendations in future research

Amy Tan: Academic Achievement in the U.S.. This project looks at various socioeconomic covariates’ correlations with academic achievement across the U.S. as measured by standardized test scores.

Sophia Freuden: Growing Pains in Portland: A Story of Crime, Unemployment, and Population. An exploration of crime data, unemployment, and population growth in Portland, Oregon over the last ten years.

Erin Guetzloe: Boston Gun Violence. Considering that Massachusetts has some of the most stringent restrictions on gun ownership in the United States, why is gun violence on the rise in Boston?

Elizabeth Guo: Demographics and Votes of the U.S. Supreme Court. Evaluating model accuracy indicates that demographics and party affiliations of Supreme Court justices cannot be used to predict their vote patterns, and that justices are not “politicians in robes” who always vote in party line.

Gov 1005, Spring 2019

Shivani Aggarwal: How Couples Meet. Visualizing the ways in which different kinds of U.S. couples meet and enter into relationships.

Neil Khurana: Harvard Dining. Archiving Harvard menus and exploring variations and repititon in meal choices.

Dasha Metropolitansky: First-Year Blocking Group Project. Harvard says it fosters a diverse community; trends in students’ housing indicate otherwise. This was a group project. The other group members were: Adiya Abdilkhay, Ilkin Bayramli, April Chen, Alistair Gluck, Christopher Milne, Neil Schrage and Stephanie Yao.

Christopher Onesti: Course Enrollment Statistics. This project presents an inside look and trend visualization regarding fall and spring undergraduate course enrollment data at Harvard.

Margaret Sun: Beyond The Stage. Various insights into the music group BTS.

Ruoqi Zhang: Settling the Dust: Censorship & Environmental Activism in China, 2012. What does social media data tell us about environmental awareness and censorship in China, 2012?

Hemanth Bharatha Chakravarthy: Twitter in the Biggest Elections in the World. Sentiment analysis of the biggest Twitter election campaign in the world and breaking down the twitter farms’ role in it.

Evelyn Cai: Survivor. Outwitting, outplaying, and outlasting: What does it take to become the Sole Survivor?

Sabrina Chok: US Federal Crime Data.

Simone Chu: Presidential Speeches. Text analysis of inaugural addresses, State of the Union Speeches, and news conferences.

Celia Concannon: Tesla Stock and Elon Musk Tweets. Interactive plot showing Tesla stock volume and tweets, table to search Elon Musk tweets by date or keyword, and an about the app tab.

Andres de Loera-Brust: Exploring the Medicare Shared Savings Program. Explore the details of Medicare’s experimental new way of paying for healthcare.

Alexandra Dobbins: Lyme Disease in the United States: a Historical Perspective. Awareness of Lyme Disease as a debilitating illness has increased in recent years – take a look at where the most cases are, and how they’ve changed over time.

Nicholas Dow: Muller Report Text analysis. Creates better way to look for important information in the 448 page Mueller report

Tanner Gildea: 2020 Democratic Candidates’ Tweets. More than 20 Democrats are running for president in 2020. But how are they using Twitter to do so?

Debi Gonzalez: The Sunshine State Turns Purple on Election Day. Visual representation of Florida’s political distribution over time accompanied by county-level demographic data.

Tate Green: Game of Thrones Analysis. In depth analysis of Game of Thrones seasons 1-7

Benjamin Hoffner-Brodsky: Asylum Seekers. Tracking which countries are most likely to accept asylum applications, based on which countries they’re coming from

Jefferey Huang: Money, Efficiency, and Education: California High Schools. Analyzing efficiency and college-readiness test (SAT, ACT, AP) outcomes in California school districts.

Taehwan Kim: An Analysis of Crime in Chicago. Visualizing Crime in Chicago over the 10 year period from 2008-2018

Andrea Lamas-Nino: Crimson Analytics. * Understand engagement with the Crimson’s content.

Jennifer Li: New York Apartment Hunt. Price Analysis of 2-Bedroom Rental Units in Manhattan between 2000 and 2019

Diego Martinez: Baseball Aging Curves. Analyzing age’s effect on performance of MLB players.

Beau Meche: Census: Population Mobility Post-Trump. Looking at apparent movement / growth of the US population in Trump’s first year in office. Did young people with degrees change location?

Igor Morzan: Corruption in Latin America. Visualizing corruption across all Latin American countries and their institutions.

Seeam Noor: Seeam & EPL. Get interesting stats on your favorite English Premier League teams from the last decade

Shafi Rubbani: Organ Donations and Transplants. Different countries have different trends in donation and transplantation rates over time.

Albert Shin: Chicago Ride Share Comparisons. Does Christmas affect ride-share and rider behavior?

Mike Silva: Harvard Football Defense Analysis. Analyzing the 2018 season for the Harvard Football Defense

William Smiles: POTUS & PRICING: How Trump’s Tweets Affect Intraday Trading. President Trump’s tweets and their effects on financial markets

Céline Vendler: Zillow Data Explorer. Using data from Zillow over the past 20+ years, this app visualizes historical and forecasted trends in real estate to compete with Redfin’s data visualizations.

Henry Zhu: Newark Airport Flight Destinations. Visualizing where flights from Newark are headed and delay patterns

Gov 1005, Fall 2018

Kemi Akenzua: Analysis of death row executions in Texas with an emphasis on sentiment analysis.

Ghada Amer: Mapping of global armed conflict post-Cold War.

Ryan Michael Antonellis: Analysis of the most highly represented colleges in the NFL draft over the past 10 years..

Esteban Arellano: Analysis of levels of upward economic mobility achieved by race and gender per county..

John Ball: Analysis of rates of mental illness among comedians..

Rana Chandra Bansal: Analysis of Indian exports between 2014 and 2017..

Katherine Elizabeth Binney: Analyzing Massachusetts public school quality..

Charlie Chatman Booker: Analyzing apartment complex data in the city of Houston.

Enxhi Buxheli: Russian tweets and their effect on the 2016 US Presidential Election..

Michael Calabro: Analysis of Batting and Pitching data in the MLB in relation to a drastic rise in strikeouts.

Cayanne Chachati: Analysis of the deaths over the course of the Syrian Civil War.

Maddie Chai: Analysis of the decline in American marriage from 1960-2012..

Holly Jaime Christensen: Analysis of NYC rental unit data in Manhattan – looking at # of units, median asking price, and housing violations.

Oliver E Cordeiro: Analysis of strokes gained data from the PGA Tour in 2018.

Sofía Corzo: Making a clear descriptive interface to generate data on post-conflict transitional justice mechanisms from 1946-2006..

Cunhonghu Ding: Which college in the United States offers you the best chance to climb the ladder?.

Donovan Mac Doyle: I look at NFL gambling data from 1979-2017.

Robert Drysdale: I project Gordan Hayward’s statistics if he wasn’t injured in 2017.

Annika Engstrom: Analysis of social liberalism across US demographics, with a focus on gay rights issues..

Steven Espinoza: Looking at data concerning sanitary violations near Harvard Square.

Grant Fairbairn: Analysis of Tiger Woods PGA Tour data for 2018 season.

Maclaine Fields: Analysis of Harvard Women’s Volleyball 2018 Season.

Charles Elliot Flood: Do Bilateral FTAs actually help boost trade?.

Claire Fridkin: Nicolas Cage movie analysis and sentiment analysis.

Melissa Gayton: Analysis of intake interview data for Access to Justice Lab’s Divorce Study.

Peter George: Visualization of strokes gained by player per round across several 2018 PGA Tour tournaments.

Hannah Elizabeth Hardenbergh: How often do artworks move at the Harvard Art Museum?.

Matti Harrison: Analysis of population and it’s affect on home prices in LA and CT.

Stone Alexander Nicholas Hart: Analysis of Pokemon statistics spread over generations.

Hannah Ella Hess: What is the Impact of Ramadan on Pornography Consumption?.

Claire Hotchkin: Analyzing Boston Marathon race times.

Sean Hughes: Why are children’s chance of earning more than their parents falling?.

Justin Hunter: Recent Trends in Building Permits in Detroit.

Tauheed Islam: Hip-Hop and Rap References of the 2016 Presidential Candidates.

Sonya Kalara: Analysis of Hate Crimes in New York State from 2010 - 2016 by county, year, and crime type..

Shriank Kanaparti: Housing Price analysis.

Saiyaz Kazi: Tracking of Immigrant Voting Preferences in the 2016 General Election.

Sara Marie Kvaska: What Influences Public Transportation Coverage?.

Alex LaPolice: Julian Edelman.

Molly Kathryn Leavens: I built tools to break down and visualize data from a survey of Ghanian Cacao farmers.

Jack Luby: Visualizing the effects of and compliance with the 2014 Gulf of Panama IMO TSS Regulations.

Miranda Lupion: Visualizing crime data for Russia’s federal subjects (administrative entities) from 1990 to 2010.

Keeley Rose Macafee: Is the U.S. facing a food revolution?.

Sofia Marie Mascia: Bridging the Gap: An analysis of Income, Violence and Drug use in a socio-economically divided Illinois.

Ethan Robert McCollister: Visual Exploration of Trends in Pitch Data from 2012 to 2018..

Robert McKenzie: NYC Taxi Pickup and Dropoff VIsualization.

Michael Montella: US-China Trade.

Junho Moon: Looking at key military data of different countries over time.

Kodi Obika: Analyzing Ariana Grande’s songs, albums, and lyrics via song length, lexical density, and word frequency/significance.

Charlie Olmert: Visualization of the Harvard Men’s Lacrosse team’s shots from the 2018 season with heat maps and filters for shot clock satisfaction..

Annabelle Paterson: New Zealand Wine Exploration: Growth of the New Zealand wine industry..

John Pirrmann: Eli Manning.

Kai Potter: Sephora Skincare Bestsellers Explorer: Making smarter, more informed purchases.

Richard Qiu: Visualization of effects of Medicaid expansion on immunization rates.

Noah Reimers: Analyzing NFL Player Ratings and Draft data between 2008 and 2012.

Tanya Rohatgi: Excavating racial basis in false convictions, and the role of Conviction Integrity Units in overturning them, using exoneration data from 1989 to present..

Teresa Noelle Rokos: Analyzing services, patients, and expenditures at emergency rooms.

Allie Russell: Analysis of offensive production in the 2017-18 NHL season.

Richard Ryan: This project takes data from Fandango via FiveThirtyEight and compares movie ratings from various popular sites. You can look at the correlation between scores among these sites..

Connor Sakmar: Visualizing trends in death totals and rates from the US leading causes of death.

Jack Schroeder: Visualizing the San Diego Padres’ Batter Data by Venue. *

Dillon Smith: Measures of Winter Olympic success are more strongly positively correlated with voter turnout rates than measures of Summer Olympic success..

Serhiy Sokhan: Analyzing The Harvard Shop’s Web Fulfillment Data.

Sydney Alexandra Steel: Gender Employment Patterns.

Jordan Topoleski: What makes a top hit on the Billboard charts?.

Meaghan Townsend: How do non-school factors influence literacy and social development outcomes for preschoolers?.

Max George Vumbaca: A look at demographic shifts and trends in Cambridge’s homeless community across time, compared to other cities, and according to living situation.

Gabriel Walker: A tool for exploring trends in Chinese elite diplomacy and overseas financial flows..

Max Weiss: Presidential and Senate Twitter Activity: Trump’s First 9 Months in Office.