Data science for service change - DataSF | Office of the ...
Data science for service change Presented by DataSF | datasf.org/science City and County of San Francisco What is data science? Data Science Service Change Applying advanced statistical tools to existing data to generate new insights Converting new data insights into (often small) changes to business processes Smarter Work More efficient and effective use of staff and resources What complements data science? (and is really good stuff to do) Approach
Process Outcome Examples Performance Management Define, visualize, often using dashboards, and manage to KPIs Meet goals and KPI targets SF Scorecard, PublicWorks Stat & Stat starter kit Evaluation Assess a project, program or policy design or results Better investment of
resources; Better policy decisions Evaluation of transitionalkindergarten in SF Policy Analysis Define and assess alternatives using a broad range of tools Report or memo with policy or program recommendations Shape Up SF Policy Analysis Easier data sharing and reporting, new tools or services built on data SFPUC Adopt a Drain Open Data Publish civic data for
use by the City and the public Smarter work on the ground in real time See rest of deck! DataScienceSF Identify insights using advanced statistics tied to a service change What complements data science? (and is really good stuff to do) Approach Performance Management Evaluation Policy Analysis Open Data
DataScienceSF All approaches can lead to service improvement. Its about choosing the right tool for the job (and sometimes combining them)! Whats in the DataScienceSF Toolkit? Statistical Methods Sentiment analysis Tools Time series analysis Multilevel modeling Survival analysis AB testing User Experience Research Data mining Missing data imputations Pattern recognition
SciPy Pandas Scikit-learn GPText OpenNLP Mahout +many others Tools User Experience Research Data Engineering Profiling ETL Job notices APIs Optimized data pipelines Optimized data storage/access Visualization D3.js Gephi R Leaflet PowerBI
ggplot2 shiny Whats in the DataScienceSF Toolkit? Statistical Methods Iterative Prototyping Tools Photo journaling and documenting User Experience Research Service blueprinting Journey mapping Ride-alongs Ethnographic field research and user observation Process mapping
Usability testing What is NOT data science? This Not that Service change Academic research Small changes Major overhauls / service disruptions Use existing data Collecting new data (mostly ;) Data Science Project Types Project Type: Find the needle in the haystack What to target?
Data Science Service Change Target areas Target categories Target individuals Service Issue: Difficult to identify targets in a population Data Science Process: Use existing data and predictive modeling to identify targets Service Change: Engage with target subset of population Result: Department resources are spent where most needed Examples: Free fire alarms in New Orleans Service Issue Fire alarms to homes that have them
Data Science ID homes with high prob. of no alarm Service Change Use list to shape outreach Result 2x increase in hit rate Service Issue Data Science New Orleans Fire Alarms New Orleans Fire Department (Nola FD) distributes free fire alarms to homes. But many homes they visited already had them, wasting Nola FDs resources. Nolas analytics team used public
data to identify homes with a high probability of not having a fire alarm and provided Nola FD with a list. Nola FD used the list to determine where to offer fire alarms. With no increase in resources or patrols, Nola FD increased the hit rate of homes needing smoke alarms by 2x. New York City Tax Compliance Examples: Find the needle in the haystack Service Change Result
New York City (NYC) conducts corporate tax audits. They are time consuming and 37% have no findings. They want to increase findings but maintain their number of audits. NYC analyzed historical audit records and identified patterns of businesses. Outliers were flagged as possible audit targets. The audit team targeted the flagged cases for audits. With the same staff levels, the audit team decreased the percent of cases
with no finding from 37 to 22%, leading to increased revenues. Project Type: Prioritize your backlog What to prioritize? Service Issue: Backlog is tackled via first in, first out (FIFO) Data Science Data Science Process: Create a model to categorize and group past and current cases Service Change Service Change: Prioritize cases based on categories in order of risk, need or opportunity Result: Department addresses high priority cases first
Examples: Blight backlog in New Orleans Service Issue Backlog in blight enforcement Data Science Use data to grade cases per prior decisions Service Change Result created abatement tool Result 1500+ case backlog gone in 100 days Examples: Prioritize your backlog Service Change Result Boston Complaints Data Science In Boston, they have a large list of residences with
anti-social complaints filed against them. The analytics team pooled data from housing, police, and tax agencies to gauge the nature of complaints and identify the biggest contributors to complaints. The Air Pollution Control Commission expedited enforcement with the biggest contributors. With no change in resources, Boston saw a 55% reduction in police calls associated with the targeted
residences. New Orleans Blight Service Issue New Orleans (Nola) faced a significant backlog in blight enforcement due in part to bottlenecks in the decision making process and missing information. Nola used data on the outcomes of previous blight cases to grade cases in the backlog and to recommend additional data to collect by field teams. The enforcement
team used the results as an abatement decision tool to speed the decision-making process of whether to demolish or foreclose a home. Nola eliminated the 1,500+ case backlog in less than 100 days. Project Type: Flag stuff early How to detect? Service Issue: Hard to predict future condition which leads to reactive services Data Science Data Science Process: Use historical and current data to create estimate ranges for
potential outcomes Service Change Service Change: Use estimates to change and tailor intervention points Result: Department provides pro-active early interventions Examples: Use of force alerts in Charlotte Service Issue Excessive force have neg. impact on community Data Science Identify patterns to refine early warning Service Change Flagged recurring complaints Result Accuracy up 20%; False positives down 55% Examples: Flag stuff early Service Change
Result Charlotte Police Violence Data Science Excessive force violations by police officers have huge negative repercussions in the community and for police careers. The analytics team refined an early warning system, identifying patterns that often led to officers having negative interactions with the public. The department flagged recurring complaints against
officers and notified supervisors when certain thresholds were reached. The CMPD system increased accuracy by 15-20% while reducing false positives by 55%. Lead Poisoning in Chicago Service Issue In Chicago, a large number of children are thought to be exposed to lead paint in older houses. The analytics team built a model of exposure using data on homes, history of childrens
exposure at that address and conditions of neighborhood. They conducted targeted inspections and provided remediation funding to homes identified in the model. Chicago reached the most vulnerable families before severe health effects from lead contamination manifest. Project Type: A/B test something Which form? Data Science 62%
respond Service Issue: Costly outreach methods are not tested before implementation Service Change 78% respond Data Science Process: Statistical testing on outreach methods to identify which, when, and to whom to send Result: Department increases response rates Service Change: Use statistically validated outreach method Examples: NYC Summons Redesign Service Issue 40% cited no-show
leading to costly arrest Data Science Redesigned and tested summons form Service Change Deployed new form and rescheduled timelines Result Currently evaluating impact Service Issue Data Science NOLA Community Health Program In New Orleans, they have a low take up rate of free primary care appointments. The analytics team tested different SMS reminders to those eligible for
appointments. The department implemented the most successful SMS text. 60% increase in clients using free primary care appointments NYC Summons Redesign Examples: A/B test something Service Change Result 40% of those cited for low-level violations did not take required next steps, leading to issuance of arrest warrants.
Experiment and test redesign of summons process Reschedule court timelines to facilitate greater access Evaluating impact on use of costly arrest warrants (Project currently in progress) Project Type: Optimize your resources How to distribute? Service Issue: Difficult to identify where to place or distribute resources to be most effective Data Science Data Science Process: Use geospatial and/or
other data to identify optimal distribution of resources Service Change Service Change: Re-allocates resources to optimal distribution Result: Department decreases response times; increases volume Examples: Chicago Pest Control Service Issue Challenging to predict outbreaks Data Science Analyze data associated with outbreaks Service Change Proactive targeting of leading indicators Result 15% drop in requests for service Examples: Optimize your resources Service Change
Result Chicago Pest Control Data Science Chicagos rodent baiting program finds it challenging to predict rodent outbreaks and locations leading to spikes in 311 complaints. Predicted potential danger of outbreaks by using leading indicators and other data correlated with previous outbreaks. Directed rodent baiting to areas identified by
leading indicators, including events, like water main breaks. Resident requests for rodent control services dropped by 15% NOLA Ambulance Stand-by Location Service Issue In New Orleans, ambulance standby locations are chosen based on dispatcher habits or instincts. Analytics team used city wide analysis of data on accident patterns, traffic patterns, and crew readiness to
identify optimal standby locations Ambulances deployed at new optimized locations Targeting short response times to EMS calls (Project currently in progress) What was the service change? From that To This Fire Alarms Random List Prioritized List Blight Staff evaluates all cases
Tool evaluates easy cases Early Warning Focus on that set of officers Focus on this set of officers Summons Send Original Form Send new form Control Arrive at location X too late Arrive at location X early Service Change = Small Business Process Change Summary: The five project types Find the needle in the haystack Prioritize your backlog
Some combination Flag stuff early A/B test something Optimize your resources Something else DataScienceSF Cohort 1 ASR: Increase property tax revenues Service Issue When a property sells in SF, we either accept the sales price or modify it to collect property taxes. So which sales should you accept and which should you dig into? Data Science Our regression model identifies which sale prices are unusual for the location, time and property details http://www.markersf.com/blog/ Service Change The model splits properties into two lists: normal sale prices to enroll directly in tax collection and outlier sales for manual review by appraisers Result
Expected: Increased revenue and time to revenue, reduced backlog, and more consistency in assessments Prioritize your backlog Full write up at datasf.org/showcase/datascience/ Evictions: Pro-actively prevent evictions Service Issue How can we make eviction prevention more proactive by identifying the most problematic eviction notices in real time? Data Science An algorithm combines data sources to identify eviction notice filings that are outside the norm Service Change A list of flagged eviction notices is sent to eviction prevention services to proactively review for service outreach Result Expected: Targeted eviction prevention that keeps residents in their homes Find the needle in the haystack Flag stuff early
Full write up at datasf.org/showcase/datascience/ ENV: Find new clients to help green our City Service Issue SF Environment offers financial incentives and technical assistance to help our constituents upgrade their lighting & refrigeration systems. But their list of leads is dwindling - how can they find new leads? Data Science Mashed together multiple data sources to identify characteristics of stronger leads Service Change New and longer list of property leads with enriched data for targeting marketing campaigns Result Expected: New customers and increased uptake of green subsidies Find the needle Optimize your resources in the haystack Full write up at datasf.org/showcase/datascience/ DPH WIC: Help moms and babies stay in nutrition program Service Issue Since 2011, DPH has seen an increase in mothers dropping out of their nutrition program. Which moms
are most at risk of dropout? Data Science Built a predictive model that identified moms and infants who are at greatest risk for dropping out Service Change Using the high-risk client profiles to conduct targeted interviews to identify program barriers and make service changes Result Expected: Reduce the dropout rate of moms, infants and children, leading to healthier outcomes for both Flag stuff early Full write up at datasf.org/showcase/datascience/ DPH BHS: Improve results and reduce costs in mental health care Service Issue A small fraction of mental health patients use a large % of resources. Can we identify high users early to improve their outcomes and reduce costs? Data Science Build predictive model to identify clients at greatest risk for becoming high users Service Change Expected: Targeted service model to direct high users to more stable and preventative services
Result Expected: Reduction in high cost clients and use of high cost emergency services Find the needle in the haystack Flag stuff early TTX: Increase response to tax letter Service Issue TTX wanted to use behavioral economics and A/B test to increase effectiveness of collection letter for unsecured personal property (a difficult type to collect on). Data Science DataSF helped organize a Behavioral Insights Training (BIT) workshop and provided guidance on A/B test Service Change Use whichever letter gets the best response Result Improved response rate by 17%. TTX continuing to apply BIT principles to other taxpayer communications A/B test something Full write up at datasf.org/showcase/datascience/
ART: Preserve City art for the future Service Issue The Arts Commission needs to accurately and efficiently project long-term costs to budget for art preservation Data Science Revised cost formula and new tool to provide long-term projections and prioritization of conservation projects on demand Service Change Use tool to model cost scenarios instead of manual, one time process Result Expected: Reduction in staff time, more accurate cost estimates, and earlier identification of pieces in need of conservation Optimize your resources Full write up at datasf.org/showcase/datascience/ Overview of Phases Cohort 2: Jan June Solicitation Oct Nov Selection Nov 22
Nov 27 Dec 13 Application due Project refining Dec 13 Dec Notify applicants Present January - May Analysis & service change June Phase: Solicitation Opportunities to learn more Brown bags Office hours Invited presentations Dates at datasf.org/science
April May May May Mid May June July - November Dec Phase: Solicitation How to prepare Brainstorm projects using the project types Identify possible service changes Review data that could help Identify key staff members Learn more at datasf.org/science April May May May
Mid May June July - November Dec Phase: Application Available at datasf.org/science Brief online form Problem statement (200 word max) Impact statement (100 words max) Service change statement Data overview Project champion April May May May Mid
May June July - November Dec Phase: Application Criteria to keep in mind Above all else: A viable path to service change Question / problem answerable by data science Solvable within cohort time frame Impact Department commitment Data readiness April May May May
Mid May June July - November Dec Phase: Selection Process Initial review Criteria assessment Application scoring Department follow-ups, as needed Be available for questions (email or in person) Estimating 5-10 projects per Cohort April May May May Mid May
June July - November Dec Phase: Winners Announced And gentle off-ramps for the rest Some projects may not be appropriate for data science or for our timeline. We will help identify other opportunities that may be a better fit: Civic Bridge pro bono opportunities via the Mayors Office of Civic Innovation STIR startup technology engagements via the Mayors Office of Civic Innovation DataSF Dashboarding Services Controller's Performance Unit Data Academy classes External Data Science groups or volunteers Other technical assistance April May
May May Mid May June July - November Dec Phase: Project refining During this phase, we will: Meet to refine the scope Optionally, do initial site visits/interviews Prepare data for analysis Outputs Project charter Data exchanges and agreements, as needed April May
May May Mid May June July - November Dec Phase: Analysis and service change During this phase, we will: Conduct site visits, ride-alongs and interviews, as appropriate Conduct iterative analysis Implementation testing Handoff and training April May May May
Mid May June July - November Service Analysis Plan Review Dec Phase: Analysis and service change What DataSF Brings Statistical Methods Tools User Experience Research Issue expertise What You Bring A good question & data
Project champion Final Product is Algorithm + Tool: Algorithms that are scripted and automated (real time if needed) tied to some service change tool (e.g. list, service, alert) implemented together and maintained by department Phase: Present (& Disseminate) During this phase, we will: Present and celebrate the results with cohort As appropriate, write an article for DataSF Speaks (datasf.org/blog) and/or other venues Disseminate method and approach (not data) for other departments and cities to learn Data Scientist will continue to be available during office hours for continued support April May May May Mid
May June July - November Dec Visit datasf.org/science At datasf.org/science: This powerpoint 1 pager Sign up for office hours Sign up for brown bag Apply! Other Resources: Civic Bridge THANK YOU @datasf | datasf.org |datasf.org/blog Activity Take 5 minutes by yourself Brainstorm ideas Take your best idea and complete the form With your neighbors Review each top idea and refine/iterate
Which slogan best reflects the point of view of Cecil Rhodes as shown in this cartoon? (1) "Imperialism is a Glorious Pursuit." (2) "Embrace African Diversity." (3) "Unite All Africans." (4) "Connecting Constantinople to Cairo."
All directional terms reference position with regards to anatomical position … even if the body in question is in a different position. Example: The head is superior (above) to the feet, whether you are standing up, laying down, or doing...
Ethyl Aminobenzoate, Butacaine Sulfate, Cocaine, Dyclonine, Lidocaine, Tetracaine. Ethyl Aminobenzoate (Benzocaine) liquid, ointment, gel are best. More rapid onset and longer duration of anesthesia. NOT produce systemic toxicity. The mucosa is dried with gauze, and a small amount of the...
Motivation at Work Prepared by Joseph Mosca Monmouth University Interval Schedules Interval Schedules Learning Objectives Describe the basic model of performance. Discuss motivation and human needs. Identify the basic process models of motivation and describe an integrative model of motivation....
However, we will also NOT receive $10,000 in 5 years The appropriate number to use in the NPV analysis is the net salvage value Always consider after-tax cash flows You can use your calculator for the cash flows and salvage,...
Web based testing: Chucklist and Selenium Concerns when testing web applications Concerns when testing web applications Broken links Information is displayed correctly Information is processed correctly Cross browser compatibilities Ajax and interactivity add a layer of complexity Assignments 2 ...