Skip to main content

Reducing Workforce Turnover using Anomaly Detection and NLP

Maintaining an engaged workforce is essential to any organization looking to not only minimize the costs associated with hiring new personnel but also maximize productivity through engaged employees. Our senior data scientist, Jeremiah Lowhorn, partnered with one of our clients to analyze the risk factors that lead to employee turnover and how to mitigate them. We sat down with him to learn about how he was uniquely suited to solve such a complex problem.

Can you tell us about your background and current role at Data Tapestry?

My title is senior data scientist, and I’m currently working on my second master of science, this time in information management. Before Data Tapestry, I worked as a senior software engineer at Cigna focusing mainly on big data and data science projects. Prior to that role, I was working at US cellular as a data scientist. While there, I focused mainly on time series analysis and predictive modeling.

Tell me about the problem you were asked to solve and what were the client’s expectations.

The scope of the project was essentially to figure out why physicians were turning over. They didn’t really have any other objectives outside of that. I started with a query to generate a time series for the physician data. It was a record of each physician id and every day that they were employed. Additionally, there was an opportunity to enrich the time series with salary information and other physician attributes. I was able to incorporate an additional 200 fields into the analysis.

So how did the results turn out?

Things turned out well! We were able to provide an end product that gave the client additional insights into a critical part of their business. The client wanted to look at turnover over time. Additionally, I decided to enhance that goal by looking at trend detection to answer some questions like: Have we seen a spike in turnover over time and how can we identify that? So, I built an anomaly detection algorithm detects if there’s been a larger than usual increase in turnover over a 3-4-month period.

I also saw an opportunity to incorporate the physician exit surveys which supplied self-reported reasons for leaving. This led to creating an NLP model to read physicians exit surveys and classify those responses into specific categories as well as classify the sentiment of the response.

How did you partner with the client to achieve these results?

My stakeholder did not offer a ton of direction because he trusted my recommendations. We also wanted to let the data tell the story. I defined the roadmap on how to approach the problem to maintain forward progress. After our initial results, the client and I studied the outputs to tie their organizational knowledge to the analysis. This allowed us to refine the models to achieve even better results.

If you are interested in learning more about Data Tapestry or this project, email us at: or click the button below to visit our website. 


Popular posts from this blog

Transforming and Accessing Data through Custom Built Pipelines

One of the biggest hurdles in data analysis is just getting access to data in the first place. At Data Tapestry, we offer end-to-end analytics services beginning with data acquisition, performing analytics, and providing end user products. Keith Shook walks us through how to maintain data security and integrity when dealing with a variety of situations. Tell us a little about your background and your role at Data Tapestry. Currently, I’m a senior data engineer, but I actually started off as an intern ingesting data into Postgres and SQL databases. I then shifted into visualization using D3, a javascript library, but we found that Tableau was much more efficient. Since then, I’ve gained a variety of experience using scala, Hive, AWS, and building clusters.   Can you walk us through a project you’ve worked on? Data engineering is pretty straightforward as far as the process goes. You get the data, ingest it into the database, and then hand it off to the data scientist. You have to be fle

Automating Visualizations and Implementing Standardized Data Collection Practices

Creating automated visualizations can be difficult when working data that has not been standardized. In a large hospital system, standardization requires multi-level communication across many departments as well as strict adherence to those standards so that processes can be implemented. Alex Ratliff talks us through how he created a dashboard around ever changing standardization issues. Tell me about your background and role at Data Tapestry. I’m currently a data scientist at Data Tapestry. My background is in math and analytics. During the first three years of my career, I worked at a company that contracted with the department of defense. While I was there, I worked on making dashboards to monitor data flow to make sure the data was being processed correctly. There were different sites that make data transfers. Our job was to make sure each job had the proper amount of bandwidth. I mostly used R and SQL to manage that.   Tell me about the dashboard project you worked on. When I star

Utility Corridor Management using Machine Learning

At Data Tapestry, our team's expertise spans a variety of specialties. While we've been able to apply NLP techniques, forecasting, and predictive analytics to many problems, most recently our team had to work with image data and the complexities that it presents. We combined resources with unmanned imaging experts at Skytec, LLC to create a solution for overgrowth and vegetation management in utility corridors.  Damages in these areas due to overgrowth can occur without warning. Tower damage and power outages can cost millions of dollars in repairs and regulatory fines. It is even more important to detect these encroachments since an electricity arc or flashover can occur within less than 15 feet of power lines, thereby damaging equipment or causing fire to nearby vegetation. Unfortunately, manual efforts to monitor overgrowth can be extremely manpower intensive, expensive, and inefficient. Our Solution Imaging experts at Skytec provide aerial photos of utility corridors via un