Skip to main content

Automating Visualizations and Implementing Standardized Data Collection Practices

Creating automated visualizations can be difficult when working data that has not been standardized. In a large hospital system, standardization requires multi-level communication across many departments as well as strict adherence to those standards so that processes can be implemented. Alex Ratliff talks us through how he created a dashboard around ever changing standardization issues.


Tell me about your background and role at Data Tapestry.

I’m currently a data scientist at Data Tapestry. My background is in math and analytics. During the first three years of my career, I worked at a company that contracted with the department of defense. While I was there, I worked on making dashboards to monitor data flow to make sure the data was being processed correctly. There were different sites that make data transfers. Our job was to make sure each job had the proper amount of bandwidth. I mostly used R and SQL to manage that.

 

Tell me about the dashboard project you worked on.

When I started working on the project, there were a few Tableau dashboards in place. The idea was to have a centralized location where they could view performance metrics as well as benchmarks for specific hospital facilities. The metrics were self-reported by the hospital. Previously, they had been using spreadsheets to track this information. So, I needed to aggregate several data sources, ensure standardized metrics across all the facilities, and eventually automated the process.

 

Can you describe the relationship you had with the stakeholders?

The team was very interested in having a data driven solution. However, I had to manage expectations around how long some of the work would take. Much of that work was coordinating many individuals to commit to a standardized way of data collection. Once those standards were implemented, I was able to automate data refresh feeds so that the dashboard was as close to real-time as possible.  

 

How did you handle data quality issues?

One of the first things I encountered was that the data had been ingested incorrectly. Some rows were being recorded in the wrong field. I had to constantly verify the data with the stakeholders and then monitor the data refresh to make sure the quality was consistent.

 

What kind of techniques did you use to aggregate all of the data?

I combined several different data sets and spreadsheets. The metrics for one facility might be spread across the different data sets. So, in addition to standardizing metric names, we also had to create a facility crosswalk to ensure we were identifying the correct facility across various data sources

 

Were there any interesting challenges that you encountered during the project?

I had to automate the data refresh script so that the stakeholders could see the dashboard with the latest data. This involved some code conversion, error handling, and a lot of code cleanup.

Comments

Popular posts from this blog

Transforming and Accessing Data through Custom Built Pipelines

One of the biggest hurdles in data analysis is just getting access to data in the first place. At Data Tapestry, we offer end-to-end analytics services beginning with data acquisition, performing analytics, and providing end user products. Keith Shook walks us through how to maintain data security and integrity when dealing with a variety of situations.
Tell us a little about your background and your role at Data Tapestry.Currently, I’m a senior data engineer, but I actually started off as an intern ingesting data into Postgres and SQL databases. I then shifted into visualization using D3, a javascript library, but we found that Tableau was much more efficient. Since then, I’ve gained a variety of experience using scala, Hive, AWS, and building clusters.Can you walk us through a project you’ve worked on?Data engineering is pretty straightforward as far as the process goes. You get the data, ingest it into the database, and then hand it off to the data scientist. You have to be flexible…

Reducing Workforce Turnover using Anomaly Detection and NLP

Maintaining an engaged workforce is essential to any organization looking to not only minimize the costs associated with hiring new personnel but also maximize productivity through engaged employees. Our senior data scientist, Jeremiah Lowhorn, partnered with one of our clients to analyze the risk factors that lead to employee turnover and how to mitigate them. We sat down with him to learn about how he was uniquely suited to solve such a complex problem.
Can you tell us about your background and current role at Data Tapestry?
My title is senior data scientist, and I’m currently working on my second master of science, this time in information management. Before Data Tapestry, I worked as a senior software engineer at Cigna focusing mainly on big data and data science projects. Prior to that role, I was working at US cellular as a data scientist. While there, I focused mainly on time series analysis and predictive modeling.
Tell me about the problem you were asked to solve and what were t…

Consulting, Designing, and Coding: the Many Roles of a Developer

At Data Tapestry, we pride ourselves in deeply understanding your business needs and delivering solutions in a hands-on or consulting capacity. Our staff not only performs analytics, but we also build and support custom software products. Philip Vacarro, our full-stack software developer, explains how he partners with multiple clients in different capacities to deliver production software solutions, advise on data architecture, and provide product support.
Can you tell us about your background and current role at Data Tapestry?
I’m a full stack software developer, and I serve as the first stop for new clients when it comes to consulting on data architecture and other products we’ve built. I’ve worked as a software engineer for Siemens in the research and development. We focused on interventional imaging. After that, I worked as a full stack developer at ORNL. We collected terabytes of data submitted from scientists all over the world.  The data was then centralized in an application so …