Processing for Historical Reports is Delayed due to DST Shift
Incident Report for MachineMetrics
Postmortem

We are writing to share additional context around the reporting issues that you may have experienced over the course of this incident.

MachineMetrics aggregates data for historical reporting in our data warehouse. Around the start of Daylight Savings Time, there was a scenario where time periods were created with the same start time and end time, resulting in time periods with a zero time duration and anomalous aggregation and reporting behavior which was later resolved.

Time periods with a zero time duration have been created shortly after Daylight Savings Time for the past few years. These caused no impact in previously, however, due to other changes within the past year, these blank periods caused problems with data aggregation with this time change. The underlying data was reporting and recorded properly; it was the aggregation around these time periods that was problematic. Over the past few days, we've changed the code to not create these meaningless time periods, correcting the aggregation problem.

To summarize, no data was compromised, and the data aggregates underlying the historical reports were able to be restored once we identified the problem and found a sustainable solution.

Again, thank you for your patience.

Posted Mar 10, 2020 - 12:55 EDT

Resolved
We appreciate your patience over the last 24 hours as we have been working diligently to fix the reporting issues you may have experienced. We are happy to report all data is now reporting as normal.

By noon yesterday, we had a process in place to keep all data available in real time so that it was reporting as normal as we continued to identify and resolve the issue. At 7:00PM yesterday, all machines except a small subset of machines (around 15%) were reporting normally. That 15% experienced issues with two data points- unplanned and planned downtime which temporarily affected the downtime pareto on the Job Report as well as the Production Report.

We are writing to inform you that as of this afternoon on Monday, March 9, 2020, all of the above issues have been resolved and all data is reporting as normal and in real-time. We'll update this incident with further details once they are available.
Posted Mar 09, 2020 - 16:06 EDT
Update
There is a small number of machines that are still processing more slowly than expected. We are still investigating the issue with the remaining machines and will post here with more information as it becomes available.
Posted Mar 08, 2020 - 21:20 EDT
Update
We have a temporary workaround in place which is designed to keep data up-to-date and accurate within 15 minutes of real-time while we identify the root cause.

We will post here with more information as it becomes available.
Posted Mar 08, 2020 - 13:44 EDT
Investigating
We are currently investigating an issue with the system that manages data for historical reporting. The Utilization, OEE, and Production reports are affected. Also affected is the Production API which is often used with BI tools such as Klipfolio and PowerBI. The issue appears to be related to the shift in Daylight Savings Time that occurred last night at 1AM.

Reporting is accurate prior to 9:30AM EST, but processing is currently delayed after that time. Delayed data will cause some data points to be reported inaccurately.

We will post here with more information as it becomes available.
Posted Mar 08, 2020 - 10:40 EDT
This incident affected: Data Processing Services.