ETL stands for Extract, Transform and Load. They’re the three vital steps most analysts do before Analyze, Investigate and Storytell.
Most of the time, the ET’ing is done for us. You log into a tool and hit export. The Loading part, getting the data into a format where it can be statistically analyzed or presented in a culturally acceptable way, is longer. And it’s where we spend too much time.
But not today. I’m unpacking a tricky T problem.
In an attempt to fully automate an algorithm further and unlock an area of possibility, involves a tricky operation of flattening lists of lists of lists, which, sadly for me, are also composed of lists. It’s tricky. There are functions that do that. It’s a matter of finding the right one and using it the right way. I’ve found plenty of functions that do it the wrong way. (oh yes).
Transformation can be especially difficult for an analyst that understands programming. The basic data structures, logic, and iteration functions are enough to get started. To really start solving somewhat specific problems involves a few additional skill sets, like research, planning, testing, and becoming incredibly familiar with extended libraries of functions. Usually, a solid knowledge of algorithms and complexity is required to solve the trickier problems. Thankfully I can run downstairs and keyword hunt with the real programmers.
And that’s where I’m at this morning. Right at T.