Okay, I have a blog post for the first time in…years. Today seemed like a good day to dust this off.
I’m back on the T-SQL Tuesday grind, brought to you this month by Louis Davidson ( l | t ). This was an excellent question: What advice do you wish Current You could go back and give past you as you were starting your first data platform job?
Let’s start at the present. In late summer 2021, I entered consulting. It was a different world than I was used to as an FTE at different companies, but I became good at it. Most crucially, I also learned a few new tools since we are vendor-independent and help clients with which tech stack is best for them. That especially has me thinking about wisdom for me as a mid-20s data analyst…before he knew about community and went beyond T-SQL queries to do much more. So now it’s time to provide this text if I ever go back in time to provide half-baked advice.
A message to my early 2010s self:
Data technology will evolve throughout your career. Keep track and adapt. Discern and allow healthy skepticism. Yet above all, understand the basics behind the tech.
Sure, you’ve done a lot of SQL and SSRS, and you started whipping up some SSIS packages, and probably remember some Java and HTML from college. This will last you for a bit, but it’s not all going to be there forever. By 2024, you will be used to hearing about new technology often, with vendors entering the field every year with their new and niche solutions. You’ll still be using a lot of SQL if you follow the same path, but don’t feel like you have to learn all the things. I can tell you from experience that your attention span won’t allow this. Here’s what I am talking about.
Learn Python now and not very slowly six years later. (Understand the basics…still)
The underlying technology might be the most important part. When 2021 hit, I barely knew what the “modern data stack” was versus the “modern data warehouse” that Azure promoted. Oh…you’re wondering about what constitutes modern? Just wait until you hear about dbt, Snowflake, and Fivetran. That’s how the “stack” term came about. Well, those are built on a Python foundation. The language will be useful for more than web scraping, but also data science. You may consider both R and Python, which each has their strengths, but go ahead and start on the former first.
Also, data modeling. Read those Kimball and Inmon books too.
Some trends will have a moment, and some will last longer. (Discern and allow healthy skepticism)
The example of something which had a moment? Well, big data is going to be a big deal by the middle of the next decade. Hadoop will look amazing…for a few minutes. I didn’t go far with learning about Hadoop, or big data clusters, or more…sort of worked out. Not saying it won’t have its place, either. Someone will refute the “big data is dead” trope when you hear about petabytes of data they work with daily.
For an opposite perspective, you’ll see someone hype that database administrators are a dying breed. Not true. They will evolve with the times and start doing systems work in the cloud. They will handle source control code releases. You’ll be doing some of this as an engineer. It’s hard to predict what the future will be, but try to discern and understand that even those at the top of your organizations have their preferences as well. Don’t chase the shiny object immediately.
Okay, you ask, “how will this apply with this Microsoft Community business you want me to get into today?” (Keep track and adapt)
The 2023-24 hype for Microsoft is Fabric. On one end, I’ve been excited about its potential. On the other, I have a tough time shaking off the impression of Databricks.
*takes moment to explain how Databricks is a data lakehouse, and what makes it similar to a data warehouse, redacted from this entry*
You can argue that Fabric initially is a lake house in the same regard, and will need to do a bit to differentiate itself. Does not mean it should be dismissed, either. To me, it’s worth paying attention to where Fabric goes, and if an organization should move to it. This is something I am now trying to apply when it comes to both paying attention and remaining skeptical. Earlier it was about Synapse, and SQL Data Warehouse before then. I worked on both briefly…you will as well. It looks like this might stick, but the idea is by now you will be well-prepared. The cloud will be huge, so get ready for all the Azure references; the cloud is mandatory for the “modern” concept.
Ultimately…
What I’m trying to tell you is that tech will come fast, and it’s best to get perspectives from everywhere. You’re going to work with a lot of data, and a lot of tools over the years as you start getting into the cloud…I’ll tell you more about clouds later. Keep up, but don’t try to learn everything because it seems hip, but because it seems practical and moves forward.