I’m hopping back in for this month, as we get into a topic I have dabbled with but have only recently used more often: notebooks.
This month’s T-SQL Tuesday post is from a figure you might have heard of, Steve Jones (b | t). He’s asking if we use notebooks, how we use notebooks, and how we want to use notebooks in the future. Simple answer: I write in notebooks when I need to jot things down by hand. Needed them all through school.
Okay, we aren’t talking about those notebooks.
These notebooks are interactive computer environments for “literate programming.” We can show code, results, and visuals all in one place to share insights. Notebooks are very popular with data scientists and analytics groups, because of how data can be displayed. Likewise, a big focus in this particular data platform community is Jupyter Notebooks. I am not a data scientist myself, so I My first taste of the notebook was when Azure Notebooks were the thing as I became familiar with Azure Data Studio (to this day, I often use SQL Server Management Studio, but have started interspersing ADS usage for my T-SQL work…helps that more Redgate toolbelt features are on there, too). They made notebook extensions and installations really easy, essential for people who get lost in the concept without a structure to learn it. Thing is, I only scratched the surface with notebooks around spring/summer 2019, and only recently started to use them more. When I start working with a new technological aspect, I may be able to learn it quickly, but my application is often extraordinarily basic until I have excuses to do it most days.
My best use for the notebook
If ya basic, what’s the most common use? For me, it’s to keep related standalone scripts I have in one place. In my work, I sometimes have scripts that can’t easily be set as objects on their databases due to permissions headaches, nor are these repeatable scripts that we can put in our current source control for similar reasons. That being said, the presentation of documentation and code in one place is what sells me on notebook usage. It is easy enough to present a notebook in a similar manner to an outline, with instructions before running individual code snippets. I can set up the notebook so that all scripts can be run at once and documented properly.
There are also little tricks that I recently saw which kept notebooks engaging. I had a bit of fun and linked one notebook to another as part of a demonstration, per a post about adding links easily from Julie Koesmarno (who, for the record, has a series of great posts on using notebooks). In the case that we need to use more than SQL, there is the pointer ‘%%lang,’ which can then be used in a code cell to switch to Python or R while using the SQL Server kernel. For instance, using “%%lang_py” allows me to use a Python script in a notebook, which is awesome. Now I don’t need separate scripts with environment switches; that seems practical.
Considering that the notebook can save code and result text, it has provided a good how-to on expected results for a script or series of them. This has helped when I need to reference something I previously did, and can show a colleague what expected results would be for a given code cell. If this feature gets voted up, soon we’ll have the chance to include graphical execution plans, which I will appreciate adding for case studies, and for others more gifted at tuning to utilize and show me where code builds can improve. The caveat is that adding the result set takes up more space. That is part of why I wasn’t saving notebooks at first, as they would take up a lot of space when I was hellbent on showing results if I wanted to explain how something worked. However, the clearing of output saves that space, and now my notebooks are all under 3kb.
How would I like to use it in the future, you ask?
First, I’d just like to use them at a more frequent clip than I currently do. However, I see two significant uses in the short term that would help me personally.
I have a recent GitHub repo, and now I need to do something with it. I’m thinking the use of Jupyter notebook migration will help for any demos or personal projects. I’ve read about an extension to get around the GitHub HTML rendering, and may want to put that into play, along with turning some of my scripts from demos and my two technical talks into respective notebooks for safe keeping. That integration would be a fine use.
The other use has to do with development for business intelligence. As I mentioned at the start, data scientists swear by the notebook quite often, and in my flirtations with data science, I’ve found that notebooks work extremely well with Anaconda, which is how I figured out Python over time. Then I would use it in ADS instead for familiarity, depending on what I learn about notebooks with Python or R. The graphical element allows for trends and reports to be seen clearly, which can also tie into the space, but will also be essential to showing results and helping to guide predictions.
There are even more uses that I look forward to seeing and discovering, including better ways to use them myself. Like I said early on in the post, I’m still very basic when it comes to how notebooks work, so let’s see how this plays out.