In “What ‘Pondering’ and ‘Reasoning’ Really Mean in AI and LLMs,” you address the semantic gap between human and machine reasoning. How does understanding this distinction impact the way in which you approach model development and interpretation in your skilled work?
AI has generated huge hype recently. Swiftly, many old-school ML-based products are immediately rebranded as AI, and there appears to be a renewed demand for anything that has AI slapped on it. For this reason, I imagine that it’s now essential for everybody to have a basic technical understanding of what AI is and the way it really works, so that they’re ready to guage what it may possibly and can’t do for them.
The reality is that we supply plenty of baggage in regards to the very nature of AI, originating in narratives from our sci-fi legacy. This baggage makes it easy to get carried away by all of AI’s exciting and promising potential and forget its actual current capabilities, ultimately misjudging it as some form of magic solution that’s going to alleviate all our problems. Non-technical business users are essentially the most susceptible to this overexcitement about AI, sometimes imagining it as a black-box superintelligence, capable of provide correct answers and solutions to anything.
For higher or for worse, this couldn’t be farther from the reality. LLMs — the essential scientific breakthrough all of the AI fuss is basically about — are impressively good at certain things (for example, generating emails or summaries), but not so good at other things (for instance, performing complex calculations or analysing multilevel cause and effect relationships).
Having a technical understanding of what AI is and the way it fundamentally works has immensely helped me in my skilled work. Primarily, it allows me to find valid AI use cases and to administer business users’ expectations of what can and can’t be done. On a more technical level, it allows me to tell apart the precise components that have to be utilized in specific contexts, in order that the delivered solution has real value for the business.
For instance, if a RAG application is required to look specific technical documentation and perform calculations based on information that’s present in that documentation, it signifies that a code terminal component must be included in the applying to perform the calculations (as an alternative of letting the model directly answer).
Where do you draw the initial inspiration in your articles, especially the more philosophical ones just like the “Water Cooler Small Talk” series?
The initial inspiration for my “Water Cooler Small Talk” series got here from actual discussions I’ve experienced in an office, in addition to from friends’ stories. I believe that resulting from the tendency of individuals to avoid unnecessary conflict in corporate setups, sometimes some really outrageous opinions could be expressed in casual discussions around a water cooler. And typically, nobody calls out incorrect facts simply to avoid conflict or challenge their colleagues.
Despite the fact that such conversations are benevolent and well-intended — really just an off-the-cuff break from work — they often result in the perpetuation of incorrect scientific facts. Especially for complex and not-so-easy-to-intuitively-understand topics like statistics and AI, we will easily oversimplify things and perpetuate invalid opinions.
The very first opinion that pushed me to put in writing a complete piece about it was that Now, in the event you’ve ever taken a statistics class, you realize that this will not be how it really works; but in the event you haven’t had that statistics class, and nobody calls this out, you could leave this discussion with some strange ideas about how gambling works. So, my initial inspiration for that series was mainly misunderstood statistics topics.
Nonetheless, the identical — if no more — misunderstandings apply nowadays to topics related to AI. The large hype that AI has generated has resulted in people imagining and spreading every kind of misinformation about how AI works and what it may possibly do, and they often achieve this with incredible confidence. Because of this it’s so essential to coach ourselves on the basics, irrespective of whether it is statistics, AI, or another topic.
Are you able to walk us through your typical writing process for an in depth technical article, from initial research to final draft? How do you balance deep technical accuracy with accessibility for a general audience?
Every technical post starts with a technical concept that I would like to put in writing about — for example, demonstrating the way to use a selected library or the way to structure a certain problem in Python. For instance, in my Pokémon post, the goal was to clarify the way to structure an operations research problem in Python. After identifying this core technical concept that I would like to deal with, my next step is generally to look for an appropriate dataset that could be used to show it.
I imagine that that is essentially the most difficult and time-consuming part — finding a superb, open-source dataset that could be freely used in your evaluation. While there are plenty of datasets on the market, it will not be so trivial to search out one which is freely available, with complete data, and interesting enough to inform a superb story.
In my opinion, the flavour of the dataset you’re going to use can have a huge impact on the recognition of your post. Structuring an operations research problem using Pokémon sounds far more fun than using worker shifts (eww!). Overall, the dataset should thematically fit the technical topic I’ve chosen and make for a somewhat coherent story.
Having identified the technical topic of the post and the dataset I’m going to make use of, I then write the actual code. It is a quite straightforward step: write the code using the dataset and get it to run and produce correct results.
After I’ve finished the code and I even have made sure it runs properly, I begin to draft the actual post. I normally start my posts with a temporary intro on what initially sparked my interest on this specific topic (for instance, I desired to make a posh visualization for my PhD, and the searoute Python library made my life easier), and the way this topic could be useful to the reader (reading my tutorial explaining API calls to the Pokémon data API can show you how to understand the way to write calls to any API).
I also add some temporary general explanations, wherever appropriate, of the underlying theoretical premise of the use case I’m demonstrating, in addition to a brief introduction to the code libraries that I will probably be using.
Within the essential a part of the technical post, I typically show the way to structure the code with Python snippets, and present step-by-step explanations of how every part is playing out and the outcomes you’re expected to get if every part runs accurately.
I also wish to add GIF screenshots demonstrating any interactive diagrams which might be incorporated within the code — I imagine they make the posts so much more interesting, easy to grasp, and visually appealing to the reader.
And there you may have it! A technical tutorial!
What initially motivated you to begin sharing your knowledge and insights with the broader data science community, and what does the technique of writing give back to your skilled practice?
Back in 2017, while writing my diploma thesis, I stumbled upon Medium and the Towards Data Science publication for the very first time. After reading a few posts, I remember being completely mesmerized by the abundance of technical material, the range of topics, and the creativity of the posts. It felt like an information science community, with writers of diverse backgrounds and at different technical levels — there have been articles for each level and for various domains.
But aside from appreciating the technicality of the tutorials that allowed me to learn and understand more about data science, I also liked the creativity and storytelling of the posts. Unlike a GitHub page or a Stack Overflow answer, there was a certain creativity and artistry in a lot of the posts. I actually enjoyed reading such posts — they helped me learn loads of stuff about data science and machine learning, and over time, I silently developed the will to also write such posts myself.
After fascinated with it for some time, I reluctantly drafted and submitted my very first post, and that is how I published with TDS for the primary time in early 2023. Since then, I’ve written several more posts for TDS, having fun with each as much as that first post.
One thing I actually enjoy about writing technical pieces for TDS is sharing things that I personally found difficult to grasp or especially interesting. Sometimes complex topics like operations research, probabilities, or AI can feel scary and intimidating, discouraging people from even beginning to read and learn more about them — I personally am guilty of this.
By making a simplified, straightforward, even seemingly fun version of a posh topic, I feel like I enable people to begin reading and learning more about it with a delicate, not-so-formal start and see for themselves that it will not be so scary in spite of everything.
On the flip side, writing has greatly helped me on a private and skilled level. My written communication has greatly improved. Over time, it has turn into easier for me to present complex, technical topics in a way that business non-technical audiences can grasp. Ultimately, putting yourself ready to clarify a subject to another person in easy terms forces you to completely understand it and avoid leaving ambiguous spots.
Looking back at your profession progression, what’s a non-technical skill you would like you had focused on earlier?
In an information profession, crucial non-technical skill is communication.
While communication is useful in any field, it is particularly critical in data roles. It is basically what bridges the gap between complex technical work and practical business understanding, and helps make you a well-rounded data skilled.
It’s because, irrespective of how strong your technical skills are, in the event you cannot communicate the worth of your deliverables to business users and management, they won’t take you very far.
It will be important to give you the chance to clarify the worth of your work to non-technical audiences, speak their language, understand what matters to them, and communicate your findings in a way that shows how your work advantages them.
Data and math, as useful as they’re, can often feel intimidating or incomprehensible to business users. With the ability to translate data into meaningful business insights after which communicate those insights effectively is ultimately what allows your data evaluation projects to have an actual impact on an organization.
To learn more about Maria’s work and stay up-to-date along with her latest articles, you’ll be able to follow her on TDS or LinkedIn.
