With one other yr nearly behind us, it’s time to take a seat again and take into account what we’ve simply been by. It’s been one other lively 12 months within the massive knowledge area, with loads of information for the intrepid massive knowledge reader.
We’ve had an eventful final yr right here at Datanami, which can quickly full the transition to BigDATAwire (preserve your eyes out for that change in January). With that in thoughts, it’s price having a look the highest tales in every of the previous 12 months. The rankings are based on pageviews.
January: All Eyes on Snowflake and Databricks in 2022
The brand new yr kicked off with plenty of anticipation for what Databricks and Snowflake would do. The 2 firms didn’t disappoint, with a bunch of recent capabilities and continued sturdy development (though the much-anticipated Databricks IPO by no means materialized). These two knowledge giants shall be fascinating to look at in 2023 too–though will probably be robust to cowl their respective person conferences in June, which occur the identical days (with Databricks in San Francisco and Snowflake in Las Vegas).
February: Snowflake, AWS Heat As much as Apache Iceberg
Apache Iceberg–the brand new open desk format that solves plenty of consistency issues in massive knowledge lakehouses–got here on sturdy in late 2021, and its utilization grew by 2022. We named Ryan Blue, the co-creator of Iceberg, as considered one of our folks to look at. Databricks, for what it’s price, introduced help for Iceberg later within the yr (it additionally open sourced its Delta desk format, offering competitors to Iceberg, together with Apache Hudi).
March: Dwelling Depot Finds DIY Success with Vector Search
Vector search was some of the compelling new applied sciences to seek out traction in 2022. We acquired an inside view of how the know-how (typically deployed utilizing vector databases) helped dwelling enchancment large Dwelling Depot supercharge its clients’ Internet and cellular searches through the use of neural networks to deduce what they’re on the lookout for as an alternative of a sustaining an enormous dictionary of generally misspelled phrases.
April: The Modernization of Information Engineering at Capital One
Democratization of knowledge science and knowledge evaluation would be the purpose, however knowledge engineering is usually the trail to get there. The oldsters at Capital One notice this, which is why the corporate has poured sources into knowledge engineering to streamline entry to knowledge. It’s inside knowledge market combines an information catalog, an automatic knowledge pipeline improvement instrument, knowledge governance, and knowledge high quality, and it’s held along with a nice knowledge mesh.
Might: Anaconda Unveils PyScript, the ‘Minecraft for Software program Improvement’
Python has grow to be the lingua franca for knowledge science. That’s not information. However with Anaconda’s new PyScript, which CEO Peter Wang unveiled on the PyCon 2022 convention, the corporate helped to decrease the barrier to creating knowledge science utility within the consolation of a Internet browser.
June: EMR Serverless Now Out there from AWS
Apache Hadoop has lengthy ceased being the middle of gravity of the large knowledge world. However Hadoop’s legacy lives on, together with at AWS, the place its Amazon EMR providing continues to be a smash hit amongst clients utilizing Apache Spark, Apache Flink, Apache Hive, Presto, and even MapReduce code. And with its new serverless possibility, Amazon EMR (which used to face for Elastic MapReduce however doesn’t formally anymore) helped to eradicate one of many massive usability hurdles that that outdated elephant Hadoop.
July: Mathematica Helps Crack Zodiac Killer’s Code
Generally, tales languish on Datanami for months earlier than readers lastly notice what they’ve lacking. Such was the case with this January 2022 story, which described how a trio of males from Virginia, Australia, and Belgium used the Mathematica statistical bundle from Wolfram to crack the Zodiac Killer’s code. Uncover Journal will get credit score for first reporting this story. Unfortunatley, the id of the Zodiac Killer, the serial killer who terrorized Northern California greater than half a century in the past, stays unresolved.
August: Datanami Folks to Watch 2022
We first introduced the 12 Datanami Folks to Watch again in February, and ran interviews with the group over the course of the yr. It’s an incredible group of leaders, together with Yu Xu (TigerGraph), Lauren Woodman (Datakind), Venkat Venkataramani (Rockset), Adam Selipsky (AWS), Matthew Scullion (Matillion), Satyen Sangani (Alation), Andrew Ng (LandingAI), Tristan Helpful (dbt Labs), Susan Gregurick (NIH), Zhamak Dehghani (Thoughtworks), Pleasure Buolamwini (MIT Media Lab), and Ryan Blue (Tabular). Preserve a watch out in early 2023 for the following batch.
September: Walmart Provides Information and Analytics Monetization A Strive
Because the world’s largest retailer, Walmart is aware of a factor or two about promoting. With the launch of its new Walmart Information Ventures arm earlier this yr, the corporate launched new choices in its Walmart Luminate line, similar to Shopper Conduct, Channel Efficiency, and Buyer Notion. The retail large is just not solely promoting to its companions knowledge about its retailer gross sales (2 billion market baskets per quarter, the corporate says), however promoting them prepackaged analytics insights, too.
October: Information Mesh Vs. Information Material: Understanding the Variations
There’s no denying it: Information materials and knowledge meshes are sizzling. There’s additionally no denying that there’s plenty of confusion round these two ideas, which share some similarities but in addition have vital variations. This text, which was revealed in October 2021, took a yr to grow to be the most-viewed story for a month, displaying simply how a lot demand there may be for informaiton on knowledge meshes and knowledge materials. It simply occurred that it took a yr for it to bubble as much as the highest. Count on extra curiosity on knowledge meshes and knowledge materials within the new yr.
November: What Does Information and Analytics Want for 2023? Forrester Shares Predictions
Up so far, Datanami had one ironclad rule: No new yr predictions tales earlier than Thanksgiving. (It was the one strategy to preserve the PR folks at bay.) For no matter cause, we broke the rule this yr once we interviewed Forrester analyst Kim Herrington and revealed her analyst group’s predictions for 2023, and the outcome was the highest grossing story for the month. Go determine.
December: UC Berkeley Launches SkyPilot to Assist Navigate Hovering Cloud Prices
One of many greatest rising traits in 2022 was the rising prices of cloud computing. The oldsters operating the pc science program at UC Berkeley realized this, which is why they created Sky Computing because the follow-on to RISELab (which succeeded AMPLab). One among Sky Computing’s first creations is Sky Pilot, which lets customers run batch machine studying workloads on any cloud. There’s no telling whether or not will probably be as extremely profitable as Ray, which got here out of RISELab, or Spark, which got here out of AMPLab. However contemplating the eye employees author Jaime Hampton’s story obtained, we’re not betting in opposition to it.
That’s it from us this yr at Datanami. Completely happy holidays, and we’ll see you again right here in 2023.
Alation, Anaconda, AWS, Databricks, DataKind, dbt Labs, Forrester, LandingAI, Matillion, MIT Media Lab, Snowflake, Tabular, TigerGraph, Wolfram