Speaker’s Corner: Featuring Debu Sinha, Senior Solutions Architect, AI & ML at Databricks
Tell me about Databricks, its work and projects…
Databricks is a data and AI company. More than 5,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify their data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe. Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is on a mission to help data teams solve the world’s toughest problems. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Databricks has received many Awards & Recognitions over the years:
- Leader in the Gartner Magic Quadrant for Data Science and Machine Learning 2021 (Leader 2020, Visionary 2019)
- Visionary in the Gartner Magic Quadrant for Cloud Database Management Systems 2020
- Forbes Cloud 100 2019 & 2020 (#5) & 2021 (#2)
- Forbes AI50 2021
- Fortune Best Places to Work in Technology 2021
- Fortune Best Places to Work in NYC 2021
- Fortune Best Places to Work for Millennials 2021
- Inc Best Workplaces 2020, 2021
- CNBC Disruptor 2020 and 2021
- Forbes Best Startup Companies for 2021
- Glassdoor Top CEOs 2021
- Computing AI and ML Awards 2021: Best ‘Outstanding Data Analytics Solution’
- LinkedIn Top Startups 2019 & 2020
- Infoworld Technology of the Year 2020
- Ventana Research Digital Innovation Award
- Datanami Readers’ Choice Top 5 Vendors to Watch
- Infoworld 2019 BOSSIE Awards (open source)
- Deloitte Tech Fast 500 2019
- WayUp Top 100 Internship Programs 2020
Databricks is positioned as a Leader in the 2021 Gartner Data Science and Machine Learning Magic Quadrant and was named a Visionary in the 2020 Gartner Cloud Database Management Systems Magic Quadrant
What are the challenges within the computer software industry today? How is your company working towards dealing with or solving these challenges?
- We are the data and AI company helping companies solve their toughest problems with data. We do this through a lakehouse, a category we have created and are actively expanding.
- A lakehouse architecture is the ideal data architecture for data-driven organizations. It combines the best qualities of data warehouses and data lakes to provide a single solution for all major data workloads and supports use cases from streaming analytics to BI, data science, and AI.
- Simple – unify your data, analytics, and AI on one platform
- Open – unify your data ecosystem with open standards and formats
- Collaborative – Unify your data teams to collaborate across the entire data and AI workflow.
- We saw the shift to a lakehouse architecture coming for a long time, because customers’ needs across analytics and AI were converging and their current architecture was too complicated to keep up. Recently, we’ve seen other vendors in our space begin to champion the lakehouse too, but our focus has allowed us to be the first to make it a reality.
- In the past, customers had to maintain proprietary data warehouses for BI workloads and data lakes for data science and machine learning workloads, often across multiple cloud platforms. This led to complicated, expensive architecture that slowed down customers’ ability to get value from their data.
Has the COVID-19 pandemic has led to an increase in data & AI solutions – which are the trends within the sector?
Absolutely! Covid-19 have speeded the adoption of data & AI solutions by several years. Many of the changes are here for the long term. At Databricks we have seen tremendous growth in all different industry verticals.
More than 40% of the Fortune 500 are already using Databricks to innovate and produce business value in their verticals. Here are some examples.
- Retail & CPG: Databricks has over 500 customers in the Retail and CPG industry including 7 of the 10 largest retailers globally. Our customers—including industry leaders such as H&M, CVS Health, Starbucks, Mars and 7-Eleven—use Databricks for a broad range of use cases such as customer personalization, granular forecasting, inventory management, and ad optimization.
- Financial Services: Databricks has over 600 customers in the Financial Services industry across banking, insurance, capital markets and fintech. Customers such as HSBC, S&P Global, Credit Suisse and ABN AMRO leverage Databricks for a broad range of use cases ranging from fraud detection, risk management, customer analytics to ESG.
- Communications, Media & Entertainment: Databricks has over 350 customers in the Media & Entertainment industry including 8 of the 10 largest media companies globally. Our customers—including industry leaders such as Comcast, Disney, T-mobile, Conde Nast and Riot Games—use Databricks for a broad range of use cases such as customer analytics, audience personalization, quality of service/network analytics and advertising optimization.
- Life Sciences: Databricks has over 350 customers in the Healthcare and Life Sciences industry including 9 of the 10 largest pharmaceutical companies globally. Our customers—including industry leaders such as AstraZeneca, GSK, Amgen, Sanofi and Biogen—use Databricks for a broad range of use cases such as genomics, translational research, clinical trial optimization, inventory management and commercial analytics.
- Healthcare: Databricks has over 350 customers in the Healthcare and Life Sciences industry including 8 of the 10 largest healthcare companies. Our customers—including industry leaders such as Humana, NHS, Optum, CMS and Milliman—use Databricks for a broad range of use cases such as medical IoT analytics, population health, precision medicine and fraud detection.
- Public Sector: Databricks has over 150 customers in the Public Sector representing some of the largest state, local and federal entities in the United States. Our customers—including the DOD, HHS, DOJ, DHS, State of California, and State of New York—use Databricks for a broad range of data analytics and AI use case such as cyber threat detection, geospatial analytics, disease spread modeling, fraud detection, predictive maintenance, and more.
What is your biggest objective as a speaker?
My biggest objective is to keep my audience engaged and provide value in my sessions. I always work on making sure that the audience connects with the central idea of the talk and leave with immediately actionable insights for their projects.
As a leader, what are the factors both professional and personal that drive you? What keeps you going?
AI and its impact on the world excite me. While working and providing solutions to challenging AI/ML/Data business problems for enterprises, I continuously keep learning about new developments in this field. Being able to participate and contribute meaningfully to the AI revolution motivates me. Personally, leading by example and with integrity while communicating effectively with everyone around me is essential to me. Treating everyone with humility and respect is one of my top values.
COVID has also made me realize that it’s essential to be grateful for what we have and be empathetic to others.
In your opinion, do digital events give you a similar level of feedback/result vis-à-vis the live versions? What would you say were the biggest pros and cons of both formats? Which do you prefer?
I do miss the personal connection while physically delivering a talk to a group of people. On the flip side, being able to present at digital events makes it possible to reach a wider audience. Feeling safe and comfortable to attend an in-person event is everyone’s personal choice. In the future, I would prefer to have hybrid events in which people can attend sessions both in person and virtually.
What is your take on in-person events? Do you prefer in-person events as compared to hybrid or virtual? How soon do you think in-person events would return? What is your take on in-person events? Do you prefer in-person events as compared to hybrid or virtual? How soon do you think in-person events would return?
In your opinion, what are the top 3 challenges to returning to ‘In-Person’ events? How could we mitigate risks?
- People are still recovering from the after-effects of COVID, and even after being vaccinated, it will take time to accustom ourselves to live with COVID for the long term.
- Traditionally, the in-person venues are jam-packed with an audience. I don’t see us returning to this setting any time soon. People will not feel comfortable sitting very close to a large group of other people in a closed room.
- The effectiveness of virtual events has already proved that most of the events can be successful without being in person. This mindset change in people will seemingly remain long after COVID.
Eventible has recently launched a B2B Interactive in Person Event Tracker, tell us what you think? Do you think this is useful?
Coming from a Data Science background, I love dashboards as an adequately designed dashboard can visually communicate valuable insights than reports. Eventibles dashboard is engaging as it shows the number of in-person events vs. virtual help in understanding the state of COVID in various countries. It will be helpful to show trends over time and be able to shortlist upcoming conferences looking for speakers and the actual scheduled dates.
Eventible.com is a review platform catering to B2B events. Given how review-driven our lives have become today, do you think reviews will bring in a level of transparency to the events industry? Would you rely on event reviews from other speakers if you had to make a speaking decision?
Interestingly my talk at Data Con LA was centred around detecting spam reviews using NLP on Databricks.
I do see genuine reviews being helpful, but it’s also true that bad actors have significant incentives to game the review and rating system to promote their events and products.
I believe that review platforms like Eventible will benefit by investing(if not already) in AI/ML approaches to detect and prevent the proliferation of fake reviews on their platform. This will enhance the credibility of their systems and provide real value to their users.
Finally, do you have a favourite mocktail or drink? We’d be delighted to know.
I have never been too interested in alcoholic beverages, but my favourite drink is peanut, banana protein smoothie.
About Debu: Debu is a Senior Solutions Architect at Databricks focused on implementing/optimizing machine learning and deep learning capable pipelines at scale. Previously he has co-founded a real-time identity graph management and analytics company called Throtle onboarding. Before that, he founded a nonprofit org in India that aimed at increasing education access to remote parts of India using virtual classrooms. In his current role, he interacts and aligns strategically with the technical and business leadership of Databricks Enterprise customers. He is leveraging his strong technical background, love for public speaking, and effective communication with customers to understand their business and technical strategy and challenges. He regularly engages in architectural design and whiteboarding sessions with customers as they see him as a trusted advisor. As a Senior Solutions Architect at Lifion by ADP, Senior Engineer at V12 group, and Bank of America, he has spearheaded multiple projects involving streaming and machine learning capable pipeline creation and optimization. his passion for cloud computing, machine learning, and distributed systems began while working on his Master’s research thesis on Machine Translation at Johns Hopkins University.