2026 website coming soon! Super Early Bird tickets available now!

Who spoke at Data Day Texas 2025

We continue to announce speakers and sessions. For the latest speaker / session updates, follow us on Linkedin.

Metadata Keynote
Ole Olesen-Bagneux (Copenhagen) @olesenbagneux

Ole Olesen-Bagneux (Linkedin) rethinks data and tech by providing perspectives from Library and Information Science. He holds a PhD in Information Science from the University of Copenhagen, Denmark, where he lectured in courses pivotal for data cataloging, such as Knowledge Organization and Information Retrieval. Ole is author of The Enterprise Data Catalog (O’Reilly). Ole is also author of the upcoming Fundamentals of Metadata Management (O'Reilly, 2025), in which he introduces a completely new architecture for metadata that he calls the Meta Grid. Standing on the shoulders of microservices, which liberated operational data, and data mesh, which liberated analytical data, the Meta Grid aims to liberate metadata. Follow Ole on Medium, and learn more about Meta Grid at Searching For Data.

Ole will present the Metadata Keynote:
Meta Grid - metadata management as an understanding of what already is and embracing it

AI Engineering Keynote
Chip Huyen (San Francisco) @chiphuyen

Chip Huyen (Linkedin) is a writer, computer scientist, and traveler.
Most recently, Chip founded Claypot AI, which was acquired which was acquired by Voltron Data. Previously, she built machine learning tools at NVIDIA, Snorkel AI, and Netflix. Chip graduated from Stanford University, where she taught CS 329S: Machine Learning Systems Design. Her lectures became the foundation for the book Designing Machine Learning Systems, which after two years, continues to be a #1 bestseller in multiple Amazon categories. Advance copies of Chip's upcoming book, AI Engineering, also from O'Reilly, will be available at Data Day Texas for your perusal. Follow her on GoodReads.

Chip will present the Ai Engineering Keynote:
From ML Engineering to AI Engineering

Data Quality Keynote
Mark Freeman (Sacramento)

Mark Freeman (Linkedin) is a data scientist turned data engineer with a deep obsession for data quality. As the Tech Lead at Gable, Mark builds internal systems and data products that drive go-to-market strategies, leveraging his extensive experience in creating robust, scalable data solutions. He is also the first employee at Gable where he aims to help bring a data contract solution to market. Mark is co-author of the upcoming O’Reilly book: Data Contracts, in which he shares insights and best practices on ensuring reliable, high-quality data flows within organizations. With a passion for turning complex data challenges into actionable solutions, Mark is committed to advancing the field of data engineering and fostering a culture of trust in data across the industry. Check out Mark's courses on Linkedin Learning.

Mark will present the Data Quality Keynote:
Introduction to Data Contracts

Closing AI Keynote
Jonathan Mugan (Austin)

Jonathan Mugan (Linkedin), Principal Scientist at De Umbra, is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.

Jonathan will present the Closing AI Keynote:
What Superintelligence Will Look Like

Eevamaija Virtanen (Helsinki) @eevamaija

Co-founder of Helsinki Data Week and founder of the DataTribe Collective, Eevamaija Virtanen is at the center of Finland's exploding data community. Currently Data Engineer and Co-Founder at Invinite Oy, Eevamaija began her career as a flight attendant, where she mastered interpersonal skills in high-pressure environments, before transitioning to business process outsourcing, where she learned project management and business development. She pursued her data engineering and analytics education, while exercising her creative instincts as a photographer and videographer, exploring storytelling and design. Eevamaija's broad experience has strengthened her belief in collaboration, trust and building systems that align with purpose. Sharing her passion for community and mentoring, Eevamaija also serves on the board of Finland’s Information Technology Association (TIVIA).

Special thanks to the folks at DataGalaxy for funding Eevamaija's first US speaking engagement.

Eevamaija will be presenting the following session:
Bridge Skills: The Hardest Problem Tech Still Can’t Solve

Bethany Lyons (London)

If you follow data podcasts, no doubt you’ve already heard of Bethany Lyons. In the last year, she has appeared on The Joe Reis Show, Catalog and Cocktails, CDO Matters with Malcolm Hawker, Better Together with Timo Dechau, How to Get an Analytics Job with John David Ariansen, and many others.
Now in her second decade in the data space, Bethany has already held a diverse set of roles. She’s done everything from pre-sales consulting to product management to implementation consulting to building trading algorithms for a hedge fund. Beginning her career with a 7 year stint at Tableau, Bethany was most recently Chief Product Officer at KAWA Analytics, and Senior Product Manager at Salesforce. Currently, she is a Principal Consultant at Assured Insights.

Bethany will host the following session:
Automating Financial Reconciliation with Linear Programming and Optimization

Xinran Waibel (SF Bay Area)

Xinran Waibel (Linkedin) is a Senior Data Engineer at Netflix, where she builds batch and event-streaming data systems to enable personalization. Prior to Netflix, she was a Data Engineer at Confluent and Target, where she leveraged big data technologies to enable data-driven decision making in the marketing and membership space. An active writer and blogger known for her commitment to data education, Xinran is founder of the 5000+ member Data Engineer Things community. Checkout their YouTube channel.

Vin Vashishta (Reno)

With over 25 years' experience in technology, Vin Vashishta, Founder and technical strategist at V Squared, is a recognized AI thought leader. Named a LinkedIn Top Voice, Gartner Ambassador, and an IBM and SAP insider, Vin has built a community of over 200K followers across social media, including tech leaders: Uber, Microsoft, Salesforce, NVIDIA, and Intel. His recently published book, From Data to Profit, is considered the goto playbook for monetizing data and AI. Check out his YouTube channel, The High ROI Data Scientist.

Vin will be presenting the following session:
The Outcomes Economy: A Technical Introduction To AI Agentic Systems, Multi-Simulations, & Ontologies

MF Joe Reis (Salt Lake city) @joereis

MF Joe Reis (Linkedin) is a “recovering data scientist,” and a business-minded data nerd who’s worked in the data industry for 20 years. His responsibilities have ranged from statistical modeling, forecasting, machine learning, data engineering, data architecture, and everything else in between. Joe was co-host of the popular Monday Morning Data Chat (Spotify / Apple) and currently the host of the Joe Reis Show (Apple / Spotify). Joe is also co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering. Joe also teaches at the University of Utah as well as runs several meetups, including The Utah Data Engineering Meetup and SLC Python. When he’s not busy running a company, teaching, or creating content, Joe often finds himself DJing/making music, rock climbing, or trail running in the mountains around Salt Lake City, Utah.

Annie Nelson (United States)

Annie Nelson is a Data Analyst at GitLab, content creator, and author of How to Become a Data Analyst. She has a background in psychology, and was previously a nanny and an occupational therapist before teaching herself data analytics and switching careers. Annie now creates content about data careers, when she's not working, traveling, or spending time outdoors.
Check out her interview with John David Ariansen on the How to Get an Analytics Job Podcast, and her YouTube channel: Annie's Analytics.

Annie will be presenting the following session:
The human side of data: Using technical storytelling to drive action

LLM Keynote
Vaibhav Gupta (Seattle)

Vaibhav Gupta (Linkedin) is the Founder and CEO of Boundary, a Y Combinator startup developing a new programming language (BAML) that makes LLMs both easier and more efficient for developers. Across nearly a decade in software engineering, Vaibhav has built predictive pipelines at D. E. Shaw, Face Id at Google, and real-time 3D reconstruction at Microsoft HoloLens. In his free time, Vaibhav dabbles in competitive table tennis and board games, and various aspects of compilers.

Vaibhav will be presenting the LLM Keynote:
LLMs in Production - How to Keep Them from Breaking
#ai

Anne-Claire Baschet (Paris)

Anne-Claire Baschet is the Chief Data & Artificial Intelligence Officer at Mirakl, where she leads the Data and AI strategy, working with teams to leverage AI and machine learning technologies to create value in products, services, and daily operations. Anne-Claire began her career as a Data Scientist at Mercer and AXA, then held various leadership roles in Data at AXA France and in Data & Product at Voyages SNCF, where she led the development of products like the TGVPro mobile app, TGV Max, and the WiFi portal on TGV trains. Before joining Mirakl, Anne-Claire served as Chief Product & Data/AI Officer at Aramis Group.
With nearly two decades of experience in Data, Anne-Claire has spent the past ten years merging Data and Product roles to develop impactful Data & AI products. In June 2018, she received the 'Digital Transformation of the Year' award from Netexplo for her significant contributions to leveraging digital technologies for transformative impact. Her career progression reflects her talent for identifying and harnessing the value of Data and AI to drive company success and align teams for impactful results.
Anne-Claire is co-author of the upcoming title: Crafting impactful AI and Data Products. This will be Anne-Claire’s first appearance speaking in the United States.

Anne-Claire will be co-presenting the following session:
Escape the Data & AI Death Cycle, Enter the Data & AI Product Mindset

Keith Belanger (New Hampshire)

With over 28 years in data management and architecture, Keith Belanger is passionate about all things data. He brings a business-focused approach to designing and leading data solutions, specializing in data modeling across Conceptual, Logical, and Physical layers—from highly normalized 3NF to Dimensional and Data Vault. A recognized Snowflake Data Superhero and the Product Evangelist at SqlDBM, Keith is dedicated to advancing the value of data modeling within modern data practices. Check out Keith's recent discussion regarding the art of data modeling on The Joe Reis Show.

Keith will be presenting the following session:
Data Modeling in the Age of AI

Yoann Benoit (Paris)

Yoann Benoit is currently Co-Founder and Head of Data at Hymaïa, a French consulting and knowledge sharing hub dedicated to crafting impactful Data & AI Products and guiding organizations in developing effective data strategies. He is a regular contributor on topics related to Data Culture and Data & AI Products, and organizes quarterly conferences on subjects like Data Mesh, MLOps and Data / AI Products. His latest endeavor is as Co-Organizer of the Forward Data Conference, which holds its first edition in Paris in October 2024.
Yoann is co-author of the upcoming title: Crafting impactful AI and Data Products. This will be Yoann’s first appearance speaking in the United States.

Yoann will be co-presenting the following session:
Escape the Data & AI Death Cycle, Enter the Data & AI Product Mindset

Jordan Morrow (Salt Lake City)

Jordan Morrow is known as the Godfather of Data Literacy, having helped invent and pioneer the entire field. He is also the founder and CEO of Bodhi Data and currently is the Senior Vice President of Data & AI Transformation for AgileOne, where he helps to utilize data and AI in the total talent management space. He served as the Chair of the Advisory Board for The Data Literacy Project, has spoken at numerous conferences around the world, and is an active voice in the data and analytics community. Jordan also helps companies and organizations around the world, including the United Nations, build and/or understand data literacy.
When not found within his work of Data, Jordan is married with 5 kids. Jordan loves fitness and has run multiple ultra marathons. He loves to travel with his wife and family. Jordan loves to read, often reading (or using Audible) to go through multiple books at a time. Jordan is the author of three books: Be Data Literate, Be Data Driven, Be Data Analytical, and the just published Business 101 for the Data Professional in December 2024.

Jordan will be presenting the following session:
Elevating Data in the Business - Bring Data and AI Skills to Life

Arthur Delaitre (Paris)

Arthur Delaitre is the AI Catalog Manager at Mirakl. He leads the AI efforts in developing features for catalog onboarding and management on both the Mirakl Marketplace Platform and Mirakl Connect. Passionate about leveraging cutting-edge technologies, he applies Generative AI, multimodal models, and fine-tunes custom Large Language Models (LLMs) to solve complex problems. Result-oriented, he delivers production-ready solutions at scale. This year, his team conceived and developed the Catalog Transformer—an innovative solution that enables sellers to automatically onboard their catalogs. This dramatically reduces onboarding time and sets Mirakl apart in the industry.

Arthur will be co-presenting the following session:
Deployment at scale of an AI system based on custom LLMs : technical challenges and architecture

Andrew Nguyen ( SF Bay Area)

Dr. Andrew Nguyen has been working at the intersection of healthcare data and AI for more than a decade. He quickly discovered graph databases and has been using them to harmonize disparate data sources for nearly as long. He has worked for a variety of organizations ranging from academia to startups. Andrew is currently the Head of Data & AI at Best Buy Health, where he leads teams focused on digital biomarkers and generative AI applications in healthcare delivery. Previously, he was the Section Head for Medical Informatics & Data Engineering at Genentech/Roche and also served as co-chair of the Data Working Group of the Alliance for AI in Healthcare.
Before jumping back into industry, he served as chair of the Department of Health Professions and director of the MS in Health Informatics program at the University of San Francisco. He also held an adjunct research associate professor appointment at Temple University. Andrew holds a PhD in biological and medical informatics from the University of California, San Francisco (UCSF) and a BS in electrical and computer engineering from the University of California, San Diego (UCSD). In his spare time, he enjoys photography, hiking/backpacking, and SCUBA diving, and serves as a search manager for his local Search and Rescue team. Andrew is author of the recently published O'Reilly book: Hands-On Healthcare Data.

Andrew will present the following session:
Context Engineering: A Framework for Data Intelligence

Adam Sroka (Edinburgh)

Adam Sroka (LinkedIn) is co-founder and director of Hypercube data & AI consultancy, host of the energy, utilities and trading Hypercube Podcast, board member of the data science & AI innovation centre The Data Lab, author of the Beyond Data Community & Newsletter with 3000+ weekly subscribers and a LinkedIn Top Voice for data strategy, leadership and management. With a MSc in Photon Science and PhD in Engineering high-power peak lasers for defence applications, Adam has built deep expertise in machine learning and data strategy with a strong mathematical background. He has over 10 years of industry experience, specialising in solving complex technical problems and building high-performing data teams in the energy sector.

Adam will be presenting the energy session:
Optimisation Platforms for Energy Trading

Lisa Cao (San Francisco) @lisancao

Lisa Cao (LinkedIn) is a former data analyst, now data engineer and software engineer interested in observability, validation, and reliability in data systems. She is a Google Women TechMakers Ambassador, Linux Foundation LiFT recipient for Women in Open Source, founder and chair of the Vancouver Datajam, and lead maintainer of the BiocSwirl project. Currently, Lisa makes her home in the San Francisco Bay Area where she leads project management at DataStrato and is a co-organizer at Data Engineer Things.

Lisa will be presenting the following two sessions:
History and Future of Iceberg REST Catalogs
Fundamentals of DataOps

Serg Masís (Raleigh-Durham-Chapel Hill) @serg-dot-ai

Serg Masís (LinkedIn) is a Climate and Agronomic Data Scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making. Serg is author of Interpretable Machine Learning with Python, now in its 2nd edition. Serg is also working on two upcoming titles: DIY AI: Step-By-Step Artificial Intelligence Projects for Makers and Hackers, and Building Responsible AI with Python. Learn more at Serg.ai.

Serg will present the following session:
Harvesting Trust: A Case Study on Developing Dependable Specialist Chatbots

Data Mesh Keynote
Jean-Georges Perrin (Albany, New York) @jgp

Jean-Georges Perrin (Wikipedia / LinkedIn) is an IT software engineer, lecturer, and serial entrepreneur from Alsace, France. The first French citizen to become an IBM Champion in 2009, he became a Lifetime IBM Champion in 2021, and a PayPal champion in 2024. Formerly Intelligence Platform Lead at PayPal, Jean-Georges is currently co-founder and Chief Innovation Officer at Abea Data. Jean-Georges is author of Spark in Action from Manning, and co-author of the upcoming Implementing Data Mesh from O'Reilly. Check out his thoughts on Data Mesh at Youtube.

JPG will present the following #dataquality session:
Data Mesh is the Grail, Bitol is your Journey

Weidong Yang (San Francisco)

Weidong Yang is the founder and CEO of Kineviz. He holds a doctorate in Physics and a Masters in Computer and Information Science. After conducting theoretical and experimental research on quantum dots, Weidong worked for 10 years as a product manager and R&D scientist in the Semiconductor industry where he invented Diffraction-based Overlay technology to improve the manufacturing precision of silicon wafers. He has been awarded 11 US patents and has contributed to 20+ peer review publications.
Weidong also co-founded Kinetech Arts, a non-profit organization that brings dancers and engineers together to explore the creative potential of making art via new technologies.

Weidong will present the following #graphday session:
GraphBI: Expanding Analytics to All Data Through the Combination of GenAI, Graph, and Visual Analytics

Michael Hunger (Dresden) @mesirii.de

Michael Hunger (LinkedIn) Michael Hunger has been passionate about software development for more than 25 years. Since 2010, he has been working on the open source Neo4j graph database filling many roles, most recently leading the Neo4j Labs efforts. As caretaker of the Neo4j community and ecosystem he especially loves to work with graph-related projects, users, and contributors. As a developer Michael enjoys many aspects of programming languages, learning new things every day, participating in exciting and ambitious open source projects and contributing and writing software related books and articles. Michael spoke at numerous conferences and helped organized several of them. His efforts got him accepted to the JavaChampions program. Michael helps kids to learn to program by running weekly girls-only coding classes at local schools.

Michael will present the GraphRAG:
The Power of GraphRAG - Successful Architectures and Patterns

Jess Haberman (Boston) @jesshaberman

Jess Haberman is Director of Product Content at Anaconda, where she leads content strategy and education. Previously, Jess was an acquisitions editor at O’Reilly Media, collaborating with tech industry leaders to develop instructional books and online content in data science and data engineering. She has presented at and facilitated technology conferences (O’Reilly’s Strata and Data Superstreams, PyCon US, Scale by the Bay, DataCon LA), webinars, live training courses, podcasts, publishing seminars, and writing retreats. Jess earned her BA in English Literature from Denison University and spent 14 years in nonfiction book publishing.

Jess will be leading the following panel:
The Future of Data Education

Bill Inmon (Castle Rock, Colorado)

Bill Inmon (Wikipedia / LinkedIn) is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, held the first conference, wrote the first column in a magazine and was the first to offer classes in data warehousing. Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions. Bill is among the most prolific and well-known authors in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years” of the computer profession.

Bill will be presenting the following session
How to become a hero - the journey to text

Susan Shu Chang (Toronto) @susan-shu-chang

Susan Shu Chang (Linkedin) is currently Principal Data Scientist at Elastic. Originally trained in Economics, Susan is a 5x PyCon speaker, founder of Indie game studio Quill Game Studios and organizer of the 3700+ member Toronto Women's Data Group. Susan is also author of the upcoming O'Reilly book: Machine Learning Interviews. To learn how she finds time for all this and more, check out her personal site, susanshu.com, for her writings on focus optimization and daily routines.

Susan will present the following session:
Improve your RAG pipelines with semantic re-ranking

Michelle Yi (SF Bay Area) @michelle-yi

Michelle Yi is a technology leader that specializes in machine learning and cloud computing. She has 15 years of experience in the technology industry, contributed to the original IBM Watson showcased on Jeopardy, and enjoys building and leading teams that develop and deploy AI solutions to solve real-world problems. Michelle is passionate about diversity, STEM education/careers for our minority communities, and serves both on the board of Women in Data and as an avid volunteer for Girls Who Code.

Michelle host the following AI session:
All Your Base Are Belong To Us: Adversarial Attack and Defense
and the following AI Causal Graph workshop:
Causal Graphs in Practice
Michelle will also participate in the following panel:
The Future of Data Education

Amy Hodler (Kettle Falls, Washington)

Amy Hodler is an evangelist for graph analytics, network science, and responsible AI. Amy has decades of experience in emerging tech at companies such as Microsoft, Hewlett-Packard (HP), Hitachi IoT, Neo4j, Cray, and Relational AI. Amy has a love for science history and a fascination for complexity studies. Amy is the co-author of the O'Reilly book: Graph Algorithms, as well as co-author of an upcoming volume on the history of graph analytics.

Amy will co-host the following AI Causal Graph workshop:
Causal Graphs in Practice
Amy will also co-host the following Sunday Data Discussion:
Hyperdimensional Horizons: Exploring Neuromorphic Intelligence and Graph Applications

Clair Sullivan (Breckenridge, Colorado) @cjlovesdata1

Dr. Clair Sullivan is currently the Founder and CEO of Clair Sullivan and Associates, a company dedicated to providing data science consulting services. Prior to starting her company, she was the Director of Data Science at Vail Resorts leading a team of data scientists and machine learning engineers providing production models for operations and marketing. Previously she was a data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society's Mary J. Oestmann Professional Women's Achievement Award in 2015.

Clair host the following sessions:
Empowering Change: Building and Sustaining a Data Culture from the Ground Up
From Office Cubicles to Independent Success: How to Create a Career and Thrive as a Freelance Data Scientist

Hala Nelson (Alexandria, Virginia)

Hala Nelson (Linkedin) is an Associate Professor of Mathematics at James Madison University. She has a Ph.D. in Mathematics from the Courant Institute of Mathematical Sciences at New York University. Prior to her work at James Madison University, she was a postdoctoral Assistant Professor at the University of Michigan- Ann Arbor. Her research is in the areas of Materials Science, Statistical Mechanics, Inverse Problems, and the Mathematics of Machine Learning and Artificial Intelligence. Her favorite subjects are Optimization, Numerical Algorithms, Mathematics for AI, Mathematical Analysis, Numerical Linear Algebra and Probability Theory. She likes to translate complex ideas into simple and practical terms. To her, most mathematical concepts are painless and relatable, unless the person presenting them either does not understand them very well, or is trying to show off. Other facts: Hala Nelson grew up in Lebanon, during the time of its brutal civil war. She lost her hair at a very young age in a missile explosion. This event and many that followed shaped her interests in human behavior, the nature of intelligence, and AI. Her father taught her Math, at home and in French, until she graduated high school. Her favorite quote from her father about math is, "It is the one clean science''. Hala is author of the recent O'Reilly book: Essential Math for AI.

Hala host the following session :
Adopting AI in a Large Complex Organization- Aspiration vs Reality
Hala will also participate in the following panel:
The Future of Data Education

Jessica Talisman (Santa Cruz) @jtalisman

Jessica Talisman is a taxonomist, information architect, and professional data wrangler. Over her 25 years of experience in the information & data architecture world, Jessica has worked as an academic librarian and for The US Department of Justice, Overstock.com, Pluralsight and Amazon. She currently works as a Senior Information Architect for Adobe. Jessica holds a Master of Library and Information Science and a Masters in Teaching. Check out Jessica’s recent interviews on the Monday Morning Data Chat, Discovering Data, Knowledge Graph Insights, Blueprints for Success, Catalog and Cocktails, Data Dialogues, Data Democracy, The AI Digest, and How AI is Built

Jessica will present the following session:
We Are All Librarians, Systems for Organizing in the Age of AI

Juan Sequeda (Austin) @juansequeda

Dr. Juan Sequeda is the Principal Scientist and Head of the AI Lab at data.world. He holds a PhD in Computer Science from The University of Texas at Austin. Juan’s research and industry work has been on the intersection of data and AI, with the goal to reliably create knowledge from inscrutable data, specifically designing and building Knowledge Graph for enterprise data and metadata management. Juan is the co-author of the book “Designing and Building Enterprise Knowledge Graph” and the co-host of Catalog and Cocktails, an honest, no-bs, non-salesy data podcast.
Juan has researched and developed technology on semantic data virtualization, graph data modeling, schema mapping and data integration methodologies. He pioneered technology to construct knowledge graphs from relational databases, resulting in W3C standards, research awards, patents, software and his startup Capsenta acquired by data.world in 2019. Juan strives to build bridges between academia and industry as former co-chair of the LDBC Property Graph Schema Working Group, member of the LDCB Graph Query Languages task force, standards editor at the World Wide Web Consortium (W3C). Juan continues to be an active member of the scientific community through academic research partnerships, advising students, and member of data and AI scientific conference committees.

Juan will present the following session:
How to Start Investing in Semantics and Knowledge: A Practical Guide

Ryan Dolley (Detroit)

Ryan Dolley is Vice President of Product Strategy at GoodData, and one half of the Super Data Brothers. Check out his discussion on the evolution of BI and moving beyond dashboards on a recent episode of the Joe Reis Show.

Max De Marzi (Chicago)

Marx De Marzi (Linkedin) is addicted to graphs. You may consider him a graph database enthusiast. He spent 8 years at Neo4j and recently made the swith to AWS Neptune. He is a blogger and an open source contributor, both activities which stem from passion: teaching people about graphs. He is always open to talk graphs, always learning, and nothing thrills him more than finding easy graph solutions to hard relational problems. He has been helping people get to the "graph epiphany" for over a decade. He is an avid graph database modeler, leveraging his knowledge of mechanical sympathy and experience to deliver dozens of graph uses cases over the years.

Max will be presenting the following #graphday session:
Modeling in Graph Databases

Malcolm Hawker (Melbourne Beach) @malhawker

Former Gartner Analyst and Profisee Head of Data Strategy, Malcolm Hawker is a recognized thought leader and one of the industry’s foremost authorities on the topics of data strategy, data governance, and master data management. As the co-author of the last three Gartner MDM Magic Quadrant™ documents, Malcolm has consulted with thousands of CDO's and other data leaders from across the globe on their biggest data related challenges. In a career that spans three decades, Malcolm has held executive-level IT and Product leadership roles at F500 companies, and has a unique combination of experience as a leader, implementer, vendor, and consultant for enterprise-class data solutions. Having lived in Austin for a big portion of his professional life, Malcolm has deep ties to Texas and the amazing data professionals that call it home.

Malcolm will be presenting the following Data Governance session:
Data Governance – It’s Time to Start Over

William Lyon (SFBay) @lyonwj

William Lyon (LinkedIn) is an AI Engineer at Hypermode where he works to improve the developer experience of building full stack data applications. Previously he worked as a software developer at Neo4j and other startups. William holds a Masters degree in Computer Science from the University of Montana. William is author of the Manning publication Full Stack GraphQL Applications With React, Node.js, and Neo4j. You can find him online at lyonwj.com.

Will will be presenting the following #graphday session:
WTF Is A Triple? My Journey From Neo4j To Dgraph

Alex Dean (London)

Alex Dean is co-founder and CEO of Snowplow, which he co-founded with Yali Sassoon in 2012. Snowplow is the Customer Data Infrastructure for AI: the product generates AI-ready first-party behavioral data to power advanced analytics, ML and GenAI; all delivered in real-time with shift-left data governance and quality. Alex has been working at the intersection of data platforms, analytics & AI, and consumer behavior his whole career; prior to Snowplow, Alex worked on Business Intelligence projects at Deloitte Consulting, and on clickstream data pipelines at adtech pioneer OpenX.
Alex is the author of Event Streams in Action.

Alex will be presenting the following session:
Towards a sensory system for AI agents

David Hughes (Seattle)

David Hughes is the Principal Graph Consultant for Graphable. He has 10 years of experience designing and building graph solutions which surface meaningful insights. His background includes clinical practice, medical research, software development, and cloud architecture. David has worked in healthcare and biotech within the intensive care, interventional radiology, oncology, cardiology, and proteomics domains. He enjoys endurance running, hiking, and spending time with his family in the outdoors when he is not enabling clients to have data epiphanies from their complex data.

David will be co-presenting the following #graphday session:
Unleashing the Power of Multimodal GraphRAG: Integrating Image Features for Deeper Insights

Patrick McFadin (SF Bay) @patrickmcfadin

Patrick McFadin (Linkedin) is the VP of Principal Technical Strategist at DataStax, working on the intersection of distributed data and AI. He is also a committer on the Apache Cassandra project and a co-author of the O’Reilly book “Managing Cloud-Native Data on Kubernetes.” Before his current role, Patrick worked as Chief Evangelist for Apache Cassandra and as a consultant for DataStax, where he helped build some of the largest and most exciting deployments in production. Before joining DataStax, he held positions as Chief Architect, Engineering Lead, and Database DBA/Developer.

Patrick will present the following session:
Moving Beyond Text-to-SQL: Reliable Database Access through LLM Tooling

Ryan Wisnesky (San Francisco )@gremlinmorgoth

Ryan Wisnesky obtained B.S. and M.S. degrees in mathematics and computer science from Stanford University and a Ph.D. in computer science from Harvard University, where he studied the design and implementation of provably correct software systems. While at IBM Research Almaden he contributed to the Clio, Orchid, and HIL projects. While a postdoctoral associate in the MIT department of mathematics, he developed the CQL query language for ontology manipulation based on category theory. He is currently exploring applications of CQL to safe AI as CTO of Conexus AI.

Ryan will present the following two sessions:
1. Ontologies vs Ologs vs Graphs
2. Validating LLM-Generated SQL Code: A mathematical approach

Chris Tabb (London)

Chris Tabb, co-founder of LEIT DATA started his career in the Business Intelligence/Analytics domain 30 years ago. Beginning at Cognos in the 90’s working in the back office before becoming an expert in all their products, and leaving to become an independent BI consultant in 1998. Chris has followed the evolution of the analytics industry, working hands-on with all the technologies in the ecosystems: – Databases, ETL/ELT, BI/OLAP /Visualisation Tools, Big Data Technologies, Infrastructure On premises / Cloud across many vendors, some old some new. Recently with a focus on the Modern Data Stack Evolution Chris has started many movements with a focus on Business Value using a number of hashtags to raise awareness #bringbackdatamodelling / #bringbackdatamodeling #bringbackdocumention under the umbrella of the #meandatastreets that is focused on simplification of the Data Platform architecture and to focus on Business Value.

Chris will present the following session:
The Force multiplier effect. How data platform foundations drive efficiency

Matthew Housley (Salt Lake city)

Matthew Housley,“Recovering Data Scientist”, is Co-Founder / CTO of Ternary Data. Also a “Reformed Academic,” Matthew holds a PhD in Math and dual Masters degrees in both Math and Physics. It was only natural that he began his career in Academia as a Professor of Mathematics, before joining one of the largest e-commerce companies as a data scientist. Matt's STEM background in combination with his knack for teaching makes him a mastermind at overhauling processes, improving teamwork, and incorporating engineering best practices so that real value is delivered to companies. While making the journey from data scientist to data engineer, Matt began to focus more on data & cloud engineering, working extensively with Amazon Web Services, Google Cloud Platform, Containers, Apache Airflow and GPUs, among other technologies. Matt (or should we say, “Dr. Housley”) is an adjunct faculty member in the David Eccles School of Business at The University of Utah. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) and co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering.

Leann Chen (Minneapolis)

Leann Chen (LinkedIn / YouTube / GitHub) is a Generative AI Developer Advocate at Diffbot, where she specializes in creating educational content to leverage the power of knowledge graphs to improve LLM-based systems. Leann first hit our radar when she created a knowledge graph powered RAG to use as a recommendation assistant for Data Day Texas 2024 (video). Check out Leann's recent interview on the Neo4j channel, and her latest video : Reliable Graph RAG with Neo4j and Diffbot

Leann will present the following session:
Back to the basics of LLM pipeline for production