Take advantage of our conference discount and book your room at the AT&T Conference Hotel.

Who will speak at Data Day Texas 2025

We continue to announce speakers and sessions. For the latest speaker / session updates, follow us on Linkedin.

Metadata Keynote
Ole Olesen-Bagneux (Copenhagen)

Ole Olesen-Bagneux (Linkedin) rethinks data and tech by providing perspectives from Library and Information Science. He holds a PhD in Information Science from the University of Copenhagen, Denmark, where he lectured in courses pivotal for data cataloging, such as Knowledge Organization and Information Retrieval. Ole is author of The Enterprise Data Catalog (O’Reilly). Ole is also author of the upcoming Fundamentals of Metadata Management (O'Reilly, 2025), in which he introduces a completely new architecture for metadata that he calls the Meta Grid. Standing on the shoulders of microservices, which liberated operational data, and data mesh, which liberated analytical data, the Meta Grid aims to liberate metadata. Follow Ole on Medium, and learn more about Meta Grid at Searching For Data.

Bethany Lyons (London)

If you follow data podcasts, no doubt you’ve already heard of Bethany Lyons. In the last year, she has appeared on The Joe Reis Show, Catalog and Cocktails, CDO Matters with Malcolm Hawker, Better Together with Timo Dechau, How to Get an Analytics Job with John David Ariansen, and many others.
Now in her second decade in the data space, Bethany has already held a diverse set of roles. She’s done everything from pre-sales consulting to product management to implementation consulting to building trading algorithms for a hedge fund. Beginning her career with a 7 year stint at Tableau, Bethany was most recently Chief Product Officer at KAWA Analytics, and Senior Product Manager at Salesforce. Currently, she is a Principal Consultant at Assured Insights.

Bethany will host the following 90 minute deep dive:
Automating Financial Reconciliation with Linear Programming and Optimization

AI Engineering Keynote
Chip Huyen (San Francisco) @chipro

Chip Huyen (Linkedin) is a writer and computer scientist, currently at Voltron Data, where she works on GPU-native data processing and open data standards (Ibis, Apache Arrow, Substrait). Previously, Chip built machine learning tools at NVIDIA, Snorkel AI, and Netflix. She also founded Claypot AI, which was acquired. Chip graduated from Stanford University, where she taught CS 329S: Machine Learning Systems Design. Her lectures became the foundation for the book Designing Machine Learning Systems, which after two years, continues to be a #1 bestseller in multiple Amazon categories. Advance copies of her upcoming book, AI Engineering, also from O'Reilly, will be available at Data Day Texas for your perusal. In her free time, Chip travels, writes, and reads. Follow her on GoodReads.

Xinran Waibel (SF Bay Area)


Xinran Waibel
(Linkedin) is a Senior Data Engineer at Netflix, where she builds batch and event-streaming data systems to enable personalization. Prior to Netflix, she was a Data Engineer at Confluent and Target, where she leveraged big data technologies to enable data-driven decision making in the marketing and membership space. An active writer and blogger known for her commitment to data education, Xinran is founder of the 5000+ member Data Engineer Things community. Checkout their YouTube channel.

Vin Vashishta (Reno)

With over 25 years' experience in technology, Vin Vashishta, Founder and technical strategist at V Squared, is a recognized AI thought leader. Named a LinkedIn Top Voice, Gartner Ambassador, and an IBM and SAP insider, Vin has built a community of over 200K followers across social media, including tech leaders: Uber, Microsoft, Salesforce, NVIDIA, and Intel. His recently published book, From Data to Profit, is considered the goto playbook for monetizing data and AI. Check out his YouTube channel, The High ROI Data Scientist.

Vin will be presenting the following session:
The Outcomes Economy: A Technical Introduction To AI Agentic Systems, Multi-Simulations, & Ontologies

Joe MF Reis (Salt Lake city)


Joe MF Reis (Linkedin), Co-Founder and CEO of Ternary Data, is a “recovering data scientist,” and a business-minded data nerd who’s worked in the data industry for 20 years. His responsibilities have ranged from statistical modeling, forecasting, machine learning, data engineering, data architecture, and everything else in between. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) as well as the newly launched Joe Reis Show (Apple / Spotify). Joe is also co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering. Joe also teaches at the University of Utah as well as runs several meetups, including The Utah Data Engineering Meetup and SLC Python. When he’s not busy running a company, teaching, or creating content, Joe often finds himself DJing/making music, rock climbing, or trail running in the mountains around Salt Lake City, Utah.

Annie Nelson (United States)

Annie Nelson is a Data Analyst at GitLab, content creator, and author of How to Become a Data Analyst. She has a background in psychology, and was previously a nanny and an occupational therapist before teaching herself data analytics and switching careers. Annie now creates content about data careers, when she's not working, traveling, or spending time outdoors.
Check out her interview with John David Ariansen on the How to Get an Analytics Job Podcast, and her YouTube channel: Annie's Analytics.

Annie will be presenting the following session:
The human side of data: Using technical storytelling to drive action

Anne-Claire Baschet (Paris)

Anne-Claire Baschet is the Chief Data & Artificial Intelligence Officer at Mirakl, where she leads the Data and AI strategy, working with teams to leverage AI and machine learning technologies to create value in products, services, and daily operations. Anne-Claire began her career as a Data Scientist at Mercer and AXA, then held various leadership roles in Data at AXA France and in Data & Product at Voyages SNCF, where she led the development of products like the TGVPro mobile app, TGV Max, and the WiFi portal on TGV trains. Before joining Mirakl, Anne-Claire served as Chief Product & Data/AI Officer at Aramis Group.
With nearly two decades of experience in Data, Anne-Claire has spent the past ten years merging Data and Product roles to develop impactful Data & AI products. In June 2018, she received the 'Digital Transformation of the Year' award from Netexplo for her significant contributions to leveraging digital technologies for transformative impact. Her career progression reflects her talent for identifying and harnessing the value of Data and AI to drive company success and align teams for impactful results.
Anne-Claire is co-author of the upcoming title: Crafting impactful AI and Data Products. This will be Anne-Claire’s first appearance speaking in the United States.

Anne-Claire will be co-presenting the following session:
Escape the Data & AI Death Cycle, Enter the Data & AI Product Mindset

Yoann Benoit (Paris)

Yoann Benoit  is currently Co-Founder and Head of Data at Hymaïa, a French consulting and knowledge sharing hub dedicated to crafting impactful Data & AI Products and guiding organizations in developing effective data strategies. He is a regular contributor on topics related to Data Culture and Data & AI Products, and organizes quarterly conferences on subjects like Data Mesh, MLOps and Data / AI Products. His latest endeavor is as Co-Organizer of the Forward Data Conference, which holds its first edition in Paris in October 2024.
Yoann is co-author of the upcoming title: Crafting impactful AI and Data Products. This will be Yoann’s first appearance speaking in the United States.

Yoann will be co-presenting the following session:
Escape the Data & AI Death Cycle, Enter the Data & AI Product Mindset

Arthur Delaitre (Paris)

Arthur Delaitre is the AI Catalog Manager at Mirakl. He leads the AI efforts in developing features for catalog onboarding and management on both the Mirakl Marketplace Platform and Mirakl Connect. Passionate about leveraging cutting-edge technologies, he applies Generative AI, multimodal models, and fine-tunes custom Large Language Models (LLMs) to solve complex problems. Result-oriented, he delivers production-ready solutions at scale. This year, his team conceived and developed the Catalog Transformer—an innovative solution that enables sellers to automatically onboard their catalogs. This dramatically reduces onboarding time and sets Mirakl apart in the industry.

Arthur will be co-presenting the following session:
Deployment at scale of an AI system based on custom LLMs : technical challenges and architecture

Adam Sroka (Edinburgh)

Adam Sroka (LinkedIn) is co-founder and director of Hypercube data & AI consultancy, host of the energy, utilities and trading Hypercube Podcast, board member of the data science & AI innovation centre The Data Lab, author of the Beyond Data Community & Newsletter with 3000+ weekly subscribers and a LinkedIn Top Voice for data strategy, leadership and management. With a MSc in Photon Science and PhD in Engineering high-power peak lasers for defence applications, Adam has built deep expertise in machine learning and data strategy with a strong mathematical background. He has over 10 years of industry experience, specialising in solving complex technical problems and building high-performing data teams in the energy sector.

Adam will be presenting the energy session:
Optimisation Platforms for Energy Trading

Lisa Cao (San Francisco)

Lisa Cao (LinkedIn) is a former data analyst, now data engineer and software engineer interested in observability, validation, and reliability in data systems. She is a Google Women TechMakers Ambassador, Linux Foundation LiFT recipient for Women in Open Source, founder and chair of the Vancouver Datajam, and lead maintainer of the BiocSwirl project. Currently, Lisa makes her home in the San Francisco Bay Area where she leads project management at DataStrato and is a co-organizer at Data Engineer Things.

Lisa will be presenting the following two sessions:
History and Future of Iceberg REST Catalogs
Fundamentals of DataOps

Serg Masís (Raleigh-Durham-Chapel Hill)

Serg Masís (LinkedIn) is a Climate and Agronomic Data Scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making. Serg is author of Interpretable Machine Learning with Python, now in its 2nd edition. Serg is also working on two upcoming titles: DIY AI: Step-By-Step Artificial Intelligence Projects for Makers and Hackers, and Building Responsible AI with Python. Learn more at Serg.ai.

Data Mesh Keynote
Jean-Georges Perrin (Albany, New York)

Jean-Georges Perrin (Wikipedia / LinkedIn) is an IT software engineer, lecturer, and serial entrepreneur from Alsace, France. The first French citizen to become an IBM Champion in 2009, he became a Lifetime IBM Champion in 2021, and a PayPal champion in 2024. Formerly Intelligence Platform Lead at PayPal, Jean-Georges is currently co-founder and Chief Innovation Officer at Abea Data. Jean-Georges is author of Spark in Action from Manning, and co-author of the upcoming Implementing Data Mesh from O'Reilly. Check out his thoughts on Data Mesh at Youtube.

Weidong Yang (San Francisco) @wdyang

Weidong Yang is the founder and CEO of Kineviz. He holds a doctorate in Physics and a Masters in Computer and Information Science. After conducting theoretical and experimental research on quantum dots, Weidong worked for 10 years as a product manager and R&D scientist in the Semiconductor industry where he invented Diffraction-based Overlay technology to improve the manufacturing precision of silicon wafers. He has been awarded 11 US patents and has contributed to 20+ peer review publications.
Weidong also co-founded Kinetech Arts, a non-profit organization that brings dancers and engineers together to explore the creative potential of making art via new technologies.

Weidong will lead the following #graphday session:
GraphBI: Expanding Analytics to All Data Through the Combination of GenAI, Graph, and Visual Analytics

Jess Haberman (Boston) @JessHaberman

Jess Haberman is Director of Product Content at Anaconda, where she leads content strategy and education. Previously, Jess was an acquisitions editor at O’Reilly Media, collaborating with tech industry leaders to develop instructional books and online content in data science and data engineering. She has presented at and facilitated technology conferences (O’Reilly’s Strata and Data Superstreams, PyCon US, Scale by the Bay, DataCon LA), webinars, live training courses, podcasts, publishing seminars, and writing retreats. Jess earned her BA in English Literature from Denison University and spent 14 years in nonfiction book publishing.

Jess will be leading the following panel:
The Future of Data Education

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin), Principal Scientist at De Umbra, is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.

jonathan will present the following session:
What Superintelligence Will Look Like

Bill Inmon (Castle Rock, Colorado)

Bill Inmon (Wikipedia / LinkedIn) is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, held the first conference, wrote the first column in a magazine and was the first to offer classes in data warehousing. Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions. Bill is among the most prolific and well-known authors in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years” of the computer profession.

Susan Shu Chang (Toronto) @susan_shuc

Susan Shu Chang (Linkedin) is currently Principal Data Scientist at Elastic. Originally trained in Economics, Susan is a 5x PyCon speaker, founder of Indie game studio Quill Game Studios and organizer of the 3700+ member Toronto Women's Data Group. Susan is also author of the upcoming O'Reilly book: Machine Learning Interviews. To learn how she finds time for all this and more, check out her personal site, susanshu.com, for her writings on focus optimization and daily routines.

Michelle Yi (SF Bay Area) @ YulleYi

Michelle Yi is a technology leader that specializes in machine learning and cloud computing. She has 15 years of experience in the technology industry, contributed to the original IBM Watson showcased on Jeopardy, and enjoys building and leading teams that develop and deploy AI solutions to solve real-world problems. Michelle is passionate about diversity, STEM education/careers for our minority communities, and serves both on the board of Women in Data and as an avid volunteer for Girls Who Code.

Michelle host the following AI session:
All Your Base Are Belong To Us: Adversarial Attack and Defense
Michelle will also participate in the following panel:
The Future of Data Education

Amy Hodler (Kettle Falls, Washington) @amyhodler

Amy Hodler is an evangelist for graph analytics, network science, and responsible AI. Amy has decades of experience in emerging tech at companies such as Microsoft, Hewlett-Packard (HP), Hitachi IoT, Neo4j, Cray, and Relational AI. Amy has a love for science history and a fascination for complexity studies. Amy is the co-author of the O'Reilly book: Graph Algorithms, as well as co-author of an upcoming volume on the history of graph analytics.

Clair Sullivan (Breckenridge, Colorado) @ProfCJSullivan


Dr. Clair Sullivan is currently the Founder and CEO of Clair Sullivan and Associates, a company dedicated to providing data science consulting services. Prior to starting her company, she was the Director of Data Science at Vail Resorts leading a team of data scientists and machine learning engineers providing production models for operations and marketing. Previously she was a data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society's Mary J. Oestmann Professional Women's Achievement Award in 2015.

Clair host the following sessions:
Empowering Change: Building and Sustaining a Data Culture from the Ground Up
From Office Cubicles to Independent Success: How to Create a Career and Thrive as a Freelance Data Scientist

Hala Nelson (Alexandria, Virginia)

Hala Nelson (Linkedin) is an Associate Professor of Mathematics at James Madison University. She has a Ph.D. in Mathematics from the Courant Institute of Mathematical Sciences at New York University. Prior to her work at James Madison University, she was a postdoctoral Assistant Professor at the University of Michigan- Ann Arbor. Her research is in the areas of Materials Science, Statistical Mechanics, Inverse Problems, and the Mathematics of Machine Learning and Artificial Intelligence. Her favorite subjects are Optimization, Numerical Algorithms, Mathematics for AI, Mathematical Analysis, Numerical Linear Algebra and Probability Theory. She likes to translate complex ideas into simple and practical terms. To her, most mathematical concepts are painless and relatable, unless the person presenting them either does not understand them very well, or is trying to show off. Other facts: Hala Nelson grew up in Lebanon, during the time of its brutal civil war. She lost her hair at a very young age in a missile explosion. This event and many that followed shaped her interests in human behavior, the nature of intelligence, and AI. Her father taught her Math, at home and in French, until she graduated high school. Her favorite quote from her father about math is, "It is the one clean science''. Hala is author of the recent O'Reilly book: Essential Math for AI.

Hala will participate in the following panel:
The Future of Data Education

Jessica Talisman (Santa Cruz)


Jessica Talisman is a taxonomist, ontologist, information architect, and professional data wrangler. Over her 25 years of experience in the information & data architecture world, Jessica has worked in galleries, libraries, museums, the federal government, e-commerce, as tech and currently is the Information Architect for Sellers Platform at Amazon. Jessica holds a Master of Library and Information Science and Masters in Teaching. Check out Jessica’s recent interviews on the Monday Morning Data Chat and Discovering Data.

Jessica will present the following session:
We Are All Librarians, Systems for Organizing in the Age of AI

Juan Sequeda (Austin) @juansequeda

Dr. Juan Sequeda is the Principal Scientist and Head of the AI Lab at data.world. He holds a PhD in Computer Science from The University of Texas at Austin. Juan’s research and industry work has been on the intersection of data and AI, with the goal to reliably create knowledge from inscrutable data, specifically designing and building Knowledge Graph for enterprise data and metadata management. Juan is the co-author of the book “Designing and Building Enterprise Knowledge Graph” and the co-host of Catalog and Cocktails, an honest, no-bs, non-salesy data podcast.
Juan has researched and developed technology on semantic data virtualization, graph data modeling, schema mapping and data integration methodologies. He pioneered technology to construct knowledge graphs from relational databases, resulting in W3C standards, research awards, patents, software and his startup Capsenta acquired by data.world in 2019. Juan strives to build bridges between academia and industry as former co-chair of the LDBC Property Graph Schema Working Group, member of the LDCB Graph Query Languages task force, standards editor at the World Wide Web Consortium (W3C). Juan continues to be an active member of the scientific community through academic research partnerships, advising students, and member of data and AI scientific conference committees.

Juan will present the following session:
How to Start Investing in Semantics and Knowledge: A Practical Guide

Ryan Dolley (Detroit)

Ryan Dolley is a data consultant specializing in BI and analytics, author of the Super Data Blog, and one half of the Super Data Brothers. Check out his discussion on the evolution of BI and moving beyond dashboards on a recent episode of the Joe Reis Show.

Max De Marzi (Chicago) @maxdemarzi

Marx De Marzi (Linkedin) is addicted to graphs. You may consider him a graph database enthusiast. He spent 8 years at Neo4j and recently made the swith to AWS Neptune. He is a blogger and an open source contributor, both activities which stem from passion: teaching people about graphs. He is always open to talk graphs, always learning, and nothing thrills him more than finding easy graph solutions to hard relational problems. He has been helping people get to the "graph epiphany" for over a decade. He is an avid graph database modeler, leveraging his knowledge of mechanical sympathy and experience to deliver dozens of graph uses cases over the years.

Max will be presenting the following #graphday session:
Modeling in Graph Databases

William Lyon (SFBay) @lyonwj

William Lyon (LinkedIn) is an AI Engineer at Hypermode where he works to improve the developer experience of building full stack data applications. Previously he worked as a software developer at Neo4j and other startups. William holds a Masters degree in Computer Science from the University of Montana. William is author of the Manning publication Full Stack GraphQL Applications With React, Node.js, and Neo4j. You can find him online at lyonwj.com.

Will will be presenting the following #graphday session:
WTF Is A Triple? My Journey From Neo4j To Dgraph

David Hughes (Seattle)

David Hughes is the Principal Graph Consultant for Graphable. He has 10 years of experience designing and building graph solutions which surface meaningful insights. His background includes clinical practice, medical research, software development, and cloud architecture. David has worked in healthcare and biotech within the intensive care, interventional radiology, oncology, cardiology, and proteomics domains. He enjoys endurance running, hiking, and spending time with his family in the outdoors when he is not enabling clients to have data epiphanies from their complex data.

David will be co-presenting the following #graphday session:
Unleashing the Power of Multimodal GraphRAG: Integrating Image Features for Deeper Insights

Patrick McFadin (SF Bay) @patrickmcfadin

Patrick McFadin (Linkedin) is the VP of Principal Technical Strategist at DataStax, working on the intersection of distributed data and AI. He is also a committer on the Apache Cassandra project and a co-author of the O’Reilly book “Managing Cloud-Native Data on Kubernetes.” Before his current role, Patrick worked as Chief Evangelist for Apache Cassandra and as a consultant for DataStax, where he helped build some of the largest and most exciting deployments in production. Before joining DataStax, he held positions as Chief Architect, Engineering Lead, and Database DBA/Developer.

Patrick will present the following session:
Moving Beyond Text-to-SQL: Reliable Database Access through LLM Tooling

Ryan Wisnesky (San Francisco )

Ryan Wisnesky obtained B.S. and M.S. degrees in mathematics and computer science from Stanford University and a Ph.D. in computer science from Harvard University, where he studied the design and implementation of provably correct software systems. While at IBM Research Almaden he contributed to the Clio, Orchid, and HIL projects. While a postdoctoral associate in the MIT department of mathematics, he developed the CQL query language for ontology manipulation based on category theory. He is currently exploring applications of CQL to safe AI as CTO of Conexus AI.

Ryan will present the following two sessions:
1. Ontologies vs Ologs vs Graphs
2. Validating LLM-Generated SQL Code: A mathematical approach

Chris Tabb (London)

Chris Tabb, co-founder of LEIT DATA started his career in the Business Intelligence/Analytics domain 30 years ago. Beginning at Cognos in the 90’s working in the back office before becoming an expert in all their products, and leaving to become an independent BI consultant in 1998. Chris has followed the evolution of the analytics industry, working hands-on with all the technologies in the ecosystems: – Databases, ETL/ELT, BI/OLAP /Visualisation Tools, Big Data Technologies, Infrastructure On premises / Cloud across many vendors, some old some new. Recently with a focus on the Modern Data Stack Evolution Chris has started many movements with a focus on Business Value using a number of hashtags to raise awareness #bringbackdatamodelling / #bringbackdatamodeling #bringbackdocumention under the umbrella of the #meandatastreets that is focused on simplification of the Data Platform architecture and to focus on Business Value.

Matthew Housley (Salt Lake city)

Matthew Housley,“Recovering Data Scientist”, is Co-Founder / CTO of Ternary Data. Also a “Reformed Academic,” Matthew holds a PhD in Math and dual Masters degrees in both Math and Physics. It was only natural that he began his career in Academia as a Professor of Mathematics, before joining one of the largest e-commerce companies as a data scientist. Matt's STEM background in combination with his knack for teaching makes him a mastermind at overhauling processes, improving teamwork, and incorporating engineering best practices so that real value is delivered to companies. While making the journey from data scientist to data engineer, Matt began to focus more on data & cloud engineering, working extensively with Amazon Web Services, Google Cloud Platform, Containers, Apache Airflow and GPUs, among other technologies. Matt (or should we say, “Dr. Housley”) is an adjunct faculty member in the David Eccles School of Business at The University of Utah. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) and co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering.