Du er ikke logget ind
Beskrivelse
Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enables data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer who depends on the quality of your data, this book shows you how to focus on the practical aspects of introducing data observability in your everyday work. Author Andy Petrella helps you build the right habits to identify and solve data issues, such as data drifts and poor quality, so you can stop their propagation in data applications, pipelines, and analytics. You'll learn ways to introduce data observability, including setting up a framework for generating and collecting all the information you need.
Learn the core principles and benefits of data observabilityUse data observability to detect, troubleshoot, and prevent data issuesFollow the book's recipes to implement observability in your data projectsUse data observability to create a trustworthy communication framework with data consumersLearn how to educate your peers about the benefits of data observabilityAbout the Author
Andy Petrella has been in the data industry for almost 20 years, starting his career as a software engineer and data miner in the GIS space. He has evangelized big data for more than a decade, especially Apache Spark for which he created the Spark-Notebook (that has 3100 stars on Github). During his time evangelizing Spark and helping hundreds of companies in the US and in EU work on their data pipelines and models, he has witnessed the lack of visibility and control of data jobs after they are deployed in production. Since 2015, he has been talking to tech and data-savvy people to build a sustainable solution for this problem.