Du er ikke logget ind
Beskrivelse
Machine learning and artificial intelligence (AI) are powerful tools that create predictive models, extract information, and help make complex decisions. They do this by examining an enormous quantity of labeled training data to find patterns too complex for human observation. However, in many real-world applications, well-labeled data can be difficult, expensive, or even impossible to obtain. In some cases, such as when identifying rare objects like new archeological sites or secret enemy military facilities in satellite images, acquiring labels could require months of trained human observers at incredible expense. Other times, as when attempting to predict disease infection during a pandemic such as COVID-19, reliable true labels may be nearly impossible to obtain early on due to lack of testing equipment or other factors. In that scenario, identifying even a small amount of truly negative data may be impossible due to the high false negative rate of available tests. In such problems, it is possible to label a small subset of data as belonging to the class of interest though it is impractical to manually label all data not of interest. We are left with a small set of positive labeled data and a large set of unknown and unlabeled data. Readers will explore this Positive and Unlabeled learning (PU learning) problem in depth. The book rigorously defines the PU learning problem, discusses several common assumptions that are frequently made about the problem and their implications, and considers how to evaluate solutions for this problem before describing several of the most popular algorithms to solve this problem. It explores several uses for PU learning including applications in biological/medical, business, security, and signal processing. This book also provides high-level summaries of several related learning problems such as one-class classification, anomaly detection, and noisy learning and their relation to PU learning.