From f69589d1bc50857b874b42a3d1bab7c891275e96 Mon Sep 17 00:00:00 2001 From: Jonas Mueller <1390638+jwmueller@users.noreply.github.com> Date: Thu, 18 May 2023 10:14:28 -0700 Subject: [PATCH] add cleanlab to awesome-transformers tools list (#23440) * add tool to awesome-transformers list * add keyword list * sgugger wording suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --- awesome-transformers.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/awesome-transformers.md b/awesome-transformers.md index beec5379a5..36059eafa6 100644 --- a/awesome-transformers.md +++ b/awesome-transformers.md @@ -582,3 +582,9 @@ Keywords: IR, Information Retrieval, Dense, Sparse Keywords: Active Learning, Research, Labeling +## [cleanlab](https://github.com/cleanlab/cleanlab) + +[cleanlab](https://github.com/cleanlab/cleanlab) is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels. For text, image, tabular, audio (among others) datasets, you can use cleanlab to automatically: detect data issues (outliers, label errors, near duplicates, etc), train robust ML models, infer consensus + annotator-quality for multi-annotator data, suggest data to (re)label next (active learning). + +Keywords: Data-Centric AI, Data Quality, Noisy Labels, Outlier Detection, Active Learning +