Prodigy · An annotation tool for AI, Machine Learning & NLP

Radically efficient machine teaching.
An annotation tool powered
by active learning.

From the makers of spaCy

pip install ./prodigy.whl
Successfully installed prodigy

prodigy ner.manual reviews_ner en_core_web_sm ./data.jsonl --label PRODUCT,PERSON,ORG

✨ Starting the web server on port 8080...
Open the app in your browser and start annotating!

Train a new AI model in hours

Prodigy is a scriptable annotation tool so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration.

Today’s transfer learning technologies mean you can train production-quality models with very few examples. With Prodigy you can take full advantage of modern machine learning by adopting a more agile approach to data collection. You'll move faster, be more independent and ship far more successful projects.

How it works

The missing piece in your data science workflow

Prodigy brings together state-of-the-art insights from machine learning and user experience. With its continuous active learning system, you're only asked to annotate examples the model does not already know the answer to. The web application is powerful, extensible and follows modern UX principles. The secret is very simple: it's designed to help you focus on one decision at a time and keep you clicking – like Tinder for data.

Everyone knows data scientists should spend more time looking at their data. When good habits are hard to form, the trick is to remove the friction. Prodigy makes the right thing easy, encouraging you to spend more time understanding your problem and interpreting your results.

Try the demo

Named EntitiesText ClassificationImagesFree-form

Try it live and highlight entities!

This live demo requires JavaScript to be enabled.

Try it live and select text categories!

This live demo requires JavaScript to be enabled.

Try it live and draw bounding boxes!

This live demo requires JavaScript to be enabled.

Try it live and type some text!

This live demo requires JavaScript to be enabled.

Prodigy users include

Try out new ideas quickly

Annotation is usually the part where projects stall. Instead of having an idea and trying it out, you start scheduling meetings, writing specifications and dealing with quality control. With Prodigy, you can have an idea over breakfast and get your first results by lunch. Once the model is trained, you can export it as a versioned Python package, giving you a smooth path from prototype to production.

Andy Halterman
@ahalterman
Mordecai would not have been possible without @explosion_ai's Prodigy. A lack of labeled data held geoparsing back for years. It took a week to fix that with Prodigy.
Raphael cohen
@cohenrap
Prodi.gy is by far the best ROI we had on any tool!
FullFact
@FullFact
We've collected 25,000+ annotations from 90 fantastic volunteers, to support our automated factchecking work thanks to Prodigy, an annotation tool created by @explosion_ai.
Andrew Trask
@iamtrask
I'm a huge fan of everything @explosion_ai does... they're brilliant... and their new annotation tool is the best I've ever seen.
Oliver Beavers
@oliverbeavers
just finishing up first major #nlp project with @explosion_ai's prodigy active learning platform. in 3 hours, did what took > 10 volunteers, painful google sheets nonsense, and weeks worth of time. game changer. #yesimshilling
David Campion
@Orbis_21
“Text Classification: Be lazy, use Prodi.gy (a tool by @explosion_ai) !”. This tool (prodi.gy) is fantastic and really help us to speed-up and build our models.
Ajinkya Kale
@ajinkyakale
Its amazing, every time i try to build something in house these guys beat me at it providing an awesome solution out of the box!

Fully scriptable and extensible

Prodigy is fully scriptable, and slots neatly into the rest of your Python-based data science workflow. As the makers of spaCy, a popular library for Natural Language Processing, we understand how to make tools programmers love. The simple secret is this: programmers want to be able to program. Good developer tools need to let you in, not lock you out. That's why Prodigy comes with a rich Python API, elegant command-line integration, and a super productive Jupyter extension. Using custom recipe scripts, you can adapt Prodigy to read and write data however you like, and plug in custom models using any of your favourite frameworks.

recipe.pyimport prodigy
from prodigy.components.loaders import JSONL

@prodigy.recipe("custom")
def custom_recipe(dataset, source):
    return {
        "dataset": dataset,
        "stream": JSONL(source),
        "view_id": "classification"
    }

Command-line usage
prodigycustommy_dataset./data.jsonl-F recipe.py

Radically efficient machine teaching.
An annotation tool powered
by active learning.

Train a new AI model in hours

The missing piece in your data science workflow

Try it live and highlight entities!

Try it live and select text categories!

Try it live and draw bounding boxes!

Try it live and type some text!

Prodigy users include

Try out new ideas quickly

What others say

Andy Halterman

Raphael cohen

FullFact

Andrew Trask

Oliver Beavers

David Campion

Ajinkya Kale

Fully scriptable and extensible

Command-line usage

Browse features

Named Entity
Recognition

Text
Classification

Dependencies &
Relations

Computer
Vision

Audio &
Video

A/B
Evaluation

Train a new AI model in hours

The missing piece in your data science workflow

Try it live and highlight entities!

Try it live and select text categories!

Try it live and draw bounding boxes!

Try it live and type some text!

Prodigy users include

Try out new ideas quickly

What others say

Fully scriptable and extensible

Command-line usage

Browse features

Named EntityRecognition

TextClassification

Dependencies &Relations

ComputerVision

Audio &Video

A/BEvaluation

Named Entity
Recognition

Text
Classification

Dependencies &
Relations

Computer
Vision

Audio &
Video

A/B
Evaluation