Microsoft Ignite 2019
Miss the Application Development Keynote? Don’t fret, catch it, and more great content on demand.

ML.NET Tutorial - Get started in 10 minutes

Download and add data

Download the Wikipedia detox dataset and save it as wikipedia-detox-250-line-data.tsv in the myMLApp directory you created.

Each row in wikipedia-detox-250-line-data.tsv represents a different review left by a user on Wikipedia. The first column represents the sentiment of the text (0 is non-toxic, 1 is toxic), and the second column represents the comment left by the user. The columns are separated by tabs. The data looks like the following:

wikipedia-detox-250-line-data.tsv
Sentiment	SentimentText
1	        ==RUDE== Dude, you are rude upload that carl picture back, or else.
1	        == OK! ==  IM GOING TO VANDALIZE WILD ONES WIKI THEN!!!
0	        I hope this helps.

Add data

In Model Builder, you can add data from a local file or connect to a SQL Server database. In this case, you will add wikipedia-detox-250-line-data.tsv from a file.

Select File as the input data source in the drop-down, and in Select a file find and select wikipedia-detox-250-line-data.tsv.

Under Column to predict (Label), select "Sentiment." The Label is what you are predicting, which in this case is the Sentiment found in the first column of the dataset.

The columns that are used to help predict the Label are called Features. In this case, the review comment is the Feature, so leave "SentimentText" checked as the Input Column (Feature).

After adding your data, go to the Train step.

Continue