{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"___\n",
"\n",
"
\n",
"___"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Text Classification Assessment - Solution\n",
"This assessment is very much like the Text Classification Project we just completed, and the dataset is very similar.\n",
"\n",
"The **moviereviews2.tsv** dataset contains the text of 6000 movie reviews. 3000 are positive, 3000 are negative, and the text has been preprocessed as a tab-delimited file. As before, labels are given as `pos` and `neg`. \n",
"\n",
"We've included 20 reviews that contain either `NaN` data, or have strings made up of whitespace.\n",
"\n",
"For more information on this dataset visit http://ai.stanford.edu/~amaas/data/sentiment/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Task #1: Perform imports and load the dataset into a pandas DataFrame\n",
"For this exercise you can load the dataset from `'../TextFiles/moviereviews2.tsv'`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
| \n", " | label | \n", "review | \n", "
|---|---|---|
| 0 | \n", "pos | \n", "I loved this movie and will watch it again. Or... | \n", "
| 1 | \n", "pos | \n", "A warm, touching movie that has a fantasy-like... | \n", "
| 2 | \n", "pos | \n", "I was not expecting the powerful filmmaking ex... | \n", "
| 3 | \n", "neg | \n", "This so-called \"documentary\" tries to tell tha... | \n", "
| 4 | \n", "pos | \n", "This show has been my escape from reality for ... | \n", "