Text Analysis Using Turicreate Last Updated : 23 May, 2024 Summarize Comments Improve Suggest changes Share Like Article Like Report What is Text Analysis? Text is a group of words or sentences.Text analysis is analyzing the text and then extracting information with the help of text.Text data is one of the biggest factor that can make a company big or small.For example On E-Commerce website people buy things .With Text Analysis the E-Commerce website can know what it's customer likes and it through this data it can make it's productivity higher. Using Text analysis and some Machine Learning Algorithm our Alexa Google Home mini works. These two are based on Natural Language Processing. Using Text Analysis we can decide whether a E-mail is a Spam or a Non Spam. Text analysis can be done using text mining.As the text "data" can be structured as well as unstructured.The text mining technique will help us in differentiating between them. Now let's do some text analysis using Turicreate.We will build a model that classifies that a message is a spam or ham for text analysis. Step 1: Import the Turicreate Library python3 import turicreate as tc Step 2:Load the data set. python3 data = tc.SFrame("data.csv") Step 3: We will explore the data first. python3 # It will print the first full rows of the data data.head(). Output: dataset Step 4:Now adding the word count in the data set. This is because data has two things category and message. Adding the word count will help in model feature selection. python3 # Text analytics library has a count word function. # It will separately count the words for each row # of message column. data['word_count']= tc.text_analytics.count_words(data['Message']) # now we can see that the data has one more column if word_count. data.head() Output: Here One more row of word_count is added in the data set. Step 5: To split the data into train and test set. python3 train_data, test_data = data.random_split(.8, seed = 0) Step 6: Now we will make a model for classifying the spam and ham. python3 # We will use our feature as word count and # our target "category is to find out spam or ham. model = tc.logistic_classifier.create( train_data, target ='Category', features =['word_count'], validation_set = test_data) Step 7: Now we will check accuracy of our model. python3 model.evaluate(test_data) Output: The accuracy is 0.975 that means 97.5%.Step 8: We can predict manually by checking from our test data that it is giving right answer or not. Code: python3 test_data.head() # We will select the first one that is spam # and select that is spam or not. Step 9: Predicting the test data. python3 model.predict(test_data[1]) Output: The result is spam hence the model is predicting it right. Comment More infoAdvertise with us Next Article TuriCreate library in Python A abhisheksrivastaviot18 Follow Improve Article Tags : Machine Learning AI-ML-DS python Practice Tags : Machine Learningpython Similar Reads Data Visualization using Turicreate in Python In Machine Learning, Data Visualization is a very important phase. In order to correctly understand the behavior and features of your data one needs to visualize it perfectly. So here I am with my post on how to efficiently and at the same time easily visualize your data to extract most out of it. B 3 min read TuriCreate library in Python TuriCreate(Machine Learning Python Library): Custom Machine Learning models can be made using Turi create very easily. You don't have to be an expert in Machine Learning to use Turi create and it helps in building an add recommendations, object detection, image classification, image similarity or ac 2 min read Wordtune: AI Writing Assistant Wordtune AI is a writing assistant tool that rephrases sentences to improve clarity, tone, and style. Wordtune is an intelligent and smart writing assistant. This is how communication has turned into a cornerstone within the heart of this fast-paced digital world. Clarity and creativity in writing, 6 min read Text Manipulation using OpenAI Open AI is a leading organization in the field of Artificial Intelligence and Machine Learning, they have provided the developers with state-of-the-art innovations like ChatGPT, WhisperAI, DALL-E, and many more to work on the vast unstructured data available. For text manipulation, OpenAI has compil 10 min read Python | Tokenize text using TextBlob Tokenization is a fundamental task in Natural Language Processing that breaks down a text into smaller units such as words or sentences which is used in tasks like text classification, sentiment analysis and named entity recognition. TextBlob is a python library for processing textual data and simpl 3 min read Guide to install TuriCreate in Python3.x Before Installing first you need to what is actually Turi Create. So, Turi Create is an open-source toolset for creating Core ML models, for tasks such as image classification, object detection, style transfers, recommendations, and many more. System Requirements Python 2.7, 3.5, 3.6, 3.7At least 4g 3 min read Like