One platform for social media data ingestion, pre-processing, and analysis.

Get StartedLearn more >

Searching Social Media

SMILE empowers researchers to search content from multiple social media platforms all at once. Find tweets, individual user accounts, and comments that match specific criteria, and analyze content with just a few clicks. Both live and historical data are available for search.

Read MoreStart

Named Entity Recognition

Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

Read MoreStart

Automated Phrase Mining

As one of the fundamental tasks in text analysis, phrase mining aims at extracting quality phrases from a text corpus. Phrase mining is important in various tasks such as information extraction/retrieval, taxonomy construction, and topic modeling.

Read MoreStart

Text Classification

Text classification is one of the important and typical task in supervised machine learning (ML). Text Classification assigns one or more classes to a document according to their content. Classes are selected from a previously established taxonomy (catergories or classes), which are usually established by human hand labeling.

Read MoreStart

Network Analysis

Social network analysis is the process of investigating social structures through the use of networks and graph theory .It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them.

Read MoreStart

Natural Language Preprocessing

Tokenization is the process of dividing written text into meaningful units, such as words, sentences , or topics. Lemmatization and Stemming reduces word forms to common base words. Part-of-speech Tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context.

Read MoreStart

Sentiment Analysis

Sentiment analysis (sometimes known as opinion mining or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information.

Read MoreStart

Topic Modeling

One of the primary applications of natural language processing is to automatically extract what topics people are discussing from large volumes of text. Topic modeling is a type of statistical modeling for discovering the abstract topics that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions.

Read MoreStart

Clowder

Clowder is a research data management system. You can choose to add search results and analytics outputs to Clowder within SMILE. A cluster of extraction services will process the data to extract interesting metadata and create web based data previews and visualizations.

Read MoreStart