Production · Mrityunjay Kumar

Text Clustering at scale

13 Apr 2018, 15:54

nlp / production / text / clustering / bigdata

Hello there! Today we will explore the overview of databricks © clusters and how to run the model using community account. Prerequisite Basic understanding of programming in Python or Scala. Knowledge or experience in Java, SQL, PySpark can be beneficial but is not essential. Objective After reading this blog, readers will be able to: Use the core Spark APIs to operate on text data. Build data pipelines and query large data sets using Spark SQL and DataFrames.