data-commons.github.io/params.json at master · data-commons/data-commons.github.io · GitHub

1
2
3
4
5
6
{
  "name": "Data Commons",
  "tagline": "Enabling working with data at scale",
  "body": "### The Data Commons Project\r\nData commons community is a group of passionate data engineers and data scientists at ThoughtWorks. Our goal is to provide a collection of rich, high-performance libraries to automate various data processing tasks at scale. We are currently building these tools to work with Apache Spark platform.\r\n\r\n### Active Projects\r\nWe are building a set of scalable, high-performance libraries that address an array of data processing concerns such as data quality assurance, data preparation for machine learning, data anonymization and data security. Here is a list of currently active projects \r\n* [**prep-buddy**](http://data-commons.github.io/prep-buddy) - A Scala / Java / Python library for cleansing, transforming and preparing large datasets for ML operations on Apache Spark.\r\n* [**protectr**](http://data-commons.github.io/protectr) - A Scala / Java / Python library for anonymization, encryption and redaction operations for large datasets on Apache Spark.\r\n\r\n\r\n### Support or Contact\r\nCatch up with us at our google group data-commons-toolchain@googlegroups.com\r\n",
  "note": "Don't delete this file! It's used internally to help with page regeneration."
}