You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"body": "### The Data Commons Project\r\nData commons community is a group of passionate data engineers and data scientists at ThoughtWorks. Our goal is to provide a collection of rich, high-performance libraries to automate various data processing tasks at scale. We are currently building these tools to work with Apache Spark platform.\r\n\r\n### Active Projects\r\nWe are building a set of scalable, high-performance libraries that address an array of data processing concerns such as data quality assurance, data preparation for machine learning, data anonymization and data security. Here is a list of currently active projects \r\n* [**prep-buddy**](http://data-commons.github.io/prep-buddy) - A Scala / Java / Python library for cleansing, transforming and preparing large datasets for ML operations on Apache Spark.\r\n* [**protectr**](http://data-commons.github.io/protectr) - A Scala / Java / Python library for anonymization, encryption and redaction operations for large datasets on Apache Spark.\r\n\r\n\r\n### Support or Contact\r\nCatch up with us at our google group data-commons-toolchain@googlegroups.com\r\n",
"note": "Don't delete this file! It's used internally to help with page regeneration."