Pinned Loading
-
meta-ads-medallion-pipeline
meta-ads-medallion-pipeline PublicProduction-grade Meta Ads analytics pipeline — Airbyte → S3 Bronze → Apache Hudi Silver → PostgreSQL Gold star schema, built on PySpark and AWS Glue.
Python
-
YT_ViralityScore_AWS_Glue-
YT_ViralityScore_AWS_Glue- PublicEnd-to-end YouTube trending analytics on AWS — S3 + Glue ETL + Data Catalog + Athena + CloudWatch, with a custom virality score computed in PySpark.
Python
-
Truth-Seeker
Truth-Seeker PublicEnd-to-end fake-news detection — DistilBERT fine-tuned with LoRA adapters on AWS SageMaker, served via Flask with a Groq LLM reasoning layer and React frontend.
Python
-
aws-production-data-platform
aws-production-data-platform PublicProduction-grade AWS data platform on EKS — Terraform, EMR-on-EKS, Airflow, CI/CD, and AI-driven monitoring. Bronze/Silver/Gold data lake, one-command deploy, IRSA auth.
HCL
-
ecommerce-analytics-gcp
ecommerce-analytics-gcp PublicEnd-to-end GCP analytics: GCS → BigQuery → dbt staging and marts → Looker Studio. Star-schema modelling of the Olist Brazilian e-commerce dataset.
-
real-time-ecommerce-event-streaming-pipeline
real-time-ecommerce-event-streaming-pipeline PublicReal-time e-commerce event pipeline — Kafka + PySpark Structured Streaming with exactly-once semantics, watermarked windowed aggregations, dual-sink to PostgreSQL.
Python
If the problem persists, check the GitHub status page or contact support.