Passionate about the intersection of Data Science, Statistical Inference, and AI.
Currently building a rock-solid theoretical foundation while getting my hands dirty with real-world data — from raw sensor noise to actionable insight.
- The Full Data Pipeline — I enjoy the grind of cleaning messy data, performing EDA, and drawing conclusions through statistical inference
- Industrial Time Series — Analyzing patterns, detecting anomalies, and forecasting in real industrial environments
- Statistical Rigor — Using hypothesis testing, regression, and mathematical reasoning to build honest, reproducible analyses
Languages
Libraries & Tools
Anomaly detection and predictive modeling on real industrial sensor data
Two end-to-end analyses on public industrial datasets — one for a solar generation plant, one for a water treatment facility.
| Solar Plant | Water Treatment Plant | |
|---|---|---|
| Goal | Identify underperforming inverters & estimate post-repair ROI | Build a Digital Twin for 3 critical sensors during scheduled panel replacement |
| Key techniques | IQR outlier removal, time interpolation, inverter performance ranking, Random Forest regressor | Isolation Forest (multivariate anomaly detection), lag hypothesis testing, Linear Regression vs. Random Forest pipeline |
| Highlight | Estimated yearly revenue recovery per faulty inverter repaired | 76% of DQO-S predictions within 10% error margin using Random Forest |
Full-scale statistical study on student habits and academic performance — designed, collected, and analyzed from scratch
A team research project where we surveyed 100 university students and put 4 popular myths to the test.
Hypotheses tested (with t-tests, α = 0.05):
- Students perceive themselves as more stressed than the scale midpoint (confirmed, p = 0.007)
- High-performing students sleep less (not confirmed — they actually sleep slightly more)
- Top students don't do sports (not confirmed — sport habits are nearly identical across grade groups)
- Students who study more consume more stimulant drinks (not confirmed)
Best regression model: grade ~ study_hours + sleep_hours → R² adj = 0.25
📍 Puertollano / Ciudad Real · Open to internships and collaborations