About me

I am an Applied Scientist at Amazon, where I build and deploy large-scale machine learning systems to optimize global hiring decisions as part of the Intelligent Talent Acquisition (ITA) team.

Previously, I earned a Double Master’s Degree with Honors in Machine Learning and Data Science from Politecnico di Milano and Universidad Politécnica de Madrid through the EIT Digital program. My academic track blended strong technical training with a focus on innovation and entrepreneurship, which deepened my interest in startups and applied AI.

My research spans generative AI, uncertainty estimation, information retrieval, NLP, recommender systems, and LLM Optimization with a hands-on emphasis on practical machine learning. I’ve published multiple peer-reviewed papers and was awarded 1st place in the 2023 ACM RecSys Challenge. I enjoy working at the intersection of applied research and scalable system design, especially in areas where LLMs and retrieval-based architectures meet real-world constraints.

I’m particularly passionate about innovation, early-stage product development, and using AI to solve real business problems — often drawing from my experience contributing to AI-powered startup prototypes and RAG pipelines.

Selected Publications

  • Monte Carlo Temperature: a robust sampling strategy for LLM’s uncertainty quantification methods
    TrustNLP @ NAACL 2025 and Quantify Uncertainty and Hallucination in Foundation Models @ ICLR 2025 Workshops
    with Andrea Bacciu, Ignacio Fernández Tobías, Amin Mantrach
    Paper

  • Leveraging Semantic Embeddings of User Reviews with Off‑the‑Shelf LLMs for Traditional Recommender Systems
    IIR 2024
    with Andrea Pisani, Maurizio Ferrari Dacrema, Paolo Cremonesi
    Paper

  • Pre-Trained LLM Embeddings of Product Reviews for Recommendation
    IIR 2024 (co-author)
    Paper

  • Pessimistic Rescaling and Distribution Shift of Boosting Models for Impression-Aware Online Advertising Recommendation
    RecSys Challenge 2023 (co-author)
    Paper

Experience

  • Applied Scientist @ Amazon
    Edinburgh, UK (May 2025 – Present)
    • Build and deploy ML models (GBDTs, two-tower, contextual bandits) to improve hiring outcomes at scale
    • Develop large-scale data pipelines with Apache Spark and AWS
    • Write production-grade Python code in cross-functional teams
  • Corporate Relations & Software Engineer @ Lead The Future
    Remote (Mar 2025 – Present)
    • Lead corporate sponsorships and fundraising initiatives
    • Develop a RAG-based solution for enhanced information access
  • Applied Scientist Intern @ Amazon
    Madrid, Spain (Sep 2024 – Feb 2025)
    • Researched uncertainty estimation and diverse generation in LLMs
    • Reduced batch inference time from 24h to 2h through optimization
    • First-author paper accepted at NAACL ‘25 and ICLR ‘25
    • Received “Inclined to Offer” and full-time return offer
  • Machine Learning Research Student @ Politecnico di Milano – ContentWise – RecSys Lab
    Milan, Italy (Dec 2023 – Aug 2024)
    • Developed LLM-based recommendation algorithms
    • Published two research papers, including a first-author contribution
    • Awarded honors for thesis on personalized content recommendation
  • Founding AI Engineer @ Mosaic (Remote Startup)
    Nov 2023 – Sep 2024
    • Designed scalable RAG systems for investment banking
    • Built training pipelines, led document analysis research
    • Contributed UI prototypes (Figma) and market validation strategy

Education

  • M.Sc. in Computer Science and Engineering
    Politecnico di Milano & Universidad Politécnica de Madrid (2022–2024)
    Graduated with honors in both universities.
    • Focus: Machine Learning, Data Science, Infrastructure for Large-Scale Data
    • EIT Digital specialization in Innovation and Entrepreneurship
    • Summer School: Technical University of Munich – Siemens
  • B.Sc. in Computer Science and Engineering
    Politecnico di Milano (2019–2022)

Projects & Competitions

  • 1st Place – ACM RecSys Challenge 2023 (Academic leaderboard)
    GitHub
    • Participated in a global competition hosted by ACM and ShareChat (400+M users)
    • Designed ML pipelines to predict ad installs based on user impressions
  • Stock Market Time Series Prediction
    GitHub
    • Grade: 10/10 with honors
    • Forecasted Apple stock prices using LSTMs and CNNs over S&P 500 data
  • Massively Parallel ML Algorithms
    GitHub
    • Built scalable logistic regression classifier (1M-row botnet dataset)
    • Implemented parallel K-Means on MNIST using Apache Spark and MapReduce
  • Summer School: TUM – Siemens (Data Trading Project)
    • Managed industrial IoT data workflows and designed data architecture
    • Delivered a business creation and investor pitch
  • Recommender Systems – PoliMi Competition
    GitHub
    • Predicted user interaction with TV content
    • Ranked 2nd (public) and 3rd (private) out of 100+ teams
  • Artificial Neural Networks & Deep Learning – PoliMi Competition
    GitHub
    • Graded 5/5 for all tasks
    • Built BLSTM for multivariate time series classification
    • Created a CNN-based leaf species classifier with transfer learning

Honors & Awards

  • 1st Place, ACM RecSys Challenge 2023
  • EIT Digital Master’s Scholarship
  • Sergio Maffezzoni Merit Scholarship (2024)
  • 3rd Place, Politecnico RecSys Competition (2022)

Contact

I’m open to collaborations and discussions. 📧 cecere DOT nicola2000 AT gmail DOT com