2025 Berkeley Analytics Lab Showcase
From posters to app demos and interactive features, the Berkeley Analytics Lab Showcase is a culmination of hands-on analysis and research conducted by Berkeley Analytics students, who utilize cutting-edge analytical methods and quantitative tools to tackle real-world business and industry challenges.
This showcase explored the transformative power of analytics across an array of industries. From sports and entertainment to the forefront of fashion, finance, generative AI, healthcare, and beyond.
Guest judges
A special thanks to our guest judges:
- Serah Varghese, Data Scientist, Visa, c/o 2023
- Allen Zhou, Data Engineer, Supply Chain Analytics CelLink Corporation, c/o 2024
- .Megha Setia, MTS Info Arch & BI Engineering Rambus, c/o 2024
- Pryanka Ravichandran, Data Analyst Inara AI, c/o 2024
- Kashin Shah, Data Engineer, Chartmetric, c/o 2023
- Apurva Arni, Senior Program Manager, Tesla, c/o 2023
- Sonia Sun, Analyst, Mercy Hospital,c/o 2023
- Ian Howard, Data Scientist, BASF, c/o 2023
- Hyunjoon Kweon, Info Architecture & Business Intelligence Engineer, Rambus, c/o 2024
- Rishi Banerjee, Business Analyst, Nihilent Inc., c/o 2023
- Jack O’Donoghue, Lead of Data Science & Analytics, Hotel Trader, c/o 2023
- Lixin Pan, Data Analyst, Tesla, c/o 2023
2025 Winners
First Place
Group 6: Predicting Bond Liquidity
Team: Vaishali Senthil, Jiayi Mo, Jingyi Zhou, Yuhan Duan, Xiaoqian Ding
Description: This project addresses the challenge of modeling and incorporating bond market liquidity risk and regime shifts into investment decisions—an area often overlooked due to limited real-time indicators. We develop a Bond Liquidity Index using a hybrid approach that combines queuing theory and Hidden Markov Models (HMMs) to identify liquidity regimes and simulate market dynamics, along with LSTM models to forecast liquidity patterns. In parallel, we explore sentiment-based signals from financial news and social media to enhance market awareness. Designed for portfolio managers, institutional investors, and policymakers, the index is integrated into an optimal portfolio framework that accounts for return, volatility, and liquidity. Market stress tests are visualized for investor benefit.
Second Place
Group 10: Intelligent Job Recommendation System
Team: Yuanjun Lin, Jinyi Xu, Xinye Guo, Jingxing Gao, Danielle Yang, Xi Zhang
Description: Our project is called HireBot, a smart and easy-to-use job recommendation tool built for job seekers who want better matches based on their skills and salary needs. We use GloVe word embeddings and cosine similarity to turn resumes and job postings into numbers, then calculate a score that balances skill fit and salary expectations using a parameter λ, which is improved over time with reinforcement learning. The tool also uses a genetic algorithm to simulate how people apply for jobs and learn better search strategies. Users just upload their resume through a simple Streamlit web interface, and the system automatically pulls out key info like skills and experience, scores the matches, and shows the top 3 jobs. The results are clear, easy to understand, and require no tech background. In short, HireBot helps job seekers find better opportunities faster and smarter.
Third Place
Group 1: Wildfire Risk Analysis & Prediction Platform
Team: Mark Li, Hanqi Wang, Veer Arora, Patrick Connor, Guoqian Zeng, Yi Ouyang
Description: Our project focuses on identifying and predicting wildfire risk in California. By training our model on relevant climate factors, such as temperature and humidity indices, we can identify geographic areas of higher fire risk. This model, along with our analysis of historical CalFire data and an image-classifier, were incorporated into a user-friendly interactive tool, so that users can identify local fire risk and risk factors. All of these features, with built-in AI assistance, allow even non-technical audiences to access relevant wildfire risk information that previously might have been inaccessible, or only available to local administrators and fire departments. Additionally, the interactive model also allows for simulation planning, because users can input custom climate conditions, future climate conditions, or even current weather data at their current location.
2025 Student Projects
Team: Mark Li, Hanqi Wang, Veer Arora, Patrick Connor, Guoqian Zeng, Yi Ouyang
Description: Our project focuses on identifying and predicting wildfire risk in California. By training our model on relevant climate factors, such as temperature and humidity indices, we can identify geographic areas of higher fire risk. This model, along with our analysis of historical CalFire data and an image-classifier, were incorporated into a user-friendly interactive tool, so that users can identify local fire risk and risk factors. All of these features, with built-in AI assistance, allow even non-technical audiences to access relevant wildfire risk information that previously might have been inaccessible, or only available to local administrators and fire departments. Addit
ionally, the interactive model also allows for simulation planning, because users can input custom climate conditions, future climate conditions, or even current weather data at their current location.
Team: Xinpeng Qu, Yun Zhang, Jiarui Wen, Yuqi Chen, Chenxiao Wang, Ziling Feng
Description: Our project introduces a personalized itinerary recommendation system that helps users plan the perfect day out—whether they’re tourists, locals, or planning a date. By analyzing Yelp business data and user reviews, our tool suggests 10–15 locations—restaurants, cafés, theaters, and more—and generates an optimized travel route based on individual preferences. Behind the scenes, we combine collaborative filtering and natural language processing with semantic embeddings to model user interests. To fuse insights from different recommendation models, we apply an optimal transport approach, producing a final list that balances behavioral patterns and semantic meaning. Tested in Philadelphia, our solution demonstrates real-world potential to support both users and local businesses through intelligent, data-driven planning.
Team: Evelyn Liang; Boyuan Lai; Tiantuo Wang; Jingyi Ying; Tianyu Qi; YIhan Yan
Description: Our project is a rental property search and recommendation system designed for prospective renters and property seekers in California, offering intuitive filters for ZIP code, square footage, bedrooms, bathrooms, garage, and amenities to narrow down options by location, size, and lifestyle preferences. Our system begins with extensive data preprocessing, transforming ZIP codes into continuous socio‑economic features like median income and population density, and calculating distances to downtown via the Haversine formula to enrich model inputs. We then evaluated a wide range of models—including linear regressions with Lasso and Ridge, Decision Trees, Random Forests, XGBoost, SVR, MLPs, and RNNs—using MAE, RMSE, and MAPE to balance accuracy, generalization, interpretability, and computational efficiency. After rigorous comparison, Random Forest stood out with a MAPE of 6.82%, offering the best trade‑off between low error and robustness to noise. Leveraging this optimized model, we predict fair market rents, compute a Deal Score as the ratio of predicted to actual prices, and rank listings to spotlight underpriced opportunities.
Description: Exploring the intersection of crime rates and rental prices. This analysis sheds light on how safety influences housing markets.
Team: Chen Liang, Abhiraj Singh, Winnie Wu, Qinchen Yao, Xinmeng Huang
Description: ResumeTailor is a browser-based assistant that transforms a candidate’s resume to match specific job postings in seconds. Users paste their resume and job description into a clean interface; our fine-tuned Qwen-2.5-7B model (enhanced through Low-Rank Adaptation) then intelligently rewrites work experience bullets to mirror the role’s language while preserving authentic career narratives. A split-screen view displays original and tailored text side-by-side, with multiple metrics quantifying the improvement in alignment. Because the model is lightweight enough for consumer GPUs, the solution is accessible for campus career centers and small HR teams without steep cloud costs. Strict prompt engineering prevents fabrication, ensuring every line remains grounded in the user’s actual experience while still optimizing for job description relevance.
Team: Vaishali Senthil, Jiayi Mo, Jingyi Zhou, Yuhan Duan, Xiaoqian Ding
Description: This project addresses the challenge of modeling and incorporating bond market liquidity risk and regime shifts into investment decisions—an area often overlooked due to limited real-time indicators. We develop a Bond Liquidity Index using a hybrid approach that combines queuing theory and Hidden Markov Models (HMMs) to identify liquidity regimes and simulate market dynamics, along with LSTM models to forecast liquidity patterns. In parallel, we explore sentiment-based signals from financial news and social media to enhance market awareness. Designed for portfolio managers, institutional investors, and policymakers, the index is integrated into an optimal portfolio framework that accounts for return, volatility, and liquidity. Market stress tests are visualized for investor benefit.
Team: Lingxin Li, Ye Zhou, Mitchell Wu, Wencong Wang, Saivenkata Nagavyjayanthi Polapragada, Raj Sunil Peswani
Description: A commercial real estate recommendation system designed to help users discover high-value property listings across the U.S. The system generates personalized recommendations based on inputs such as location, budget, ROI, and proximity to transit and airports, using TF-IDF and cosine similarity. An unsupervised deal score model, built with a Random Forest Tree model and quantile-based price estimates, highlights fairly priced listings. The solution is delivered through an interactive web application that combines recommendation and deal evaluation features for efficient decision-making.
Team: Xie Wu, Lucy Lin, Vivian Yeh, Jiayi Jiang, Yidan Ma, Yongkang Zhang
Description: San Francisco Commercial Site Recommendation is a data-driven tool that helps users identify ideal commercial properties based on six input features: commercial type, max price, tax liability, preferred population, minimum businesses, and AGI. It combines a K-Nearest Neighbors baseline model with a Path-Based Knowledge Graph to capture both feature similarity and geographic connectivity. The system returns top-matching locations enriched with insights comparing local and city-wide averages. This dual-approach supports real estate agents and investors in making informed, location-sensitive decisions aligned with client needs.
Team: Chicheng Xu, Hengzhou Li, Haoxuan Qu, Meixuan Li, Olivia Zhang, Siddharth Sunil Salian
Description: This project supports football club managers in making data-driven recruitment and budgeting decisions by building a recommendation and forecasting system. Using historical FIFA data, we trained an XGBoost model to predict player market value and computed a value ratio (FIFA overall rating divided by predicted market value) to identify high-efficiency players. A collaborative filtering model based on key performance features suggests similar players with strong value potential. To aid long-term financial planning, we employed ARIMA models to forecast each player’s market value over the next five years. An interactive Streamlit dashboard enables users to filter by club, position, and budget while exploring value-optimized player recommendations and future market value trends. This tool empowers clubs to make informed, cost-effective decisions in player acquisition and retention.
Team: Yuanjun Lin, Jinyi Xu, Xinye Guo, Jingxing Gao, Danielle Yang, Xi Zhang
Description: Our project is called HireBot, a smart and easy-to-use job recommendation tool built for job seekers who want better matches based on their skills and salary needs. We use GloVe word embeddings and cosine similarity to turn resumes and job postings into numbers, then calculate a score that balances skill fit and salary expectations using a parameter λ, which is improved over time with reinforcement learning. The tool also uses a genetic algorithm to simulate how people apply for jobs and learn better search strategies. Users just upload their resume through a simple Streamlit web interface, and the system automatically pulls out key info like skills and experience, scores the matches, and shows the top 3 jobs. The results are clear, easy to understand, and require no tech background. In short, HireBot helps job seekers find better opportunities faster and smarter.
Team: Adithya Bhat, Jingying Liu, Gerson Aaron Morales Deras, Juncheng Xu, Yingyu Zhu
Description: Soccer is the world’s most popular sport, and chances are you know at least one star player by name. Many people have dreamed of becoming professional players. However, have you ever wondered what happens behind the scenes? Do you know how teams scout talent, negotiate transfers, and balance budgets? In this project, you will become a soccer club manager: using our predictive model to estimate player market values, you will build your ideal European team with an imaginary budget in mind.
Are you ready to discover which factors influence a player’s price tag? Explore our interactive tool to experiment with different attributes, assemble a lineup, and see what it takes to run a professional soccer club!
Team: Zumin Chen, XuangYu, Yuqi Shen,Wenfei Tan, Pratham Gupta
Description: This interactive mapping tool helps future store owners in San Francisco make data-driven location decisions. By combining foot traffic data from the city’s 10 most popular streets with Yelp business information, users can explore dynamic trends, analyze competitor presence, and visualize neighborhood business composition through adjustable filters and visualizations. Our goal is to turn complex location planning into an intuitive, insight-rich experience. And ultimately, our platform bridges data analytics with real-world planning, empowering entrepreneurs to choose store locations that align with their goals and market strategy.
Team: Simeng Wang, Fujia Sun, Jingwen Jia, Tianze Liu, Yihan Zhou
Description: Our project focuses on forecasting short-term return trends for the top three main currency pairs (GBP, Euros, JPY) to support trading and investment decisions. Using data from Dewey Global FX Rates, Wall Street Journal FX data, and Central Charts Forex Exchange Data, we processed and standardized exchange rates to ensure reliable trend detection. We also designed and compared traditional models such as ARIMA, GARCH-ARIMA, and VAR with a Long Short-Term Memory (LSTM) neural network. LSTM models successfully captured both linear and non-linear patterns in the highly volatile FX market, outperforming traditional econometric methods. In addition, We designed a trading dashboard to translate model outputs into actionable strategies, even convenient to use for non-technical users. Our approach demonstrates the practical potential of deep learning to enhance currency trading strategies.
Team: Haoning Wang, Shuo Chen, Yilin Chen, Sirui Huang, Wushuang Li, Yifei Li
Description: Our project is aim to all investors interested in event forecasting by giving a quantitative approach to event-driven investing. We focus on the gas market, using event predictions from the platform Kalshi to guide investment decisions. By forecasting gas prices, we provide users with actionable insights that can help them profit from events listed on Kalshi. In addition, we screen and a variety of gas-related financial instruments like stocks, bonds, and derivatives to support investment strategies. To better meet the needs and maximize the benefits of our clients, our models optimize portfolios while finding balance between risk and return.
Team: Jim Cao, Shiyunyang Zhao, Yuqin Yang, Luis Schmitz, Atharv Raturi, Monika Voutov
Description: FinOpt is an interactive portfolio optimization tool built as a graduate capstone project at UC Berkeley. The app helps users design customized investment portfolios across diverse asset classes, including stocks, bonds, real estate, gold, and automobiles. Using predicted returns, risk metrics, and user-defined preferences, it solves for optimal allocations that balance return and volatility. The tool supports indivisible assets, buy-in constraints, and utility-based optimization. Users can explore asset performance, visualize portfolio breakdowns, and compare outcomes against market benchmarks like the S&P 500. Built with Python, Gurobi, and Streamlit, FinOpt transforms complex financial modeling into an intuitive, user-friendly experience. The tool also includes simulation features and benchmarking insights to support long-term investment decisions.
Team: Honglin Zhu, Zhaoling Zou, Ruofei Fan, Yunyang Zhang, Zhuojin Yu, Sizhe Tang
Description: Our project, Salary Scope, aims to predict salary ranges based on job-related features such as industry, location, remote status, and company size. By leveraging regression and classification models—including Linear Regression, Random Forest, and XGBoost—we estimate salaries and uncover disparities across industries and regions. A key innovation lies in our data preprocessing with log transformation and segmentation, which improves model accuracy and interpretability. Our final model, an XGBoost Classifier, achieved ~65% accuracy and excelled at identifying the “Medium” salary class. We developed an interactive web tool where users input job and personal details to receive tailored salary predictions and position recommendations. This project supports HR professionals and job seekers in understanding salary benchmarks and navigating compensation expectations. Future plans include expanding the dataset and integrating real-time scraping for more dynamic insights.
Team: Vedaant Agarwal, Vandana Mathi, Jingwen Liu, Rohit Pugazhendi, Lingye Chen, Tzu-Yang Lin
Description: Our project, S.N.A.P. (Spectrogram-based Neural Accent Predictor), focuses on classifying English accents using deep learning. By converting voice recordings into mel spectrograms—visual representations of audio—we train a ResNet34 model to distinguish different regional English accents. This enables applications in customer service, language research, and voice-enabled technology. The model is designed with adaptability and fairness in mind, aiming to reduce AI bias by learning from diverse voice data. Ultimately, the project enhances voice-based interactions by adding cultural and regional awareness to AI systems.
Team: Sankalp N V, Hongye (Hypatia) Pan, Harper Li, Leslie Hu, Konstantin Zhivotov
Description: TubeSense is an AI-driven platform built to empower the next generation of YouTube creators. By analyzing historical video data, we uncover what makes content stand out — from compelling titles to attention-grabbing thumbnails. Our system uses advanced models like DistilBERT and a custom CNN to deliver high-accuracy predictions on video popularity. TubeSense doesn’t just offer data; it delivers smart, real-time recommendations that creators can act on immediately. Whether you’re refining a title or choosing between thumbnails, TubeSense provides clear guidance to boost engagement. In a rapidly changing digital landscape, TubeSense gives creators a competitive edge. Our mission is simple: help creators grow smarter, faster, and more sustainably.
Team: Qilian Wu, Chuyun Deng, Jiaqi Cheng, Jiayi Li, Keyou Wang, Cindy Christina Yang
Description: Our project analyses previous market data on rental prices in California to find trends and make predictions. By preprocessing data, doing exploratory data analysis, and running in our features into different models, we were able to gather surprising insight into how rental prices in California are affected. For example, we learned that older properties actually had higher rental prices (contrary to exceptions), due to their historical significance.
Our end product is an interactive dashboard where you can toggle different features in a typical house/ apartment (number of bedrooms, bathrooms, amenities, doormen, area, etc.) and the model will output a prediction of the rental prices based on your customization. Given the unpredictability of the economy and housing market, our model gives a deeper insight to the expected value of properties and brings comfort to those who are looking to secure a roof over their head.
Team: Haotian Chen, Vincent Karpf, Sri Lahari Dwadasi, Yunpu Zhao, Guang Yang, Xuanru Yue
Description: Our project builds a personalized career learning roadmap system that helps users transition into their dream job. Users begin by uploading their resume and selecting a target job title. The system analyzes their current skill set, identifies skill gaps using job-specific knowledge graphs, and recommends targeted skills with focus and confidence scores. Based on this, we curate learning modules with relevant online resources and generate a weekly schedule tailored to the user’s availability. The platform supports continuous progress tracking and dynamic updates as the user advances. Ultimately, we aim to bridge the gap between career aspirations and actionable learning paths through intelligent planning and personalized guidance.
Team: Daniel Huang, Huiyi He, Yuchen Zhang, Xingjian Liu, Corey Lin, Luowei Wang
Description: As part of UC Berkeley’s INDENG 243 capstone with Almanax, our team developed a real-time risk dashboard for NEAR’s decentralized AI ecosystem. NEAR AI enables developers to create and deploy AI agents on-chain, but the rapid growth of these agents introduces critical governance challenges—ranging from smart contract vulnerabilities to misuse of user data. To address this, we engineered a robust machine learning pipeline that combines on-chain activity, external security metrics, and Almanax analytics to quantify behavioral risk.
Our final output is a transparent, explainable dashboard powered by a stacked XGBoost ensemble, which classifies agents into Low/Medium/High risk tiers with SHAP-based interpretability. The dashboard empowers ecosystem stakeholders—including governance councils, auditors, and developers—with tools to filter, rank, and investigate agent behavior based on risk indicators. This not only enhances decision-making and enforcement transparency, but also builds trust in the broader AI-on-blockchain movement by ensuring human-in-the-loop accountability and preventing unjust penalties. Ultimately, the dashboard acts as a foundation for ethical, scalable governance in decentralized AI systems.
Team: Linzhe Wu, Anzhou Wang, Zixuan Li, Chenyang Wang, Huei-Sin Liu
Description: BerkeleyChoice is a personalized course recommendation platform developed for UC Berkeley students to optimize academic planning and support career development. The system generates comprehensive student profiles by analyzing resumes, demographic information, and career-oriented survey responses. It employs a fine-tuned SBERT-based NLP model alongside clustering and multi-target classification to predict skill gaps and recommend courses tailored to individual goals. Each recommendation is transparent, showing clear connections between skills and course content.This streamlined and adaptive system empowers students to make informed decisions, avoid redundant coursework, and efficiently acquire career-aligned skills.
Description: Recommending products through influencer insights to enhance online shopping experiences.
Team: Derek Shih, Jingwen Zhang, Shelly Wei, Karlie Shao, Miranda Du
Description: Our project establishes a data-driven platform that connects the micro-level consumer experience (prospective car buyer) with macro-level dealership operations (car dealership/inventory manager). On the Consumer-Side, we have personalized vehicle recommendations (micro level). We use a multi-output XGBoost classifier to deliver tailored suggestions on Brand, Vehicle Class, and Fuel Type simultaneously. We didn’t just stop at making recommendations, we also built a “Find Your Community” feature that uses an autoencoder plus clustering approach to group similar buyers together based on their personal information and demographic profiles. On the Dealer-Side, we built Sales Forecasting & Inventory Planning system to help decision makers make better decisions and minimize the cost of over/under-stocking. We first apply SARIMA models to forecast state-by-state car purchase trends. After tuning, SARIMA delivers low Mean Absolute Error (MAE) while accurately projecting next-year demand. Building on this model, we implement Neural Collaborative Filtering (NCF) to recommend which vehicle Brands and Styles dealerships should stock in each region (based on the dealership’s Zipcode).